About 50 results
Open links in new tab
  1. Samba/pretrain.py at main · microsoft/Samba · GitHub

    Should rewrite it in the future. if resume: if curr_iter < initial_iter: curr_iter += 1 continue else: resume = False curr_iter = -1 fabric.barrier () fabric.print ("resume finished, taken {} seconds".format …

  2. Samba: Simple Hybrid State Space Models for Efficient ... - GitHub

    Samba is a simple yet powerful hybrid model with an unlimited context length. Its architecture is frustratingly simple: Samba = Mamba + MLP + Sliding Window Attention + MLP stacking at the layer …

  3. Samba/lit_gpt/model.py at main · microsoft/Samba · GitHub

    [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - Samba/lit_gpt/model.py at main · microsoft/Samba

  4. Pulse · microsoft/Samba · GitHub

    [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - Pulse · microsoft/Samba

  5. LitGPT · Issue #6 · microsoft/Samba - GitHub

    Jun 14, 2024 · Congrats on this research milestone 🙌! And it’s nice to see that our LitGPT library has been has been useful for this project. However, note that LitGPT is an open-source project, and the …

  6. Models · Issue #4 · microsoft/Samba - GitHub

    Jun 12, 2024 · [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - microsoft/Samba

  7. finetune code · Issue #20 · microsoft/Samba · GitHub

    I'm currently pretraining with the Samba architecture. But I want to pretrain this model and finetune it to suit a specific task. Wondering if there's any related code or material I can help with.

  8. Error when using Docker · Issue #10 · microsoft/Samba · GitHub

    Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

  9. Inferrence Code · Issue #8 · microsoft/Samba · GitHub

    Jun 18, 2024 · Amazing work, team! Thank you sincerely for sharing. I have trained a toy model but have completely failed creating an inference script. Sharing one would be sincerely appreciated!

  10. Samba/assets/Samba-pic.webp at main · microsoft/Samba · GitHub

    [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - microsoft/Samba