Running on CPU Upgrade Featured 2.94k The Smol Training Playbook 📚 2.94k The secrets to building world-class LLMs
moonshotai/Kimi-Linear-48B-A3B-Instruct Text Generation • 49B • Updated Dec 16, 2025 • 36.8k • 531
OpenGVLab/InternVL3_5-241B-A28B Image-Text-to-Text • 241B • Updated Aug 29, 2025 • 1.44k • 134
moonshotai/Kimi-K2-Instruct Text Generation • 1T • Updated about 23 hours ago • 178k • • 2.31k
view article Article You could have designed state of the art positional encoding Nov 25, 2024 • 442
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 • 28
moonshotai/Kimi-VL-A3B-Thinking-2506 Image-Text-to-Text • 16B • Updated about 23 hours ago • 152k • 342
Running 3.67k The Ultra-Scale Playbook 🌌 3.67k The ultimate guide to training LLM on large GPU Clusters
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 251