Running on CPU Upgrade 1.8k 1.8k The Smol Training Playbook: The Secrets to Building World-Class LLMs π Explore loss curves for training LLMs
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch May 21 β’ 229
Running 3.45k 3.45k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! β’ 30 items β’ Updated Jun 12, 2024 β’ 247
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights Paper β’ 2502.09619 β’ Published Feb 13 β’ 35
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper β’ 2505.22453 β’ Published May 28 β’ 46
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 β’ 1.14k
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published Feb 4 β’ 248