MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published 4 days ago • 97
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers Paper • 2511.11062 • Published 5 days ago • 26
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 10 days ago • 103
DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation Paper • 2511.06307 • Published 10 days ago • 49
Scaling Agent Learning via Experience Synthesis Paper • 2511.03773 • Published 13 days ago • 75
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published 20 days ago • 70
moonshotai/Kimi-Linear-48B-A3B-Instruct Text Generation • 49B • Updated 1 day ago • 312k • 453
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published 20 days ago • 211
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 28 days ago • 111
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published 29 days ago • 118