Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published 12 days ago • 83
view article Article Granite 4.0 Nano: Just how small can you go? By ibm-granite and 1 other • 12 days ago • 108
Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published 23 days ago • 37
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning Paper • 2510.12693 • Published 25 days ago • 26
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published 30 days ago • 48
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward Paper • 2510.03222 • Published Oct 3 • 49
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 26 days ago • 173
Efficient Intent Detection with Dual Sentence Encoders Paper • 2003.04807 • Published Mar 10, 2020 • 2
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Paper • 2510.04212 • Published Oct 5 • 22
Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs Paper • 2509.25771 • Published Sep 30 • 10
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models Paper • 2510.03561 • Published Oct 3 • 23
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published Oct 6 • 46
No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping Paper • 2509.21880 • Published Sep 26 • 51
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1 • 107
LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer Paper • 2509.22414 • Published Sep 26 • 21
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing Paper • 2509.22651 • Published Sep 26 • 22
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation Paper • 2509.25849 • Published Sep 30 • 47
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models Paper • 2509.25848 • Published Sep 30 • 78