view article Article Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models 6 days ago • 21
ServiceNow-AI/Apriel-H1-15b-Thinker-SFT Text Generation • 16B • Updated 21 days ago • 842 • 25
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 14 days ago • 99
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 188
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2 • 224
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5 • 50 • 3
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5 • 50
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks Paper • 2407.05291 • Published Jul 7, 2024 • 2
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models Paper • 2411.05830 • Published Nov 5, 2024 • 21