TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments Paper • 2510.01179 • Published Oct 1 • 25
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20 • 13