VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models Paper • 2509.19803 • Published Sep 24 • 118
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models Paper • 2403.03432 • Published Mar 6, 2024 • 1
AirRAG: Activating Intrinsic Reasoning for Retrieval Augmented Generation via Tree-based Search Paper • 2501.10053 • Published Jan 17
PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning Paper • 2508.21104 • Published Aug 28 • 35