Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph Paper • 2511.00086 • Published 26 days ago • 41
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26 • 133
AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks Paper • 2508.00890 • Published Jul 26 • 6
A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness Paper • 2411.03350 • Published Nov 4, 2024 • 2
view article Article A Survey of Small Language Models in the Era of LLMs: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness Jul 16 • 4
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published Nov 25, 2024 • 41