Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning Paper • 2510.05251 • Published Oct 6 • 7
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning Paper • 2504.21370 • Published Apr 30 • 2