Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models Paper • 2506.01413 • Published Jun 2 • 16
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs Paper • 2505.24858 • Published May 30 • 17
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models Paper • 2505.07591 • Published May 12 • 11
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 186
Cost-of-Pass: An Economic Framework for Evaluating Language Models Paper • 2504.13359 • Published Apr 17 • 4
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval Paper • 2503.04644 • Published Mar 6 • 21
Language Models' Factuality Depends on the Language of Inquiry Paper • 2502.17955 • Published Feb 25 • 33
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 126
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published Feb 20 • 100
Linear Correlation in LM's Compositional Generalization and Hallucination Paper • 2502.04520 • Published Feb 6 • 11
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 222