The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Paper • 2510.08240 • Published Oct 9 • 40
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning Paper • 2509.22621 • Published Sep 26 • 8
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 • 85
Jointly Reinforcing Diversity and Quality in Language Model Generations Paper • 2509.02534 • Published Sep 2 • 24
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Paper • 2506.11930 • Published Jun 13 • 53
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Paper • 2506.11930 • Published Jun 13 • 53
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Paper • 2506.11930 • Published Jun 13 • 53 • 3