5 13 186

Jian Hu

chuyi777

https://hujian.website

hijkzzz

AI & ML interests

Reinforcement Learning

Recent Activity

upvoted a paper 20 days ago

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

upvoted a paper about 1 month ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

liked a model 2 months ago

moonshotai/Kimi-K2-Instruct-0905

View all activity

Organizations

upvoted a paper 20 days ago

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published 25 days ago • 15

upvoted a paper about 1 month ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1 • 17

upvoted a paper 3 months ago

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published Aug 11 • 48

upvoted an article 5 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12

• 149

upvoted a paper 5 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 139

upvoted 2 papers 7 months ago

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Paper • 2504.15271 • Published Apr 21 • 67

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15 • 19

upvoted 2 papers 9 months ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published Feb 20 • 47

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 124

upvoted a paper 10 months ago

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 102

upvoted a paper 11 months ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 84

upvoted an article over 1 year ago

Article

4D masks support in Transformers

•

Jan 8, 2024

• 30

upvoted a paper over 1 year ago

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20, 2024 • 41