Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Hanning Zhang's picture
10 5

Hanning Zhang

HanningZhang
RogerZhuo's profile picture circulartext's profile picture
·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago
CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents
upvoted a paper 26 days ago
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
upvoted a paper 26 days ago
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning
View all activity

Organizations

RLHFlow's profile picture mytestdpo's profile picture ScaleBio Baseline's profile picture UIUC ScaleML Lab's profile picture

authored a paper 6 months ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5 • 25
authored a paper 8 months ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 83
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs