Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Wei Xiong's picture
15 23 19

Wei Xiong

weqweasdas
xinyut's profile picture dangkai-nk's profile picture qingyangzhang's profile picture
·
https://weixiongust.github.io/WeiXiongUST/index.html

AI & ML interests

Machine learning, RLHF

Recent Activity

upvoted a paper 6 days ago
Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning
updated a dataset 14 days ago
weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition
published a dataset 14 days ago
weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition
View all activity

Organizations

reward modeling's profile picture raft_study's profile picture Directional Preference Alignment's profile picture RLHFlow's profile picture RRLHF's profile picture TIRData's profile picture feedbackagent's profile picture myselfrew's profile picture selfcorrexp's profile picture selfcorrexp2's profile picture mytestdpo's profile picture tmpmodelsave's profile picture qwselfcorr's profile picture dsrtrain's profile picture dsrselfcorr's profile picture ptllama's profile picture raftstudy's profile picture Reinforce's profile picture UIUC ScaleML Lab's profile picture

authored 4 papers over 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

Paper • 2312.11456 • Published Dec 18, 2023 • 1

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs