Wei Xiong's picture

Wei Xiong

weqweasdas

·

https://weixiongust.github.io/WeiXiongUST/index.html

AI & ML interests

Machine learning, RLHF

Recent Activity

upvoted a paper 6 days ago

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

updated a dataset 14 days ago

weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition

published a dataset 14 days ago

weqweasdas/qwen15b_train_simple_subset5k_for_difficulty_transition

View all activity

Organizations

authored 4 papers over 1 year ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 71

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

Paper • 2312.11456 • Published Dec 18, 2023 • 1

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

Paper • 2306.12420 • Published Jun 21, 2023 • 2

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Paper • 2304.06767 • Published Apr 13, 2023 • 2