2 35 21

Wujian Peng(SII)

wjpoom

https://scholar.google.com/citations?user=GTuWk9YAAAAJ&hl=zh-CN

wjpoom

AI & ML interests

None yet

Recent Activity

upvoted a paper 29 days ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

upvoted a paper about 1 month ago

LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models

upvoted a paper 3 months ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

View all activity

Organizations

upvoted a paper 29 days ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper • 2510.23763 • Published about 1 month ago • 53

upvoted a paper about 1 month ago

LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models

Paper • 2510.13626 • Published Oct 15 • 44

upvoted a paper 3 months ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28 • 89

upvoted 2 papers 6 months ago

Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Paper • 2505.18600 • Published May 24 • 48

CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Paper • 2505.12504 • Published May 18 • 24

upvoted a paper 7 months ago

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Paper • 2505.03318 • Published May 6 • 93

upvoted a paper 8 months ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24 • 30

upvoted 2 papers 9 months ago

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

Paper • 2503.10480 • Published Mar 13 • 55

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 123

upvoted 3 papers 12 months ago

Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding

Paper • 2312.00081 • Published Nov 30, 2023 • 2

Cross-Modality Safety Alignment

Paper • 2406.15279 • Published Jun 21, 2024 • 5

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

Paper • 2412.03565 • Published Dec 4, 2024 • 11

upvoted 5 papers over 1 year ago

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Paper • 2407.07053 • Published Jul 9, 2024 • 47

Video Diffusion Alignment via Reward Gradients

Paper • 2407.08737 • Published Jul 11, 2024 • 49

upvoted an article over 1 year ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15, 2024

•

190

upvoted 2 papers over 1 year ago

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

Paper • 2407.01284 • Published Jul 1, 2024 • 82

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8, 2024 • 12

Wujian Peng(SII)

AI & ML interests

Recent Activity

Organizations

wjpoom's activity

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community