SII-LeeSXian's picture

6

SII-LeeSXian

LEE0v0

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

upvoted a paper 21 days ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

upvoted a collection 3 months ago

View all activity

Organizations

None yet

upvoted a paper 12 days ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published 13 days ago • 191

upvoted a paper 21 days ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper • 2510.23763 • Published 23 days ago • 53

upvoted a collection 3 months ago

EO-Robotics

EmbodiedOneVision is a unified framework for multimodal embodied reasoning and robot control, featuring interleaved vision-text-action pretraining. • 5 items • Updated Sep 16 • 8

upvoted 2 papers 8 months ago

Unicorn: Text-Only Data Synthesis for Vision Language Model Training

Paper • 2503.22655 • Published Mar 28 • 39

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Paper • 2503.06053 • Published Mar 8 • 138

upvoted a paper over 1 year ago

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Paper • 2407.05131 • Published Jul 6, 2024 • 27