Generalist Foundation Models Are Not Clinical Enough for Hospital Operations Paper • 2511.13703 • Published 6 days ago • 17
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking Paper • 2511.16618 • Published 3 days ago • 6
First Frame Is the Place to Go for Video Content Customization Paper • 2511.15700 • Published 4 days ago • 46
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published 4 days ago • 38
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 4 days ago • 181
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published 9 days ago • 104
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published 10 days ago • 52
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper • 2511.13648 • Published 6 days ago • 46
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data Paper • 2511.12609 • Published 7 days ago • 98
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 10 days ago • 80
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist Paper • 2511.08521 • Published 12 days ago • 37
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published 14 days ago • 51
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 11 days ago • 175
GUI-360: A Comprehensive Dataset and Benchmark for Computer-Using Agents Paper • 2511.04307 • Published 17 days ago • 14
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions Paper • 2511.03334 • Published 18 days ago • 50
How Far Are Surgeons from Surgical World Models? A Pilot Study on Zero-shot Surgical Video Generation with Expert Assessment Paper • 2511.01775 • Published 20 days ago • 6
Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals Paper • 2510.27684 • Published 23 days ago • 21
MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency Paper • 2510.25897 • Published 25 days ago • 16