VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published 9 days ago • 104
ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries Paper • 2511.14349 • Published 5 days ago • 15
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published 17 days ago • 195
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation Paper • 2511.01163 • Published 20 days ago • 31
World Simulation with Video Foundation Models for Physical AI Paper • 2511.00062 • Published 26 days ago • 39
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents Paper • 2510.23691 • Published 27 days ago • 51
Gauss Gym Datasets Collection Datasets used for the gauss gym photorealistic simulator • 4 items • Updated Oct 17 • 8
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published Oct 15 • 70
view article Article Introduction to MedVideoCap-55K: A New, Large-Scale, High-Quality Medical Video-Caption Pair Dataset Jun 25 • 10
CommonForms: A Large, Diverse Dataset for Form Field Detection Paper • 2509.16506 • Published Sep 20 • 19
EgoLife Collection CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated Mar 7 • 20
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published Jan 7 • 81
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation Paper • 2509.19296 • Published Sep 23 • 23
Matrix-3D: Omnidirectional Explorable 3D World Generation Paper • 2508.08086 • Published Aug 11 • 75
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning Paper • 2508.10433 • Published Aug 14 • 143
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing Paper • 2508.10881 • Published Aug 14 • 52
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 383