HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video Paper • 2510.05560 • Published Oct 7 • 7
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning Paper • 2510.06217 • Published Oct 7 • 62
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published about 1 month ago • 121
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7 • 136
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 16 days ago • 38
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published 15 days ago • 92
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published 16 days ago • 44
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation Paper • 2510.21583 • Published 15 days ago • 30
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging Paper • 2510.20479 • Published 16 days ago • 10