DreamOmni2: Multimodal Instruction-based Editing and Generation Paper • 2510.06679 • Published Oct 8 • 73
MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech Paper • 2509.25131 • Published Sep 29 • 15
MGM-Omni Collection MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech • 18 items • Updated Oct 11 • 10
Training-Free Efficient Video Generation via Dynamic Token Carving Paper • 2505.16864 • Published May 22 • 24
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning Paper • 2505.12081 • Published May 17 • 18
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published Dec 5, 2024 • 118
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper • 2408.06070 • Published Aug 12, 2024 • 55