Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling Paper • 2504.13169 • Published Apr 17 • 39
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published Mar 30 • 94
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing Paper • 2503.19385 • Published Mar 25 • 34
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 10 • 344
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data Paper • 2401.10891 • Published Jan 19, 2024 • 62
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models Paper • 2307.06949 • Published Jul 13, 2023 • 51