arXiv:2509.14232
Zhaokai Wang
wzk1015
AI & ML interests
Computer Vision
Music Generation
Multimodal Large Language Models
Recent Activity
liked
a model
16 days ago
Zhenxin-Lei/MetaCaptioner
upvoted
a
paper
20 days ago
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding
LLM
upvoted
a
paper
about 1 month ago
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language
Models under Data Constraints