Enrich VLMs’ vision-centric reasoning capabilities via Chain-of-Visual-Thought!
YM Qin
Wakals
AI & ML interests
Computer Vision, Vision-language Model, Generative Model
Recent Activity
updated
a collection
about 8 hours ago
CoVT: Chain-of-Visual-Thought
updated
a dataset
about 21 hours ago
Wakals/CoVT-Dataset
updated
a model
about 22 hours ago
Wakals/CoVT-LLaVA-13B-depth
Organizations
None yet