Data and Models for Extract+Think as part of Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models
-
markendo/Visual-Extraction-Tuning-382K
Viewer • Updated • 382k • 66 -
markendo/llava-extract-qwen3-0.6B
Image-Text-to-Text • 1.0B • Updated • 24 -
markendo/llava-extract-qwen3-1.7B
Image-Text-to-Text • 2B • Updated • 30 -
markendo/llava-extract-from-scratch-qwen3-0.6B
Image-Text-to-Text • 1.0B • Updated • 15