Resolving Discrepancies in Compute-Optimal Scaling of Language Models Paper • 2406.19146 • Published Jun 27, 2024 • 1
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets Paper • 2506.04598 • Published Jun 5 • 6
Reproducible scaling laws for contrastive language-image learning Paper • 2212.07143 • Published Dec 14, 2022 • 2
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 131
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated May 10, 2024 • 27
OpenCLIP DataComp Collection OpenCLIP models trained on DataComp (https://huggingface.co/papers/2304.14108). • 6 items • Updated Oct 20 • 7
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 46