INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper β’ 2510.25602 β’ Published 21 days ago β’ 71
π¦ Llama-3.2-Taiwan Collection Based on the meta-llama/Llama-3.2-*B model, we continue pre-training on a large corpus of Traditional Chinese and non-Chinese language data. β’ 9 items β’ Updated Apr 26 β’ 1
UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset Paper β’ 2510.20661 β’ Published 27 days ago β’ 13
ποΈ Formosa-1 Series Collection A collection of Formosa-1 (F1) reasoning models and datasets focused on Traditional Chinese instruction-following and logic. β’ 4 items β’ Updated Oct 13 β’ 4
π Eval Logs Collection Benchmark log generated with Twinkle Eval, recording the model's outputs for each prompt. β’ 2 items β’ Updated Oct 13 β’ 4
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper β’ 2508.18265 β’ Published Aug 25 β’ 205
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). β’ 54 items β’ Updated Sep 28 β’ 103
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. β’ 2 items β’ Updated Aug 7 β’ 381
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper β’ 2506.20920 β’ Published Jun 26 β’ 75
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 β’ 716
FLUX.1 Collection A collection of our FLUX.1 models and LoRAs. β’ 10 items β’ Updated Oct 14 β’ 243