GGUF models
Collection
17 items
•
Updated
These models are converted from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B with llama.cpp. These models contain biases in both q_proj and k_proj layers.
The bf16, f16 models are converted with llama.cpp in branch b4514, because the latest main branch of llama.cpp failed to convert the hf model into GGUF format (2025/7/28).
Both of them can be evaluated with latest llama.cpp via ./build/bin/llama-perplexity. However, when we evaluate the bf16, f16 model with lighteval in vllm, it seems fail to load the model correctly. (2025/8/5)
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit