|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen/Qwen3-30B-A3B |
|
|
base_model_relation: quantized |
|
|
quantized_by: turboderp |
|
|
tags: |
|
|
- exl3 |
|
|
--- |
|
|
|
|
|
EXL3 quants of [Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B) |
|
|
|
|
|
[2.25 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/2.25bpw) |
|
|
[3.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/3.0bpw) |
|
|
[4.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/4.0bpw) |
|
|
[5.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/5.0bpw) |
|
|
[6.00 bits per weight](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/6.0bpw) |
|
|
[8.00 bits per weight / H8](https://huggingface.co/turboderp/Qwen3-30B-A3B-exl3/tree/8.0bpw_H8) |
|
|
|
|
|
| Model | HumanEval pass@1 | KL-div vs FP16 (wiki2 20k tokens) | Top-1 agreement vs FP16 | |
|
|
|----------|------------------|-----------------------------------|-------------------------| |
|
|
| 2.25 bpw | 88.41% | 0.1416 | 84.78% | |
|
|
| 3.00 bpw | 89.63% | 0.0688 | 89.44% | |
|
|
| 4.00 bpw | 92.07% | 0.0215 | 94.33% | |
|
|
| 5.00 bpw | 93.29% | 0.0094 | 96.24% | |
|
|
| 6.00 bpw | 92.68% | 0.0054 | 97.45% | |
|
|
| 8.00 bpw | 91.46% | 0.0020 | 98.36% | |
|
|
| FP16 | 91.46% | - | - | |
|
|
|
|
|
 |