Official AQLM quantization of meta-llama/Meta-Llama-3.1-70B-Instruct finetuned with PV-Tuning.

For this quantization, we used 1 codebook of 16 bits and groupsize of 8.

Results:

Model Quantization MMLU (5-shot) ArcC ArcE Hellaswag PiQA Winogrande Model size, Gb
fp16 0.8213 0.6246 0.8683 0.6516 0.8313 0.7908 141
1x16g8 0.7814 0.5478 0.8270 0.6284 0.8036 0.7814 21.9
Downloads last month
7
Safetensors
Model size
11B params
Tensor type
F16
·
I16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ISTA-DASLab/Meta-Llama-3.1-70B-Instruct-AQLM-PV-2Bit-1x16

Quantized
(118)
this model
Finetunes
1 model

Collection including ISTA-DASLab/Meta-Llama-3.1-70B-Instruct-AQLM-PV-2Bit-1x16