coco101010
/

Qwen3-32B-GPTQ-4bit-default-calibration

4-bit precision

Model card Files Files and versions

coco101010 commited on May 22

Commit

5a74d76

·

verified ·

1 Parent(s): 02d215b

Update README.md

Files changed (1) hide show

README.md +28 -1

README.md CHANGED Viewed

@@ -2,4 +2,31 @@
 license: apache-2.0
 base_model:
 - Qwen/Qwen3-32B
----

 license: apache-2.0
 base_model:
 - Qwen/Qwen3-32B
+---
+This model is created with the following code:
+```Python
+from datasets import load_dataset
+from gptqmodel import GPTQModel, QuantizeConfig
+from huggingface_hub import constants
+model_id = "Qwen/Qwen3-32B"
+# Save the quantized model in the HF cache directory
+cache_dir = constants.HF_HUB_CACHE
+quant_path = os.path.join(cache_dir, "models--quantized--" + model_id.replace("/", "--"))
+os.makedirs(quant_path, exist_ok=True)
+# Load calibration data (1024 samples from C4)
+calibration_dataset = load_dataset(
+    "allenai/c4",
+    data_files="en/c4-train.00001-of-01024.json.gz",
+    split="train"
+  ).select(range(1024))["text"]
+# Configure and run quantization
+quant_config = QuantizeConfig(bits=4, group_size=128)
+model = GPTQModel.load(model_id, quant_config)
+model.quantize(calibration_dataset, batch_size=2)
+model.save(quant_path)
+```