Model Overview

Model Architecture: Qwen3-30B-A3B-Thinking-2507
- Input: Text
- Output: Text
Supported Hardware Microarchitecture: AMD MI350/MI355
ROCm: 7.0
Operating System(s): Linux
Inference Engine: vLLM
Model Optimizer: AMD-Quark
- Weight quantization: Perchannel, FP8E4M3, Static
- Activation quantization: Pertoken, FP8E4M3, Dynamic
Calibration Dataset: Pile

This model was built with Qwen3-30B-A3B-Thinking-2507 model by applying AMD-Quark for ptpc quantization.

Model Quantization

The model was quantized from Qwen/Qwen3-30B-A3B-Thinking-2507 using AMD-Quark. The weights are quantized to FP8 and activations are quantized to FP8.

Quantization scripts:

cd Quark/examples/torch/language_modeling/llm_ptq/

python3 internal_scripts/quantize_quark.py --model_dir Qwen/Qwen3-30B-A3B-Thinking-2507 \
                          --quant_scheme w_fp8_per_channel_static_a_fp8_per_token_dynamic \
                          --exclude_layers "*lm_head" "*mlp.gate" \
                          --num_calib_data 512 \
                          --output_dir amd/Qwen3-30B-A3B-Thinking-2507-ptpc \
                          --model_export hf_format \

Accuracy

Benchmark	Qwen3-VL-235B-A22B-Instruct	Qwen3-VL-235B-A22B-Instruct-ptpc(this model)
GSM8K	0.755	0.720

Reproduction

The result of GSM8K was obtained using vLLM.

GSM8K

lm_eval --model vllm \
    --model_args pretrained=/model_path/Qwen/Qwen3-30B-A3B-Thinking-2507-ptpc,add_bos_token=true,tensor_parallel_size=2 \
    --tasks gsm8k \
    --num_fewshot 5 \
    --batch_size auto \
    --limit 200

Deployment

Use with vLLM

This model can be deployed efficiently using the vLLM backend.

Evaluation

The evaluation results and reproduction script are being prepared.

License

Downloads last month: 61

Safetensors

Model size

31B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amd/Qwen3-30B-A3B-Thinking-2507-ptpc

Base model

Qwen/Qwen3-30B-A3B-Thinking-2507

Quantized

(70)

this model