Qwen2.5-72B — Supervised Fine-Tuning (SFT) with LoRA Adapters

Model type: Causal Language Model
Base model: Qwen/Qwen2.5-72B
License: Apache 2.0
Framework: Axolotl + DeepSpeed ZeRO-1

Overview

qwen2.5-72b-sft is a supervised fine-tuned version of Qwen 2.5-72B, trained using LoRA adapters in 4-bit NF4 quantization for efficient adaptation.
This release contains only the LoRA adapters and training configuration, allowing users to load them on top of the official Qwen 2.5-72B base model.

The SFT objectiv refines the model’s question-answering and conversational skills using synthetic QA data.
Training was performed on the Leonardo EuroHPC supercomputer using Axolotl 0.6 + DeepSpeed ZeRO-1 optimization with bfloat16 computation.

Training Setup

Component	Specification
Objective	Supervised fine-tuning (chat QA pairs)
Adapter type	LoRA
Quantization	4-bit NF4 (bnb)
Precision	bfloat16
Hardware	8 nodes × 2 × NVIDIA A100-64 GB GPUs
Framework	Axolotl + DeepSpeed ZeRO-1 (PyTorch 2.5.1 + CUDA 12.1)
Runtime	≈ 24 hours
Checkpoints	Saved every 1/10 of an epoch
Loss watchdog	threshold = 5.0, patience = 3

Dataset

Name: axolotl_deduplicated_synthetic_qa.jsonl
Type: Instruction-following synthetic QA dataset

Each sample follows a QA/chat format used in the alpaca_chat.load_qa schema.

Hyperparameters

Parameter	Value
Sequence length	2048
Micro batch size	1
Gradient accumulation	4
Epochs	1
Learning rate	0.0001
LR scheduler	cosine
Optimizer	AdamW (8-bit)
Warmup steps	20
Weight decay	0.0
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.05
LoRA target modules	`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`
Gradient checkpointing	✅
Flash attention	✅
Auto resume	✅
bnb 4-bit compute dtype	bfloat16
bnb 4-bit quant type	nf4
bnb double quant	true
Validation set size	0.3
Evals per epoch	10

Tokenizer

Tokenizer type: AutoTokenizer
Special token: <|end_of_text|> as pad_token

Files Included

This repository hosts LoRA adapters and Axolotl metadata only.

Contents:

adapter_config.json
adapter_model.safetensors
config.json
special_tokens_map.json
tokenizer_config.json
tokenizer.json
README.md

Usage — Load and Apply the Adapters

To use this SFT variant in Python:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "Qwen/Qwen2.5-72B"
sft_adapter = "ubitech-edg/qwen2.5-72b-sft"

# Load base and tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model, device_map="auto", torch_dtype="bfloat16"
)

# Load SFT LoRA adapters
model = PeftModel.from_pretrained(model, sft_adapter)
model.eval()

prompt = "What is the role of AI in renewable energy optimization?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 30

Model tree for ubitech-edg/qwen2.5-72b-sft

Base model

Qwen/Qwen2.5-72B

Adapter

(3)

this model