## Setting Up

In [1]:
%%capture
%pip install -U transformers
%pip install -U datasets 
%pip install -U accelerate 
%pip install -U peft 
%pip install -U trl 
%pip install -U bitsandbytes
%pip install huggingface_hub[hf_xet]

In [2]:
from huggingface_hub import login
import os

hf_token = os.environ.get("HF_TOKEN")
login(hf_token)

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


## Loading the model and tokenizer

In [3]:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch


In [4]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=False,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

In [5]:
# Load tokenizer & model

model_dir = "Qwen/Qwen3-32B"

tokenizer = AutoTokenizer.from_pretrained(model_dir, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    quantization_config=bnb_config,   
    device_map="auto",  
    torch_dtype=torch.bfloat16,
    trust_remote_code=True             
)

model.config.use_cache = False
model.config.pretraining_tp = 1

Loading checkpoint shards:   0%|          | 0/17 [00:00<?, ?it/s]

In [6]:
!nvidia-smi

Tue Apr 29 13:19:53 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:9B:00.0 Off |                    0 |
| N/A   29C    P0             84W /  400W |   31789MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

## Loading and processing the dataset

In [7]:
train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 

### Question:
{}

### Response:
<think>
{}
</think>
{}"""

In [8]:
EOS_TOKEN = tokenizer.eos_token  # Must add EOS_TOKEN

def formatting_prompts_func(examples):
    inputs = examples["Question"]
    complex_cots = examples["Complex_CoT"]
    outputs = examples["Response"]
    texts = []
    for question, cot, response in zip(inputs, complex_cots, outputs):
        # Append the EOS token to the response if it's not already there
        if not response.endswith(tokenizer.eos_token):
            response += tokenizer.eos_token
        text = train_prompt_style.format(question, cot, response)
        texts.append(text)
    return {"text": texts}

In [9]:
from datasets import load_dataset

dataset = load_dataset(
    "FreedomIntelligence/medical-o1-reasoning-SFT",
    "en",
    split="train[0:2000]",
    trust_remote_code=True,
)
dataset = dataset.map(
    formatting_prompts_func,
    batched=True,
)
dataset["text"][10]

README.md:   0%|          | 0.00/1.97k [00:00<?, ?B/s]

medical_o1_sft.json:   0%|          | 0.00/58.2M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/19704 [00:00<?, ? examples/s]

Map:   0%|          | 0/2000 [00:00<?, ? examples/s]

"Below is an instruction that describes a task, paired with an input that provides further context. \nWrite a response that appropriately completes the request. \nBefore answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n### Instruction:\nYou are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. \nPlease answer the following medical question. \n\n### Question:\nIn a patient with dermatomyositis as indicated by fatigue, muscle weakness, a scaly rash, elevated creatine kinase-MB, anti-Jo-1 antibodies, and perimysial inflammation, which type of cancer is most often associated with this condition?\n\n### Response:\n<think>\nAlright, so when I'm thinking about dermatomyositis, I know it's an inflammatory condition with muscle weakness and a telltale skin rash. It's sometimes linked to certain cancers. \n\nNow, I remember reading somewhere that when you have

In [10]:
from transformers import DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

## Model inference before fine-tuning

In [11]:
inference_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request. 
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.

### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. 
Please answer the following medical question. 

### Question:
{}

### Response:
<think>{}"""

In [13]:
question = dataset[10]['Question']
inputs = tokenizer(
    [inference_prompt_style.format(question, "") + tokenizer.eos_token],
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    eos_token_id=tokenizer.eos_token_id,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(response[0].split("### Response:")[1])


<think>

Okay, let's tackle this question. So, the user is asking about the type of cancer most often associated with dermatomyositis based on the given symptoms and lab results.

First, I need to recall what dermatomyositis is. It's an autoimmune inflammatory myopathy characterized by muscle weakness, skin rash, and specific lab findings like elevated creatine kinase. The presence of anti-Jo-1 antibodies points towards a specific subset, maybe the polymyositis/dermatomyositis overlap with interstitial lung disease. But the main question is about the associated cancer.

I remember that dermatomyositis is linked to an increased risk of malignancy, especially in adults. The classic teaching is that certain cancers are more commonly associated with it. Which ones come to mind? Ovarian cancer? Lung cancer? Maybe breast or lymphomas?

Wait, I think the most commonly associated cancers in dermatomyositis include ovarian cancer, lung cancer (especially non-small cell), and sometimes lymphoma

## Setting up the model

In [14]:
from peft import LoraConfig, get_peft_model

# LoRA config
peft_config = LoraConfig(
    lora_alpha=16,                           # Scaling factor for LoRA
    lora_dropout=0.05,                       # Add slight dropout for regularization
    r=64,                                    # Rank of the LoRA update matrices
    bias="none",                             # No bias reparameterization
    task_type="CAUSAL_LM",                   # Task type: Causal Language Modeling
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],  # Target modules for LoRA
)

model = get_peft_model(model, peft_config)

In [15]:
from trl import SFTTrainer
from transformers import TrainingArguments


# Training Arguments
training_arguments = TrainingArguments(
    output_dir="output",
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    gradient_accumulation_steps=2,
    optim="paged_adamw_32bit",
    num_train_epochs=1,
    logging_steps=0.2,
    warmup_steps=10,
    logging_strategy="steps",
    learning_rate=2e-4,
    fp16=False,
    bf16=False,
    group_by_length=True,
    report_to="none"
)

# Initialize the Trainer
trainer = SFTTrainer(
    model=model,
    args=training_arguments,
    train_dataset=dataset,
    peft_config=peft_config,
    data_collator=data_collator,
)

Converting train dataset to ChatML:   0%|          | 0/2000 [00:00<?, ? examples/s]

Adding EOS to train dataset:   0%|          | 0/2000 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/2000 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/2000 [00:00<?, ? examples/s]

No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


## Model Training

In [16]:
import gc, torch
gc.collect()
torch.cuda.empty_cache()
model.config.use_cache = False
trainer.train()

Step,Training Loss
200,1.1861
400,1.1253
600,1.1196
800,1.1201
1000,1.0959


TrainOutput(global_step=1000, training_loss=1.1294009857177734, metrics={'train_runtime': 2515.0296, 'train_samples_per_second': 0.795, 'train_steps_per_second': 0.398, 'total_flos': 2.7579217282460467e+17, 'train_loss': 1.1294009857177734})

## Model inference after fine-tuning

In [18]:
question = dataset[10]['Question']
inputs = tokenizer(
    [inference_prompt_style.format(question, "") + tokenizer.eos_token],
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    eos_token_id=tokenizer.eos_token_id,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(response[0].split("### Response:")[1])


<think>
In dermatomyositis, there's a known link between the condition and certain cancers. The connection is especially significant in adult patients. So, which type of cancer is most commonly associated with dermatomyositis?

First, I know that dermatomyositis can be an indicator of underlying cancer. It's like a sign that something else might be going on in the body. When we talk about the cancers linked to dermatomyositis, several come to mind, like lung cancer, breast cancer, ovarian cancer, and sometimes even pancreatic cancer. But there's one that really stands out.

Let's think about it: lung cancer is often mentioned as being closely tied to dermatomyositis. It seems that lung cancer, especially non-small cell lung cancer, is the most frequently associated with this condition. It's not just a coincidence; it's been observed in many studies that when dermatomyositis is diagnosed in adults, there's a higher likelihood of lung cancer being present.

Now, why is lung cancer so cl

In [19]:
question = dataset[100]['Question']
inputs = tokenizer(
    [inference_prompt_style.format(question, "") + tokenizer.eos_token],
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=1200,
    eos_token_id=tokenizer.eos_token_id,
    use_cache=True,
)
response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(response[0].split("### Response:")[1])


<think>
Toxic shock syndrome, or TSS, is usually associated with bacterial toxins, especially those produced by Staphylococcus aureus. These toxins can cause a massive release of cytokines, leading to symptoms like fever, rash, and systemic issues. The toxins involved are known as superantigens.

Superantigens are special because they don't need to be processed by antigen-presenting cells to activate T cells. Instead, they bind directly to the outside of MHC class II molecules on antigen-presenting cells. This binding allows the toxin to interact with a large number of T cells, not just specific ones. This interaction is a key factor in the massive immune response seen in TSS.

So, these superantigens bind to T cell receptors, but in a different way than typical antigens. They don't bind to the antigen-binding site of the T cell receptor but rather to the variable region, specifically the β-chain. This unique binding causes the activation of many T cells, which leads to the release of

## Saving the model

In [21]:
new_model_name = "Qwen-3-32B-Medical-Reasoning"
model.push_to_hub(new_model_name)
tokenizer.push_to_hub(new_model_name)

Uploading...:   0%|          | 0.00/2.15G [00:00<?, ?B/s]

No files have been modified since last commit. Skipping to prevent empty commit.


Uploading...:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

No files have been modified since last commit. Skipping to prevent empty commit.


CommitInfo(commit_url='https://huggingface.co/kingabzpro/Qwen-3-32B-Medical-Reasoning/commit/028c43854a330c79b2dc214feda3ebfe6b4b85a0', commit_message='Upload tokenizer', commit_description='', oid='028c43854a330c79b2dc214feda3ebfe6b4b85a0', pr_url=None, repo_url=RepoUrl('https://huggingface.co/kingabzpro/Qwen-3-32B-Medical-Reasoning', endpoint='https://huggingface.co', repo_type='model', repo_id='kingabzpro/Qwen-3-32B-Medical-Reasoning'), pr_revision=None, pr_num=None)