|
|
--- |
|
|
license: cc-by-4.0 |
|
|
base_model: StanfordAIMI/CheXagent-2-3b |
|
|
tags: |
|
|
- medical |
|
|
- radiology |
|
|
- chest-x-ray |
|
|
- multimodal |
|
|
- report-generation |
|
|
- structured-reporting |
|
|
- contextualized |
|
|
- temporal-reasoning |
|
|
- findings |
|
|
- lora |
|
|
- medical-imaging |
|
|
- clinical-nlp |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: image-text-to-text |
|
|
library_name: transformers |
|
|
datasets: |
|
|
- erjui/csrrg_ift_dataset |
|
|
--- |
|
|
|
|
|
# CheXagent-2-3b: Contextualized Structured Radiology Report Generation (Findings) |
|
|
|
|
|
This model is a fine-tuned version of [StanfordAIMI/CheXagent-2-3b](https://huggingface.co/StanfordAIMI/CheXagent-2-3b) for generating the **FINDINGS** section of contextualized structured chest X-ray radiology reports. |
|
|
It was trained using LoRA (Low-Rank Adaptation) on the [csrrg_ift_dataset](https://huggingface.co/datasets/erjui/csrrg_ift_dataset) containing instruction-following examples from MIMIC-CXR and CheXpert+ datasets. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model performs **Contextualized Structured Radiology Report Generation (CSRRG)** for chest X-rays, generating detailed findings sections with rich clinical context including patient history, imaging technique, comparison to prior studies, and temporal reasoning. |
|
|
|
|
|
**Key characteristics:** |
|
|
- Generates the **FINDINGS** section of radiology reports |
|
|
- Incorporates **clinical history/indication**, **technique**, and **comparison** to prior studies |
|
|
- Performs temporal reasoning across multiple examinations |
|
|
- Produces structured, clinically relevant observations with contextual awareness |
|
|
- Fine-tuned with LoRA for parameter-efficient adaptation |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
### Primary Use Cases |
|
|
- Research on contextualized radiology report generation |
|
|
- Development of temporal reasoning systems for medical imaging |
|
|
- Clinical decision support with longitudinal patient data |
|
|
- Medical AI and multimodal model research |
|
|
- Educational tools for radiology training |
|
|
|
|
|
### Intended Users |
|
|
- Medical AI researchers |
|
|
- Healthcare technology developers |
|
|
- Clinical informatics specialists |
|
|
- Radiology departments (research use only) |
|
|
|
|
|
### Out-of-Scope Use |
|
|
- **NOT intended for clinical diagnosis without physician review** |
|
|
- Should not replace human radiologists in clinical practice |
|
|
- Requires validation before any clinical deployment |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
- **Dataset**: [csrrg_ift_dataset](https://huggingface.co/datasets/erjui/csrrg_ift_dataset) (csrrg_ift_dataset_findings subset) |
|
|
- **Training samples**: ~181,874 instruction-following examples |
|
|
- **Data sources**: MIMIC-CXR and CheXpert+ chest X-ray datasets |
|
|
- **Task format**: Instruction fine-tuning with rich clinical context |
|
|
- **Context includes**: Clinical history/indication, imaging technique, comparison to prior studies, current and prior images |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
**Fine-tuning method**: LoRA (Low-Rank Adaptation) |
|
|
|
|
|
**LoRA Configuration:** |
|
|
- Rank (r): 32 |
|
|
- Alpha: 64 |
|
|
- Dropout: 0.1 |
|
|
- Target modules: `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
|
|
|
|
|
**Training hyperparameters:** |
|
|
- Learning rate: 2e-4 |
|
|
- Batch size: 4 per device |
|
|
- Gradient accumulation steps: 32 (effective batch size: 128) |
|
|
- Epochs: 1 |
|
|
- Optimizer: AdamW |
|
|
- Learning rate scheduler: Cosine with 3% warmup |
|
|
- Precision: bfloat16 |
|
|
- Attention implementation: Flash Attention 2 |
|
|
- Max sequence length: 2048 |
|
|
- Max images per sample: 2 |
|
|
|
|
|
**Hardware:** |
|
|
- GPU: NVIDIA H100 |
|
|
- Training framework: HuggingFace Transformers + PEFT |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Loading the Model |
|
|
|
|
|
```python |
|
|
from transformers import AutoProcessor, AutoModelForVision2Seq |
|
|
from PIL import Image |
|
|
import torch |
|
|
|
|
|
# Load model and processor |
|
|
model_name = "erjui/CheXagent-2-3b-csrrg-findings" |
|
|
model = AutoModelForVision2Seq.from_pretrained( |
|
|
model_name, |
|
|
trust_remote_code=True, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
processor = AutoProcessor.from_pretrained("StanfordAIMI/CheXagent-2-3b", trust_remote_code=True) |
|
|
|
|
|
# Load chest X-ray images (current and prior studies) |
|
|
# CSRRG models support multiple images for temporal comparison (max_images_per_sample: 2) |
|
|
current_image = Image.open("current_xray.jpg") |
|
|
prior_image = Image.open("prior_xray.jpg") |
|
|
|
|
|
# Prepare input with clinical context |
|
|
messages = [ |
|
|
{ |
|
|
"role": "system", |
|
|
"content": [{"type": "text", "text": "You are an expert radiologist."}] |
|
|
}, |
|
|
{ |
|
|
"role": "user", |
|
|
"content": [ |
|
|
{ |
|
|
"type": "text", |
|
|
"text": """Analyze the chest X-ray images and write the FINDINGS section of a radiology report. Use standard medical terminology and organize findings by anatomical regions. Consider the available clinical contexts when formulating your findings. |
|
|
|
|
|
=== CLINICAL HISTORY/INDICATION === |
|
|
Male patient status post acetabular surgery with concern for pleural effusion. |
|
|
|
|
|
=== TECHNIQUE === |
|
|
Portable semi-erect single frontal chest radiograph. |
|
|
|
|
|
=== CURRENT IMAGES ===""" |
|
|
}, |
|
|
{"type": "image"}, # Current image |
|
|
{"type": "image"} # Prior image (supports multiple images for temporal comparison) |
|
|
] |
|
|
} |
|
|
] |
|
|
|
|
|
# Process and generate |
|
|
inputs = processor(images=[current_image, prior_image], text=messages, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
|
generated_text = processor.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
print(generated_text) |
|
|
``` |
|
|
|
|
|
### Expected Output Format |
|
|
|
|
|
``` |
|
|
FINDINGS: |
|
|
Lungs and Airways: |
|
|
- No pleural effusion or pneumothorax detected |
|
|
- Bibasilar atelectasis present |
|
|
|
|
|
Cardiovascular: |
|
|
- Mild left ventricular enlargement |
|
|
|
|
|
Musculoskeletal and Chest Wall: |
|
|
- Bilateral rib fractures noted |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@article{kang2025automated, |
|
|
title={Automated Structured Radiology Report Generation with Rich Clinical Context}, |
|
|
author={Kang, Seongjae and Lee, Dong Bok and Jung, Juho and Kim, Dongseop and Kim, Won Hwa and Joo, Sunghoon}, |
|
|
journal={arXiv preprint arXiv:2510.00428}, |
|
|
year={2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
Also cite the base model: |
|
|
```bibtex |
|
|
@article{chen2024chexagent, |
|
|
title={Chexagent: Towards a foundation model for chest x-ray interpretation}, |
|
|
author={Chen, Zhihong and Varma, Maya and Delbrouck, Jean-Benoit and Paschali, Magdalini and Blankemeier, Louis and Van Veen, Dave and Valanarasu, Jeya Maria Jose and Youssef, Alaa and Cohen, Joseph Paul and Reis, Eduardo Pontes and others}, |
|
|
journal={arXiv preprint arXiv:2401.12208}, |
|
|
year={2024} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
Seongjae Kang (erjui) |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
For questions or issues, please open an issue on the [model repository](https://huggingface.co/erjui/CheXagent-2-3b-csrrg-findings/discussions). |
|
|
|
|
|
|