Issues with Fine Tuning

#37
by rirv938 - opened

Hi. Great work on the models. Qwen team always produces great models.

I am running into an issue when fine tuning this model with Transformers Trainer. Basically both deepspeed stage 3 OR FSDP libraries result in a hanging on the first training step.

It may be that I am configuring something wrong during training (but I use the same script for many other models). So perhaps an example fine tuning script would be useful here. This is similar to GPTOSS which provides an example script "https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers".

Thanks for any help here.

Robert.

Sign up or log in to comment