YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Model Card
Model Summary
This model is a LoRA fine-tuned variant of huggingface/your-base-model, trained using the Unsloth library for parameter-efficient fine-tuning (PEFT).
It supports efficient training and inference with significantly reduced VRAM usage while maintaining high performance.
- Base model:
huggingface/your-base-model - Fine-tuning method: LoRA (rank = 8, alpha = 16)
- Framework: Hugging Face Transformers + Unsloth
- Intended use: Instruction-following and text generation tasks
Training Details
Optimizer: AdamW
LoRA Config:
- Rank: 8
- Alpha: 16
- Dropout: 0.0
Target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_projGradient checkpointing: Enabled (
unslothmode, reduces VRAM by ~30%)
Datasets
- Primary dataset:
HuggingFaceH4/Multilingual-Thinking - Data type: instruction–response pairs
Intended Use
This model is suitable for:
- Text generation
- Instruction-following tasks
- Educational, research, or prototyping purposes
⚠️ Not intended for unsafe or malicious content generation.
Limitations
- Performance depends on the quality and size of the fine-tuning dataset.
- May produce hallucinations on knowledge-intensive queries.
- English-focused (if dataset was English).
How to Use
[NOTE] Currently finetunes can only be loaded via Unsloth in the meantime - we're working on vLLM and GGUF exporting!
if False:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "dexyasir/gpt-oss-20b-Multilingual-Thinking-finetuned", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = 1024,
dtype = None,
load_in_4bit = True,
)
messages = [
{"role": "system", "content": "reasoning language: French\n\nYou are a helpful assistant that can solve mathematical problems."},
{"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt = True,
return_tensors = "pt",
return_dict = True,
reasoning_effort = "high",
).to(model.device)
from transformers import TextStreamer
_ = model.generate(**inputs, max_new_tokens = 64, streamer = TextStreamer(tokenizer))
Citation
If you use this model in your work, please cite:
@misc{devxyasir/gpt-oss-20b-Multilingual-Thinking-finetuned,
author = {Muhammad Yasir},
title = {LoRA Fine-tuned Model via Unsloth},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/devxyasir/gpt-oss-20b-Multilingual-Thinking-finetuned}}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support