Model Card for gpt-oss-20b-triton-kernel
This model is a fine-tuned version of openai/gpt-oss-20b. It has been trained using TRL.
About
We introduce gpt-oss-20b-triton-kernel, a large language model based on gpt-oss-20b, which has been trained specifically for the task of authoring GPU kernels using Triton The filtered and compiled dataset is KernelBook-messages a modifed version of KernelBook
It took me 4 hrs 55 min to train this model
Quick start
from transformers import pipeline
question = "Your prompt here"
generator = pipeline("text-generation", model="Nadiveedishravanreddy/gpt-oss-20b-triton-kernel", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
Training procedure
This model was trained with SFT.
Framework versions
- TRL: 0.22.2
- Transformers: 4.56.1
- Pytorch: 2.6.0+git684f6f2
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citations
Cite TRL as:
@software{kFT_2025,
title = {GPT-OSS-20B Triton Kernel Fine-tuned Model},
author = {Nadiveedi, Shravan Reddy},
year = {2025},
month = {9},
url = {https://huggingface.co/Nadiveedishravanreddy/gpt-oss-20b-triton-kernel},
version = {v1.0},
note = {Hugging Face Model Hub}
}
- Downloads last month
- 2