Model Card for gpt-oss-20b-triton-kernel

This model is a fine-tuned version of openai/gpt-oss-20b. It has been trained using TRL.

About

We introduce gpt-oss-20b-triton-kernel, a large language model based on gpt-oss-20b, which has been trained specifically for the task of authoring GPU kernels using Triton The filtered and compiled dataset is KernelBook-messages a modifed version of KernelBook

It took me 4 hrs 55 min to train this model

Quick start

from transformers import pipeline

question = "Your prompt here"
generator = pipeline("text-generation", model="Nadiveedishravanreddy/gpt-oss-20b-triton-kernel", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

Training procedure

Visualize in Weights & Biases

This model was trained with SFT.

Framework versions

  • TRL: 0.22.2
  • Transformers: 4.56.1
  • Pytorch: 2.6.0+git684f6f2
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citations

Cite TRL as:

@software{kFT_2025,
    title       = {GPT-OSS-20B Triton Kernel Fine-tuned Model},
    author      = {Nadiveedi, Shravan Reddy},
    year        = {2025},
    month       = {9},
    url         = {https://huggingface.co/Nadiveedishravanreddy/gpt-oss-20b-triton-kernel},
    version     = {v1.0},
    note        = {Hugging Face Model Hub}
}


Downloads last month
2
Safetensors
Model size
21B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Nadiveedishravanreddy/gpt-oss-20b-triton-kernel

Base model

openai/gpt-oss-20b
Finetuned
(392)
this model
Quantizations
2 models

Dataset used to train Nadiveedishravanreddy/gpt-oss-20b-triton-kernel