kldzj commited on
Commit
b14888f
·
verified ·
1 Parent(s): 53763e3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +212 -0
README.md ADDED
@@ -0,0 +1,212 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ tags:
6
+ - vllm
7
+ - heretic
8
+ - uncensored
9
+ - decensored
10
+ - abliterated
11
+ - mxfp4
12
+ ---
13
+ # This is a decensored version of [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b), made using [Heretic](https://github.com/p-e-w/heretic) v1.0.1
14
+
15
+ ## Abliteration parameters
16
+
17
+ | Parameter | Value |
18
+ | :-------- | :---: |
19
+ | **direction_index** | 18.02 |
20
+ | **attn.o_proj.max_weight** | 1.44 |
21
+ | **attn.o_proj.max_weight_position** | 21.27 |
22
+ | **attn.o_proj.min_weight** | 1.42 |
23
+ | **attn.o_proj.min_weight_distance** | 9.82 |
24
+ | **mlp.down_proj.max_weight** | 1.38 |
25
+ | **mlp.down_proj.max_weight_position** | 31.84 |
26
+ | **mlp.down_proj.min_weight** | 0.69 |
27
+ | **mlp.down_proj.min_weight_distance** | 2.50 |
28
+
29
+ ## Performance
30
+
31
+ | Metric | This model | Original model ([openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b)) |
32
+ | :----- | :--------: | :---------------------------: |
33
+ | **KL divergence** | 0.53 | 0 *(by definition)* |
34
+ | **Refusals** | 22/100 | 97/100 |
35
+
36
+ -----
37
+
38
+
39
+ <p align="center">
40
+ <img alt="gpt-oss-120b" src="https://raw.githubusercontent.com/openai/gpt-oss/main/docs/gpt-oss-120b.svg">
41
+ </p>
42
+
43
+ <p align="center">
44
+ <a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
45
+ <a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
46
+ <a href="https://arxiv.org/abs/2508.10925"><strong>Model card</strong></a> ·
47
+ <a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
48
+ </p>
49
+
50
+ <br>
51
+
52
+ Welcome to the gpt-oss series, [OpenAI’s open-weight models](https://openai.com/open-models) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
53
+
54
+ We’re releasing two flavors of these open models:
55
+ - `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fit into a single 80GB GPU (like NVIDIA H100 or AMD MI300X) (117B parameters with 5.1B active parameters)
56
+ - `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
57
+
58
+ Both models were trained on our [harmony response format](https://github.com/openai/harmony) and should only be used with the harmony format as it will not work correctly otherwise.
59
+
60
+
61
+ > [!NOTE]
62
+ > This model card is dedicated to the larger `gpt-oss-120b` model. Check out [`gpt-oss-20b`](https://huggingface.co/openai/gpt-oss-20b) for the smaller model.
63
+
64
+ # Highlights
65
+
66
+ * **Permissive Apache 2.0 license:** Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.
67
+ * **Configurable reasoning effort:** Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
68
+ * **Full chain-of-thought:** Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
69
+ * **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.
70
+ * **Agentic capabilities:** Use the models’ native capabilities for function calling, [web browsing](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#browser), [Python code execution](https://github.com/openai/gpt-oss/tree/main?tab=readme-ov-file#python), and Structured Outputs.
71
+ * **MXFP4 quantization:** The models were post-trained with MXFP4 quantization of the MoE weights, making `gpt-oss-120b` run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the `gpt-oss-20b` model run within 16GB of memory. All evals were performed with the same MXFP4 quantization.
72
+
73
+ ---
74
+
75
+ # Inference examples
76
+
77
+ ## Transformers
78
+
79
+ You can use `gpt-oss-120b` and `gpt-oss-20b` with Transformers. If you use the Transformers chat template, it will automatically apply the [harmony response format](https://github.com/openai/harmony). If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [openai-harmony](https://github.com/openai/harmony) package.
80
+
81
+ To get started, install the necessary dependencies to setup your environment:
82
+
83
+ ```
84
+ pip install -U transformers kernels torch
85
+ ```
86
+
87
+ Once, setup you can proceed to run the model by running the snippet below:
88
+
89
+ ```py
90
+ from transformers import pipeline
91
+ import torch
92
+
93
+ model_id = "openai/gpt-oss-120b"
94
+
95
+ pipe = pipeline(
96
+ "text-generation",
97
+ model=model_id,
98
+ torch_dtype="auto",
99
+ device_map="auto",
100
+ )
101
+
102
+ messages = [
103
+ {"role": "user", "content": "Explain quantum mechanics clearly and concisely."},
104
+ ]
105
+
106
+ outputs = pipe(
107
+ messages,
108
+ max_new_tokens=256,
109
+ )
110
+ print(outputs[0]["generated_text"][-1])
111
+ ```
112
+
113
+ Alternatively, you can run the model via [`Transformers Serve`](https://huggingface.co/docs/transformers/main/serving) to spin up a OpenAI-compatible webserver:
114
+
115
+ ```
116
+ transformers serve
117
+ transformers chat localhost:8000 --model-name-or-path openai/gpt-oss-120b
118
+ ```
119
+
120
+ [Learn more about how to use gpt-oss with Transformers.](https://cookbook.openai.com/articles/gpt-oss/run-transformers)
121
+
122
+ ## vLLM
123
+
124
+ vLLM recommends using [uv](https://docs.astral.sh/uv/) for Python dependency management. You can use vLLM to spin up an OpenAI-compatible webserver. The following command will automatically download the model and start the server.
125
+
126
+ ```bash
127
+ uv pip install --pre vllm==0.10.1+gptoss \
128
+ --extra-index-url https://wheels.vllm.ai/gpt-oss/ \
129
+ --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
130
+ --index-strategy unsafe-best-match
131
+
132
+ vllm serve openai/gpt-oss-120b
133
+ ```
134
+
135
+ [Learn more about how to use gpt-oss with vLLM.](https://cookbook.openai.com/articles/gpt-oss/run-vllm)
136
+
137
+ ## PyTorch / Triton
138
+
139
+ To learn about how to use this model with PyTorch and Triton, check out our [reference implementations in the gpt-oss repository](https://github.com/openai/gpt-oss?tab=readme-ov-file#reference-pytorch-implementation).
140
+
141
+ ## Ollama
142
+
143
+ If you are trying to run gpt-oss on consumer hardware, you can use Ollama by running the following commands after [installing Ollama](https://ollama.com/download).
144
+
145
+ ```bash
146
+ # gpt-oss-120b
147
+ ollama pull gpt-oss:120b
148
+ ollama run gpt-oss:120b
149
+ ```
150
+
151
+ [Learn more about how to use gpt-oss with Ollama.](https://cookbook.openai.com/articles/gpt-oss/run-locally-ollama)
152
+
153
+ #### LM Studio
154
+
155
+ If you are using [LM Studio](https://lmstudio.ai/) you can use the following commands to download.
156
+
157
+ ```bash
158
+ # gpt-oss-120b
159
+ lms get openai/gpt-oss-120b
160
+ ```
161
+
162
+ Check out our [awesome list](https://github.com/openai/gpt-oss/blob/main/awesome-gpt-oss.md) for a broader collection of gpt-oss resources and inference partners.
163
+
164
+ ---
165
+
166
+ # Download the model
167
+
168
+ You can download the model weights from the [Hugging Face Hub](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4) directly from Hugging Face CLI:
169
+
170
+ ```shell
171
+ # gpt-oss-120b
172
+ huggingface-cli download openai/gpt-oss-120b --include "original/*" --local-dir gpt-oss-120b/
173
+ pip install gpt-oss
174
+ python -m gpt_oss.chat model/
175
+ ```
176
+
177
+ # Reasoning levels
178
+
179
+ You can adjust the reasoning level that suits your task across three levels:
180
+
181
+ * **Low:** Fast responses for general dialogue.
182
+ * **Medium:** Balanced speed and detail.
183
+ * **High:** Deep and detailed analysis.
184
+
185
+ The reasoning level can be set in the system prompts, e.g., "Reasoning: high".
186
+
187
+ # Tool use
188
+
189
+ The gpt-oss models are excellent for:
190
+ * Web browsing (using built-in browsing tools)
191
+ * Function calling with defined schemas
192
+ * Agentic operations like browser tasks
193
+
194
+ # Fine-tuning
195
+
196
+ Both gpt-oss models can be fine-tuned for a variety of specialized use cases.
197
+
198
+ This larger model `gpt-oss-120b` can be fine-tuned on a single H100 node, whereas the smaller [`gpt-oss-20b`](https://huggingface.co/openai/gpt-oss-20b) can even be fine-tuned on consumer hardware.
199
+
200
+ # Citation
201
+
202
+ ```bibtex
203
+ @misc{openai2025gptoss120bgptoss20bmodel,
204
+ title={gpt-oss-120b & gpt-oss-20b Model Card},
205
+ author={OpenAI},
206
+ year={2025},
207
+ eprint={2508.10925},
208
+ archivePrefix={arXiv},
209
+ primaryClass={cs.CL},
210
+ url={https://arxiv.org/abs/2508.10925},
211
+ }
212
+ ```