Updated Readme.md
Browse files
README.md
CHANGED
|
@@ -17,10 +17,13 @@ tags:
|
|
| 17 |
# Model Overview
|
| 18 |
|
| 19 |
## Description:
|
| 20 |
-
The NVIDIA gpt-oss-120b Eagle model is the Eagle head of the OpenAI’s gpt-oss-120b model, which is an auto-regressive language model that uses a mixture-of-experts (MoE) architecture with
|
| 21 |
|
| 22 |
This model is ready for commercial/non-commercial use. <br>
|
| 23 |
|
|
|
|
|
|
|
|
|
|
| 24 |
### License/Terms of Use:
|
| 25 |
[nvidia-open-model-license](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
|
| 26 |
|
|
|
|
| 17 |
# Model Overview
|
| 18 |
|
| 19 |
## Description:
|
| 20 |
+
The NVIDIA gpt-oss-120b Eagle model is the Eagle head of the OpenAI’s gpt-oss-120b model, which is an auto-regressive language model that uses a mixture-of-experts (MoE) architecture with 5 billion activated parameters and 120 billion total parameters. For more information, please check [here](https://huggingface.co/openai/gpt-oss-120b). The NVIDIA gpt-oss-120b Eagle3 model incorporates Eagle speculative decoding with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
|
| 21 |
|
| 22 |
This model is ready for commercial/non-commercial use. <br>
|
| 23 |
|
| 24 |
+
### Note
|
| 25 |
+
For use cases of less than 8k context length - please consider using [gpt-oss-120b-Eagle3-v2](https://huggingface.co/nvidia/gpt-oss-120b-Eagle3-v2)
|
| 26 |
+
|
| 27 |
### License/Terms of Use:
|
| 28 |
[nvidia-open-model-license](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
|
| 29 |
|