Update README.md
Browse files
README.md
CHANGED
|
@@ -263,7 +263,7 @@ Teapot LLM is fine-tuned from [flan-t5-large](https://huggingface.co/google/flan
|
|
| 263 |
- [Hardware] Teapot was trained for ~10hr on an A100 provided by Google Colab.
|
| 264 |
- [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
|
| 265 |
|
| 266 |
-
### Evaluation
|
| 267 |
TeapotLLM is focused on in-context reasoning tasks, and therefore most benchmarks are not suitable for evaluation. We want TeapotLLM to be a practical tool for QnA and information extraction, so we have developed custom datasets to benchmark performance.
|
| 268 |
|
| 269 |
[Evaluation Notebook Here](https://github.com/zakerytclarke/teapot/blob/main/docs/evals/TeapotLLM_Benchmark.ipynb)
|
|
|
|
| 263 |
- [Hardware] Teapot was trained for ~10hr on an A100 provided by Google Colab.
|
| 264 |
- [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
|
| 265 |
|
| 266 |
+
### Model Evaluation
|
| 267 |
TeapotLLM is focused on in-context reasoning tasks, and therefore most benchmarks are not suitable for evaluation. We want TeapotLLM to be a practical tool for QnA and information extraction, so we have developed custom datasets to benchmark performance.
|
| 268 |
|
| 269 |
[Evaluation Notebook Here](https://github.com/zakerytclarke/teapot/blob/main/docs/evals/TeapotLLM_Benchmark.ipynb)
|