zakerytclarke commited on
Commit
3d1fa9e
·
verified ·
1 Parent(s): 78398a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -238,7 +238,7 @@ print(answer[0].get('generated_text')) # => The Eiffel Tower stands at a height
238
 
239
 
240
  ## Model Details
241
- Teapot LLM is fine-tuned from [flan-t5-base](https://huggingface.co/google/flan-t5-base) on a [synthetic dataset](https://huggingface.co/datasets/teapotai/synthqa) of LLM tasks generated using [Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B).
242
 
243
  ### Conversational Question Answering
244
  Teapot is fine-tuned to provide friendly, conversational answers using context and documents provided as references.
@@ -255,7 +255,7 @@ Teapot has been trained to extract succint answers in a variety of format enabli
255
  ### Training Details
256
  - [Dataset] ~4mb synthetic dataset consisting of QnA pairs with a variety of task specific formats.
257
  - [Methodology] The model is trained to mimic task specific output formats, and is scored based on its ability to output relevant, succint and verifiable answers in the requested format.
258
- - [Hardware] Teapot was trained for ~2hr on an A100 provided by Google Colab.
259
  - [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
260
 
261
  ### Limitations and Risks
 
238
 
239
 
240
  ## Model Details
241
+ Teapot LLM is fine-tuned from [flan-t5-base](https://huggingface.co/google/flan-t5-base) on a [synthetic dataset](https://huggingface.co/datasets/teapotai/synthqa) of LLM tasks generated using [DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3).
242
 
243
  ### Conversational Question Answering
244
  Teapot is fine-tuned to provide friendly, conversational answers using context and documents provided as references.
 
255
  ### Training Details
256
  - [Dataset] ~4mb synthetic dataset consisting of QnA pairs with a variety of task specific formats.
257
  - [Methodology] The model is trained to mimic task specific output formats, and is scored based on its ability to output relevant, succint and verifiable answers in the requested format.
258
+ - [Hardware] Teapot was trained for ~10hr on an A100 provided by Google Colab.
259
  - [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
260
 
261
  ### Limitations and Risks