teapotai
/

teapotllm

Text Generation

Transformers.js

text2text-generation

text-generation-inference

Model card Files Files and versions

zakerytclarke commited on Mar 10

Commit

3d1fa9e

·

verified ·

1 Parent(s): 78398a7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -238,7 +238,7 @@ print(answer[0].get('generated_text')) # => The Eiffel Tower stands at a height
 ## Model Details
-Teapot LLM is fine-tuned from [flan-t5-base](https://huggingface.co/google/flan-t5-base) on a [synthetic dataset](https://huggingface.co/datasets/teapotai/synthqa) of LLM tasks generated using [Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B).
 ### Conversational Question Answering
 Teapot is fine-tuned to provide friendly, conversational answers using context and documents provided as references.
@@ -255,7 +255,7 @@ Teapot has been trained to extract succint answers in a variety of format enabli
 ### Training Details
 - [Dataset] ~4mb synthetic dataset consisting of QnA pairs with a variety of task specific formats.
 - [Methodology] The model is trained to mimic task specific output formats, and is scored based on its ability to output relevant, succint and verifiable answers in the requested format.
-- [Hardware] Teapot was trained for ~2hr on an A100 provided by Google Colab.
 - [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
 ### Limitations and Risks

 ## Model Details
+Teapot LLM is fine-tuned from [flan-t5-base](https://huggingface.co/google/flan-t5-base) on a [synthetic dataset](https://huggingface.co/datasets/teapotai/synthqa) of LLM tasks generated using [DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3).
 ### Conversational Question Answering
 Teapot is fine-tuned to provide friendly, conversational answers using context and documents provided as references.
 ### Training Details
 - [Dataset] ~4mb synthetic dataset consisting of QnA pairs with a variety of task specific formats.
 - [Methodology] The model is trained to mimic task specific output formats, and is scored based on its ability to output relevant, succint and verifiable answers in the requested format.
+- [Hardware] Teapot was trained for ~10hr on an A100 provided by Google Colab.
 - [Hyperparameters] The model was trained with various learning rates and monitored to ensure task specific performance was learned without catastrophic forgetting.
 ### Limitations and Risks