tomg-group-umd
/

3-goldfish-loss-llama-1B

Text Generation

text2text-generation

text-generation-inference

Model card Files Files and versions

ahans1 commited on Aug 19, 2024

Commit

fd48d85

·

verified ·

1 Parent(s): 0a41636

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -40,9 +40,14 @@ The following checkpoints are from our paper titled Goldfish Loss: Mitigating Me
 - The control model differs only in the fact that it did not utilize the canaries dataset for memorization and was simply pre-trained on 20B Redpajama tokens.
 - The Canaries dataset, which contains 2000 Wikidocs, is repeated 50 times throughout the pre-training. Thus, it contains around ~204M tokens in total (including padding).
 # Cite our work
-If you find our work useful, please cite our paper:
 ```bibtex
 @misc{hans2024like,

 - The control model differs only in the fact that it did not utilize the canaries dataset for memorization and was simply pre-trained on 20B Redpajama tokens.
 - The Canaries dataset, which contains 2000 Wikidocs, is repeated 50 times throughout the pre-training. Thus, it contains around ~204M tokens in total (including padding).
+# Technical Specification
+Each checkpoint mentioned above used randomly initialized [TinyLLaMA-1.1B](https://huggingface.co/TinyLlama/TinyLlama_v1.1) architecture.
+For pretraining details, please find check our [GitHub](https://github.com/ahans30/goldfish-loss) repository.
 # Cite our work
+If you find our model, codebase or dataset beneficial, please consider citing our work:
 ```bibtex
 @misc{hans2024like,