Update README.md
Browse files
README.md
CHANGED
|
@@ -40,9 +40,14 @@ The following checkpoints are from our paper titled Goldfish Loss: Mitigating Me
|
|
| 40 |
- The control model differs only in the fact that it did not utilize the canaries dataset for memorization and was simply pre-trained on 20B Redpajama tokens.
|
| 41 |
- The Canaries dataset, which contains 2000 Wikidocs, is repeated 50 times throughout the pre-training. Thus, it contains around ~204M tokens in total (including padding).
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
# Cite our work
|
| 44 |
|
| 45 |
-
If you find our
|
| 46 |
|
| 47 |
```bibtex
|
| 48 |
@misc{hans2024like,
|
|
|
|
| 40 |
- The control model differs only in the fact that it did not utilize the canaries dataset for memorization and was simply pre-trained on 20B Redpajama tokens.
|
| 41 |
- The Canaries dataset, which contains 2000 Wikidocs, is repeated 50 times throughout the pre-training. Thus, it contains around ~204M tokens in total (including padding).
|
| 42 |
|
| 43 |
+
# Technical Specification
|
| 44 |
+
|
| 45 |
+
Each checkpoint mentioned above used randomly initialized [TinyLLaMA-1.1B](https://huggingface.co/TinyLlama/TinyLlama_v1.1) architecture.
|
| 46 |
+
For pretraining details, please find check our [GitHub](https://github.com/ahans30/goldfish-loss) repository.
|
| 47 |
+
|
| 48 |
# Cite our work
|
| 49 |
|
| 50 |
+
If you find our model, codebase or dataset beneficial, please consider citing our work:
|
| 51 |
|
| 52 |
```bibtex
|
| 53 |
@misc{hans2024like,
|