Llama-3.1-8B Knowledge Recall Model
This is a fine-tuned Llama-3.1-8B model specialized for knowledge recall tasks. This checkpoint was released alongside https://arxiv.org/abs/2509.11167.
Model Details
- Base model: Llama-3.1-8B
- Training dataset: tulu3_mixture_knowledge_recall
- Learning rate: 5e-06
- Effective batch size: 128
Export Files
This repository includes export files for state averaging and other advanced techniques.
- Downloads last month
- 31
Model tree for pmahdavi/Llama-3.1-8B-knowledge-recall
Base model
meta-llama/Llama-3.1-8BSpace using pmahdavi/Llama-3.1-8B-knowledge-recall 1
Evaluation results
- Training Loss on tulu3_mixture_knowledge_recallself-reported1.050