Llama-3.1-8B Knowledge Recall Model

This is a fine-tuned Llama-3.1-8B model specialized for knowledge recall tasks. This checkpoint was released alongside https://arxiv.org/abs/2509.11167.

Model Details

Base model: Llama-3.1-8B
Training dataset: tulu3_mixture_knowledge_recall
Learning rate: 5e-06
Effective batch size: 128

Export Files

This repository includes export files for state averaging and other advanced techniques.

Downloads last month: 31

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for pmahdavi/Llama-3.1-8B-knowledge-recall

Base model

meta-llama/Llama-3.1-8B

Finetuned

(1646)

this model

Space using pmahdavi/Llama-3.1-8B-knowledge-recall 1

Evaluation results

Training Loss on tulu3_mixture_knowledge_recall
self-reported

1.050

View on Papers With Code