Update README.md
Browse files
README.md
CHANGED
|
@@ -68,6 +68,13 @@ This setup preserved general reasoning ability while improving spatial accuracy.
|
|
| 68 |
|
| 69 |
# Evaluation
|
| 70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
## Benchmark Tasks
|
| 72 |
|
| 73 |
### SpatialQA Benchmark
|
|
|
|
| 68 |
|
| 69 |
# Evaluation
|
| 70 |
|
| 71 |
+
| Model | MMLU-Geography (%) | Spatial Eval (%) | Babi Task 17 (%) |
|
| 72 |
+
|-------------------------------------------------|--------------------|------------------|------------------|
|
| 73 |
+
| mistralai/Mistral-7B-Instruct-v0.3 (base model) | 75.63% | 36.07% | 51% |
|
| 74 |
+
| sareena/spatial_lora_mistral (fine-tuned) | 76.17% | 0% | 53% |
|
| 75 |
+
| meta-llama/Llama-2-7b-hf | 42.42% | 18.25% | 48.00% |
|
| 76 |
+
| google/gemma-7b | 80.30% | 7.01% | 58.00% |
|
| 77 |
+
|
| 78 |
## Benchmark Tasks
|
| 79 |
|
| 80 |
### SpatialQA Benchmark
|