sareena commited on
Commit
d19155d
·
verified ·
1 Parent(s): 629fd00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -39,7 +39,10 @@ through fine-tuning, but there is limited work targeting spatial reasoning.
39
 
40
  ## Main Results
41
 
42
-
 
 
 
43
 
44
  # Training Data
45
 
@@ -108,7 +111,9 @@ fine-tuning task.
108
 
109
 
110
  ## Comparison Models
111
-
 
 
112
 
113
  # Usage and Intended Uses
114
  This model is designed to assist with natural language spatial reasoning, particularly in tasks that involve multi-step relational
 
39
 
40
  ## Main Results
41
 
42
+ The fine-tuned model slightly improved on general knowledge tasks such as MMLU Geography and Babi Task 17
43
+ compared to the original Mistral-7B base model. However, its performance on spatial reasoning benchmarks like SpatialEval
44
+ significantly declined, suggesting that fine-tuning may have led to incompatibility between the prompt style used for training with StepGame
45
+ and the multiple-choice formatting in SpatialEval.
46
 
47
  # Training Data
48
 
 
111
 
112
 
113
  ## Comparison Models
114
+ LLaMA-2 and Gemma represent strong alternatives from Meta and Google respectively, offering diverse architectural approaches with a similar number of parameters and
115
+ training data sources. Including these models allowed for a more meaningful evaluation of how my fine-tuned model performs
116
+ not just against its own baseline, but also against state-of-the-art peers on spatial reasoning and general knowledge tasks.
117
 
118
  # Usage and Intended Uses
119
  This model is designed to assist with natural language spatial reasoning, particularly in tasks that involve multi-step relational