HariomSahu commited on
Commit
9fa2b06
verified
1 Parent(s): 615fb83

Training run v20250727_173457 - F1: 88.7676, EM: 80.4541

Browse files
Files changed (4) hide show
  1. README.md +8 -8
  2. eval_results.json +2 -2
  3. model.safetensors +1 -1
  4. training_config.json +3 -3
README.md CHANGED
@@ -22,7 +22,7 @@ model-index:
22
  - type: exact_match
23
  value: N/A
24
  - type: f1
25
- value: 89.56708898636393
26
  ---
27
 
28
  # albert-base-v2 fine-tuned on SQuAD
@@ -37,12 +37,12 @@ This model is a fine-tuned version of [albert-base-v2](https://huggingface.co/al
37
  - **Dataset**: SQuAD
38
  - **Optimizer**: adamw
39
  - **Learning Rate Scheduler**: cosine_with_restarts
40
- - **Learning Rate**: 8e-05
41
- - **Batch Size**: 24 per device
42
- - **Total Batch Size**: 192
43
  - **Epochs**: 6 (with early stopping)
44
  - **Weight Decay**: 0.005
45
- - **Warmup Ratio**: 0.03
46
  - **Max Gradient Norm**: 0.5
47
 
48
  ### Early Stopping
@@ -78,11 +78,11 @@ print(f"Answer: {answer}")
78
 
79
  The model achieved the following results on the evaluation set:
80
 
81
- - **Exact Match**: 79.2999
82
- - **F1 Score**: 88.0907
83
 
84
  ## Training Configuration Hash
85
 
86
- Config Hash: d92d5758
87
 
88
  This hash can be used to reproduce the exact training configuration.
 
22
  - type: exact_match
23
  value: N/A
24
  - type: f1
25
+ value: 89.93540108105752
26
  ---
27
 
28
  # albert-base-v2 fine-tuned on SQuAD
 
37
  - **Dataset**: SQuAD
38
  - **Optimizer**: adamw
39
  - **Learning Rate Scheduler**: cosine_with_restarts
40
+ - **Learning Rate**: 6e-05
41
+ - **Batch Size**: 28 per device
42
+ - **Total Batch Size**: 224
43
  - **Epochs**: 6 (with early stopping)
44
  - **Weight Decay**: 0.005
45
+ - **Warmup Ratio**: 0.08
46
  - **Max Gradient Norm**: 0.5
47
 
48
  ### Early Stopping
 
78
 
79
  The model achieved the following results on the evaluation set:
80
 
81
+ - **Exact Match**: 80.4541
82
+ - **F1 Score**: 88.7676
83
 
84
  ## Training Configuration Hash
85
 
86
+ Config Hash: a8d23824
87
 
88
  This hash can be used to reproduce the exact training configuration.
eval_results.json CHANGED
@@ -1,4 +1,4 @@
1
  {
2
- "exact_match": 82.05298013245033,
3
- "f1": 89.56708898636393
4
  }
 
1
  {
2
+ "exact_match": 82.69631031220435,
3
+ "f1": 89.93540108105752
4
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1c1b8a698d2b09adb89ae35f8dae34da1657e13cd438f720d72951743d65b517
3
  size 44381360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:191e91d64faf0b47d869e6b4936c080d5182f28224dd5f1615880bfef5bd6fc7
3
  size 44381360
training_config.json CHANGED
@@ -10,11 +10,11 @@
10
  "context_dropout": 0.05,
11
  "question_paraphrasing": true,
12
  "negative_sampling": true,
13
- "batch_size": 24,
14
  "num_epochs": 6,
15
- "learning_rate": 8e-05,
16
  "weight_decay": 0.005,
17
- "warmup_ratio": 0.03,
18
  "gradient_accumulation_steps": 2,
19
  "max_grad_norm": 0.5,
20
  "optimizer_type": "adamw",
 
10
  "context_dropout": 0.05,
11
  "question_paraphrasing": true,
12
  "negative_sampling": true,
13
+ "batch_size": 28,
14
  "num_epochs": 6,
15
+ "learning_rate": 6e-05,
16
  "weight_decay": 0.005,
17
+ "warmup_ratio": 0.08,
18
  "gradient_accumulation_steps": 2,
19
  "max_grad_norm": 0.5,
20
  "optimizer_type": "adamw",