Training run v20250727_161609 - F1: 88.0907, EM: 79.2999

Files changed (4) hide show

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ model-index:
     - type: exact_match
       value: N/A
     - type: f1
-      value: 90.16042070890077
 ---
 # albert-base-v2 fine-tuned on SQuAD
@@ -37,17 +37,17 @@ This model is a fine-tuned version of [albert-base-v2](https://huggingface.co/al
 - **Dataset**: SQuAD
 - **Optimizer**: adamw
 - **Learning Rate Scheduler**: cosine_with_restarts
-- **Learning Rate**: 1e-05
-- **Batch Size**: 20 per device
-- **Total Batch Size**: 160
-- **Epochs**: 10 (with early stopping)
 - **Weight Decay**: 0.005
 - **Warmup Ratio**: 0.03
 - **Max Gradient Norm**: 0.5
 ### Early Stopping
-- **Patience**: 10
 - **Metric**: f1
 - **Best Epoch**: 2
@@ -78,11 +78,11 @@ print(f"Answer: {answer}")
 The model achieved the following results on the evaluation set:
-- **Exact Match**: 80.4825
-- **F1 Score**: 88.4728
 ## Training Configuration Hash
-Config Hash: 5627841c
 This hash can be used to reproduce the exact training configuration.

     - type: exact_match
       value: N/A
     - type: f1
+      value: 89.56708898636393
 ---
 # albert-base-v2 fine-tuned on SQuAD
 - **Dataset**: SQuAD
 - **Optimizer**: adamw
 - **Learning Rate Scheduler**: cosine_with_restarts
+- **Learning Rate**: 8e-05
+- **Batch Size**: 24 per device
+- **Total Batch Size**: 192
+- **Epochs**: 6 (with early stopping)
 - **Weight Decay**: 0.005
 - **Warmup Ratio**: 0.03
 - **Max Gradient Norm**: 0.5
 ### Early Stopping
+- **Patience**: 4
 - **Metric**: f1
 - **Best Epoch**: 2
 The model achieved the following results on the evaluation set:
+- **Exact Match**: 79.2999
+- **F1 Score**: 88.0907
 ## Training Configuration Hash
+Config Hash: d92d5758
 This hash can be used to reproduce the exact training configuration.

eval_results.json CHANGED Viewed

@@ -1,4 +1,4 @@
 {
-  "exact_match": 83.02743614001892,
-  "f1": 90.16042070890077
 }

 {
+  "exact_match": 82.05298013245033,
+  "f1": 89.56708898636393
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a865323a0aae82688859f4b390249e5e0fc39a95017d434f76d4c89b6f0dc413
 size 44381360

 version https://git-lfs.github.com/spec/v1
+oid sha256:1c1b8a698d2b09adb89ae35f8dae34da1657e13cd438f720d72951743d65b517
 size 44381360

training_config.json CHANGED Viewed

@@ -10,9 +10,9 @@
   "context_dropout": 0.05,
   "question_paraphrasing": true,
   "negative_sampling": true,
-  "batch_size": 20,
-  "num_epochs": 10,
-  "learning_rate": 1e-05,
   "weight_decay": 0.005,
   "warmup_ratio": 0.03,
   "gradient_accumulation_steps": 2,
@@ -27,7 +27,7 @@
   "scheduler_power": 0.5,
   "scheduler_eta_min": 5e-07,
   "scheduler_num_cycles": 0.5,
-  "early_stopping_patience": 10,
   "early_stopping_threshold": 0.0002,
   "early_stopping_metric": "f1",
   "log_interval": 50,

   "context_dropout": 0.05,
   "question_paraphrasing": true,
   "negative_sampling": true,
+  "batch_size": 24,
+  "num_epochs": 6,
+  "learning_rate": 8e-05,
   "weight_decay": 0.005,
   "warmup_ratio": 0.03,
   "gradient_accumulation_steps": 2,
   "scheduler_power": 0.5,
   "scheduler_eta_min": 5e-07,
   "scheduler_num_cycles": 0.5,
+  "early_stopping_patience": 4,
   "early_stopping_threshold": 0.0002,
   "early_stopping_metric": "f1",
   "log_interval": 50,