Training run v20250727_173457 - F1: 88.7676, EM: 80.4541
Browse files- README.md +8 -8
- eval_results.json +2 -2
- model.safetensors +1 -1
- training_config.json +3 -3
README.md
CHANGED
|
@@ -22,7 +22,7 @@ model-index:
|
|
| 22 |
- type: exact_match
|
| 23 |
value: N/A
|
| 24 |
- type: f1
|
| 25 |
-
value: 89.
|
| 26 |
---
|
| 27 |
|
| 28 |
# albert-base-v2 fine-tuned on SQuAD
|
|
@@ -37,12 +37,12 @@ This model is a fine-tuned version of [albert-base-v2](https://huggingface.co/al
|
|
| 37 |
- **Dataset**: SQuAD
|
| 38 |
- **Optimizer**: adamw
|
| 39 |
- **Learning Rate Scheduler**: cosine_with_restarts
|
| 40 |
-
- **Learning Rate**:
|
| 41 |
-
- **Batch Size**:
|
| 42 |
-
- **Total Batch Size**:
|
| 43 |
- **Epochs**: 6 (with early stopping)
|
| 44 |
- **Weight Decay**: 0.005
|
| 45 |
-
- **Warmup Ratio**: 0.
|
| 46 |
- **Max Gradient Norm**: 0.5
|
| 47 |
|
| 48 |
### Early Stopping
|
|
@@ -78,11 +78,11 @@ print(f"Answer: {answer}")
|
|
| 78 |
|
| 79 |
The model achieved the following results on the evaluation set:
|
| 80 |
|
| 81 |
-
- **Exact Match**:
|
| 82 |
-
- **F1 Score**: 88.
|
| 83 |
|
| 84 |
## Training Configuration Hash
|
| 85 |
|
| 86 |
-
Config Hash:
|
| 87 |
|
| 88 |
This hash can be used to reproduce the exact training configuration.
|
|
|
|
| 22 |
- type: exact_match
|
| 23 |
value: N/A
|
| 24 |
- type: f1
|
| 25 |
+
value: 89.93540108105752
|
| 26 |
---
|
| 27 |
|
| 28 |
# albert-base-v2 fine-tuned on SQuAD
|
|
|
|
| 37 |
- **Dataset**: SQuAD
|
| 38 |
- **Optimizer**: adamw
|
| 39 |
- **Learning Rate Scheduler**: cosine_with_restarts
|
| 40 |
+
- **Learning Rate**: 6e-05
|
| 41 |
+
- **Batch Size**: 28 per device
|
| 42 |
+
- **Total Batch Size**: 224
|
| 43 |
- **Epochs**: 6 (with early stopping)
|
| 44 |
- **Weight Decay**: 0.005
|
| 45 |
+
- **Warmup Ratio**: 0.08
|
| 46 |
- **Max Gradient Norm**: 0.5
|
| 47 |
|
| 48 |
### Early Stopping
|
|
|
|
| 78 |
|
| 79 |
The model achieved the following results on the evaluation set:
|
| 80 |
|
| 81 |
+
- **Exact Match**: 80.4541
|
| 82 |
+
- **F1 Score**: 88.7676
|
| 83 |
|
| 84 |
## Training Configuration Hash
|
| 85 |
|
| 86 |
+
Config Hash: a8d23824
|
| 87 |
|
| 88 |
This hash can be used to reproduce the exact training configuration.
|
eval_results.json
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
{
|
| 2 |
-
"exact_match": 82.
|
| 3 |
-
"f1": 89.
|
| 4 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"exact_match": 82.69631031220435,
|
| 3 |
+
"f1": 89.93540108105752
|
| 4 |
}
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 44381360
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:191e91d64faf0b47d869e6b4936c080d5182f28224dd5f1615880bfef5bd6fc7
|
| 3 |
size 44381360
|
training_config.json
CHANGED
|
@@ -10,11 +10,11 @@
|
|
| 10 |
"context_dropout": 0.05,
|
| 11 |
"question_paraphrasing": true,
|
| 12 |
"negative_sampling": true,
|
| 13 |
-
"batch_size":
|
| 14 |
"num_epochs": 6,
|
| 15 |
-
"learning_rate":
|
| 16 |
"weight_decay": 0.005,
|
| 17 |
-
"warmup_ratio": 0.
|
| 18 |
"gradient_accumulation_steps": 2,
|
| 19 |
"max_grad_norm": 0.5,
|
| 20 |
"optimizer_type": "adamw",
|
|
|
|
| 10 |
"context_dropout": 0.05,
|
| 11 |
"question_paraphrasing": true,
|
| 12 |
"negative_sampling": true,
|
| 13 |
+
"batch_size": 28,
|
| 14 |
"num_epochs": 6,
|
| 15 |
+
"learning_rate": 6e-05,
|
| 16 |
"weight_decay": 0.005,
|
| 17 |
+
"warmup_ratio": 0.08,
|
| 18 |
"gradient_accumulation_steps": 2,
|
| 19 |
"max_grad_norm": 0.5,
|
| 20 |
"optimizer_type": "adamw",
|