Jack Li
commited on
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_1.0B-D_1.0B](https://huggingface.co/collect
|
|
| 27 |
|
| 28 |
### Training Parameters
|
| 29 |
- **Learning rate (lr)**: 3.906e-03
|
| 30 |
-
- **Batch size (bs)**:
|
| 31 |
- **Training iterations**: 7629
|
| 32 |
- **Training tokens (D)**: 2.0B
|
| 33 |
|
| 34 |
## Model Description
|
| 35 |
|
| 36 |
-
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 3.906e-03 and batch size
|
| 37 |
|
| 38 |
## Usage Example
|
| 39 |
|
|
|
|
| 27 |
|
| 28 |
### Training Parameters
|
| 29 |
- **Learning rate (lr)**: 3.906e-03
|
| 30 |
+
- **Batch size (bs)**: 262144
|
| 31 |
- **Training iterations**: 7629
|
| 32 |
- **Training tokens (D)**: 2.0B
|
| 33 |
|
| 34 |
## Model Description
|
| 35 |
|
| 36 |
+
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 3.906e-03 and batch size 262144 for 7629 iterations, using a total of 2.0B training tokens.
|
| 37 |
|
| 38 |
## Usage Example
|
| 39 |
|