StepLaw
/

StepLaw-N_1.0B-D_1.0B-LR3.906e-03-BS262144

@@ -27,13 +27,13 @@ This model is part of the [StepLaw-N_1.0B-D_1.0B](https://huggingface.co/collect
 ### Training Parameters
 - **Learning rate (lr)**: 3.906e-03
-- **Batch size (bs)**: 128
 - **Training iterations**: 7629
 - **Training tokens (D)**: 2.0B
 ## Model Description
-StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 3.906e-03 and batch size 128 for 7629 iterations, using a total of 2.0B training tokens.
 ## Usage Example

 ### Training Parameters
 - **Learning rate (lr)**: 3.906e-03
+- **Batch size (bs)**: 262144
 - **Training iterations**: 7629
 - **Training tokens (D)**: 2.0B
 ## Model Description
+StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 3.906e-03 and batch size 262144 for 7629 iterations, using a total of 2.0B training tokens.
 ## Usage Example