YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Model Card
- Source: https://arxiv.org/abs/2509.02046
- Optimizer:
kron - Model size:
130m - Data size:
10B
Best configuration
| Hyperparameter | Value |
|---|---|
| beta1 | 0.95 |
| block_size | 256 |
| learning_rate | 0.002 |
| max_grad_norm | 1 |
| min_lr_ratio | 0 |
| normalize_grads | True |
| partition_grads_into_blocks | True |
| preconditioner_init_scale | 1 |
| preconditioner_lr | 0.2 |
| preconditioner_update_probability | 0.05 |
| train_batch_size | 128 |
| update_prob_flat_start | 2000 |
| warmup | 1000 |
| weight_decay | 0.5 |
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support