update model card
Browse files
README.md
CHANGED
|
@@ -31,7 +31,9 @@ Check out the model info here: [Swiss-AI/LLM](https://huggingface.co/collections
|
|
| 31 |
# Finetuned on AQUA-RAT
|
| 32 |
|
| 33 |
This repo contains the fine-tuned version of Apertus on [AQuA-RAT dataset](https://huggingface.co/datasets/deepmind/aqua_rat).
|
|
|
|
| 34 |
The fine-tuning was performed using Unsloth on one GPU (RTX A6000 48 GB) with the following parameters:
|
|
|
|
| 35 |
- per_device_train_batch_size: 8
|
| 36 |
- gradient_accumulation_steps: 4 (effective batch size: 32)
|
| 37 |
- warmup_steps: 10
|
|
@@ -46,7 +48,7 @@ The fine-tuning was performed using Unsloth on one GPU (RTX A6000 48 GB) with th
|
|
| 46 |
- eval_strategy: steps
|
| 47 |
- eval_steps: 150
|
| 48 |
- packing: True
|
| 49 |
-
|
| 50 |
## How to use
|
| 51 |
|
| 52 |
You can run this fine-tuned version using the below instructions:
|
|
|
|
| 31 |
# Finetuned on AQUA-RAT
|
| 32 |
|
| 33 |
This repo contains the fine-tuned version of Apertus on [AQuA-RAT dataset](https://huggingface.co/datasets/deepmind/aqua_rat).
|
| 34 |
+
|
| 35 |
The fine-tuning was performed using Unsloth on one GPU (RTX A6000 48 GB) with the following parameters:
|
| 36 |
+
|
| 37 |
- per_device_train_batch_size: 8
|
| 38 |
- gradient_accumulation_steps: 4 (effective batch size: 32)
|
| 39 |
- warmup_steps: 10
|
|
|
|
| 48 |
- eval_strategy: steps
|
| 49 |
- eval_steps: 150
|
| 50 |
- packing: True
|
| 51 |
+
|
| 52 |
## How to use
|
| 53 |
|
| 54 |
You can run this fine-tuned version using the below instructions:
|