End of training

Browse files

Files changed (3) hide show

README.md +17 -12
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [vidore/colpaligemma-3b-pt-448-base](https://huggingface.co/vidore/colpaligemma-3b-pt-448-base) on the gajanhcc/fashion-query-dataset-10samples dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0035
-- Model Preparation Time: 0.0058
 ## Model description
@@ -38,27 +38,32 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
-- gradient_accumulation_steps: 4
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 100
-- num_epochs: 1.5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time |
 |:-------------:|:-----:|:----:|:---------------:|:----------------------:|
-| No log        | 0.01  | 1    | 0.2341          | 0.0058                 |
-| 0.0067        | 1.0   | 100  | 0.0033          | 0.0058                 |
 ### Framework versions
 - Transformers 4.47.1
-- Pytorch 2.5.1+cu124
-- Datasets 3.3.2
-- Tokenizers 0.21.0

 This model is a fine-tuned version of [vidore/colpaligemma-3b-pt-448-base](https://huggingface.co/vidore/colpaligemma-3b-pt-448-base) on the gajanhcc/fashion-query-dataset-10samples dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0004
+- Model Preparation Time: 0.0063
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time |
 |:-------------:|:-----:|:----:|:---------------:|:----------------------:|
+| No log        | 0.01  | 1    | 0.0104          | 0.0063                 |
+| 0.0028        | 0.5   | 50   | 0.0026          | 0.0063                 |
+| 0.0           | 1.0   | 100  | 0.0012          | 0.0063                 |
+| 0.0001        | 1.5   | 150  | 0.0009          | 0.0063                 |
+| 0.0049        | 2.0   | 200  | 0.0006          | 0.0063                 |
+| 0.0002        | 2.5   | 250  | 0.0004          | 0.0063                 |
+| 0.0           | 3.0   | 300  | 0.0004          | 0.0063                 |
 ### Framework versions
 - Transformers 4.47.1
+- Pytorch 2.6.0+cu124
+- Datasets 3.4.1
+- Tokenizers 0.21.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4fd94fd36aa9e0f131f044d90244b1f421ab774cc60d269b9d9c426ec129b508
 size 157071680

 version https://git-lfs.github.com/spec/v1
+oid sha256:8273a6a2e39cd05daf9c53fb76f53c6af39933d1342f8a87f6ddc7c843aaf36d
 size 157071680

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c1c560b6e024b37e975b368dba6899704efc80176b615e815386d7162689095e
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:f751ab26bf13641a4cd3c087dadb60f84eae221cf9b924bef5d30e415ad69a33
 size 5304