cisco-ehsan commited on Oct 6

Commit

96d3e32

verified ·

1 Parent(s): 77f34be

Upload folder using huggingface_hub

Browse files

Files changed (18) hide show

README.md +364 -0
config.json +55 -0
model.safetensors +3 -0
optimizer.pt +3 -0
rng_state_0.pth +3 -0
rng_state_1.pth +3 -0
rng_state_2.pth +3 -0
rng_state_3.pth +3 -0
rng_state_4.pth +3 -0
rng_state_5.pth +3 -0
rng_state_6.pth +3 -0
rng_state_7.pth +3 -0
scheduler.pt +3 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +952 -0
trainer_state.json +119 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,364 @@

+---
+language:
+- en
+license: apache-2.0
+tags:
+- sentence-transformers
+- cross-encoder
+- reranker
+- generated_from_trainer
+- dataset_size:35705
+- loss:CachedMultipleNegativesRankingLoss
+pipeline_tag: text-ranking
+library_name: sentence-transformers
+---
+# SecureBERT 2.0 Cross-Encoder fine-tuned
+This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model trained using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
+## Model Details
+### Model Description
+- **Model Type:** Cross Encoder
+<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
+- **Maximum Sequence Length:** 1024 tokens
+- **Number of Output Labels:** 1 label
+<!-- - **Training Dataset:** Unknown -->
+- **Language:** en
+- **License:** apache-2.0
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import CrossEncoder
+# Download from the 🤗 Hub
+model = CrossEncoder("cross_encoder_model_id")
+# Get scores for pairs of texts
+pairs = [
+    ['What is the purpose of the withdrawn draft document mentioned?', 'The withdrawn draft document is provided solely for historical purposes.'],
+    ["What are the implications of Stealc's use of Sqlite3 Dll for data retrieval from browsers?", "Stealc's use of the Sqlite3 Dll for data retrieval from browsers has several implications for cybersecurity. Firstly, it demonstrates the malware's sophistication in interacting directly with browser databases, allowing it to bypass simpler data extraction methods that might be more easily detected. This approach enables Stealc to execute complex queries against the browser's SQLite databases to extract sensitive information such as cookies, saved passwords, and browsing history. The reliance on Sqlite3 Dll also indicates that Stealc can potentially adapt to extract data from any application that uses SQLite for data storage, broadening its threat landscape. For cybersecurity professionals, this underscores the need for robust monitoring and protection mechanisms at the database level, as well as the importance of securing applications that use SQLite databases against unauthorized access. Additionally, it highlights the necessity for continuous monitoring of system and application dependencies for signs of tampering or malicious use, as legitimate tools like Sqlite3 Dll can be exploited by malware to facilitate data theft."],
+    ['Which of the following is a key security requirement for an effective Information Security Awareness program?', 'Segmenting the audience based on their role'],
+    ['What steps should be taken post-acquisition to integrate cybersecurity practices effectively?', "Post-acquisition, effectively integrating cybersecurity practices involves several critical steps to ensure a seamless transition and maintain a strong security posture: 1. Conduct a Comprehensive Security Assessment: Perform a detailed assessment of the acquired company's cybersecurity infrastructure, policies, and practices to identify any gaps or vulnerabilities. 2. Align Cybersecurity Policies and Procedures: Harmonize the cybersecurity policies and procedures of both companies to ensure consistent standards and practices across the merged entity. This includes data protection, incident response, and access control policies. 3. Integrate Security Technologies: Evaluate and integrate security technologies from both companies, such as firewalls, intrusion detection systems, and endpoint protection solutions, to create a unified security architecture. 4. Consolidate Security Teams: Merge the cybersecurity teams of both companies to streamline operations and foster collaboration. Ensure that roles, responsibilities, and reporting structures are clearly defined. 5. Provide Training and Awareness: Conduct comprehensive training sessions for all employees to familiarize them with the integrated company's cybersecurity policies, practices, and tools. 6. Establish Continuous Monitoring and Threat Hunting: Implement continuous monitoring of the integrated network and systems to detect and respond to threats promptly. Engage in proactive threat hunting to identify and mitigate potential security issues before they can be exploited. 7. Review and Update Incident Response Plans: Update the incident response plans to reflect the integrated company's structure and capabilities. Conduct regular drills to ensure readiness in the event of a cybersecurity incident. By following these steps, companies can effectively integrate cybersecurity practices post-acquisition, minimizing risks and ensuring a secure and resilient IT environment."],
+    ['How can you architect zero‐trust principles to render Kerberoasting ineffective?', "Architecting zero-trust principles to mitigate Kerberoasting attacks requires implementing comprehensive identity verification, continuous monitoring, and segmented network access controls that fundamentally challenge the assumptions underlying this MITRE ATT&CK technique (T1558.003).\\n\\n**Identity-Centric Security Architecture:**\\nImplement multi-factor authentication (MFA) universally across all privileged accounts, eliminating password-only dependencies that Kerberoasting exploits. Deploy privileged access management (PAM) solutions with just-in-time access provisioning, ensuring service accounts receive minimal necessary permissions and elevated credentials only during specific operational windows. This aligns with NIST CSF's Protect function (PR.AC-1) by implementing identity governance frameworks that continuously validate user and service account legitimacy.\\n\\n**Kerberos Protocol Hardening:**\\nConfigure domain controllers to implement Kerberos Armoring (FAST), which encrypts pre-authentication data using AES encryption rather than RC4. This prevents offline password cracking attempts characteristic of Kerberoasting workflows. Additionally, deploy Managed Service Accounts (MSAs) and Group Managed Service Accounts (gMSAs) to eliminate static service account passwords entirely, replacing them with automatically managed credentials that cannot be extracted from memory.\\n\\n**Network Segmentation and Zero-Trust Network Access:**\\nImplement microsegmentation strategies that limit lateral movement capabilities even after successful credential compromise. Deploy zero-trust network access (ZTNA) solutions that authenticate and authorize every connection attempt, regardless of source location. This addresses MITRE ATT&CK's lateral movement tactics by ensuring compromised service accounts cannot freely traverse the network infrastructure.\\n\\n**Continuous Monitoring and Detection:**\\nEstablish behavioral analytics platforms monitoring Kerberos ticket requests for anomalous patterns indicating potential Kerberoasting attempts. Deploy endpoint detection and response (EDR) solutions capable of identifying unusual memory access patterns targeting service account credentials. Implement Security Information and Event Management (SIEM) correlation rules detecting multiple failed authentication attempts against high-privilege service accounts.\\n\\n**Credential Hygiene and Rotation:**\\nImplement automated credential rotation policies for all service accounts, ensuring passwords change frequently enough to limit attack windows. Deploy password complexity requirements exceeding common Kerberoasting cracking capabilities, incorporating extended character sets and minimum length requirements that significantly increase computational costs for offline attacks.\\n\\nThis comprehensive approach transforms the traditional trust-on-first-use model into a continuous verification paradigm where every authentication event requires fresh validation, making Kerberoasting economically infeasible while maintaining operational efficiency through automated policy enforcement."],
+]
+scores = model.predict(pairs)
+print(scores.shape)
+# (5,)
+# Or rank different texts based on similarity to a single text
+ranks = model.rank(
+    'What is the purpose of the withdrawn draft document mentioned?',
+    [
+        'The withdrawn draft document is provided solely for historical purposes.',
+        "Stealc's use of the Sqlite3 Dll for data retrieval from browsers has several implications for cybersecurity. Firstly, it demonstrates the malware's sophistication in interacting directly with browser databases, allowing it to bypass simpler data extraction methods that might be more easily detected. This approach enables Stealc to execute complex queries against the browser's SQLite databases to extract sensitive information such as cookies, saved passwords, and browsing history. The reliance on Sqlite3 Dll also indicates that Stealc can potentially adapt to extract data from any application that uses SQLite for data storage, broadening its threat landscape. For cybersecurity professionals, this underscores the need for robust monitoring and protection mechanisms at the database level, as well as the importance of securing applications that use SQLite databases against unauthorized access. Additionally, it highlights the necessity for continuous monitoring of system and application dependencies for signs of tampering or malicious use, as legitimate tools like Sqlite3 Dll can be exploited by malware to facilitate data theft.",
+        'Segmenting the audience based on their role',
+        "Post-acquisition, effectively integrating cybersecurity practices involves several critical steps to ensure a seamless transition and maintain a strong security posture: 1. Conduct a Comprehensive Security Assessment: Perform a detailed assessment of the acquired company's cybersecurity infrastructure, policies, and practices to identify any gaps or vulnerabilities. 2. Align Cybersecurity Policies and Procedures: Harmonize the cybersecurity policies and procedures of both companies to ensure consistent standards and practices across the merged entity. This includes data protection, incident response, and access control policies. 3. Integrate Security Technologies: Evaluate and integrate security technologies from both companies, such as firewalls, intrusion detection systems, and endpoint protection solutions, to create a unified security architecture. 4. Consolidate Security Teams: Merge the cybersecurity teams of both companies to streamline operations and foster collaboration. Ensure that roles, responsibilities, and reporting structures are clearly defined. 5. Provide Training and Awareness: Conduct comprehensive training sessions for all employees to familiarize them with the integrated company's cybersecurity policies, practices, and tools. 6. Establish Continuous Monitoring and Threat Hunting: Implement continuous monitoring of the integrated network and systems to detect and respond to threats promptly. Engage in proactive threat hunting to identify and mitigate potential security issues before they can be exploited. 7. Review and Update Incident Response Plans: Update the incident response plans to reflect the integrated company's structure and capabilities. Conduct regular drills to ensure readiness in the event of a cybersecurity incident. By following these steps, companies can effectively integrate cybersecurity practices post-acquisition, minimizing risks and ensuring a secure and resilient IT environment.",
+        "Architecting zero-trust principles to mitigate Kerberoasting attacks requires implementing comprehensive identity verification, continuous monitoring, and segmented network access controls that fundamentally challenge the assumptions underlying this MITRE ATT&CK technique (T1558.003).\\n\\n**Identity-Centric Security Architecture:**\\nImplement multi-factor authentication (MFA) universally across all privileged accounts, eliminating password-only dependencies that Kerberoasting exploits. Deploy privileged access management (PAM) solutions with just-in-time access provisioning, ensuring service accounts receive minimal necessary permissions and elevated credentials only during specific operational windows. This aligns with NIST CSF's Protect function (PR.AC-1) by implementing identity governance frameworks that continuously validate user and service account legitimacy.\\n\\n**Kerberos Protocol Hardening:**\\nConfigure domain controllers to implement Kerberos Armoring (FAST), which encrypts pre-authentication data using AES encryption rather than RC4. This prevents offline password cracking attempts characteristic of Kerberoasting workflows. Additionally, deploy Managed Service Accounts (MSAs) and Group Managed Service Accounts (gMSAs) to eliminate static service account passwords entirely, replacing them with automatically managed credentials that cannot be extracted from memory.\\n\\n**Network Segmentation and Zero-Trust Network Access:**\\nImplement microsegmentation strategies that limit lateral movement capabilities even after successful credential compromise. Deploy zero-trust network access (ZTNA) solutions that authenticate and authorize every connection attempt, regardless of source location. This addresses MITRE ATT&CK's lateral movement tactics by ensuring compromised service accounts cannot freely traverse the network infrastructure.\\n\\n**Continuous Monitoring and Detection:**\\nEstablish behavioral analytics platforms monitoring Kerberos ticket requests for anomalous patterns indicating potential Kerberoasting attempts. Deploy endpoint detection and response (EDR) solutions capable of identifying unusual memory access patterns targeting service account credentials. Implement Security Information and Event Management (SIEM) correlation rules detecting multiple failed authentication attempts against high-privilege service accounts.\\n\\n**Credential Hygiene and Rotation:**\\nImplement automated credential rotation policies for all service accounts, ensuring passwords change frequently enough to limit attack windows. Deploy password complexity requirements exceeding common Kerberoasting cracking capabilities, incorporating extended character sets and minimum length requirements that significantly increase computational costs for offline attacks.\\n\\nThis comprehensive approach transforms the traditional trust-on-first-use model into a continuous verification paradigm where every authentication event requires fresh validation, making Kerberoasting economically infeasible while maintaining operational efficiency through automated policy enforcement.",
+    ]
+)
+# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 35,705 training samples
+* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence1                                                                                       | sentence2                                                                                         | label                                                         |
+  |:--------|:------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:--------------------------------------------------------------|
+  | type    | string                                                                                          | string                                                                                            | float                                                         |
+  | details | <ul><li>min: 24 characters</li><li>mean: 98.46 characters</li><li>max: 415 characters</li></ul> | <ul><li>min: 3 characters</li><li>mean: 1468.34 characters</li><li>max: 5413 characters</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
+* Samples:
+  | sentence1                                                                                                                                                                        | sentence2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | label            |
+  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
+  | <code>Explain the implementation details of C2 protocols that use IPv6 extension headers and fragmentation for covert data channels while evading deep packet inspection.</code> | <code>IPv6 offers many new possibilities for malicious code authors in terms of evasion. The protocol introduces a number of new features such as multiple addresses per interface, auto-configuration, routing header options, and others. Some of these can be used to hide traffic and data transfers, making it difficult for packet inspection systems to detect them or associate with known malware command-and-control (C2) channels. For example, the 41 protocol number is reserved for IPv6 over IPv4 tunneling. This feature allows packets encapsulated in an IPv6 header to be transported over an IPv4 network. However, there are no restrictions on what the encapsulated traffic can be. A malicious actor could encapsulate data from a botnet or C2 channel within an IPv6 tunnel that is routed through an IPv4 network. Another example is the 137 protocol number reserved for Dynamic Host Configuration Protocol (DHCP). DHCP uses broadcast to obtain IP addresses and configuration information, so using this as a...</code> | <code>1.0</code> |
+  | <code>What does each switchport represent on a switch?</code>                                                                                                                    | <code>Collision domain</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | <code>1.0</code> |
+  | <code>How can defenders avoid alert fatigue by converting reactive detections into hunt signature exceptions?</code>                                                             | <code>Alert fatigue represents a critical challenge in cybersecurity operations, where security teams become desensitized to frequent notifications, potentially missing genuine threats buried within noise. Converting reactive detections into hunt signature exceptions offers a strategic approach to mitigate this issue while enhancing proactive threat hunting capabilities.\n\nThe foundation lies in establishing a robust detection framework aligned with NIST Cybersecurity Framework's Detect (DE) function, specifically DE.AE (Anomalies and Events) and DE.CM (Security Continuous Monitoring). Organizations should implement tiered alert severity classifications using the Common Vulnerability Scoring System (CVSS) or organization-specific risk matrices. This categorization enables defenders to prioritize high-fidelity alerts while systematically analyzing lower-priority notifications during structured hunt cycles.\n\nHunt signature exceptions operate by transforming previously reactive, rule-based d...</code> | <code>1.0</code> |
+* Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 10.0,
+      "num_negatives": 10,
+      "activation_fn": "torch.nn.modules.activation.Sigmoid",
+      "mini_batch_size": 24
+  }
+  ```
+### Evaluation Dataset
+#### Unnamed Dataset
+* Size: 8,927 evaluation samples
+* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence1                                                                                       | sentence2                                                                                         | label                                                         |
+  |:--------|:------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:--------------------------------------------------------------|
+  | type    | string                                                                                          | string                                                                                            | float                                                         |
+  | details | <ul><li>min: 17 characters</li><li>mean: 97.23 characters</li><li>max: 341 characters</li></ul> | <ul><li>min: 2 characters</li><li>mean: 1537.31 characters</li><li>max: 5375 characters</li></ul> | <ul><li>min: 1.0</li><li>mean: 1.0</li><li>max: 1.0</li></ul> |
+* Samples:
+  | sentence1                                                                                                                  | sentence2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | label            |
+  |:---------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
+  | <code>What is the purpose of the withdrawn draft document mentioned?</code>                                                | <code>The withdrawn draft document is provided solely for historical purposes.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | <code>1.0</code> |
+  | <code>What are the implications of Stealc's use of Sqlite3 Dll for data retrieval from browsers?</code>                    | <code>Stealc's use of the Sqlite3 Dll for data retrieval from browsers has several implications for cybersecurity. Firstly, it demonstrates the malware's sophistication in interacting directly with browser databases, allowing it to bypass simpler data extraction methods that might be more easily detected. This approach enables Stealc to execute complex queries against the browser's SQLite databases to extract sensitive information such as cookies, saved passwords, and browsing history. The reliance on Sqlite3 Dll also indicates that Stealc can potentially adapt to extract data from any application that uses SQLite for data storage, broadening its threat landscape. For cybersecurity professionals, this underscores the need for robust monitoring and protection mechanisms at the database level, as well as the importance of securing applications that use SQLite databases against unauthorized access. Additionally, it highlights the necessity for continuous monitoring of system and application dep...</code> | <code>1.0</code> |
+  | <code>Which of the following is a key security requirement for an effective Information Security Awareness program?</code> | <code>Segmenting the audience based on their role</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | <code>1.0</code> |
+* Loss: [<code>CachedMultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#cachedmultiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 10.0,
+      "num_negatives": 10,
+      "activation_fn": "torch.nn.modules.activation.Sigmoid",
+      "mini_batch_size": 24
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 20
+- `per_device_eval_batch_size`: 20
+- `learning_rate`: 2e-05
+- `num_train_epochs`: 10
+- `warmup_ratio`: 0.1
+- `seed`: 12
+- `bf16`: True
+- `load_best_model_at_end`: True
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 20
+- `per_device_eval_batch_size`: 20
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 2e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 10
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 12
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: True
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: True
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: True
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: proportional
+- `router_mapping`: {}
+- `learning_rate_mapping`: {}
+</details>
+### Training Logs
+| Epoch  | Step | Training Loss | Validation Loss |
+|:------:|:----:|:-------------:|:---------------:|
+| 0.0045 | 1    | 0.0007        | -               |
+| 0.2242 | 50   | 0.0031        | -               |
+| 0.4484 | 100  | 0.0023        | -               |
+| 0.6726 | 150  | 0.0029        | -               |
+| 0.8969 | 200  | 0.0029        | -               |
+| 1.1211 | 250  | 0.0034        | -               |
+| 1.3453 | 300  | 0.0029        | -               |
+| 1.5695 | 350  | 0.0034        | -               |
+| 1.7937 | 400  | 0.0048        | -               |
+| 2.0179 | 450  | 0.0064        | -               |
+| 2.2422 | 500  | 0.0052        | 0.0395          |
+### Framework Versions
+- Python: 3.10.10
+- Sentence Transformers: 5.0.0
+- Transformers: 4.52.4
+- PyTorch: 2.7.0+cu128
+- Accelerate: 1.9.0
+- Datasets: 3.6.0
+- Tokenizers: 0.21.1
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,55 @@

+{
+  "architectures": [
+    "ModernBertForSequenceClassification"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 50281,
+  "classifier_activation": "gelu",
+  "classifier_bias": false,
+  "classifier_dropout": 0.0,
+  "classifier_pooling": "mean",
+  "cls_token_id": 50281,
+  "decoder_bias": true,
+  "deterministic_flash_attn": false,
+  "embedding_dropout": 0.0,
+  "eos_token_id": 50282,
+  "global_attn_every_n_layers": 3,
+  "global_rope_theta": 160000.0,
+  "gradient_checkpointing": false,
+  "hidden_activation": "gelu",
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_cutoff_factor": 2.0,
+  "initializer_range": 0.02,
+  "intermediate_size": 1152,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-05,
+  "local_attention": 128,
+  "local_rope_theta": 10000.0,
+  "max_position_embeddings": 8192,
+  "mlp_bias": false,
+  "mlp_dropout": 0.0,
+  "model_type": "modernbert",
+  "norm_bias": false,
+  "norm_eps": 1e-05,
+  "num_attention_heads": 12,
+  "num_hidden_layers": 22,
+  "pad_token_id": 50283,
+  "position_embedding_type": "absolute",
+  "repad_logits_with_grad": false,
+  "sentence_transformers": {
+    "activation_fn": "torch.nn.modules.activation.Sigmoid",
+    "version": "5.0.0"
+  },
+  "sep_token_id": 50282,
+  "sparse_pred_ignore_index": -100,
+  "sparse_prediction": false,
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.4",
+  "vocab_size": 50368
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5bd289c4ec77b5fa46480fbfda4c2ffee8b29fe2ffbec5466ff796f1dfc63b8b
+size 598436708

optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:347a6fcb7d9239e1d15fce7b1771c2d954f5aec13547a51df8a45bb6336eff21
+size 1196961739

rng_state_0.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:243a4925489c190ee58a88c7361972d99c4762450fc884f52e3e46de42986d61
+size 16389

rng_state_1.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ad4d35093cde1af968aa653fe59ed06c71e182eeae5add06dcba2a1696f9c44a
+size 16389

rng_state_2.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d2815c0d98d71c1752edf1e56158a78643505d99b9adb763317f56db13fba11c
+size 16389

rng_state_3.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:91a8923ce784f0d1596d46d0ce31824c4432ced50ccac53b2eaa2289d383bd17
+size 16389

rng_state_4.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:55c32d07cb66cc830b633f109ae405bf9f451747291e377d14d8b654944e8472
+size 16389

rng_state_5.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a8a73c6adaaa388a14609ba29c9003671b41e7f6fd97b0d960e06f7aa4cc5ba3
+size 16389

rng_state_6.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cd21b528d8b7d76abf247ea00694429c53f6d11ab8158d9071f96c39ab938805
+size 16389

rng_state_7.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f59970869f06ddf9a28a0736381e30ba7b2bfb9cba0beb65c41a0e81c4751064
+size 16389

scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5bc50cd91c9364b8f5f7be52e743b30cf1c0d97c97475282e10e4b9e2abe99a5
+size 1465

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,952 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "|||IP_ADDRESS|||",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "1": {
+      "content": "<|padding|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50254": {
+      "content": "                        ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50255": {
+      "content": "                       ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50256": {
+      "content": "                      ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50257": {
+      "content": "                     ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50258": {
+      "content": "                    ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50259": {
+      "content": "                   ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50260": {
+      "content": "                  ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50261": {
+      "content": "                 ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50262": {
+      "content": "                ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50263": {
+      "content": "               ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50264": {
+      "content": "              ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50265": {
+      "content": "             ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50266": {
+      "content": "            ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50267": {
+      "content": "           ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50268": {
+      "content": "          ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50269": {
+      "content": "         ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50270": {
+      "content": "        ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50271": {
+      "content": "       ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50272": {
+      "content": "      ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50273": {
+      "content": "     ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50274": {
+      "content": "    ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50275": {
+      "content": "   ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50276": {
+      "content": "  ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50277": {
+      "content": "|||EMAIL_ADDRESS|||",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50278": {
+      "content": "|||PHONE_NUMBER|||",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50279": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50280": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50281": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50282": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50283": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50284": {
+      "content": "[MASK]",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50285": {
+      "content": "[unused0]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50286": {
+      "content": "[unused1]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50287": {
+      "content": "[unused2]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50288": {
+      "content": "[unused3]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50289": {
+      "content": "[unused4]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50290": {
+      "content": "[unused5]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50291": {
+      "content": "[unused6]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50292": {
+      "content": "[unused7]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50293": {
+      "content": "[unused8]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50294": {
+      "content": "[unused9]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50295": {
+      "content": "[unused10]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50296": {
+      "content": "[unused11]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50297": {
+      "content": "[unused12]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50298": {
+      "content": "[unused13]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50299": {
+      "content": "[unused14]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50300": {
+      "content": "[unused15]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50301": {
+      "content": "[unused16]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50302": {
+      "content": "[unused17]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50303": {
+      "content": "[unused18]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50304": {
+      "content": "[unused19]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50305": {
+      "content": "[unused20]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50306": {
+      "content": "[unused21]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50307": {
+      "content": "[unused22]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50308": {
+      "content": "[unused23]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50309": {
+      "content": "[unused24]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50310": {
+      "content": "[unused25]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50311": {
+      "content": "[unused26]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50312": {
+      "content": "[unused27]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50313": {
+      "content": "[unused28]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50314": {
+      "content": "[unused29]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50315": {
+      "content": "[unused30]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50316": {
+      "content": "[unused31]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50317": {
+      "content": "[unused32]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50318": {
+      "content": "[unused33]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50319": {
+      "content": "[unused34]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50320": {
+      "content": "[unused35]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50321": {
+      "content": "[unused36]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50322": {
+      "content": "[unused37]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50323": {
+      "content": "[unused38]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50324": {
+      "content": "[unused39]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50325": {
+      "content": "[unused40]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50326": {
+      "content": "[unused41]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50327": {
+      "content": "[unused42]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50328": {
+      "content": "[unused43]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50329": {
+      "content": "[unused44]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50330": {
+      "content": "[unused45]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50331": {
+      "content": "[unused46]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50332": {
+      "content": "[unused47]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50333": {
+      "content": "[unused48]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50334": {
+      "content": "[unused49]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50335": {
+      "content": "[unused50]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50336": {
+      "content": "[unused51]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50337": {
+      "content": "[unused52]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50338": {
+      "content": "[unused53]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50339": {
+      "content": "[unused54]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50340": {
+      "content": "[unused55]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50341": {
+      "content": "[unused56]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50342": {
+      "content": "[unused57]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50343": {
+      "content": "[unused58]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50344": {
+      "content": "[unused59]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50345": {
+      "content": "[unused60]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50346": {
+      "content": "[unused61]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50347": {
+      "content": "[unused62]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50348": {
+      "content": "[unused63]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50349": {
+      "content": "[unused64]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50350": {
+      "content": "[unused65]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50351": {
+      "content": "[unused66]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50352": {
+      "content": "[unused67]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50353": {
+      "content": "[unused68]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50354": {
+      "content": "[unused69]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50355": {
+      "content": "[unused70]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50356": {
+      "content": "[unused71]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50357": {
+      "content": "[unused72]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50358": {
+      "content": "[unused73]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50359": {
+      "content": "[unused74]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50360": {
+      "content": "[unused75]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50361": {
+      "content": "[unused76]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50362": {
+      "content": "[unused77]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50363": {
+      "content": "[unused78]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50364": {
+      "content": "[unused79]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50365": {
+      "content": "[unused80]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50366": {
+      "content": "[unused81]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50367": {
+      "content": "[unused82]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_length": 1024,
+  "model_input_names": [
+    "input_ids",
+    "attention_mask"
+  ],
+  "model_max_length": 1024,
+  "pad_to_multiple_of": null,
+  "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "[SEP]",
+  "stride": 0,
+  "tokenizer_class": "PreTrainedTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

trainer_state.json ADDED Viewed

	@@ -0,0 +1,119 @@

+{
+  "best_global_step": 500,
+  "best_metric": 0.03948367014527321,
+  "best_model_checkpoint": "/teamspace/studios/this_studio/secure_modern_bert/Models/reranker-securebert2-cmnrl/checkpoint-500",
+  "epoch": 2.242152466367713,
+  "eval_steps": 500,
+  "global_step": 500,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.004484304932735426,
+      "grad_norm": 3.019306495843921e-05,
+      "learning_rate": 0.0,
+      "loss": 0.0007,
+      "step": 1
+    },
+    {
+      "epoch": 0.2242152466367713,
+      "grad_norm": 0.4142322838306427,
+      "learning_rate": 4.394618834080718e-06,
+      "loss": 0.0031,
+      "step": 50
+    },
+    {
+      "epoch": 0.4484304932735426,
+      "grad_norm": 0.06909839808940887,
+      "learning_rate": 8.878923766816144e-06,
+      "loss": 0.0023,
+      "step": 100
+    },
+    {
+      "epoch": 0.672645739910314,
+      "grad_norm": 3.889437675476074,
+      "learning_rate": 1.3363228699551571e-05,
+      "loss": 0.0029,
+      "step": 150
+    },
+    {
+      "epoch": 0.8968609865470852,
+      "grad_norm": 0.008452645502984524,
+      "learning_rate": 1.7847533632286997e-05,
+      "loss": 0.0029,
+      "step": 200
+    },
+    {
+      "epoch": 1.1210762331838564,
+      "grad_norm": 0.017008045688271523,
+      "learning_rate": 1.974090682610862e-05,
+      "loss": 0.0034,
+      "step": 250
+    },
+    {
+      "epoch": 1.3452914798206277,
+      "grad_norm": 0.12214717268943787,
+      "learning_rate": 1.9242650722471353e-05,
+      "loss": 0.0029,
+      "step": 300
+    },
+    {
+      "epoch": 1.5695067264573992,
+      "grad_norm": 0.01061748992651701,
+      "learning_rate": 1.8744394618834082e-05,
+      "loss": 0.0034,
+      "step": 350
+    },
+    {
+      "epoch": 1.7937219730941703,
+      "grad_norm": 0.017459532245993614,
+      "learning_rate": 1.824613851519681e-05,
+      "loss": 0.0048,
+      "step": 400
+    },
+    {
+      "epoch": 2.0179372197309418,
+      "grad_norm": 0.0639125257730484,
+      "learning_rate": 1.7747882411559544e-05,
+      "loss": 0.0064,
+      "step": 450
+    },
+    {
+      "epoch": 2.242152466367713,
+      "grad_norm": 1.5931968688964844,
+      "learning_rate": 1.7249626307922273e-05,
+      "loss": 0.0052,
+      "step": 500
+    },
+    {
+      "epoch": 2.242152466367713,
+      "eval_loss": 0.03948367014527321,
+      "eval_runtime": 38.1763,
+      "eval_samples_per_second": 233.836,
+      "eval_steps_per_second": 1.467,
+      "step": 500
+    }
+  ],
+  "logging_steps": 50,
+  "max_steps": 2230,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 10,
+  "save_steps": 500,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 0.0,
+  "train_batch_size": 20,
+  "trial_name": null,
+  "trial_params": null
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1c53fcf7e62c84e8f4247e66c55343aa03d38abe8cd9f386f43ce2f11db8acd3
+size 6161