File size: 50,226 Bytes

---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:400
- loss:MultipleNegativesRankingLoss
base_model: Qwen/Qwen3-Embedding-0.6B
widget:
- source_sentence: "Wrapper for calling the clean method of services attribute\n\n\
    \        :return: None"
  sentences:
  - "def import_from_nhmmer_table(hmmout_path):\n        \n        \n        \n  \
    \      \n        res=HMMSearchResult()\n        res.fields = [\n             \
    \          SequenceSearchResult.QUERY_ID_FIELD,\n                       SequenceSearchResult.HMM_NAME_FIELD,\n\
    \                       SequenceSearchResult.ALIGNMENT_LENGTH_FIELD,\n       \
    \                SequenceSearchResult.QUERY_FROM_FIELD,\n                    \
    \   SequenceSearchResult.QUERY_TO_FIELD,\n                       SequenceSearchResult.HIT_FROM_FIELD,\n\
    \                       SequenceSearchResult.HIT_TO_FIELD,\n                 \
    \      SequenceSearchResult.ALIGNMENT_BIT_SCORE,\n                       SequenceSearchResult.ALIGNMENT_DIRECTION,\n\
    \                       ]\n        \n        for row in [x.rstrip().split() for\
    \ x in open(hmmout_path) if not x.startswith()]:\n            alifrom    = int(row[6])\n\
    \            alito      = int(row[7])\n            aln_length = (alito-alifrom\
    \ if alito-alifrom>0 else alifrom-alito)\n            res.results.append([row[0],\n\
    \                                row[2],\n                                aln_length,\n\
    \                                int(row[4]),\n                              \
    \  int(row[5]),\n                                alifrom,\n                  \
    \              alito,\n                                row[13],\n            \
    \                    alito > alifrom\n                                ])\n   \
    \     return res"
  - "def clean(self):\n        \n        logger.debug(\"Cleaning configuration objects\
    \ before configuration sending:\")\n        types_creations = self.__class__.types_creations\n\
    \        for o_type in types_creations:\n            (_, _, inner_property, _,\
    \ _) = types_creations[o_type]\n            logger.debug(\"  . for %s\", inner_property,\
    \ )\n            inner_object = getattr(self, inner_property)\n            inner_object.clean()"
  - "def index_modules(idx=None, path=None):\n    \n    suppress_output()\n    modules\
    \ = defaultdict(list)\n    pkglist = pkgutil.walk_packages(onerror=lambda x: True)\n\
    \    print(pkglist)\n    if path:\n        pkglist = pkgutil.walk_packages(path,\
    \ onerror=lambda x: True)\n    for modl, name, ispkg in pkglist:\n        try:\n\
    \            path = os.path.join(modl.path, name.split()[-1])\n        except\
    \ AttributeError:\n            \n            continue\n\n        if os.path.isdir(path):\n\
    \            path = os.path.join(path, )\n        path += \n\n        objs = []\n\
    \n        if os.path.exists(path):\n            try:\n                objs = read_objs_from_path(path)\n\
    \            except:\n                continue\n        elif not re.search(MODULE_BLACKLIST,\
    \ name):\n            try:\n                mod = __import__(name)\n         \
    \       objs = [k for k in dir(mod) if not k.startswith()]\n            except:\n\
    \                continue\n        else:\n            continue\n\n        for\
    \ obj in objs:\n            if name not in modules[obj]:\n                modules[obj].append(name)\n\
    \    suppress_output(True)\n    return merge_dicts(idx, dict(modules))"
- source_sentence: Try to import the aeneas package and return ``True`` if that fails.
  sentences:
  - "def check_import():\n    \n    try:\n        import aeneas\n        print_success(u\"\
    aeneas         OK\")\n        return False\n    except ImportError:\n        print_error(u\"\
    aeneas         ERROR\")\n        print_info(u\"  Unable to load the aeneas Python\
    \ package\")\n        print_info(u\"  This error is probably caused by:\")\n \
    \       print_info(u\"    A. you did not download/git-clone the aeneas package\
    \ properly; or\")\n        print_info(u\"    B. you did not install the required\
    \ Python packages:\")\n        print_info(u\"      1. BeautifulSoup4\")\n    \
    \    print_info(u\"      2. lxml\")\n        print_info(u\"      3. numpy\")\n\
    \    except Exception as e:\n        print_error(e)\n    return True"
  - "def simplify(source, kink=20):\n    \n    source_coord = map(lambda o: {\"lng\"\
    : o.coordinates[0], \"lat\": o.coordinates[1]}, source)\n\n    \n    \n    \n\
    \    F = (math.pi / 180.0) * 0.5\n    index = [] \n    sig_start = [] \n    sig_end\
    \ = []\n\n    \n    count = len(source_coord)\n    if count < 3:\n        return\
    \ source_coord \n\n    \n\n    band_sqr = kink * 360.0 / (2.0 * math.pi * 6378137.0)\
    \ \n    band_sqr *= band_sqr\n    n_dest = 0\n    sig_start[0] = 0\n    sig_end[0]\
    \ = count - 1\n    n_stack = 1\n\n    \n    while n_stack > 0:\n        \n   \
    \     start = sig_start[n_stack - 1]\n        end = sig_end[n_stack - 1]\n   \
    \     n_stack -= 1\n\n        if (end - start) > 1: \n            \n         \
    \   x12 = source[end][\"lng\"] - source[start][\"lng\"]\n            y12 = source[end][\"\
    lat\"] - source[start][\"lat\"]\n            if math.fabs(x12) > 180.0:\n    \
    \            x12 = 360.0 - math.fabs(x12)\n            x12 *= math.cos(F * (source[end][\"\
    lat\"] + source[start][\"lat\"])) \n            d12 = (x12 * x12) + (y12 * y12)\n\
    \n            i = start + 1\n            sig = start\n            max_dev_sqr\
    \ = -1.0\n            while i < end:\n                x13 = source[i][\"lng\"\
    ] - source[start][\"lng\"]\n                y13 = source[i][\"lat\"] - source[start][\"\
    lat\"]\n                if math.fabs(x13) > 180.0:\n                    x13 =\
    \ 360.0 - math.fabs(x13)\n                x13 *= math.cos(F * (source[i][\"lat\"\
    ] + source[start][\"lat\"]))\n                d13 = (x13 * x13) + (y13 * y13)\n\
    \                x23 = source[i][\"lng\"] - source[end][\"lng\"]\n           \
    \     y23 = source[i][\"lat\"] - source[end][\"lat\"]\n                if math.fabs(x23)\
    \ > 180.0:\n                    x23 = 360.0 - math.fabs(x23)\n               \
    \ x23 *= math.cos(F * (source[i][\"lat\"] + source[end][\"lat\"]))\n         \
    \       d23 = (x23 * x23) + (y23 * y23)\n\n                if d13 >= (d12 + d23):\n\
    \                    dev_sqr = d23\n                elif d23 >= (d12 + d13):\n\
    \                    dev_sqr = d13\n                else:\n                  \
    \  dev_sqr = (x13 * y12 - y13 * x12) * (x13 * y12 - y13 * x12) / d12 \n      \
    \          if dev_sqr > max_dev_sqr:\n                    sig = i\n          \
    \          max_dev_sqr = dev_sqr\n                i += 1\n\n\n            if max_dev_sqr\
    \ < band_sqr: \n            \n                index[n_dest] = start\n        \
    \        n_dest += 1\n            else: \n                n_stack += 1\n     \
    \           sig_start[n_stack - 1] = sig\n                sig_end[n_stack - 1]\
    \ = end\n                n_stack += 1\n                sig_start[n_stack - 1]\
    \ = start\n                sig_end[n_stack - 1] = sig\n\n        else:  \n   \
    \         index[n_dest] = start\n            n_dest += 1\n\n    \n    index[n_dest]\
    \ = count - 1\n    n_dest += 1\n\n    \n    r = []\n    for i in range(0, n_dest):\n\
    \        r.append(source_coord[index[i]])\n\n    return map(lambda o:  {\"type\"\
    : \"Point\",\"coordinates\": [o.lng, o.lat]}, r)"
  - "def smooth(data, fw):\r\n    \r\n    if fw == 0:\r\n        fdata = data\r\n\
    \    else:\r\n        fdata = lfilter(np.ones(fw)/fw, 1, data)\r\n    return fdata"
- source_sentence: Start response processing.
  sentences:
  - "async def start(self, connection: ) -> :\n        \n        self._closed = False\n\
    \        self._protocol = connection.protocol\n        self._connection = connection\n\
    \n        with self._timer:\n            while True:\n                \n     \
    \           try:\n                    message, payload = await self._protocol.read()\
    \  \n                except http.HttpProcessingError as exc:\n               \
    \     raise ClientResponseError(\n                        self.request_info, self.history,\n\
    \                        status=exc.code,\n                        message=exc.message,\
    \ headers=exc.headers) from exc\n\n                if (message.code < 100 or\n\
    \                        message.code > 199 or message.code == 101):\n       \
    \             break\n\n                if self._continue is not None:\n      \
    \              set_result(self._continue, True)\n                    self._continue\
    \ = None\n\n        \n        payload.on_eof(self._response_eof)\n\n        \n\
    \        self.version = message.version\n        self.status = message.code\n\
    \        self.reason = message.reason\n\n        \n        self._headers = message.headers\
    \  \n        self._raw_headers = message.raw_headers  \n\n        \n        self.content\
    \ = payload\n\n        \n        for hdr in self.headers.getall(hdrs.SET_COOKIE,\
    \ ()):\n            try:\n                self.cookies.load(hdr)\n           \
    \ except CookieError as exc:\n                client_logger.warning(\n       \
    \             , exc)\n        return self"
  - "def solve(self, verbose=False, allow_brute_force=True):\n        \n        while\
    \ not self.is_solved:\n            \n            self._update()\n\n          \
    \  \n            singles_found = False or self._fill_naked_singles() or self._fill_hidden_singles()\n\
    \n            \n            \n            \n            if not singles_found:\n\
    \                if allow_brute_force:\n                    solution = None\n\
    \                    try:\n                        dlxs = DancingLinksSolver(copy.deepcopy(self._matrix))\n\
    \                        solutions = dlxs.solve()\n                        solution\
    \ = next(solutions)\n                        more_solutions = next(solutions)\n\
    \                    except StopIteration as e:\n                        if solution\
    \ is not None:\n                            self._matrix = solution\n        \
    \                else:\n                            raise SudokuHasNoSolutionError(\"\
    Dancing Links solver could not find any solution.\")\n                    except\
    \ Exception as e:\n                        raise SudokuHasNoSolutionError(\"Brute\
    \ Force method failed.\")\n                    else:\n                       \
    \ \n                        \n                        raise SudokuHasMultipleSolutionsError(\"\
    This Sudoku has multiple solutions!\")\n                    self.solution_steps.append(\"\
    BRUTE FORCE - Dancing Links\")\n                    break\n                else:\n\
    \                    print(self)\n                    raise SudokuTooDifficultError(\"\
    This Sudoku requires more advanced methods!\")\n        if verbose:\n        \
    \    print(\"Sudoku solved in {0} iterations!\\n{1}\".format(len(self.solution_steps),\
    \ self))\n            for step in self.solution_steps:\n                print(step)"
  - "def get_peer_ips(self):\n        \n        presponse = [ord(i) for i in self.tracker_response[]]\n\
    \        while presponse:\n            peer_ip = ((.join(str(x) for x in presponse[0:4]),\n\
    \                       256 * presponse[4] + presponse[5]))\n            if peer_ip\
    \ not in self.peer_ips:\n                self.peer_ips.append(peer_ip)\n     \
    \       presponse = presponse[6:]"
- source_sentence: "Setter method for ipv6_phy_intf_cmds, mapped from YANG variable\
    \ /interface/fortygigabitethernet/ipv6/ipv6_phy_intf_cmds (container)\n    If\
    \ this variable is read-only (config: false) in the\n    source YANG file, then\
    \ _set_ipv6_phy_intf_cmds is considered as a private\n    method. Backends looking\
    \ to populate this variable should\n    do so via calling thisObj._set_ipv6_phy_intf_cmds()\
    \ directly."
  sentences:
  - "def _trim_xpath(self, xpath, prop):\n        \n\n        xroot = self._get_xroot_for(prop)\n\
    \n        if xroot is None and isinstance(xpath, string_types):\n            xtags\
    \ = xpath.split(XPATH_DELIM)\n\n            if xtags[-1] in _iso_tag_primitives:\n\
    \                xroot = XPATH_DELIM.join(xtags[:-1])\n\n        return xroot"
  - "def _set_ipv6_phy_intf_cmds(self, v, load=False):\n    \n    if hasattr(v, \"\
    _utype\"):\n      v = v._utype(v)\n    try:\n      t = YANGDynClass(v,base=ipv6_phy_intf_cmds.ipv6_phy_intf_cmds,\
    \ is_container=, presence=False, yang_name=\"ipv6-phy-intf-cmds\", rest_name=\"\
    \", parent=self, path_helper=self._path_helper, extmethods=self._extmethods, register_paths=True,\
    \ extensions={u: {u: u, u: None, u: u}}, namespace=, defining_module=, yang_type=,\
    \ is_config=True)\n    except (TypeError, ValueError):\n      raise ValueError({\n\
    \          : ,\n          : \"container\",\n          : ,\n        })\n\n    self.__ipv6_phy_intf_cmds\
    \ = t\n    if hasattr(self, ):\n      self._set()"
  - "def create_snapshot(self, xml_bytes):\n        \n        root = XML(xml_bytes)\n\
    \        snapshot_id = root.findtext(\"snapshotId\")\n        volume_id = root.findtext(\"\
    volumeId\")\n        status = root.findtext(\"status\")\n        start_time =\
    \ root.findtext(\"startTime\")\n        start_time = datetime.strptime(\n    \
    \        start_time[:19], \"%Y-%m-%dT%H:%M:%S\")\n        progress = root.findtext(\"\
    progress\")[:-1]\n        progress = float(progress or \"0\") / 100.\n       \
    \ return model.Snapshot(\n            snapshot_id, volume_id, status, start_time,\
    \ progress)"
- source_sentence: "Generates samples of text from the provided vocabulary.\n\n  Args:\n\
    \    plain_vocab: vocabulary.\n    distribution: distribution.\n    train_samples:\
    \ samples for training.\n    length: length.\n\n  Returns:\n    train_indices\
    \ (np.array of Integers): random integers for training.\n      shape = [num_samples,\
    \ length]\n    test_indices (np.array of Integers): random integers for testing.\n\
    \      shape = [num_samples, length]\n    plain_vocab   (list of Integers): unique\
    \ vocabularies."
  sentences:
  - "def late_filling(target, pressure=,\n                 Pc_star=,\n           \
    \      Swp_star=0.2, eta=3):\n    r\n    element = pressure.split()[0]\n    network\
    \ = target.project.network\n    phase = target.project.find_phase(target)\n  \
    \  pc_star = phase[Pc_star]\n    Pc = phase[pressure]\n    \n        Ts = network.map_throats(throats=target.Ts,\
    \ origin=target)\n        values = values[Ts]\n    else:\n        Ps = network.map_pores(pores=target.Ps,\
    \ origin=target)\n        values = values[Ps]\n    return values"
  - "def switch(self, name):\n        \n        try:\n            switch = self.storage[self.__namespaced(name)]\n\
    \        except KeyError:\n            if not self.autocreate:\n             \
    \   raise ValueError(\"No switch named  registered in \" % (name, self.namespace))\n\
    \n            switch = self.__create_and_register_disabled_switch(name)\n\n  \
    \      switch.manager = self\n        return switch"
  - "def generate_plaintext_random(plain_vocab, distribution, train_samples,\n   \
    \                           length):\n  \n  if distribution is not None:\n   \
    \ assert len(distribution) == len(plain_vocab)\n\n  train_indices = np.random.choice(\n\
    \      range(len(plain_vocab)), (train_samples, length), p=distribution)\n\n \
    \ return train_indices"
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@1
- cosine_ndcg@5
- cosine_ndcg@10
- cosine_mrr@1
- cosine_mrr@5
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on Qwen/Qwen3-Embedding-0.6B
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: Unknown
      type: unknown
    metrics:
    - type: cosine_accuracy@1
      value: 0.99
      name: Cosine Accuracy@1
    - type: cosine_accuracy@5
      value: 1.0
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 1.0
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.99
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.3333333333333334
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.19999999999999996
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.09999999999999998
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.99
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 1.0
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 1.0
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 1.0
      name: Cosine Recall@10
    - type: cosine_ndcg@1
      value: 0.99
      name: Cosine Ndcg@1
    - type: cosine_ndcg@5
      value: 0.9963092975357145
      name: Cosine Ndcg@5
    - type: cosine_ndcg@10
      value: 0.9963092975357145
      name: Cosine Ndcg@10
    - type: cosine_mrr@1
      value: 0.99
      name: Cosine Mrr@1
    - type: cosine_mrr@5
      value: 0.995
      name: Cosine Mrr@5
    - type: cosine_mrr@10
      value: 0.995
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.995
      name: Cosine Map@100
---

# SentenceTransformer based on Qwen/Qwen3-Embedding-0.6B

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) <!-- at revision c54f2e6e80b2d7b7de06f51cec4959f6b3e03418 -->
- **Maximum Sequence Length:** 32768 tokens
- **Output Dimensionality:** 1024 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 32768, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("JacobLinCool/Qwen3-Embedding-0.6B-GIR-1")
# Run inference
queries = [
    "Generates samples of text from the provided vocabulary.\n\n  Args:\n    plain_vocab: vocabulary.\n    distribution: distribution.\n    train_samples: samples for training.\n    length: length.\n\n  Returns:\n    train_indices (np.array of Integers): random integers for training.\n      shape = [num_samples, length]\n    test_indices (np.array of Integers): random integers for testing.\n      shape = [num_samples, length]\n    plain_vocab   (list of Integers): unique vocabularies.",
]
documents = [
    'def generate_plaintext_random(plain_vocab, distribution, train_samples,\n                              length):\n  \n  if distribution is not None:\n    assert len(distribution) == len(plain_vocab)\n\n  train_indices = np.random.choice(\n      range(len(plain_vocab)), (train_samples, length), p=distribution)\n\n  return train_indices',
    'def switch(self, name):\n        \n        try:\n            switch = self.storage[self.__namespaced(name)]\n        except KeyError:\n            if not self.autocreate:\n                raise ValueError("No switch named  registered in " % (name, self.namespace))\n\n            switch = self.__create_and_register_disabled_switch(name)\n\n        switch.manager = self\n        return switch',
    'def late_filling(target, pressure=,\n                 Pc_star=,\n                 Swp_star=0.2, eta=3):\n    r\n    element = pressure.split()[0]\n    network = target.project.network\n    phase = target.project.find_phase(target)\n    pc_star = phase[Pc_star]\n    Pc = phase[pressure]\n    \n        Ts = network.map_throats(throats=target.Ts, origin=target)\n        values = values[Ts]\n    else:\n        Ps = network.map_pores(pores=target.Ps, origin=target)\n        values = values[Ps]\n    return values',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 1024] [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.8344, -0.0822,  0.0233]])
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.99       |
| cosine_accuracy@5   | 1.0        |
| cosine_accuracy@10  | 1.0        |
| cosine_precision@1  | 0.99       |
| cosine_precision@3  | 0.3333     |
| cosine_precision@5  | 0.2        |
| cosine_precision@10 | 0.1        |
| cosine_recall@1     | 0.99       |
| cosine_recall@3     | 1.0        |
| cosine_recall@5     | 1.0        |
| cosine_recall@10    | 1.0        |
| cosine_ndcg@1       | 0.99       |
| cosine_ndcg@5       | 0.9963     |
| **cosine_ndcg@10**  | **0.9963** |
| cosine_mrr@1        | 0.99       |
| cosine_mrr@5        | 0.995      |
| cosine_mrr@10       | 0.995      |
| cosine_map@100      | 0.995      |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset

* Size: 400 training samples
* Columns: <code>query</code> and <code>code</code>
* Approximate statistics based on the first 400 samples:
  |         | query                                                                               | code                                                                                  |
  |:--------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
  | type    | string                                                                              | string                                                                                |
  | details | <ul><li>min: 2 tokens</li><li>mean: 67.12 tokens</li><li>max: 3156 tokens</li></ul> | <ul><li>min: 24 tokens</li><li>mean: 126.98 tokens</li><li>max: 1236 tokens</li></ul> |
* Samples:
  | query                                                                                                                                                                                                                                                                                 | code                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
  |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>For memory actions, get a list of addresses it operates on.<br><br>        :param SimAction action: The action object to work with.<br>        :return:                 A list of addresses that are accessed with that action.<br>        :rtype:                  list</code> | <code>def _get_actual_addrs(action, state):<br>        <br><br>        if action.actual_addrs is None:<br>            <br>                addr_list = {0x60000000}  <br>        else:<br>            addr_list = set(action.actual_addrs)<br><br>        return addr_list</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
  | <code>Construct the input file of the calculation.</code>                                                                                                                                                                                                                             | <code>def make_input(self, with_header=False):<br>        <br>        s = str(self.input)<br>        if with_header: s = str(self) + "\n" + s<br>        return s</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
  | <code>Check worker status route</code>                                                                                                                                                                                                                                                | <code>def check_worker_status():<br>    <br>    if  not in request.args:<br>        resp = {"status": "bad request"}<br>        return jsonify(**resp)<br>    else:<br>        worker_id = request.args[]<br>        assignment_id = request.args[]<br>        allow_repeats = CONFIG.getboolean(, )<br>        if allow_repeats: <br>            try:<br>                part = Participant.query.\<br>                    filter(Participant.workerid == worker_id).\<br>                    filter(Participant.assignmentid == assignment_id).one()<br>                status = part.status<br>            except exc.SQLAlchemyError:<br>                status = NOT_ACCEPTED<br>        else: <br>            try:<br>                matches = Participant.query.\<br>                    filter(Participant.workerid == worker_id).all()<br>                numrecs = len(matches)<br>                if numrecs==0: <br>                    status = NOT_ACCEPTED<br>                else:<br>                    status = max([record.status for record in matches])<br>            except exc.SQLAlchemyError:<br>  ...</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim",
      "gather_across_devices": false
  }
  ```

### Evaluation Dataset

#### Unnamed Dataset

* Size: 100 evaluation samples
* Columns: <code>query</code> and <code>code</code>
* Approximate statistics based on the first 100 samples:
  |         | query                                                                              | code                                                                                 |
  |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                               |
  | details | <ul><li>min: 5 tokens</li><li>mean: 66.56 tokens</li><li>max: 548 tokens</li></ul> | <ul><li>min: 24 tokens</li><li>mean: 142.11 tokens</li><li>max: 901 tokens</li></ul> |
* Samples:
  | query                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | code                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>Return the value of the android prefixed attribute in a specific tag.<br><br>        This function will always try to get the attribute with a android: prefix first,<br>        and will try to return the attribute without the prefix, if the attribute could not be found.<br>        This is useful for some broken AndroidManifest.xml, where no android namespace is set,<br>        but could also indicate malicious activity (i.e. wrongly repackaged files).<br>        A warning is printed if the attribute is found without a namespace prefix.<br><br>        If you require to get the exact result you need to query the tag directly:<br><br>        example::<br>            >>> from lxml.etree import Element<br>            >>> tag = Element('bar', nsmap={'android': 'http://schemas.android.com/apk/res/android'})<br>            >>> tag.set('{http://schemas.android.com/apk/res/android}foobar', 'barfoo')<br>            >>> tag.set('name', 'baz')<br>            # Assume that `a` is some APK object<br>            >>> a.get_value_from_tag(tag, 'name'...</code> | <code>def get_value_from_tag(self, tag, attribute):<br>        <br><br>        <br>        <br>        value = tag.get(self._ns(attribute))<br>        if value is None:<br>            value = tag.get(attribute)<br><br>            if value:<br>                <br>                log.warning("Failed to get the attribute  on tag  with namespace. "<br>                            "But found the same attribute without namespace!".format(attribute, tag.tag))<br>        return value</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
  | <code>Get information about this object as a dictionary.  Used by WebSocket interface to pass some<br>            relevant information to client applications.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | <code>def get_as_datadict(self):<br>        <br>        return dict(type=self.__class__.__name__, tags=list(self.tags))</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
  | <code>Makes forecast with the estimated model<br><br>        Parameters<br>        ----------<br>        h : int (default : 5)<br>            How many steps ahead would you like to forecast?<br><br>        past_values : int (default : 20)<br>            How many past observations to show on the forecast graph?<br><br>        intervals : Boolean<br>            Would you like to show 95% prediction intervals for the forecast?<br><br>        Returns<br>        ----------<br>        - Plot of the forecast</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | <code>def plot_predict(self,h=5,past_values=20,intervals=True,**kwargs):      <br>        <br>        import matplotlib.pyplot as plt<br>        import seaborn as sns<br><br>        figsize = kwargs.get(,(10,7))<br><br>        if self.latent_variables.estimated is False:<br>            raise Exception("No latent variables estimated!")<br>        else:<br>            <br>            scale, shape, skewness = self._get_scale_and_shape(self.latent_variables.get_z_values(transformed=True))<br>            previous_value = self.data[-1]  <br>            forecasted_values = np.ones(h)*self.states[-1]  <br>            date_index = self.shift_dates(h)<br>            simulations = 10000<br>            sim_vector = np.zeros([simulations,h])<br>            t_params = self.transform_z()<br><br>            for n in range(0,simulations):  <br>                rnd_q = np.random.normal(0,np.sqrt(self.latent_variables.get_z_values(transformed=True)[0]),h) <br>                exp = forecasted_values.copy()<br><br>                for t in range(0,h):<br>                    if t == 0:...</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim",
      "gather_across_devices": false
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: epoch
- `per_device_train_batch_size`: 64
- `per_device_eval_batch_size`: 64
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `seed`: 2025
- `bf16`: True
- `load_best_model_at_end`: True
- `optim`: adamw_torch
- `push_to_hub`: True
- `hub_model_id`: JacobLinCool/Qwen3-Embedding-0.6B-GIR-1
- `hub_private_repo`: False
- `gradient_checkpointing`: True
- `eval_on_start`: True
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: epoch
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 64
- `per_device_eval_batch_size`: 64
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 2025
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `parallelism_config`: None
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: True
- `resume_from_checkpoint`: None
- `hub_model_id`: JacobLinCool/Qwen3-Embedding-0.6B-GIR-1
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: True
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: True
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}

</details>

### Training Logs
| Epoch   | Step  | Validation Loss | cosine_ndcg@10 |
|:-------:|:-----:|:---------------:|:--------------:|
| 0       | 0     | 0.0616          | 0.9926         |
| **1.0** | **7** | **0.0358**      | **0.9963**     |
| -1      | -1    | -               | 0.9963         |

* The bold row denotes the saved checkpoint.

### Framework Versions
- Python: 3.11.11
- Sentence Transformers: 5.1.1
- Transformers: 4.56.2
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.1.1
- Tokenizers: 0.22.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->