pandora-s Rocketknight1 HF Staff commited on
Commit
1c56091
·
verified ·
1 Parent(s): 18f5d8b

Align tokenizer with mistral-common (#45)

Browse files

- Align tokenizer with mistral-common (53f216c52ce4534a38a71c21861acd514fa8a904)
- Defend the honour of the Hugging Face tokenizer (684c1751c210aa11e0b187c0eac1b7b2bd4d7967)
- Update to tokenizer v3 with correct proper special tokens (106a1b0c338ddbd0e3e42dbeb63634bc85d6f71b)
- Re-add chat template (3256c7e7ea279386e0cdd18553202ed78c4d735b)


Co-authored-by: Matthew Carrigan <[email protected]>

Files changed (4) hide show
  1. README.md +0 -5
  2. tokenizer.json +0 -0
  3. tokenizer.model +2 -2
  4. tokenizer_config.json +0 -0
README.md CHANGED
@@ -13,11 +13,6 @@ extra_gated_description: If you want to learn more about how we process your per
13
 
14
  # Model Card for Codestral-22B-v0.1
15
 
16
- ###
17
-
18
- > [!WARNING]
19
- > 🚫
20
- > The `transformers` tokenizer is not properly configured. Make sure that your encoding and decoding is correct by using `mistral-common` as shown below:
21
 
22
  ## Encode and Decode with `mistral_common`
23
 
 
13
 
14
  # Model Card for Codestral-22B-v0.1
15
 
 
 
 
 
 
16
 
17
  ## Encode and Decode with `mistral_common`
18
 
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer.model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:37f00374dea48658ee8f5d0f21895b9bc55cb0103939607c8185bfd1c6ca1f89
3
- size 587404
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9addc8bdce5988448ae81b729336f43a81262160ae8da760674badab9d4c7d33
3
+ size 587591
tokenizer_config.json CHANGED
The diff for this file is too large to render. See raw diff