inferencerlabs commited on
Commit
de44468
·
verified ·
1 Parent(s): b7c634d

Upload complete model

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -10,9 +10,9 @@ base_model: moonshotai/Kimi-K2-Instruct-0905
10
  **CURRENTLY UPLOADING**
11
  *Notice will be removed once complete*
12
 
13
- **See Kimi-K2-Instruct-0905 Dynamic MLX in action - [COMING SOON](https://youtu.be/-zfUvA2CDqE)**
14
 
15
- *q3.824bit dynamic quant typically achieves 1.... perplexity in our testing, slotting closer to q4 perplexity (1.168) than q3 perplexity (1.900).*
16
  | Quantization | Perplexity |
17
  |:------------:|:----------:|
18
  | **q2** | 41.293 |
@@ -28,8 +28,8 @@ base_model: moonshotai/Kimi-K2-Instruct-0905
28
 
29
  * Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
30
  * Does not require expanding VRAM limit
31
- * However expanding it will allow you to avoid slow downs with larger context windows:
32
  * `sudo sysctl iogpu.wired_limit_mb=507000`
33
  * Expect ~20 tokens/s
34
  * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
35
- * For more details see [demonstration video](https://youtu.be/-zfUvA2CDqE) or visit [Kimi K2](https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905).
 
10
  **CURRENTLY UPLOADING**
11
  *Notice will be removed once complete*
12
 
13
+ **See Kimi-K2-Instruct-0905 Dynamic MLX in action - [https://youtu.be/Ia-q3Ll4tAY](https://youtu.be/Ia-q3Ll4tAY)**
14
 
15
+ *q3.824bit dynamic quant typically achieves 1.256 perplexity in our testing, slotting closer to q4 perplexity (1.168) than q3 perplexity (1.900).*
16
  | Quantization | Perplexity |
17
  |:------------:|:----------:|
18
  | **q2** | 41.293 |
 
28
 
29
  * Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
30
  * Does not require expanding VRAM limit
31
+ * However expanding it will allow you to use larger context windows:
32
  * `sudo sysctl iogpu.wired_limit_mb=507000`
33
  * Expect ~20 tokens/s
34
  * Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
35
+ * For more details see [demonstration video](https://youtu.be/Ia-q3Ll4tAY) or visit [Kimi K2](https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905).