Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,9 @@ This model [mlx-community/LongCat-Flash-Chat-mlx-DQ6_K_M](https://huggingface.co
|
|
| 14 |
converted to MLX format from [meituan-longcat/LongCat-Flash-Chat](https://huggingface.co/meituan-longcat/LongCat-Flash-Chat)
|
| 15 |
using mlx-lm version **0.28.1**.
|
| 16 |
|
| 17 |
-
This is created for people using a single Apple Mac Studio M3 Ultra with 512 GB. The 8-bit version of
|
|
|
|
|
|
|
| 18 |
|
| 19 |
```bash
|
| 20 |
pip install mlx-lm
|
|
|
|
| 14 |
converted to MLX format from [meituan-longcat/LongCat-Flash-Chat](https://huggingface.co/meituan-longcat/LongCat-Flash-Chat)
|
| 15 |
using mlx-lm version **0.28.1**.
|
| 16 |
|
| 17 |
+
This is created for people using a single Apple Mac Studio M3 Ultra with 512 GB. The 8-bit version of LongCat-Flash-Chat does not fit. Using research results, we aim to get almost-8bit performance from a slightly smaller and smarter quantization. It should also not be so large that it leaves no memory for a useful context window.
|
| 18 |
+
|
| 19 |
+
You can find more similar MLX model quants for a single Apple Mac Studio M3 Ultra with 512 GB at https://huggingface.co/bibproj
|
| 20 |
|
| 21 |
```bash
|
| 22 |
pip install mlx-lm
|