Could you please upload a 99GB-100GB version of the MLX quantization model so that it can be deployed locally on a 128GB RAM MAC? Thank you very much!
#3
by
mimeng1990
- opened
The model at https://model.lmstudio.ai/download/nightmedia/LIMI-Air-qx86-hi-mlx is 97GB and works very well on a Mac with 128GB of RAM. However, I would also like to try a debugged Iceblink model. If the quantization size is between 97GB and 100GB, a local deployment would be more comparable. 4-bit quantization is too small, clearly resulting in a significant performance loss; I've tested it, and the text processing results are not ideal. Similarly, 6-bit quantization is also not ideal. In my opinion, LIMI-Air-qx86-hi-mlx is currently the most ideal model for processing Chinese text on a Mac with 128GB of RAM. The output results seem better than QWEN3-80b.