Note ik_llama.cpp can run regular GGUFs in addition
Browse files
README.md
CHANGED
|
@@ -15,6 +15,8 @@ tags:
|
|
| 15 |
## `ik_llama.cpp` imatrix Quantizations of Hunyuan-A13B-Instruct
|
| 16 |
This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support the ik's latest SOTA quants and optimizations! Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!
|
| 17 |
|
|
|
|
|
|
|
| 18 |
Some of ik's new quants are supported with [Nexesenex/croco.cpp](https://github.com/Nexesenex/croco.cpp) fork of KoboldCPP.
|
| 19 |
|
| 20 |
These quants provide best in class perplexity for the given memory footprint.
|
|
|
|
| 15 |
## `ik_llama.cpp` imatrix Quantizations of Hunyuan-A13B-Instruct
|
| 16 |
This quant collection **REQUIRES** [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork to support the ik's latest SOTA quants and optimizations! Do **not** download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!
|
| 17 |
|
| 18 |
+
*NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
|
| 19 |
+
|
| 20 |
Some of ik's new quants are supported with [Nexesenex/croco.cpp](https://github.com/Nexesenex/croco.cpp) fork of KoboldCPP.
|
| 21 |
|
| 22 |
These quants provide best in class perplexity for the given memory footprint.
|