ubergarm
/

Hunyuan-A13B-Instruct-GGUF

@@ -130,18 +130,15 @@ export model=/mnt/models/ubergarm/Hunyuan-A13B-Instruct-GGUF/Hunyuan-A13B-Instru
 ```
 ## *NOTE* Building Experimental PRs
-This PR is based on basically three currently un-released PRs so is quite experimental. To build it before PRs are merged try something like this:
 ```bash
 # get the code setup
 cd projects
 git clone https://github.com/ikawrakow/ik_llama.cpp.git
 git ik_llama.cpp
-git fetch origin
 git remote add ubergarm https://github.com/ubergarm/ik_llama.cpp
 git fetch ubergarm
 git checkout ug/hunyuan-moe-2
-git checkout -b merge-stuff-here
-git merge ikawrakow/ik/iq3_ks_v2
 # build for CUDA
 cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DGGML_VULKAN=OFF -DGGML_RPC=OFF -DGGML_BLAS=OFF -DGGML_CUDA_F16=ON -DGGML_SCHED_MAX_COPIES=1
@@ -153,6 +150,7 @@ git branch -D merge-stuff-here
 ```
 ## VRAM Estimations
 *  8k = 3790MiB total with KV self size  =  544.00 MiB, K (q8_0):  272.00 MiB, V (q8_0):  272.00 MiB
 * 32k = 5462MiB total with KV self size  = 2176.00 MiB, K (q8_0): 1088.00 MiB, V (q8_0): 1088.00 MiB

 ```
 ## *NOTE* Building Experimental PRs
+This PR is based on currently un-released PRs so is quite experimental. To build it before PRs are merged try something like this:
 ```bash
 # get the code setup
 cd projects
 git clone https://github.com/ikawrakow/ik_llama.cpp.git
 git ik_llama.cpp
 git remote add ubergarm https://github.com/ubergarm/ik_llama.cpp
 git fetch ubergarm
 git checkout ug/hunyuan-moe-2
 # build for CUDA
 cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DGGML_VULKAN=OFF -DGGML_RPC=OFF -DGGML_BLAS=OFF -DGGML_CUDA_F16=ON -DGGML_SCHED_MAX_COPIES=1
 ```
 ## VRAM Estimations
+Context length = VRAM use:
 *  8k = 3790MiB total with KV self size  =  544.00 MiB, K (q8_0):  272.00 MiB, V (q8_0):  272.00 MiB
 * 32k = 5462MiB total with KV self size  = 2176.00 MiB, K (q8_0): 1088.00 MiB, V (q8_0): 1088.00 MiB