ubergarm commited on
Commit
745d01c
·
1 Parent(s): e3a98eb

Update experimental PR build instructions

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -130,18 +130,15 @@ export model=/mnt/models/ubergarm/Hunyuan-A13B-Instruct-GGUF/Hunyuan-A13B-Instru
130
  ```
131
 
132
  ## *NOTE* Building Experimental PRs
133
- This PR is based on basically three currently un-released PRs so is quite experimental. To build it before PRs are merged try something like this:
134
  ```bash
135
  # get the code setup
136
  cd projects
137
  git clone https://github.com/ikawrakow/ik_llama.cpp.git
138
  git ik_llama.cpp
139
- git fetch origin
140
  git remote add ubergarm https://github.com/ubergarm/ik_llama.cpp
141
  git fetch ubergarm
142
  git checkout ug/hunyuan-moe-2
143
- git checkout -b merge-stuff-here
144
- git merge ikawrakow/ik/iq3_ks_v2
145
 
146
  # build for CUDA
147
  cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DGGML_VULKAN=OFF -DGGML_RPC=OFF -DGGML_BLAS=OFF -DGGML_CUDA_F16=ON -DGGML_SCHED_MAX_COPIES=1
@@ -153,6 +150,7 @@ git branch -D merge-stuff-here
153
  ```
154
 
155
  ## VRAM Estimations
 
156
 
157
  * 8k = 3790MiB total with KV self size = 544.00 MiB, K (q8_0): 272.00 MiB, V (q8_0): 272.00 MiB
158
  * 32k = 5462MiB total with KV self size = 2176.00 MiB, K (q8_0): 1088.00 MiB, V (q8_0): 1088.00 MiB
 
130
  ```
131
 
132
  ## *NOTE* Building Experimental PRs
133
+ This PR is based on currently un-released PRs so is quite experimental. To build it before PRs are merged try something like this:
134
  ```bash
135
  # get the code setup
136
  cd projects
137
  git clone https://github.com/ikawrakow/ik_llama.cpp.git
138
  git ik_llama.cpp
 
139
  git remote add ubergarm https://github.com/ubergarm/ik_llama.cpp
140
  git fetch ubergarm
141
  git checkout ug/hunyuan-moe-2
 
 
142
 
143
  # build for CUDA
144
  cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON -DGGML_VULKAN=OFF -DGGML_RPC=OFF -DGGML_BLAS=OFF -DGGML_CUDA_F16=ON -DGGML_SCHED_MAX_COPIES=1
 
150
  ```
151
 
152
  ## VRAM Estimations
153
+ Context length = VRAM use:
154
 
155
  * 8k = 3790MiB total with KV self size = 544.00 MiB, K (q8_0): 272.00 MiB, V (q8_0): 272.00 MiB
156
  * 32k = 5462MiB total with KV self size = 2176.00 MiB, K (q8_0): 1088.00 MiB, V (q8_0): 1088.00 MiB