Baichuan-M2-32B-gguf
medical knowledge LLM
LLM is an invention as important as electricity, much more important than internet, phones, and computers combined.
Make sure you have enough ram/gpu to run. On the right of model card, you may see the size of each quantized models.
Use the model in ollama
First download and install ollama.
Command
in windows command line, or in terminal in ubuntu, type:
ollama run hf.co/John1604/Baichuan-M2-32B-gguf:q4_k_s
(q4_k_s is the model quant type, q5_k_s, q4_k_m, ..., can also be used)
C:\Users\developer>ollama run hf.co/John1604/Baichuan-M2-32B-gguf:q4_k_s
pulling manifest
...
verifying sha256 digest
writing manifest
success
>>>
Use the model in LM Studio
download and install LM Studio
Discover models
In the LM Studio, click "Discover" icon. "Mission Control" popup window will be displayed.
In the "Mission Control" search bar, type "John1604/Baichuan-M2-32B-gguf" and check "GGUF", the model should be found.
Download the model.
You may choose quantized model.
Load the model.
Ask questions.
quantized models
| Type | Bits | Quality | Description |
|---|---|---|---|
| Q2_K | 2-bit | 🟥 Low | Minimal footprint; only for tests |
| Q3_K_S | 3-bit | 🟧 Low | “Small” variant (less accurate) |
| Q3_K_M | 3-bit | 🟧 Low–Med | “Medium” variant |
| Q4_K_S | 4-bit | 🟨 Med | Small, faster, slightly less quality |
| Q4_K_M | 4-bit | 🟩 Med–High | “Medium” — best 4-bit balance |
| Q5_K_S | 5-bit | 🟩 High | Slightly smaller than Q5_K_M |
| Q5_K_M | 5-bit | 🟩🟩 High | Excellent general-purpose quant |
| Q6_K | 6-bit | 🟩🟩🟩 Very High | Almost FP16 quality, larger size |
| Q8_0 | 8-bit | 🟩🟩🟩🟩 | Near-lossless baseline |
- Downloads last month
- 284
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support