cgus
/

silly-v0.2-exl2

4-bit precision

Model card Files Files and versions

cgus commited on Sep 6

Commit

79daa35

·

verified ·

1 Parent(s): 964f255

Update README.md

Files changed (1) hide show

README.md +23 -2

README.md CHANGED Viewed

@@ -1,9 +1,30 @@
 ---
 license: apache-2.0
 base_model:
-- mistralai/Mistral-Nemo-Base-2407
-library_name: transformers
 ---
 # silly-v0.2
 Finetune of [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) designed to emulate the writing style of character.ai models.

 ---
 license: apache-2.0
 base_model:
+- wave-on-discord/silly-v0.2
+library_name: exllamav2
 ---
+# silly-v0.2-exl2
+Original model: [silly-v0.2](https://huggingface.co/wave-on-discord/silly-v0.2) by [wave-on-discord](https://huggingface.co/wave-on-discord)
+Based on: [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) by [Mistral AI](https://huggingface.co/mistralai)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/silly-v0.2-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/silly-v0.2-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/silly-v0.2-exl2/tree/5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/silly-v0.2-exl2/tree/6bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/silly-v0.2-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.3.1 with default dataset.
+The model can be used with Nvidia RTX GPUs on Windows or RTX/AMD ROCm on Linux with TabbyAPI or Text-Generation-WebUI.
+Should be usable at 6bpw/16k context with something like RTX3060/12GB or 6bpw/32k with RTX4060Ti/16GB, both Q8 cache.
+In my brief testing the model had interesting writing style but it's very fragile and easily starts looping or repeating.
+I guess it should be used with DRY sampler to avoid repetition/loops. Both TabbyAPI and TGW have it.
+I don't recommend using repetition_penalty or frequency_penalty samplers for this as they are far far more destructive than DRY.
+# Original model card
 # silly-v0.2
 Finetune of [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) designed to emulate the writing style of character.ai models.