The model collection of paper: Optimal Brain Restoration for Joint Sparsification and Quantization of LLMs. Github: https://github.com/csguoh/OBR
-
HangGuo/Llama2-70B-QuaRot-OBR-GPTQ-W4A4KV4S50
Text Generation • Updated • 22 • 1 -
HangGuo/Llama2-70B-SpinQuant-OBR-GPTQ-W4A4KV4S50
Text Generation • Updated • 3 -
HangGuo/Llama3-70B-SpinQuant-OBR-RTN-W4A4KV4S50
Text Generation • Updated • 4 -
HangGuo/Llama2-70B-SpinQuant-OBR-RTN-W4A4KV4S50
Text Generation • Updated • 3