Request to Add AWQ Quantization Model

#13

by wunu - opened 28 days ago

wunu

28 days ago

AWQ (Activation-aware Weight Quantization) is a powerful quantization technique that can significantly reduce the memory footprint and computational cost of large language models while maintaining high accuracy. Adding support for the AWQ quantization model would enable more efficient deployment and usage of models, especially in resource-constrained environments. Could you please consider integrating this quantization model? Thank you.

Aly87

26 days ago

I’d like to see that too, it would be very helpful for deployment.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment