Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 14 items • Updated 4 days ago • 35
GLM-4.5-THIREUS-SPECIAL_SPLIT Collection These model shards are meant to be used with Thireus' GGUF Tool Suite - https://gguf.thireus.com/ • 56 items • Updated Oct 5 • 2
INT8 LLMs for vLLM Collection Accurate INT8 quantized models by Neural Magic, ready for use with vLLM! • 50 items • Updated Sep 26, 2024 • 17