Edit Models filters

Apps

Inference Providers

HF Inference API

Misc

compressed-tensors

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

3,141

Full-text search

Active filters: compressed-tensors

moonshotai/Kimi-K2-Thinking

Text Generation • Updated 2 days ago • 57.1k • • 882

allenai/olmOCR-2-7B-1025-FP8

Image-to-Text • 8B • Updated 19 days ago • 106k • 132

cyankiwi/Kimi-Linear-48B-A3B-Instruct-AWQ-4bit

9B • Updated 6 days ago • 32k • 13

cpatonn/GLM-4.5-Air-AWQ-4bit

Text Generation • 19B • Updated Sep 2 • 292k • 21

cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit

Text Generation • 5B • Updated Aug 29 • 35.8k • 16

cpatonn/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit

Text Generation • 5B • Updated Aug 28 • 26k • 21

cpatonn/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit

Any-to-Any • 10B • Updated Sep 28 • 7.63k • 23

dazipe/Qwen3-Next-80B-A3B-Instruct-GPTQ-Int4A16

Updated 12 days ago • 141 • 2

Firworks/Kimi-Linear-48B-A3B-Instruct-nvfp4

28B • Updated 10 days ago • 318 • 2

unsloth/Kimi-K2-Thinking

Text Generation • Updated 3 days ago • 38 • 2

nm-testing/Meta-Llama-3-8B-Instruct-W8A8-FP8-Channelwise-compressed-tensors

Text Generation • 8B • Updated Oct 9, 2024 • 2 • 1

RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a16

Text Generation • 3B • Updated Oct 23, 2024 • 3.92k • 12

RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8

Text Generation • 8B • Updated Sep 22 • 22k • 18

RedHatAI/Llama-3.3-70B-Instruct-FP8-dynamic

Text Generation • Updated Sep 22 • 1.67M • 12

root-signals/RootSignals-Judge-Llama-70B

Text Generation • 71B • Updated Jul 21 • 81 • 17

RedHatAI/whisper-large-v3-turbo-FP8-dynamic

Automatic Speech Recognition • 0.9B • Updated Apr 22 • 288 • 6

gaunernst/gemma-3-4b-it-qat-compressed-tensors

Image-Text-to-Text • Updated Apr 8 • 1.23k • 3

NeoChen1024/llama-joycaption-beta-one-hf-llava-FP8-Dynamic

8B • Updated Sep 12 • 459 • 9

jeffcookio/Mistral-Small-3.2-24B-Instruct-2506-awq-sym

5B • Updated Jul 4 • 4.26k • 9

RedHatAI/Kimi-K2-Instruct-quantized.w4a16

Text Generation • 146B • Updated 28 days ago • 316 • 12

zai-org/GLM-4.5-Air-FP8

Text Generation • 111B • Updated Aug 12 • 113k • • 68

cpatonn/Qwen3-30B-A3B-Thinking-2507-AWQ-4bit

Text Generation • 5B • Updated Sep 22 • 45.9k • 11

cpatonn/Qwen3-Coder-30B-A3B-Instruct-GPTQ-4bit

Text Generation • 5B • Updated Aug 2 • 809 • 5

cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-8bit

Text Generation • 9B • Updated Aug 29 • 758 • 2

zai-org/GLM-4.5V-FP8

Image-Text-to-Text • 108B • Updated 16 days ago • 365k • • 36

llmat/Qwen3-30B-A3B-Instruct-2507-NVFP4

Text Generation • 17B • Updated Aug 27 • 143 • 1

NousResearch/Hermes-4-14B-FP8

Text Generation • 15B • Updated Sep 3 • 462 • 12

warshanks/Hermes-4-14B-AWQ

Text Generation • 3B • Updated Sep 3 • 4 • 1

cpatonn/Hermes-4-14B-AWQ-4bit

Text Generation • 4B • Updated Sep 4 • 59 • 2

cpatonn/Qwen3-Next-80B-A3B-Instruct-AWQ-4bit

Text Generation • Updated Sep 24 • 56.8k • 41