Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,850

Full-text search

Active filters: 4-bit

QuantTrio/GLM-4.7-AWQ

Text Generation • 358B • Updated 1 day ago • 8.57k • 12

Disty0/Qwen-Image-Edit-2511-SDNQ-uint4-svd-r32

Image-to-Image • Updated 7 days ago • 241 • 6

tencent/HY-MT1.5-1.8B-GPTQ-Int4

Translation • 2B • Updated about 11 hours ago • 5

tencent/HY-MT1.5-7B-GPTQ-Int4

Translation • 8B • Updated about 11 hours ago • 5

QuantTrio/GLM-4.7-GPTQ-Int4-Int8Mix

Text Generation • 390B • Updated 1 day ago • 80 • 4

mlx-community/MiniMax-M2.1-4bit

Text Generation • 229B • Updated 4 days ago • 680 • 4

QuantTrio/MiniMax-M2.1-AWQ

Text Generation • 229B • Updated about 12 hours ago • 586 • 4

TheBloke/CapybaraHermes-2.5-Mistral-7B-GPTQ

7B • Updated Jan 31, 2024 • 195 • 61

MaziyarPanahi/TheTop-5x7B-Instruct-S5-v0.1-GGUF

Text Generation • 7B • Updated Feb 19, 2024 • 45 • 3

unsloth/Phi-3-mini-4k-instruct-bnb-4bit

Text Generation • 4B • Updated Sep 3, 2024 • 37.5k • 42

MaziyarPanahi/Mistral-Nemo-Instruct-2407-GGUF

Text Generation • 12B • Updated Jul 22, 2024 • 165k • 50

ICEPVP8977/Uncensored_Qwen2.5_Coder_7B_4_bit_quantized_Seaftensors

8B • Updated Mar 19 • 35 • 3

Qwen/Qwen3-4B-MLX-4bit

Text Generation • 0.6B • Updated Aug 29 • 70.8k • 22

lmstudio-community/Devstral-Small-2507-MLX-4bit

Text Generation • 24B • Updated Jul 10 • 27.6k • 5

mlx-community/gpt-oss-20b-MXFP4-Q8

Text Generation • Updated Aug 29 • 676k • 22

nota-ai/Qwen3-30B-A3B-NotaMoEQuant-Int4

Text Generation • 0.6B • Updated 6 days ago • 129 • 4

Disty0/Qwen-Image-Layered-SDNQ-uint4-svd-r32

Updated 8 days ago • 48 • 3

nota-ai/GLM-4.5-Air-NotaMoeQuant-Int4

Text Generation • 1B • Updated 1 day ago • 63 • 2

nightmedia/Qwen3-4B-Agent-F32-dwq4-mlx

Text Generation • 0.8B • Updated 3 days ago • 183 • 2

fifrio/gemma-3-4b-it-gptq-4bit-calibration-Swahili-128samples

4B • Updated 7 days ago • 77 • 2

mlx-community/maya1-4bit

Text-to-Speech • 0.5B • Updated 6 days ago • 25 • 2

TevunahAi/Nemotron-3-Nano-30B-A3B-GPTQ

Text Generation • 6B • Updated 5 days ago • 856 • 2

Intel/GLM-4.7-int4-mixed-AutoRound

Text Generation • 2B • Updated 1 day ago • 22 • 2

TheBloke/WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ

Text Generation • 33B • Updated Aug 21, 2023 • 40 • 93

MaziyarPanahi/TheTop-5x7B-Instruct-T-v0.1-GGUF

Text Generation • 7B • Updated Feb 19, 2024 • 51 • 1

CohereLabs/c4ai-command-r-v01-4bit

Text Generation • 35B • Updated Apr 16 • 31 • 176

Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4

Text Generation • 14B • Updated Jun 9, 2024 • 1.34k • 49

casperhansen/llama-3-8b-instruct-awq

Text Generation • 8B • Updated Jun 8, 2024 • 8.46k • 26

casperhansen/llama-3-70b-instruct-awq

Text Generation • 71B • Updated Apr 19, 2024 • 1.1k • 70

solidrust/Llama-3-8B-Lexi-Uncensored-AWQ

Text Generation • 8B • Updated Sep 3, 2024 • 81.4k • 4