Typhoon2.5-Qwen3-30B-A3B (GGUF)

GGUF-format conversions of the scb10x/typhoon2.5-qwen3-30b-a3b model for efficient inference with llama.cpp and compatible runtimes.

Converted using llama.cpp’s convert_hf_to_gguf.py and quantize tool. No additional training or fine-tuning was performed.

🧩 Variants

Variant	File size
BF16	61.1 GB
F16	61.1 GB
Q8_0	32.5 GB
Q_K_M	18.6 GB

📝 Notes

Source model: scb10x/typhoon2.5-qwen3-30b-a3b
Tokenizer and metadata preserved during conversion
Choose BF16 for best fidelity, F16 for GPU does not support BF16, Q8_0 for balance, Q4_K_M for lowest memory

⚖️ License

Weights inherit the upstream model’s license.
This repository redistributes format-converted copies only.
Please review and comply with the upstream terms before use.

📝 Acknowledgments

Original model by SCB10X.
GGUF conversion with quantization performed via llama.cpp tooling.

Downloads last month: 48

GGUF

Model size

31B params

Architecture

qwen3moe

Hardware compatibility

4-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support