Typhoon2.5-Qwen3-30B-A3B (GGUF)

GGUF-format conversions of the scb10x/typhoon2.5-qwen3-30b-a3b model for efficient inference with llama.cpp and compatible runtimes.

Converted using llama.cpp’s convert_hf_to_gguf.py and quantize tool. No additional training or fine-tuning was performed.


🧩 Variants

Variant File size
BF16 61.1 GB
F16 61.1 GB
Q8_0 32.5 GB
Q_K_M 18.6 GB

πŸ“ Notes

  • Source model: scb10x/typhoon2.5-qwen3-30b-a3b
  • Tokenizer and metadata preserved during conversion
  • Choose BF16 for best fidelity, F16 for GPU does not support BF16, Q8_0 for balance, Q4_K_M for lowest memory

βš–οΈ License

Weights inherit the upstream model’s license.
This repository redistributes format-converted copies only.
Please review and comply with the upstream terms before use.


πŸ“ Acknowledgments

Original model by SCB10X.
GGUF conversion with quantization performed via llama.cpp tooling.

Downloads last month
48
GGUF
Model size
31B params
Architecture
qwen3moe
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support