|
|
--- |
|
|
license: mit |
|
|
base_model: |
|
|
- unsloth/DeepSeek-R1-BF16 |
|
|
--- |
|
|
## Model Details |
|
|
|
|
|
This model card is for mxfp8/mxfp4/nvfp4 quantization of [unsloth/DeepSeek-R1-BF16](https://huggingface.co/unsloth/DeepSeek-R1-BF16) based on [intel/auto-round](https://github.com/intel/auto-round). |
|
|
The models are not able to be published due to the storage limitation. Please follow the INC example README to generate and evaluate the low precision models. |
|
|
|
|
|
## How to Use |
|
|
|
|
|
The step-by-step README of quantization and evaluation can be found in [Intel Neural Compressor Examples](https://github.com/intel/neural-compressor/blob/master/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/deepseek/README.md). |
|
|
|
|
|
## Evaluate Results |
|
|
|
|
|
|
|
|
| Task | backend | BF16 | MXFP8 | MXFP4 | NVFP4 | |
|
|
|:-----------:|:-------:|:----------:|:----------:|:----------:|:----------:| |
|
|
| hellaswag | vllm | 0.6903 | 0.6956 | 0.6898 | 0.6953 | |
|
|
| piqa | vllm | 0.8319 | 0.8324 | 0.8297 | 0.8303 | |
|
|
| mmlu | vllm | 0.8489 | 0.8532 | 0.8426 | 0.8495 | |
|
|
| gsm8k | vllm | 0.9568 | 0.9583 | 0.9553 | 0.9606 | |
|
|
| **average** | vllm | **0.8320** | **0.8349** | **0.8294** | **0.8339** | |
|
|
|
|
|
|
|
|
## Ethical Considerations and Limitations |
|
|
|
|
|
The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. |
|
|
Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs. |
|
|
|
|
|
Therefore, before deploying any applications of the model, developers should perform safety testing. |
|
|
|
|
|
## Caveats and Recommendations |
|
|
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. |
|
|
|
|
|
Here are a couple of useful links to learn more about Intel's AI software: |
|
|
|
|
|
- [Intel Neural Compressor](https://github.com/intel/neural-compressor) |
|
|
- [AutoRound](https://github.com/intel/auto-round) |
|
|
|
|
|
## Disclaimer |
|
|
|
|
|
The license on this model does not constitute legal advice. |
|
|
We are not responsible for the actions of third parties who use this model. |
|
|
Please consult an attorney before using this model for commercial purposes. |
|
|
|