---
language:
- en
license: mit
library_name: mlx
pipeline_tag: image-text-to-text
base_model: showlab/ShowUI-2B
tags:
- mlx
- mlx-vlm
- safetensors
- apple-silicon
- conversational
- gui
- vision-language-model
- qwen2_vl
- showui
- gui-agents
- vision-language-action
- computer-use
- grounding
- 6-bit
- quantized
---

# ShowUI-2B 6bit

This is a 6-bit quantized MLX conversion of [showlab/ShowUI-2B](https://huggingface.co/showlab/ShowUI-2B), optimized for Apple Silicon.

ShowUI is a lightweight `2B` vision-language-action model designed for GUI agents. Upstream, it is framed around GUI grounding and UI navigation, with point-style localization and atomic action dictionaries over screenshots.

This artifact was derived from the validated local MLX `bf16` reference conversion and then quantized with `mlx-vlm`. It was validated locally with both `mlx_vlm` prompt-packet checks and `vllm-mlx` OpenAI-compatible serve checks.

## Conversion Details

| Field | Value |
|---|---|
| Upstream model | `showlab/ShowUI-2B` |
| Artifact type | `6bit quantized MLX conversion` |
| Source artifact | local validated `bf16` MLX artifact |
| Repo action | `update existing mlx-community repo` |
| Conversion tool | `mlx_vlm.convert` via `mlx-vlm 0.3.12` |
| Python | `3.11.14` |
| MLX | `0.31.0` |
| Transformers | `5.2.0` |
| Validation backend | `vllm-mlx (phase/p1 @ 8a5d41b)` |
| Quantization | `6bit` |
| Group size | `64` |
| Quantization mode | `affine` |
| Converter dtype note | `bfloat16` |
| Reported effective bits per weight | `9.088` |
| Artifact size | `2.60G` |
| Template repair | `tokenizer_config.json["chat_template"]` was re-injected after quantization |

Additional notes:

- This quantized artifact inherits the fresh-source posture of the validated local `bf16` base artifact.
- `chat_template.json`, `chat_template.jinja`, and `tokenizer_config.json["chat_template"]` were kept aligned after quantization.
- This family was validated on the Track B packet revision aligned to ShowUI's native point/action contract.

## Validation

This artifact passed local validation in this workspace:

- `mlx_vlm` prompt-packet validation: `PASS`
- `vllm-mlx` OpenAI-compatible serve validation: `PASS`

Local validation notes:

- All four Track B packet prompts matched the local `bf16` outputs exactly.
- The same coordinate drift between non-stream and streamed serve outputs remained present.
- No new regression appeared in packet shape, multimodal detection, or the serve path after quantization.

## Performance

- Artifact size on disk: `2.60G`
- Local fixed-packet `mlx_vlm` validation used about `4.35 GB` peak memory
- Local `vllm-mlx` serve validation completed in about `20.15s` non-stream and `21.13s` streamed

These are local validation measurements, not a full benchmark suite.

## Usage

### Install

```bash
pip install -U mlx-vlm
```

### CLI

```bash
python -m mlx_vlm.generate \
  --model mlx-community/ShowUI-2B-6bit-v2 \
  --image path/to/image.png \
  --prompt "Based on the screenshot, return the clickable location for the API Host field as [x, y] on a 0-1 scale." \
  --max-tokens 128 \
  --temperature 0.0
```

### Python

```python
from mlx_vlm import load, generate

model, processor = load("mlx-community/ShowUI-2B-6bit-v2")
result = generate(
    model,
    processor,
    prompt="Based on the screenshot, return the clickable location for the API Host field as [x, y] on a 0-1 scale.",
    image="path/to/image.png",
    max_tokens=128,
    temp=0.0,
)
print(result.text)
```

### vllm-mlx Serve

```bash
python -m vllm_mlx.cli serve mlx-community/ShowUI-2B-6bit-v2 --mllm --localhost --port 8000
```

## Links

- Upstream model: [showlab/ShowUI-2B](https://huggingface.co/showlab/ShowUI-2B)
- Paper: [ShowUI: One Vision-Language-Action Model for GUI Visual Agent](https://arxiv.org/abs/2411.17465)
- GitHub: [showlab/ShowUI](https://github.com/showlab/ShowUI/tree/main)
- Demo Space: [showlab/ShowUI Space](https://huggingface.co/spaces/showlab/ShowUI)
- Dataset: [showlab/ShowUI-desktop-8K](https://huggingface.co/datasets/showlab/ShowUI-desktop-8K)
- Base model lineage: [Qwen/Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct)
- MLX framework: [ml-explore/mlx](https://github.com/ml-explore/mlx)
- mlx-vlm: [Blaizzy/mlx-vlm](https://github.com/Blaizzy/mlx-vlm)

## Other Quantizations

Planned sibling repos in this wave:

- [`mlx-community/ShowUI-2B-bf16-v2`](https://huggingface.co/mlx-community/ShowUI-2B-bf16-v2)
- [`mlx-community/ShowUI-2B-6bit-v2`](https://huggingface.co/mlx-community/ShowUI-2B-6bit-v2) - this model

## Notes and Limitations

- This card reports local MLX conversion and validation results only.
- Upstream benchmark claims belong to the original ShowUI model family and were not re-run here unless explicitly stated.
- This family remains tied to the Track B point/action packet rather than the Track A bounding-box packet.
- The original `mlx-community/ShowUI-2B-bf16-6bit` repo already existed, so this refreshed artifact is published under the `-v2` repo id.

## Citation

If you use this MLX conversion, please also cite the original ShowUI work:

```bibtex
@misc{lin2024showui,
      title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent},
      author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou},
      year={2024},
      eprint={2411.17465},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17465},
}
```

## License

This repo follows the upstream model license: MIT.
See the upstream model card for the authoritative license details:
[showlab/ShowUI-2B](https://huggingface.co/showlab/ShowUI-2B).