mmBERT Detector for "Why Do Multilingual Reasoning Gaps Emerge in Reasoning Language Models?"
This repository provides mmBERT-based detectors introduced in our paper "Why Do Multilingual Reasoning Gaps Emerge in Reasoning Language Models?".
We release mmBERT detector checkpoints trained for Qwen3-4B, using two training setups:
- mgsm_filtered
- mmlu_prox_lite_dev
For each setup, we provide three independent seeds:
- seed 32
- seed 42
- seed 52
These detectors are intended for research use in analyzing multilingual reasoning gaps and understandability-related behaviors.
Citation
If you find this repository useful, please cite:
@misc{kang2025multilingualreasoninggapsemerge,
title={Why Do Multilingual Reasoning Gaps Emerge in Reasoning Language Models?},
author={Deokhyung Kang and Seonjeong Hwang and Daehui Kim and Daehui Kim and Hyounghun Kim and Gary Geunbae Lee},
year={2025},
eprint={2510.27269},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.27269},
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for deokhk/mmbert_ft_understandability_Qwen3-4B
Base model
jhu-clsp/mmBERT-base