Automatic Speech Recognition
ESPnet
multilingual
audio
phone-recognition
grapheme-to-phoneme
phoneme-to-grapheme
Instructions to use espnet/powsm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/powsm with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/powsm" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
update arxiv
Browse files
README.md
CHANGED
|
@@ -91,6 +91,13 @@ print(pred)
|
|
| 91 |
### Citations
|
| 92 |
|
| 93 |
```BibTex
|
| 94 |
-
@article{powsm
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
}
|
| 96 |
```
|
|
|
|
| 91 |
### Citations
|
| 92 |
|
| 93 |
```BibTex
|
| 94 |
+
@article{powsm,
|
| 95 |
+
title={POWSM: A Phonetic Open Whisper-Style Speech Foundation Model},
|
| 96 |
+
author={Chin-Jou Li and Kalvin Chang and Shikhar Bharadwaj and Eunjung Yeo and Kwanghee Choi and Jian Zhu and David Mortensen and Shinji Watanabe},
|
| 97 |
+
year={2025},
|
| 98 |
+
eprint={2510.24992},
|
| 99 |
+
archivePrefix={arXiv},
|
| 100 |
+
primaryClass={cs.CL},
|
| 101 |
+
url={https://arxiv.org/abs/2510.24992},
|
| 102 |
}
|
| 103 |
```
|