Paraformer-zh-streaming
Streaming speech recognition โ real-time transcription with low latency for Chinese.
Streaming variant of Paraformer that processes audio in chunks, delivering results in real-time as audio arrives.
Quick Start
from funasr import AutoModel
model = AutoModel(
model="funasr/paraformer-zh-streaming",
hub="hf",
device="cuda",
)
# Streaming inference with chunk processing
import sounddevice as sd
import numpy as np
chunk_size = [0, 10, 5] # [lookback, chunk, lookahead] in 60ms frames
result = model.generate(
input="audio.wav",
chunk_size=chunk_size,
encoder_chunk_look_back=4,
decoder_chunk_look_back=1,
)
print(result[0]["text"])
Features
- Real-time streaming recognition
- Low latency chunk-by-chunk processing
- Chinese + English support
- Compatible with WebSocket deployment
Model Details
| Property | Value |
|---|---|
| Architecture | Paraformer-Streaming |
| Parameters | 220M |
| Languages | Chinese, English |
| Mode | Streaming (online) |
| Sample Rate | 16kHz |
Links
- GitHub: FunASR
- Docs: modelscope.github.io/FunASR
- Offline version: funasr/paraformer-zh
- Downloads last month
- 234