whisper-tiny-he-acft
Hebrew Whisper Tiny with ACFT (Audio-Context Fine-Tuning) for optimized short-audio performance, compatible with FUTO Keyboard and whisper.cpp.
Training
Two-stage pipeline:
- Fine-tune: openai/whisper-tiny on ivrit-ai/whisper-training (~400h Hebrew) → amitkot/whisper-tiny-he
- ACFT: Fine-tuned model on google/fleurs (he_il) using FUTO-aligned ACFT (partial encoder with truncated positional embeddings, 8 epochs, batch_size=1)
- Hardware: Apple M4 (MPS)
- Method: Distillation-based — teaches model to handle short audio contexts without repeating
Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
processor = WhisperProcessor.from_pretrained("amitkot/whisper-tiny-he-acft")
model = WhisperForConditionalGeneration.from_pretrained("amitkot/whisper-tiny-he-acft")
For FUTO Keyboard / whisper.cpp, convert to ggml:
uv run python scripts/pipeline.py \
--finetune-config configs/hebrew_tiny_finetune.yaml \
--config configs/hebrew_tiny_acft.yaml
Training pipeline
Trained using whisper-acft-pipeline.
See also
- amitkot/whisper-tiny-he — Base fine-tuned model (before ACFT)
- FUTO whisper-acft — ACFT method reference
- Downloads last month
- 22