TTS, VC
updated
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
Paper
• 2402.07383
• Published
• 16
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper
• 2309.03199
• Published
• 15
Natural language guidance of high-fidelity text-to-speech with synthetic
annotations
Paper
• 2402.01912
• Published
• 13
Fast Timing-Conditioned Latent Audio Diffusion
Paper
• 2402.04825
• Published
• 8
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper
• 2404.14700
• Published
• 32
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Paper
• 2406.02430
• Published
• 38
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive
Modeling of Audio Discrete Codes
Paper
• 2406.02897
• Published
• 16
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text
to Speech Synthesizers
Paper
• 2406.05370
• Published
• 17
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Paper
• 2406.18009
• Published
• 22
Towards Robust Speech Representation Learning for Thousands of Languages
Paper
• 2407.00837
• Published
• 11
Autoregressive Speech Synthesis without Vector Quantization
Paper
• 2407.08551
• Published
• 17
Paper
• 2407.14358
• Published
• 26
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Paper
• 2407.14329
• Published
• 5
Paper
• 2407.15595
• Published
• 14
Speech Slytherin: Examining the Performance and Efficiency of Mamba for
Speech Separation, Recognition, and Synthesis
Paper
• 2407.09732
• Published
• 10
Qwen2-Audio Technical Report
Paper
• 2407.10759
• Published
• 64
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Paper
• 2308.06873
• Published
• 28
MulliVC: Multi-lingual Voice Conversion With Cycle Consistency
Paper
• 2408.04708
• Published
• 8
Audio Match Cutting: Finding and Creating Matching Audio Transitions in
Movies and Videos
Paper
• 2408.10998
• Published
• 9
Accelerating High-Fidelity Waveform Generation via Adversarial Flow
Matching Optimization
Paper
• 2408.08019
• Published
• 11
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform
Generation
Paper
• 2408.07547
• Published
• 9
Meta Flow Matching: Integrating Vector Fields on the Wasserstein
Manifold
Paper
• 2408.14608
• Published
• 8
No Training, No Problem: Rethinking Classifier-Free Guidance for
Diffusion Models
Paper
• 2407.02687
• Published
• 24
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Paper
• 2408.16725
• Published
• 53