Manipulate images by dragging points
Generate realistic audio from text
Generate audio from text using VITS model