A neural model for text-to-speech synthesis in Latvian. Trained using VITS on a 20-hour speech corpus of audiobooks read in a male voice. Currently released for research purposes only.
A neural model for text-to-speech (TTS) synthesis in Latvian. Trained using VITS on a 25-hour speech corpus of audiobooks read in a male voice. Available for academic and non-commercial purposes via an API. To get access to the API, please, send a request to info@ailab.lv.