305 points by mfiguiere 73 days ago | 179 comments
Zonos – Apache 2.0 licensed, Multilingual, Text to Speech model(zyphra.com) We are excited to announce the release of Zonos-v0.1 beta, featuring two expressive and real-time text-to-speech (TTS) models with high-fidelity voice cloning. We are releasing our 1.6B transformer and 1.6B hybrid under an Apache 2.0 license.
PlayAI's new Dialog model achieves 3:1 preference in human evals(play.ht) PlayAI’s Dialog Text-to-Speech model is now in general availability, bringing multilingual capabilities, and exceptional performance to applications requiring emotive, human-like speech. In recent third-party benchmark tests, Dialog was preferred by 10:1 vs. ElevenLabs v2.5 Turbo, and by over 3:1 vs. ElevenLabs Multilingual v2.0.Play the video below to find out what it sounds like, or visit our AI voiceover Studio to try it for yourself.
Edge TTS(github.com/rany2) edge-tts is a Python module that allows you to use Microsoft Edge's online text-to-speech service from within your Python code or using the provided edge-tts or edge-playback command.
420 points by csantini 101 days ago | 246 comments
MathReader: Text-to-Speech for Mathematical Documents [pdf](arxiv.org) TTS (Text-to-Speech) document reader from Microsoft, Adobe, Apple, and OpenAI have been serviced worldwide. They provide relatively good TTS results for general plain text, but sometimes skip contents or provide unsatisfactory results for mathematical expressions.
A CC-By Open-Source TTS Model with Voice Cloning(huggingface.co) OuteTTS-0.1-350M is a novel text-to-speech synthesis model that leverages pure language modeling without external adapters or complex architectures, built upon the LLaMa architecture using our Oute3-350M-DEV base model, it demonstrates that high-quality speech synthesis is achievable through a straightforward approach using crafted prompts and audio tokens.