Hacker News with Generative AI: Text-to-Speech

ElevenReader (elevenreader.io)
Bring any book, article, PDF, newsletter, or text to life with ultra realistic AI narration in one app
Zonos – Apache 2.0 licensed, Multilingual, Text to Speech model (zyphra.com)
We are excited to announce the release of Zonos-v0.1 beta, featuring two expressive and real-time text-to-speech (TTS) models with high-fidelity voice cloning. We are releasing our 1.6B transformer and 1.6B hybrid under an Apache 2.0 license.
PlayAI's new Dialog model achieves 3:1 preference in human evals (play.ht)
PlayAI’s Dialog Text-to-Speech model is now in general availability, bringing multilingual capabilities, and exceptional performance to applications requiring emotive, human-like speech.  In recent third-party benchmark tests, Dialog was preferred by 10:1 vs. ElevenLabs v2.5 Turbo, and by over 3:1 vs. ElevenLabs Multilingual v2.0.Play the video below to find out what it sounds like, or visit our AI voiceover Studio to try it for yourself.
Show HN: Voice Cloning and Multilingual TTS in One Click (Windows) (github.com/abus-aikorea)
Voice-Pro is a cutting-edge AI-powered web application designed to revolutionize multimedia content processing.
Edge TTS (github.com/rany2)
edge-tts is a Python module that allows you to use Microsoft Edge's online text-to-speech service from within your Python code or using the provided edge-tts or edge-playback command.
Generate audiobooks from E-books with Kokoro-82M (claudio.uk)
Kokoro v0.19 is a recently published text-to-speech model with just 82M params and very high-quality output.
MathReader: Text-to-Speech for Mathematical Documents [pdf] (arxiv.org)
TTS (Text-to-Speech) document reader from Microsoft, Adobe, Apple, and OpenAI have been serviced worldwide. They provide relatively good TTS results for general plain text, but sometimes skip contents or provide unsatisfactory results for mathematical expressions.
Show HN: New Cartesia Text-to-Speech Model (cartesia.ai)
Real-time multimodal intelligence for every device
Play Dialog: A contextual turn-taking TTS model like NotebookLM Playground (play.ai)
PlayNoteAgentsPlaygroundPricingAPICommunityConversation (2 Speakers)Narration (1 Speaker)LanguageSpeaker 1 VoiceSpeaker 2 VoiceConnecting...Random PromptCreate Voice Clone
A CC-By Open-Source TTS Model with Voice Cloning (huggingface.co)
OuteTTS-0.1-350M is a novel text-to-speech synthesis model that leverages pure language modeling without external adapters or complex architectures, built upon the LLaMa architecture using our Oute3-350M-DEV base model, it demonstrates that high-quality speech synthesis is achievable through a straightforward approach using crafted prompts and audio tokens.
Show HN: I made a tutorial of how to use free edge TTS API with deno.js [video] (youtube.com)
Play 3.0 mini – A lightweight, reliable, cost-efficient Multilingual TTS model (play.ht)
Today we’re releasing our most capable and conversational voice model that can speak in 30+ languages using any voice or accent, with industry leading speed and accuracy. We’re also releasing 50+ new conversational AI voices across languages.
Free Text-to-Speech App with natural voices (elevenlabs.io)
Show HN: Using AI to generate custom sounds from text (image-effects.com)
Show HN: I generated 70k audiobooks with OpenAI Text-to-Speech (listenly.io)
Coqui.ai TTS: A Deep Learning Toolkit for Text-to-Speech (github.com/coqui-ai)
Show HN: MARS5, open-source, insanely prosodic TTS model (github.com/Camb-ai)
ChatTTS-Best open source TTS Model (github.com/2noise)
Show HN: PaperTube – Turn YouTube Videos into Kindle-Ready Articles (papertube.site)
Show HN: Affordable text-to-speech for long-form content (audiowaveai.com)
ElevenLabs Music (twitter.com)