Hacker News with Generative AI: Language Modeling

Jargonic: Industry-Tunable ASR Model (aiola.ai)
Automatic Speech Recognition (ASR) has made significant strides over the last decade, but most ASR models on the market offer general-purpose transcription. They perform well in clean, controlled environments but break down when handling:
A CC-By Open-Source TTS Model with Voice Cloning (huggingface.co)
OuteTTS-0.1-350M is a novel text-to-speech synthesis model that leverages pure language modeling without external adapters or complex architectures, built upon the LLaMa architecture using our Oute3-350M-DEV base model, it demonstrates that high-quality speech synthesis is achievable through a straightforward approach using crafted prompts and audio tokens.
Have we stopped to think about what LLMs model? (theregister.com)