Send Data with Sound(github.com/solst-ice) This application allows you to transmit and receive data through sound. It uses a simple encoding scheme to convert text into audio frequencies, which can be played through your speakers and picked up by a microphone.
INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations(grisoon.github.io) We present INFP, an audio-driven interactive head generation framework for dyadic conversations. Given the dual-track audio in dyadic conversations and a single portrait image of arbitrary agent, our framework can dynamically synthesize verbal, non-verbal and interactive agent videos with lifelike facial expressions and rhythmic head pose movements. Additionally, our framework is lightweight yet powerful, making it practical in instant communication scenarios such as the video conferencing. INFP denotes our method is Interactive, Natural, Flash and Person-generic.
Fish Speech 1.5(github.com/fishaudio) This codebase and all models are released under CC-BY-NC-SA-4.0 License. Please refer to LICENSE for more details.
314 points by thunderbong 180 days ago | 64 comments
Hertz-dev, the first open-source base model for conversational audio(si.inc) For the last few months, we at Standard Intelligence have focused on fundamental research on the frontier of audio-only speech generation. We're excited to announce that we're open-sourcing current checkpoints of our full-duplex, audio-only transformer base model, hertz-dev, with a total of 8.5 billion parameters.