Moshi: A speech-text foundation model for real time dialogue (github.com/kyutai-labs)
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework.