EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer (haidog-yaqub.github.io)
EzAudio is an advanced text-to-audio (T2A) generation model that creates high-quality audio from text prompts. It sets a new standard for open-source T2A models by delivering fast, efficient, and realistic sound effects generation.