Hacker News with Generative AI: Video Generation

Fast Video Generation with Sliding Tile Attention (hao-ai-lab.github.io)
TL;DR: Video generation with DiTs is painfully slow – HunyuanVideo takes 16 minutes to generate just a 5-second video on an H100 with FlashAttention3. Our sliding tile attention (STA) slashes this to 5 minutes with zero quality loss, no extra training required. Specifically, STA accelerates attention alone by 2.8–17x over FlashAttention-2 and 1.6–10x over FlashAttention-3.
Goku Flow Based Video Generative Foundation Models (github.com/Saiyan-World)
Goku is a new family of joint image-and-video generation models based on rectified flow Transformers.
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation (hila-chefer.github.io)
Despite tremendous recent progress, generative video models still struggle to capture real-world motion, dynamics, and physics.
Veo 2: Our video generation model (deepmind.google)
Veo creates videos with realistic motion and high quality output, up to 4K. Explore different styles and find your own with extensive camera controls.
Veo and Imagen 3: Announcing new video and image generation models on Vertex AI (cloud.google.com)
Generative AI is leading to real business growth and transformation. Among enterprise companies with gen AI in production, 86% report an increase in revenue1, with an estimated 6% growth. That’s why Google is investing in its AI technology with new models like Veo, our most advanced video generation model, and Imagen 3, our highest quality image generation model.
OpenAI's Sora has been leaked (techcrunch.com)
A group appears to have leaked access to Sora, OpenAI’s video generator, in protest of what they’re calling duplicity and “art washing” on OpenAI’s part.
The Matrix: a foundation world model for generating infinite-length videos (twitter.com)
Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0 (twitter.com)
Sora-like text-to-video model from Chinese startup Minimax, 10 examples (twitter.com)
CogVideoX: A Cutting-Edge Video Generation Model (medium.com)
Show HN: AnimeGenAi – AI-powered anime style image and video generator (animegenai.com)
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation (stevenlsw.github.io)
Open-Sora does pretty good video generation on consumer GPUs (backprop.co)
Gen-3 Alpha: A New Frontier for Video Generation (runwayml.com)
Highly realistic talking head video generation (github.com/fudan-generative-vision)
Google announces Veo, their Sora competing text/image-to-video model (aitestkitchen.withgoogle.com)
StoryDiffusion: Long-range image and video generation (storydiffusion.github.io)
China's VIDU Video Generation AI Competes with OpenAI's Sora (medium.com)