Hacker News with Generative AI: AI Models

Omnivision-968M: Vision Language Model with 9x Tokens Reduction for Edge Devices (nexa.ai)
Show HN: A real time AI video agent with under 1 second of latency (ycombinator.com)
Hey it’s Hassaan & Quinn – co-founders of Tavus, an AI research company and developer platform for video APIs. We’ve been building AI video models for ‘digital twins’ or ‘avatars’ since 2020.
Llama can now see and run on your device – welcome Llama 3.2 (huggingface.co)
Llama 3.2 is out! Today we welcome the next iteration of the Llama collection to Hugging Face. This time, we’re excited to collaborate with Meta on the release of multimodal and small models. Ten open-weight models (5 multimodal models and 5 text-only ones) are available on the Hub.
Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more (googleblog.com)
Pixtral 12B (mistral.ai)
Pixtral 12B - the first-ever multimodal Mistral model. Apache 2.0.
Smaller Gemini 1.5 Flash-8B, stronger Gemini 1.5 Pro, improved Gemini 1.5 Flash (twitter.com)
Gemini Pro 1.5 experimental "version 0801" available for early testing (deepmind.google)
Llama 3.1: 405B, the largest openly available model released (github.com/meta-llama)
OpenAI is releasing GPT-4o Mini, a cheaper, smarter model (theverge.com)
My finetuned models beat OpenAI's GPT-4 (mlops.systems)
OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost (openpipe.ai)
Snowflake releases a flagship generative AI model of its own (techcrunch.com)
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat (huggingface.co)