Hacker News with Generative AI: AI Models

An upgraded dev experience in Google AI Studio (googleblog.com)
Google AI Studio is the fastest place to start building with the Gemini API, with access to our most capable models, including Gemini 2.5 preview models, and generative media models like Imagen, Lyria RealTime, and Veo. At Google I/O, we announced new features to help you build and deploy complete applications, new model capabilities, and new features in the Google Gen AI SDK.

Generative AI, Machine Learning, Software Development, AI Models

197 points by meetpateltech 430 days ago | 110 comments

LTXVideo 13B AI video generation (ltxv.video)
A groundbreaking 13B-parameter AI model by Lightricks, revolutionizing video creation with unprecedented speed and quality. 30x faster than comparable models, powered by advanced multiscale rendering technology.

Artificial Intelligence, Video Generation, Technology, AI Models

216 points by zoudong376 441 days ago | 64 comments

How to reverse engineer AI models: a study on Google Photos (skyld.io)
Google Photos is one of the most widely-used photo management applications globally, pre-installed on almost every Android device running Google Mobile Services (GMS). It is appreciated by users because it offers powerful features like “Magic Eraser” and advanced AI-powered photo editing tools. Of course, Google doesn’t open-source its AI models to keep its competitive edge.

Reverse Engineering, AI Models, Google Photos, Image Processing

10 points by superkitten 452 days ago | 6 comments

TinyChat15M: 15M param conversational model designed to run with 60 MB RAM (github.com/starhopp3r)
TinyChat15M is a 15-million parameter conversational language model built on the Meta Llama 2 architecture.

Conversational AI, Open Source, AI Models, Resources

4 points by klaussilveira 460 days ago | 0 comments

Gemini Advanced is free for college students through finals 2026 (gemini.google)
Students get free access to our best AI model with Gemini Advanced. Prep for your exams, perfect your writing, and tackle your homework with the best of Google AI: Gemini Advanced, NotebookLM Plus, Whisk. Plus free 2TB of storage. Available in the US only. Sign up by June 30, 2025.

Generative AI, Education, Google, AI Models, Students

3 points by vyrotek 464 days ago | 0 comments

Docker Model Runner (docker.com)
Generative AI is transforming software development, but building and running AI models locally is still harder than it should be.

Generative AI, Software Development, Docker, AI Models

100 points by kordlessagain 471 days ago | 47 comments

Show HN: Comparing product rankings by OpenAI, Anthropic, and Perplexity (productrank.ai)

Generative AI, Product Rankings, AI Models

125 points by the1024 472 days ago | 35 comments

Meta got caught gaming LMArena (theverge.com)
With Llama 4, Meta fudged benchmarks to appear as though its new AI model is better than the competition.

Artificial Intelligence, Benchmarking, AI Models, Meta

11 points by weavedfreedunes 474 days ago | 2 comments

Amazon introduces Nova Chat (aboutamazon.com)
Amazon makes it easier for developers and tech enthusiasts to explore Amazon Nova, its advanced Gen AI models

Amazon, Generative AI, AI Models, Cloud Computing, Tech

78 points by ao98 481 days ago | 54 comments

DeepSeek releases their latest DeepSeek v3 model, now featuring an MIT license (simonwillison.net)
Chinese AI lab DeepSeek just released the latest version of their enormous DeepSeek v3 model, baking the release date into the name DeepSeek-V3-0324.

Generative AI, Open Source, AI Models

34 points by spenvo 488 days ago | 1 comments

Google calls Gemma 3 the most powerful AI model you can run on one GPU (theverge.com)
A little over a year after releasing two “open” Gemma AI models built from the same technology behind its Gemini AI, Google is updating the family with Gemma 3.

Generative AI, Google, AI Models, Computer Hardware

127 points by gmays 492 days ago | 100 comments

voyage-3-large (voyageai.com)
TL;DR – Introducing voyage-3-large, a new state-of-the-art general-purpose and multilingual embedding model that ranks first across eight evaluated domains spanning 100 datasets, including law, finance, and code. It outperforms OpenAI-v3-large and Cohere-v3-English by an average of 9.74% and 20.71%, respectively. Enabled by Matryoshka learning and quantization-aware training, voyage-3-large supports smaller dimensions and int8 and binary quantization that dramatically reduce vectorDB costs with minimal impact on retrieval quality.

Generative AI, AI Models, Machine Learning

3 points by fzliu 496 days ago | 0 comments

Gemma3 – The current strongest model that fits on a single GPU (ollama.com)
Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

Generative AI, AI Models, Multimodal AI, Computer Vision

252 points by brylie 500 days ago | 138 comments

Evaluating Mistral OCR Against Gemini 2.0 Flash (reducto.ai)
Today, Mistral AI released a new OCR model, claiming to be state-of-the-art (SOTA) on unreleased benchmarks. We decided to put the model to the test.

OCR, Generative AI, AI Models, Benchmarking, Technology

15 points by raunakchowdhuri 506 days ago | 0 comments

Amazon says that Alexa+ is 'model agnostic' (techcrunch.com)
Amazon says that the new and improved Alexa unveiled on Wednesday, Alexa+, is powered by a “model agnostic” system that’s always using the “best” AI model for any given task.

Amazon, Artificial Intelligence, AI Models

5 points by marban 514 days ago | 3 comments

Claude 3.7 Sonnet and Claude Code (anthropic.com)
Today, we’re announcing Claude 3.7 Sonnet1, our most intelligent model to date and the first hybrid reasoning model on the market.

Generative AI, AI Models, Language Models, Reasoning

2127 points by bakugo 516 days ago | 963 comments

How to Run DeepSeek R1 Distilled Reasoning Models on RyzenAI and Radeon GPUs (guru3d.com)

Deep Learning, Computer Hardware, GPUs, AI Models

82 points by waltercool 539 days ago | 22 comments

DeepSeek R1: Don't Put All Your Eggs in One LLM Basket (notdiamond.ai)
Over the last week, the world has been on fire because of Deepseek’s new R1 reasoning model. But the stock predictions surrounding R1 don’t matter, and neither do the conspiracy theories. Even the model itself—while impressive—doesn’t really matter. The reason DeepSeek R1 really matters is because it means the number of frontier AI models is about to explode.

Artificial Intelligence, Technology, AI Models

10 points by achompas 540 days ago | 0 comments

Sky-T1: Train your own O1 preview model within $450 (novasky-ai.github.io)
We introduce Sky-T1-32B-Preview, our reasoning model that performs on par with o1-preview on popular reasoning and coding benchmarks. Remarkably, Sky-T1-32B-Preview was trained for less than $450, demonstrating that it is possible to replicate high-level reasoning capabilities affordably and efficiently. All code is open-source.

Generative AI, Open Source, AI Models, Cost-Effectiveness

44 points by fofoz 558 days ago | 5 comments

Notes on the New Deepseek v3 (composio.dev)
Deepseek released their flagship model, v3, a 607B mixture-of-experts model with 37B active parameters. Currently, it is the best open-source model, beating Llama 3.1 405b, Qwen, and Mistral. It is on par with OpenAI GPT-4o and Claude 3.5 Sonnet from the benchmarks. The first model performs on par and better at some tasks than the big closed models.

Generative AI, Open Source, Benchmarking, AI Models

99 points by soham123 569 days ago | 24 comments

We fine-tuned Llama and got 4.2x Sonnet 3.5 accuracy for code generation (finecodex.com)

Generative AI, AI Models, Code Generation, Accuracy

136 points by banddk 573 days ago | 77 comments

You don't need to pay for OpenAI - Gemini 2.0 Free & More (github.com/EliasPereirah)
Orion is a web-based chat interface that simplifies interactions with multiple AI model providers.

OpenAI, AI Models, Free Software, Chatbots, Web Applications

41 points by singularis 580 days ago | 7 comments

Google releases its own 'reasoning' AI model (techcrunch.com)
Google has released what it’s calling a new “reasoning” AI model — but it’s in the experimental stages, and from our brief testing, there’s certainly room for improvement.

Artificial Intelligence, Google, AI Models

5 points by kjhughes 583 days ago | 0 comments

Show HN: Anthropic's MCP Server Directory (glama.ai)
Model Context Protocol (MCP) is an open protocol that enables AI models to interact with local and remote resources through standardized server implementations.

Generative AI, Open Source, Software, AI Models

11 points by punkpeye 585 days ago | 6 comments

Gemini 2.0: our new AI model for the agentic era (google)
Google DeepMind introduces Gemini 2.0, a new AI model designed for the "agentic era."

Generative AI, Artificial Intelligence, Google, AI Models

1015 points by meetpateltech 591 days ago | 490 comments

32k context length text embedding models (voyageai.com)
TL;DR – We are excited to announce voyage-3 and voyage-3-lite embedding models, advancing the frontier of retrieval quality, latency, and cost. voyage-3 outperforms OpenAI v3 large by 7.55% on average across all evaluated domains, including code, law, finance, multilingual, and long-context, with 2.2x lower costs and 3x smaller embedding dimension, resulting in 3x lower vectorDB costs. voyage-3-lite offers 3.82% better retrieval accuracy than OpenAI v3 large while costing 6x less and having 6x smaller embedding dimension.

Generative AI, Language Models, Text Embedding, AI Models, Cost Efficiency

101 points by fzliu 609 days ago | 32 comments

Omnivision-968M: Vision Language Model with 9x Tokens Reduction for Edge Devices (nexa.ai)

Computer Vision, Edge Computing, AI Models, Tokenization

69 points by BUFU 618 days ago | 12 comments

Show HN: A real time AI video agent with under 1 second of latency (ycombinator.com)
Hey it’s Hassaan & Quinn – co-founders of Tavus, an AI research company and developer platform for video APIs. We’ve been building AI video models for ‘digital twins’ or ‘avatars’ since 2020.

Artificial Intelligence, Video, Real-Time, AI Models

455 points by hassaanr 662 days ago | 256 comments

Llama can now see and run on your device – welcome Llama 3.2 (huggingface.co)
Llama 3.2 is out! Today we welcome the next iteration of the Llama collection to Hugging Face. This time, we’re excited to collaborate with Meta on the release of multimodal and small models. Ten open-weight models (5 multimodal models and 5 text-only ones) are available on the Hub.

Generative AI, AI Models, New Releases, Computer Vision

26 points by nitramm 668 days ago | 1 comments

Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more (googleblog.com)

Google, AI Models, Pricing, Updates

193 points by meetpateltech 669 days ago | 137 comments