Hacker News with Generative AI: AI

Tenstorrent Launches Blackhole Developer Products at Tenstorrent Dev Day (tenstorrent.com)
Tenstorrent launched the next generation Blackhole™ chip family today at their DevDay event in San Francisco.
Show HN: MCP Server to let agents control the browser (github.com/Skyvern-AI)
Skyvern's MCP server implementation helps connect your AI Applications to the browser. This allows your AI applications to do things like: Fill out forms, download files, research information on the web, and more.
How to write good prompts for generating code from LLMs (github.com/potpie-ai)
Large Language Models (LLMs) have revolutionized code generation, but to get high-quality, useful output, creating effective prompts is crucial.
I stopped using AI code editors (lucianonooijen.com)
In late 2022, I used AI tools for the first time, even before the first version of ChatGPT. In 2023, I started using AI-based tools in my development workflow. Initially, I was super impressed with the capabilities of these LLMs. The fact that I could just copy and paste obscure compiler errors along with the C++ source code, and be told where the error is caused felt like magic.
Firefox 137.0 released with vertical tabs (mozilla.org)
Firefox’s new sidebar lets you move tabs to the side, pin key sites and keep your AI assistant handy.
Search-R1: Training LLMs to Reason and Leverage Search Engines with RL (arxiv.org)
Efficiently acquiring external knowledge and up-to-date information is essential for effective reasoning and text generation in large language models (LLMs).
GitSummarize – Generate living documentation from any GitHub repo (gitsummarize.com)
Turn any GitHub repository into a comprehensive AI-powered documentation hub.
Show HN: Arrakis – Open-source, self-hostable sandboxing service for AI Agents (github.com/abshkbh)
AI agents can generate malicious or buggy code that can attack the host system its run on.
Wikipedia is struggling with voracious AI bot crawlers (engadget.com)
Wikimedia has seen a 50 percent increase in bandwidth used for downloading multimedia content since January 2024, the foundation said in an update. But it's not because human readers have suddenly developed a voracious appetite for consuming Wikipedia articles and for watching videos or downloading files from Wikimedia Commons. No, the spike in usage came from AI crawlers, or automated programs scraping Wikimedia's openly licensed images, videos, articles and other files to train generative artificial intelligence models.
We need a better term for GenAI output – "slop" is too benign (rockpapershotgun.com)
Earlier this month, Snail Games put out a widely and justifiably clowned-on genAI trailer for Ark: Survival Evolved's Aquatica DLC.
Kai Scheduler: Kubernetes Native scheduler for AI workloads at large scale (github.com/NVIDIA)
KAI Scheduler is a robust, efficient, and scalable Kubernetes scheduler that optimizes GPU resource allocation for AI and machine learning workloads.
Show HN: VaporVibe – auto-generate video demos for vibe-coded projects (influme.ai)
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad (arxiv.org)
Recent math benchmarks for large language models (LLMs) such as MathArena indicate that state-of-the-art reasoning models achieve impressive performance on mathematical competitions like AIME, with the leading model, o3-mini, achieving scores comparable to top human competitors.
Text to Bark from ElevenLabs [video] (youtube.com)
Why MCP Is Mostly Bullshit (lycee.ai)
If you follow the AI space closely, you’ve surely noticed the increased interest in MCP (Model Context Protocol).
How AI is creating a rift at McKinsey, Bain, and BCG (the-ken.com)
Clients’ increasing access to AI tools is transforming the way consulting firms operate
Show HN: Qwen-2.5-32B is now the best open source OCR model (github.com/getomni-ai)
A benchmarking tool that compares OCR and data extraction capabilities of different large multimodal models such as gpt-4o, evaluating both text and json extraction accuracy. The goal of this benchmark is to publish a comprehensive benchmark of OCR accuracy across traditional OCR providers and multimodal Language Models. The evaluation dataset and methodologies are all Open Source, and we encourage expanding this benchmark to encompass any additional providers.
Dual RTX 5090 Beats $25,000 H100 in Real-World LLM Performance (hardware-corner.net)
AI enthusiasts looking for top-tier performance in local LLMs have long considered NVIDIA’s H100 to be the gold standard for inference, thanks to its high-bandwidth HBM3 memory and optimized tensor cores. However, recent benchmarks show that a dual RTX 5090 setup, while still pricey, outperforms the H100 in sustained output token generation, making it an ideal choice for those seeking the best possible performance for home use, especially for models up to 70B parameters.
NaNoWriMo shut down after AI, content moderation scandals (techcrunch.com)
NaNoWriMo, a 25-year-old online writing community-turned-nonprofit, announced on Monday evening that it is shutting down.
LLM providers on the cusp of an 'extinction' phase as capex realities bite (theregister.com)
Gartner says the market for large language model (LLM) providers is on the cusp of an extinction phase as it grapples with the capital-intensive costs of building products in a competitive market.
OpenAI plans to release a new 'open' AI language model in the coming months (techcrunch.com)
OpenAI says that it intends to release its first “open” language model since GPT‑2 “in the coming months.”
Show HN: Neuronpedia, an open source platform for AI interpretability (neuronpedia.org)
Neuronpedia is an open source interpretability platform.
Ask HN: What's the best way to get started with LLM-assisted programing? (ycombinator.com)
Currently, I use Perplexity or ChatGPT via the web prompt for small coding tasks, but sometimes I'll use Ollama. Stuff like writing a shell script perform some task, or maybe a small Python function. I'd like to get to the next level, but I don't know where to start.
Aim: Supercharged open-source experiment tracker (github.com/aimhubio)
Aim logs your training runs and any AI Metadata, enables a beautiful UI to compare, observe them and an API to query them programmatically.
Launch HN: Augento (YC W25) – Fine-tune your agents with reinforcement learning (ycombinator.com)
Hi HN, we’re the cofounders of Augento (https://augento.ai/). We’re building Deepseek R1-like fine-tuning as a service. You connect your agent, tell us when it’s right or wrong, and we deliver an LLM optimized for that agent.
DeepSeek surpasses ChatGPT in new monthly visits (indiatimes.com)
There is no Vibe Engineering (serce.me)
You've probably heard about "vibe coding" by now. The term was recently coined by Andrej Karpathy in his tweet. Andrej defines Vibe Coding as "a new kind of coding, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists". The key difference between vibe coding and normal coding is that the engineer doesn’t interact with the codebase directly, and instead converses with the agent and inspects the final outcome.
Show HN: AI-powered reading companion that helps you read hard books (collabai.live)
AbletonMCP – Ableton Live Model Context Protocol Integration (github.com/ahujasid)
AbletonMCP connects Ableton Live to Claude AI through the Model Context Protocol (MCP), allowing Claude to directly interact with and control Ableton Live. This integration enables prompt-assisted music production, track creation, and Live session manipulation.
Show HN: I built an open-source NotebookLM alternative using Morphik (github.com/morphik-org)
Morphik is an open-source database designed for AI applications that simplifies working with unstructured data. It provides advanced RAG (Retrieval Augmented Generation) capabilities with multi-modal support, knowledge graphs, and intuitive APIs.