Hacker News with Generative AI: AI

GitHub Copilot Pro+ (github.blog)
Today, we’re introducing GitHub Copilot Pro+, a new individual tier for developers who want to take their coding experience to the next level.
DOGE's AI Push at the Department of Veterans Affairs (wired.com)
Elon Musk’s so-called Department of Government Efficiency (DOGE) has been clear about its plans to fire tens of thousands of employees at the Department of Veterans Affairs. New WIRED reporting sheds light on the specific DOGE operatives at the VA and the ways they’re trying to infiltrate and drastically change the agency.
Microsoft 50th Anniversary (microsoft.com)
Microsoft empowers you with AI
Microsoft employee disrupts 50th anniversary and calls AI boss 'war profiteer' (theverge.com)
A Microsoft employee disrupted the company’s 50th anniversary event to protest its use of AI.
GitHub Copilot rolls out Agent mode and MCP for VS Code (github.blog)
In celebration of MSFT’s 50th anniversary, we’re rolling out Agent Mode with MCP support to all VS Code users. We are also announcing the new GitHub Copilot Pro+ plan w/ premium requests, the general availability of models from Anthropic, Google, and OpenAI, next edit suggestions for code completions & the Copilot code review agent.
Tenstorrent Launches Blackhole Developer Products at Tenstorrent Dev Day (tenstorrent.com)
Tenstorrent launched the next generation Blackhole™ chip family today at their DevDay event in San Francisco.
Show HN: MCP Server to let agents control the browser (github.com/Skyvern-AI)
Skyvern's MCP server implementation helps connect your AI Applications to the browser. This allows your AI applications to do things like: Fill out forms, download files, research information on the web, and more.
Show HN: Docsumo's OCR Benchmark Report – Surpassing Mistral and Landing AI (docsumo.com)
In the past month, the AI community witnessed the launch of two much-anticipated OCR solutions—Mistral OCR by the Mistral team (known for their LLMs) and Agentic Document Extraction by Landing AI, Andrew Ng’s company. At Docsumo, we live and breathe Document AI. So when these releases hit the market, we couldn’t resist putting them to the test
How to write good prompts for generating code from LLMs (github.com/potpie-ai)
Large Language Models (LLMs) have revolutionized code generation, but to get high-quality, useful output, creating effective prompts is crucial.
I stopped using AI code editors (lucianonooijen.com)
In late 2022, I used AI tools for the first time, even before the first version of ChatGPT. In 2023, I started using AI-based tools in my development workflow. Initially, I was super impressed with the capabilities of these LLMs. The fact that I could just copy and paste obscure compiler errors along with the C++ source code, and be told where the error is caused felt like magic.
Firefox 137.0 released with vertical tabs (mozilla.org)
Firefox’s new sidebar lets you move tabs to the side, pin key sites and keep your AI assistant handy.
Search-R1: Training LLMs to Reason and Leverage Search Engines with RL (arxiv.org)
Efficiently acquiring external knowledge and up-to-date information is essential for effective reasoning and text generation in large language models (LLMs).
GitSummarize – Generate living documentation from any GitHub repo (gitsummarize.com)
Turn any GitHub repository into a comprehensive AI-powered documentation hub.
Show HN: Arrakis – Open-source, self-hostable sandboxing service for AI Agents (github.com/abshkbh)
AI agents can generate malicious or buggy code that can attack the host system its run on.
Scaling Up Reinforcement Learning for Traffic Smoothing (bair.berkeley.edu)
We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone.
Wikipedia is struggling with voracious AI bot crawlers (engadget.com)
Wikimedia has seen a 50 percent increase in bandwidth used for downloading multimedia content since January 2024, the foundation said in an update. But it's not because human readers have suddenly developed a voracious appetite for consuming Wikipedia articles and for watching videos or downloading files from Wikimedia Commons. No, the spike in usage came from AI crawlers, or automated programs scraping Wikimedia's openly licensed images, videos, articles and other files to train generative artificial intelligence models.
We need a better term for GenAI output – "slop" is too benign (rockpapershotgun.com)
Earlier this month, Snail Games put out a widely and justifiably clowned-on genAI trailer for Ark: Survival Evolved's Aquatica DLC.
Kai Scheduler: Kubernetes Native scheduler for AI workloads at large scale (github.com/NVIDIA)
KAI Scheduler is a robust, efficient, and scalable Kubernetes scheduler that optimizes GPU resource allocation for AI and machine learning workloads.
Show HN: VaporVibe – auto-generate video demos for vibe-coded projects (influme.ai)
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad (arxiv.org)
Recent math benchmarks for large language models (LLMs) such as MathArena indicate that state-of-the-art reasoning models achieve impressive performance on mathematical competitions like AIME, with the leading model, o3-mini, achieving scores comparable to top human competitors.
Text to Bark from ElevenLabs [video] (youtube.com)
Why MCP Is Mostly Bullshit (lycee.ai)
If you follow the AI space closely, you’ve surely noticed the increased interest in MCP (Model Context Protocol).
How AI is creating a rift at McKinsey, Bain, and BCG (the-ken.com)
Clients’ increasing access to AI tools is transforming the way consulting firms operate
Show HN: Qwen-2.5-32B is now the best open source OCR model (github.com/getomni-ai)
A benchmarking tool that compares OCR and data extraction capabilities of different large multimodal models such as gpt-4o, evaluating both text and json extraction accuracy. The goal of this benchmark is to publish a comprehensive benchmark of OCR accuracy across traditional OCR providers and multimodal Language Models. The evaluation dataset and methodologies are all Open Source, and we encourage expanding this benchmark to encompass any additional providers.
Dual RTX 5090 Beats $25,000 H100 in Real-World LLM Performance (hardware-corner.net)
AI enthusiasts looking for top-tier performance in local LLMs have long considered NVIDIA’s H100 to be the gold standard for inference, thanks to its high-bandwidth HBM3 memory and optimized tensor cores. However, recent benchmarks show that a dual RTX 5090 setup, while still pricey, outperforms the H100 in sustained output token generation, making it an ideal choice for those seeking the best possible performance for home use, especially for models up to 70B parameters.
NaNoWriMo shut down after AI, content moderation scandals (techcrunch.com)
NaNoWriMo, a 25-year-old online writing community-turned-nonprofit, announced on Monday evening that it is shutting down.
LLM providers on the cusp of an 'extinction' phase as capex realities bite (theregister.com)
Gartner says the market for large language model (LLM) providers is on the cusp of an extinction phase as it grapples with the capital-intensive costs of building products in a competitive market.
OpenAI plans to release a new 'open' AI language model in the coming months (techcrunch.com)
OpenAI says that it intends to release its first “open” language model since GPT‑2 “in the coming months.”
Show HN: Neuronpedia, an open source platform for AI interpretability (neuronpedia.org)
Neuronpedia is an open source interpretability platform.
Ask HN: What's the best way to get started with LLM-assisted programing? (ycombinator.com)
Currently, I use Perplexity or ChatGPT via the web prompt for small coding tasks, but sometimes I'll use Ollama. Stuff like writing a shell script perform some task, or maybe a small Python function. I'd like to get to the next level, but I don't know where to start.