Hacker News with Generative AI: Generative AI

Exploring model welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.
The Policy Puppetry Attack: Novel bypass for major LLMs (hiddenlayer.com)
Researchers at HiddenLayer have developed the first, post-instruction hierarchy, universal, and transferable prompt injection technique that successfully bypasses instruction hierarchy and safety guardrails across all major frontier AI models.
The State of Reinforcement Learning for LLM Reasoning (sebastianraschka.com)
A lot has happened this month, especially with the releases of new flagship models like GPT-4.5 and Llama 4. But you might have noticed that reactions to these releases were relatively muted. Why? One reason could be that GPT-4.5 and Llama 4 remain conventional models, which means they were trained without explicit reinforcement learning for reasoning.
Docker Model Runner Brings Local LLMs to Your Desktop (thenewstack.io)
Show HN: Lemon Slice Live – Have a video call with a transformer model (ycombinator.com)
Hey HN, this is Lina, Andrew, and Sidney from Lemon Slice. We’ve trained a custom diffusion transformer (DiT) model that achieves video streaming at 25fps and wrapped it into a demo that allows anyone to turn a photo into a real-time, talking avatar.
One Prompt Can Bypass Every Major LLM's Safeguards (forbes.com)
Show HN: High-performance GenAI engine now open source (github.com/arthur-ai)
Make AI work for Everyone - Monitoring and governing for your AI/ML
Ask HN: Share your AI prompt that stumps every model (ycombinator.com)
I had an idea for creating a crowdsourced database of AI prompts that no AI model could yet crack.
Llama 4 Smells Bad (fastml.com)
Meta has distinguished itself positively by releasing three generations of Llama, a semi-open LLM with weights available if you ask nicely (and provide your full legal name, date of birth, and full organization name with all corporate identifiers). So no, it’s not open source. Anyway, on Saturday (!) May the 5th, Cinco de Mayo, Meta released Llama 4.
Major Concern – Google Gemini 2.5 Research Preview (ycombinator.com)
Does anyone else feel like Google Gemini 2.5 Research Preview has been created with the exact intent of studying the effects of using indirect and clarifying/qualifying language?
Teaching LLMs how to solid model (willpatrick.xyz)
It turns out that LLMs can make CAD models for simple 3D mechanical parts. And, I think they’ll be extremely good at it soon.
Advancing Invoice Document Processing at Uber Using GenAI (uber.com)
The Outlook for Programmers (cacm.acm.org)
The job market for programmers is cooling, part of the continuing impact of generative AI and large language models.
Values in the wild: Discovering values in real-world language model interactions (anthropic.com)
People don’t just ask AIs for the answers to equations, or for purely factual information. Many of the questions they ask force the AI to make value judgments.
New ChatGPT Models Seem to Leave Watermarks on Text (rumidocs.com)
The newer GPT-o3 and GPT-o4 mini models appear to be embedding special character watermarks in generated text.
Lessons Learned Writing a Book Collaboratively with LLMs (ycombinator.com)
I recently finished a months-long project collaborating intensively with various LLMs (ChatGPT, Claude, Gemini) to write a book about using AI in management.
More than 100 public software companies are getting 'squeezed' by AI (businessinsider.com)
A foundational shift is underway in enterprise software, and it's being driven by generative AI.
Making AI-generated code more accurate in any language (news.mit.edu)
Programmers can now use large language models (LLMs) to generate computer code more quickly. However, this only makes programmers’ lives easier if that code follows the rules of the programming language and doesn’t cause a computer to crash.
LLM-powered tools amplify developer capabilities rather than replacing them (matthewsinclair.com)
Last month, I used Claude Code to build two apps: an MVP for a non-trivial backend agent processing platform and the early workings of a reasonably complex frontend for a B2C SaaS product. Together, these projects generated approximately 30k lines of code (and about the same amount again thrown away over the course of the exercise). The experience taught me something important about AI and software development that contradicts much of the current narrative.
New ChatGPT Models Seem to Leave Watermarks on Text (rumidocs.com)
The newer GPT-o3 and GPT-o4 mini models appear to be embedding special character watermarks in generated text.
The State of Reinforcement Learning for LLM Reasoning (sebastianraschka.com)
A lot has happened this month, especially with the releases of new flagship models like GPT-4.5 and Llama 4. But you might have noticed that reactions to these releases were relatively muted. Why? One reason could be that GPT-4.5 and Llama 4 remain conventional models, which means they were trained without explicit reinforcement learning for reasoning.
To Make Language Models Work Better, Researchers Sidestep Language (quantamagazine.org)
Language isn’t always necessary. While it certainly helps in getting across certain ideas, some neuroscientists have argued that many forms of human thought and reasoning don’t require the medium of words and grammar. Sometimes, the argument goes, having to turn ideas into language actually slows down the thought process.
Jagged AGI: o3, Gemini 2.5, and everything after (oneusefulthing.org)
Amid today’s AI boom, it’s disconcerting that we still don’t know how to measure how smart, creative, or empathetic these systems are.
Maybe Meta's Llama claims to be open source because of the EU AI act (simonwillison.net)
I encountered a theory a while ago that one of the reasons Meta insist on using the term “open source” for their Llama models despite the Llama license not actually conforming to the terms of the Open Source Definition is that the EU’s AI act includes special rules for open source models without requiring OSI compliance.
Show HN: LettuceDetect – Lightweight hallucination detector for RAG pipelines (github.com/KRLabsOrg)
LettuceDetect is a lightweight and efficient tool for detecting hallucinations in Retrieval-Augmented Generation (RAG) systems. It identifies unsupported parts of an answer by comparing it to the provided context.
Inferring the Phylogeny of Large Language Models (arxiv.org)
This paper introduces PhyloLM, a method adapting phylogenetic algorithms to Large Language Models (LLMs) to explore whether and how they relate to each other and to predict their performance characteristics.
New OpenAI models hallucinate more (slashdot.org)
CaMeL: Defeating Prompt Injections by Design (arxiv.org)
Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an external environment.
Hands-On Large Language Models (github.com/HandsOnLLM)
Why R the Critical Value and Emergent Behavior of Large Language Models Fake? (cacm.acm.org)
Why there are no emergent properties in Large Language Models.