LLM providers on the cusp of an 'extinction' phase as capex realities bite
(theregister.com)
Gartner says the market for large language model (LLM) providers is on the cusp of an extinction phase as it grapples with the capital-intensive costs of building products in a competitive market.
Gartner says the market for large language model (LLM) providers is on the cusp of an extinction phase as it grapples with the capital-intensive costs of building products in a competitive market.
OpenAI plans to release a new 'open' AI language model in the coming months
(techcrunch.com)
OpenAI says that it intends to release its first “open” language model since GPT‑2 “in the coming months.”
OpenAI says that it intends to release its first “open” language model since GPT‑2 “in the coming months.”
Everything is Ghibli
(carly.substack.com)
OpenAI unleashed its native image generation in ChatGPT on Tuesday. By Wednesday morning, every social feed was drowning in Studio Ghibli-style portraits. (Linkedin, check back next week.) What happened—and why—is another signal of where AI, art, and our attention are headed.
OpenAI unleashed its native image generation in ChatGPT on Tuesday. By Wednesday morning, every social feed was drowning in Studio Ghibli-style portraits. (Linkedin, check back next week.) What happened—and why—is another signal of where AI, art, and our attention are headed.
GPT-4o helped me re-create classic cartoons with myself as a character
(twitter.com)
Something went wrong, but don’t fret — let’s give it another shot.
Something went wrong, but don’t fret — let’s give it another shot.
LLM Workflows then Agents: Getting Started with Apache Airflow
(github.com/astronomer)
This repository contains an SDK for working with LLMs from Apache Airflow, based on Pydantic AI. It allows users to call LLMs and orchestrate agent calls directly within their Airflow pipelines using decorator-based tasks. The SDK leverages the familiar Airflow @task syntax with extensions like @task.llm, @task.llm_branch, and @task.agent.
This repository contains an SDK for working with LLMs from Apache Airflow, based on Pydantic AI. It allows users to call LLMs and orchestrate agent calls directly within their Airflow pipelines using decorator-based tasks. The SDK leverages the familiar Airflow @task syntax with extensions like @task.llm, @task.llm_branch, and @task.agent.
RLHF Is Cr*P, It's a Paint Job on a Rusty Car: Geoffrey Hinton
(officechai.com)
RLHF, or Reinforcement Learning from Human Feedback, is behind some of the recent advances in AI, but one of the pioneers of the field doesn’t think highly of it.
RLHF, or Reinforcement Learning from Human Feedback, is behind some of the recent advances in AI, but one of the pioneers of the field doesn’t think highly of it.
Amazon introduces Nova Chat
(aboutamazon.com)
Amazon makes it easier for developers and tech enthusiasts to explore Amazon Nova, its advanced Gen AI models
Amazon makes it easier for developers and tech enthusiasts to explore Amazon Nova, its advanced Gen AI models
Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison
(composio.dev)
Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I mostly care about how the model compares against the best available coding model, Claude 3.7 Sonnet (thinking), released at the end of February, which I have been using, and it has been a great experience
Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I mostly care about how the model compares against the best available coding model, Claude 3.7 Sonnet (thinking), released at the end of February, which I have been using, and it has been a great experience
GPT-4o draws itself as a consistent type of guy
(danielpaleka.com)
When asked to draw itself as a person, the ChatGPT Create Image feature introduced on March 25, 2025, consistently portrays itself as a white male in his 20s with brown hair, often sporting facial hair and glasses.
When asked to draw itself as a person, the ChatGPT Create Image feature introduced on March 25, 2025, consistently portrays itself as a white male in his 20s with brown hair, often sporting facial hair and glasses.
AI Experts Say We're on the Wrong Path to Achieving Human-Like AI
(gizmodo.com)
According to a panel of hundreds of artificial intelligence researchers, the field is currently pursuing artificial general intelligence the wrong way.
According to a panel of hundreds of artificial intelligence researchers, the field is currently pursuing artificial general intelligence the wrong way.
What Anthropic Researchers Found After Reading Claude's 'Mind' Surprised Them
(singularityhub.com)
Despite popular analogies to thinking and reasoning, we have a very limited understanding of what goes on in an AI’s “mind.”
Despite popular analogies to thinking and reasoning, we have a very limited understanding of what goes on in an AI’s “mind.”
What is Zombie Prompting: in 5 simple images
(twitter.com)
Something went wrong, but don’t fret — let’s give it another shot.
Something went wrong, but don’t fret — let’s give it another shot.
Qwen2.5-Omni Technical Report
(huggingface.co)
In this report, we present Qwen2.5-Omni, an end-to-end multimodal model designed to perceive diverse modalities, including text, images, audio, and video, while simultaneously generating text and natural speech responses in a streaming manner.
In this report, we present Qwen2.5-Omni, an end-to-end multimodal model designed to perceive diverse modalities, including text, images, audio, and video, while simultaneously generating text and natural speech responses in a streaming manner.
Karpathy's 'Vibe Coding' Movement Considered Harmful
(nmn.gl)
Last Tuesday at 1 AM, I was debugging a critical production issue in my AI dev tool. As I dug through layers of functions, I suddenly realized — unlike the new generation of developers, I was grateful that I could actually understand my codebase. That’s when I started thinking more about Karpathy’s recent statements on vibe coding.
Last Tuesday at 1 AM, I was debugging a critical production issue in my AI dev tool. As I dug through layers of functions, I suddenly realized — unlike the new generation of developers, I was grateful that I could actually understand my codebase. That’s when I started thinking more about Karpathy’s recent statements on vibe coding.
Show HN: I built a tool that generates quizzes from documents using LLMs
(ycombinator.com)
Hey everyone! I recently built this little side project that takes any document you upload and turns it into practice quizzes using LLMs to generate the questions: https://www.cuiz-ai.com
Hey everyone! I recently built this little side project that takes any document you upload and turns it into practice quizzes using LLMs to generate the questions: https://www.cuiz-ai.com
Ask HN: Are LLMs just answering what we want to hear?
(ycombinator.com)
I keep seeing those tweets and posts where users ask ChatGPT or a similar LLM to describe them etc... and it always answers positive cool stuff which reinforces what the user wants to hear.
I keep seeing those tweets and posts where users ask ChatGPT or a similar LLM to describe them etc... and it always answers positive cool stuff which reinforces what the user wants to hear.
ChatGPT is shifting rightwards politically
(psypost.org)
An examination of a large number of ChatGPT responses found that the model consistently exhibits values aligned with the libertarian-left segment of the political spectrum. However, newer versions of ChatGPT show a noticeable shift toward the political right. The paper was published in Humanities & Social Sciences Communications.
An examination of a large number of ChatGPT responses found that the model consistently exhibits values aligned with the libertarian-left segment of the political spectrum. However, newer versions of ChatGPT show a noticeable shift toward the political right. The paper was published in Humanities & Social Sciences Communications.
We hacked Gemini's Python sandbox and leaked its source code (at least some)
(landh.tech)
In 2024 we released the blog post We Hacked Google A.I. for $50,000, where we traveled in 2023 to Las Vegas with Joseph "rez0" Thacker, Justin "Rhynorater" Gardner, and myself, Roni "Lupin" Carta, on a hacking journey that spanned from Las Vegas, Tokyo to France, all in pursuit of Gemini vulnerabilities during Google's LLM bugSWAT event. Well, we did it again …
In 2024 we released the blog post We Hacked Google A.I. for $50,000, where we traveled in 2023 to Las Vegas with Joseph "rez0" Thacker, Justin "Rhynorater" Gardner, and myself, Roni "Lupin" Carta, on a hacking journey that spanned from Las Vegas, Tokyo to France, all in pursuit of Gemini vulnerabilities during Google's LLM bugSWAT event. Well, we did it again …
Gemini hackers can deliver more potent attacks with a helping hand from Gemini
(arstechnica.com)
Hacking LLMs has always been more art than science. A new attack on Gemini could change that.
Hacking LLMs has always been more art than science. A new attack on Gemini could change that.
OpenAI's Ghibli frenzy took a dark turn real fast
(businessinsider.com)
From meme madness to copyright concerns, the release of OpenAI's new image generator this week has been nothing short of dramatic.
From meme madness to copyright concerns, the release of OpenAI's new image generator this week has been nothing short of dramatic.
The Jailbreak Bible
(generalanalysis.com)
The rapid evolution of Large Language Models (LLMs) has unlocked remarkable new possibilites, but with these advances come unexpected blind spots. Even rigorously safety-aligned LLMs can be subtly manipulated through carefully designed adversarial prompts, commonly known as "jailbreaks." By exploiting linguistic nuances, these jailbreaks can sidestep safeguards, enabling models to divulge toxic content, propagate misinformation, or even disclose detailed instructions related to dangerous chemical, biological, radiological, and nuclear threats (Anthropic, 2023a).
The rapid evolution of Large Language Models (LLMs) has unlocked remarkable new possibilites, but with these advances come unexpected blind spots. Even rigorously safety-aligned LLMs can be subtly manipulated through carefully designed adversarial prompts, commonly known as "jailbreaks." By exploiting linguistic nuances, these jailbreaks can sidestep safeguards, enabling models to divulge toxic content, propagate misinformation, or even disclose detailed instructions related to dangerous chemical, biological, radiological, and nuclear threats (Anthropic, 2023a).
FFN Fusion: Rethinking Sequential Computation in Large Language Models
(arxiv.org)
We introduce FFN Fusion, an architectural optimization technique that reduces sequential computation in large language models by identifying and exploiting natural opportunities for parallelization.
We introduce FFN Fusion, an architectural optimization technique that reduces sequential computation in large language models by identifying and exploiting natural opportunities for parallelization.
In a first, OpenAI removes influence operations tied to Russia, China and Israel
(npr.org)
OpenAI, the company behind generative artificial intelligence tools such as ChatGPT, announced Thursday that it had taken down influence operations tied to Russia, China and Iran.
OpenAI, the company behind generative artificial intelligence tools such as ChatGPT, announced Thursday that it had taken down influence operations tied to Russia, China and Iran.
Parameter-free KV cache compression for memory-efficient long-context LLMs
(arxiv.org)
The linear growth of key-value (KV) cache memory and quadratic computational complexity pose significant bottlenecks for large language models (LLMs) in long-context processing.
The linear growth of key-value (KV) cache memory and quadratic computational complexity pose significant bottlenecks for large language models (LLMs) in long-context processing.
Circuit Tracing: Revealing Computational Graphs in Language Models
(transformer-circuits.pub)
We introduce a method to uncover mechanisms underlying behaviors of language models. We produce graph descriptions of the model’s computation on prompts of interest by tracing individual computational steps in a “replacement model”. This replacement model substitutes a more interpretable component (here, a “cross-layer transcoder”) for parts of the underlying model (here, the multi-layer perceptrons) that it is trained to approximate.
We introduce a method to uncover mechanisms underlying behaviors of language models. We produce graph descriptions of the model’s computation on prompts of interest by tracing individual computational steps in a “replacement model”. This replacement model substitutes a more interpretable component (here, a “cross-layer transcoder”) for parts of the underlying model (here, the multi-layer perceptrons) that it is trained to approximate.
New Function Calling Guide for Gemini
(google.dev)
Function calling lets you connect models to external tools and APIs. Instead of generating text responses, the model understands when to call specific functions and provides the necessary parameters to execute real-world actions. This allows the model to act as a bridge between natural language and real-world actions and data. Function calling has 3 primary use cases:
Function calling lets you connect models to external tools and APIs. Instead of generating text responses, the model understands when to call specific functions and provides the necessary parameters to execute real-world actions. This allows the model to act as a bridge between natural language and real-world actions and data. Function calling has 3 primary use cases: