Hacker News with Generative AI: Language Models

Judges Shouldn't Rely on AI for the Ordinary Meaning of Text (lawfaremedia.org)
Judges are debating how large language models (LLMs) should fit into judicial work. One popular idea is to consult LLMs for the “ordinary meaning” of text, a key issue in statutory interpretation. At first glance, this may seem promising: These models, trained on massive amounts of human language, should reflect everyday usage.
Understand anything, anywhere with the new NotebookLM app (google)
Listen to Audio Overviews on the go with the new iOS and Android apps.
Internet Search Is Not a Naive Information Retrieval Problem (gojiberries.io)
The research demonstrates something interesting about language models' ability to simulate search behavior in controlled conditions. But claiming equivalence to a "real search engine" is like saying you've built a military defense system because your soldiers performed well in peacetime maneuvers. The real test isn't whether it works when nobody's trying to break it—it's whether it works when half the internet is trying to game it for profit.
Large Language Models Are More Persuasive Than Incentivized Human Persuaders (arxiv.org)
We directly compare the persuasion capabilities of a frontier large language model (LLM; Claude Sonnet 3.5) against incentivized human persuaders in an interactive, real-time conversational quiz setting.
Experimentation Matters: Why Nuenki isn't using pairwise evaluations (nuenki.app)
Nuenki's old language translation quality benchmark used a simple system where a suite of LLMs would score the outputs of other LLMs between 1 and 10.
ChatGPT may be polite, but it's not cooperating with you (theguardian.com)
Big tech companies have exploited human language for AI gain. Now they want us to see their products as trustworthy collaborators
ChatGPT may be polite, but it's not cooperating with you (theguardian.com)
Big tech companies have exploited human language for AI gain. Now they want us to see their products as trustworthy collaborators
ChatGPT Blows Mapmaking 101 (garymarcus.substack.com)
People keep telling me ChatGPT is smart. Is it really?
Block Diffusion: Interpolating Autoregressive and Diffusion Language Models (m-arriola.com)
Diffusion language models offer unique benefits over autoregressive models due to their potential for parallelized generation and controllability, yet they lag in likelihood modeling and are limited to fixed-length generation. In this work, we introduce a class of block diffusion language models that interpolate between discrete denoising diffusion and autoregressive models.
Hypermode Model Router Preview – OpenRouter Alternative (hypermode.com)
Today, we’re excited to introduce Model Router, a powerful new feature in Hypermode that enables developers to connect to both open-source and commercial language models through a single, unified API.
A Brief History of Cursor's Tab-Completion (coplay.dev)
In building our own Unity tab-completion feature, I explored existing solutions and their fascinating history. This led me to the story of Cursor acquiring Babble, the best tab-completion model available today. Originating from Jacob Jackson's early code-completion tool TabNine, Babble revolutionized the space by using edit sequences for training—far superior to traditional Fill-in-the-Middle methods. With a groundbreaking 1M context window, Babble dramatically outperformed competitors in speed and scope.
ChatGPT Hyphen Causing Issues? (rollingstone.com)
In the escalating arms race between AI models like ChatGPT and human beings trying to determine if what they’re reading was machine-generated, there is no easy or surefire way to spot the bot — or is there?
You sent the message, but did you write it? (davidduncan.substack.com)
Last week, I got a message from someone I’ve known for ten years. It was articulate, thoughtful…and definitely not written by him.
Russian disinformation network flooded training data to manipulate frontier LLMs (meduza.io)
A Russian disinformation network called Pravda (“Truth”) has influenced leading AI chatbots’ output by publishing numerous articles that made their way into the bots’ training data, a new report from the analysis group NewsGuard reveals.
Making of Monkeys.zip (lukeschaefer.dev)
It’s been one month since the launch of monkeys.zip. In that time, we’ve gathered over 11,000 monkeys, which have written over 6 billion words - completing well over 75% of the words in Shakespeare’s works. In fact, they’ve recently finished writing every four-letter word!
When ChatGPT broke the field of NLP: An oral history (quantamagazine.org)
Asking scientists to identify a paradigm shift, especially in real time, can be tricky. After all, truly ground-shifting updates in knowledge may take decades to unfold. But you don’t necessarily have to invoke the P-word to acknowledge that one field in particular — natural language processing, or NLP — has changed. A lot.
Phi Silica, small but mighty on-device SL (windows.com)
This blog is the first installment in a new series of technical content designed to provide insights into the AI innovation on Windows. Today we will share how the Applied Sciences team used a multi-interdisciplinary approach to achieve a breakthrough in power efficiency, inference speed and memory efficiency for a state-of-the-art small language model (SLM), Phi Silica.
AI suggestions make writing more generic, Western (news.cornell.edu)
Artificial intelligence-based writing assistants are popping up everywhere – from phones to email apps to social media platforms.
NotebookLM Audio Overviews are now available in over 50 languages (google)
Audio Overviews are now multilingual, and you can try it out today.
Do Large Language Models know who did what to whom? (arxiv.org)
Large Language Models (LLMs) are commonly criticized for not understanding language. However, many critiques focus on cognitive abilities that, in humans, are distinct from language processing. Here, we instead study a kind of understanding tightly linked to language: inferring who did what to whom (thematic roles) in a sentence.
Analysis of US congressional speeches reveals a shift from evidence to intuition (nature.com)
Pursuit of honest and truthful decision-making is crucial for governance and accountability in democracies.
Introducing Arcana: AI Voices with Vibes (rime.ai)
Rime's newest spoken language model is the most realistic you've ever heard.
Values in the wild: Discovering values in real-world language model interactions (anthropic.com)
People don’t just ask AIs for the answers to equations, or for purely factual information. Many of the questions they ask force the AI to make value judgments.
Show HN: Dia, an open-weights TTS model for generating realistic dialogue (github.com/nari-labs)
Dia is a 1.6B parameter text to speech model created by Nari Labs.
New ChatGPT Models Seem to Leave Watermarks on Text (rumidocs.com)
The newer GPT-o3 and GPT-o4 mini models appear to be embedding special character watermarks in generated text.
To Make Language Models Work Better, Researchers Sidestep Language (quantamagazine.org)
Language isn’t always necessary. While it certainly helps in getting across certain ideas, some neuroscientists have argued that many forms of human thought and reasoning don’t require the medium of words and grammar. Sometimes, the argument goes, having to turn ideas into language actually slows down the thought process.
ChatGPT now performs well at GeoGuesser (flausch.social)
TeapotLLM- an open-source <1B model for hallucination-resistant Q&A on a CPU (huggingface.co)
Teapot is an open-source small language model (~800 million parameters) fine-tuned on synthetic data and optimized to run locally on resource-constrained devices such as smartphones and CPUs. Teapot is trained to only answer using context from documents, reducing hallucinations. Teapot can perform a variety of tasks, including hallucination-resistant Question Answering (QnA), Retrieval-Augmented Generation (RAG), and JSON extraction. Teapot is a model built by and for the community.
AI generated text is forbidden with the exception of automated translation (grapheneos.org)
Something went wrong while trying to load the full version of this site. Try hard-refreshing this page to fix the error.
Liquid: Language models are scalable and unified multi-modal generators (foundationvision.github.io)
We present Liquid, an auto-regressive generation paradigm that seamlessly integrates visual comprehension and generation by tokenizing images into discrete codes and learning these code embeddings alongside text tokens within a shared feature space for both vision and language.