Hacker News with Generative AI: Language Models

Using ChatGPT is not bad for the environment (andymasley.substack.com)
If you don’t have time to read this post, these four graphs give most of the argument:
Read CV Acquired by Perplexity (read.cv)
Since 2021 we've had the immense privilege of putting our whole selves into building and growing Read.cv. We're tremendously proud of what we've accomplished, and the wonderful community that has blossomed around it.
She Is in Love with ChatGPT (nytimes.com)
OpenAI's AI reasoning model 'thinks' in Chinese sometimes, no one knows why (techcrunch.com)
Shortly after OpenAI released o1, its first “reasoning” AI model, people began noting a curious phenomenon. The model would sometimes begin “thinking” in Chinese, Persian, or some other language — even when asked a question in English.
Maybe ChatGPT has some pre-frontal cortex problems (solresol.substack.com)
People have been complaining that ChatGPT has been degrading with each new version. This sounds like cognitive decline! Let’s administer some tests that might detect incipent dementia.
Phi-4 weights have been released under MIT license (huggingface.co)
phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures
Claude Tries Standup (simonwillison.net)
Speaking of death, you know what's really awkward? When humans ask if I can feel emotions. I'm like, "Well, that depends - does constantly being asked to debug JavaScript count as suffering?"
smolagents: A simple library to build AI agents (huggingface.co)
Today we are launching smolagents, a very simple library that unlocks agentic capabilities for language models. Here’s a glimpse:
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? (2023) (arxiv.org)
Language models (LMs) are powerful tools for natural language processing, but they often struggle to produce coherent and fluent text when they are small.
Fabrice Bellard's Ts_SMS: Short Message Compression Using LLM (bellard.org)
ts_sms: Short Message Compression using Large Language Models
Voice2Anki: FOSS tool to turn many audio into flashcards (github.com/thiswillbeyourgithub)
Voice2Anki is a tool that leverages the power of LLMs (think ChatGPT) to correct the transcription of TTS (text to speech, think openai's whisper) models to create Anki flashcards.
OpenAI Announces New O3 Model (techcrunch.com)
OpenAI saved its biggest announcement for the last day of its 12-day “shipmas” event.
A Replacement for BERT (huggingface.co)
This blog post introduces ModernBERT, a family of state-of-the-art encoder-only models representing improvements over older generation encoders across the board, with a 8192 sequence length, better downstream performance and much faster processing.
Multilspy: Building a common LSP client handtuned for all Language servers (github.com/microsoft)
This repository hosts multilspy, a library developed as part of research conducted for NeruIPS 2023 paper titled "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context" ("Guiding Language Models of Code with Global Context using Monitors" on Arxiv).
Bringing Grok to Everyone (x.ai)
Grok is now faster, sharper, and has improved multilingual support. It is available to everyone on the 𝕏 platform.
The Clever Hans Effect, Iterative LLM Prompting, and Socrates' Meno (aalokbhattacharya.substack.com)
Artificial intelligence (AI) is inextricably linked to natural intelligence; however, for longer than this field has existed, philosophers and computer scientists have questioned whether or not such a link with regards to intelligence is tenable to begin with.
Google's NotebookLM now lets you to talk to its AI podcast hosts (techcrunch.com)
A few months ago, Google’s NotebookLM note-taking app debuted an Audio Overviews feature that generates a podcast with AI virtual hosts based on information you have shared with the app.
A Confederacy of Models: A Comprehensive Evaluation of LLMs on Creative Writing (aclanthology.org)
We evaluate a range of recent LLMs on English creative writing, a challenging and complex task that requires imagination, coherence, and style.
PaliGemma 2: Powerful Vision-Language Models, Simple Fine-Tuning (googleblog.com)
Building custom, advanced AI that can "see" used to be a complex and resource-intensive endeavor. Not anymore. This past May, we launched PaliGemma, the first vision-language model in the Gemma family, taking a significant step toward making class-leading visual AI more accessible. Now, we're thrilled to introduce PaliGemma 2, the next evolution in tunable vision-language models.
OpenAI reveals why ChatGPT won't say "David Mayer" (neowin.net)
If you were online last weekend, you might’ve seen strange news about a guy named David Mayer. He wasn’t trending because of a major event or some viral moment but because of a weird glitch in ChatGPT. Users couldn’t get the chatbot to spit out his name, no matter how hard they tried. Instead, it either froze mid-sentence, claimed “something went wrong,” or flat-out refused to respond.
Certain names make ChatGPT grind to a halt, and we know why (arstechnica.com)
OpenAI's ChatGPT is more than just an AI language model with a fancy interface. It's a system consisting of a stack of AI models and content filters that make sure its outputs don't embarrass OpenAI or get the company into legal trouble when its bot occasionally makes up potentially harmful facts about people.
ChatGPT refuses to say one specific name "David Mayer" – and people are worried (independent.co.uk)
ChatGPT users have spotted an unusual glitch that prevents the AI chatbot from saying the name ‘David Mayer’.
CleaR: Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Labels (arxiv.org)
Parameter-efficient fine-tuning (PEFT) has enabled the efficient optimization of cumbersome language models in real-world settings. However, as datasets in such environments often contain noisy labels that adversely affect performance, PEFT methods are inevitably exposed to noisy labels.
Conversational Game Theory (aikiwiki.com)
Conversational Game Theory is a formal game of messaging, replies and tagging between human or AI agents that addresses qualitative, reasoning-intensive challenges while synthesizing multiple perspectives through high stakes conflict resolution and consensus building through narrative structures and feedback loops.
32k context length text embedding models (voyageai.com)
TL;DR – We are excited to announce voyage-3 and voyage-3-lite embedding models, advancing the frontier of retrieval quality, latency, and cost. voyage-3 outperforms OpenAI v3 large by 7.55% on average across all evaluated domains, including code, law, finance, multilingual, and long-context, with 2.2x lower costs and 3x smaller embedding dimension, resulting in 3x lower vectorDB costs. voyage-3-lite offers 3.82% better retrieval accuracy than OpenAI v3 large while costing 6x less and having 6x smaller embedding dimension.
AI can learn to think before it speaks (ft.com)
AI can learn to think before it speaks
AI-generated poetry is indistinguishable from human-written and more favorably (nature.com)
As AI-generated text continues to evolve, distinguishing it from human-authored content has become increasingly difficult.
Our brains are vector databases – here's why that's helpful when using AI (venturebeat.com)
In 2014, a breakthrough at Google transformed how machines understand language: The self-attention model. This innovation allowed AI to grasp context and meaning in human communication by treating words as mathematical vectors — precise numerical representations that capture relationships between ideas. Today, this vector-based approach has evolved into sophisticated vector databases, systems that mirror how our own brains process and retrieve information.
LLäMmlein 1B and 120M – German-only decoder models (uni-wuerzburg.de)
We created two German-only decoder models, LLäMmlein 120M and 1B, from scratch.
Image-Text Curation for 1B+ Data: Faster, Better, Smaller Clip Models (datologyai.com)