Hacker News with Generative AI: Language Models

DeepSeek Native Sparse Attention (arxiv.org)
Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges.
Mistral Saba (mistral.ai)
Making AI ubiquitous requires addressing every culture and language. As AI proliferates globally, many of our customers worldwide have expressed a strong desire for models that are not just fluent but native to regional parlance.
Is ChatGPT autocomplete bad UX/UI? (honzabe.com)
I get it. I am not the only user the world revolves around. When an app does not behave the way I would prefer, it’s probably because most people have different preferences, and the app is optimized for them.
A woman made her AI voice clone say "arse." Then she got banned (technologyreview.com)
People with motor neuron disease should be allowed to say whatever they want, including “arse” and “knickers.”
Surrealist Compliment Generator (madsci.org)
<h2>May your succulent earlobes ever flap about my knees like a thousand wooden pigeons fleeing the local sawmill.</h2>
PlayAI's new Dialog model achieves 3:1 preference in human evals (play.ht)
PlayAI’s Dialog Text-to-Speech model is now in general availability, bringing multilingual capabilities, and exceptional performance to applications requiring emotive, human-like speech.  In recent third-party benchmark tests, Dialog was preferred by 10:1 vs. ElevenLabs v2.5 Turbo, and by over 3:1 vs. ElevenLabs Multilingual v2.0.Play the video below to find out what it sounds like, or visit our AI voiceover Studio to try it for yourself.
OpenEuro LLM (openeurollm.eu)
Europe's leading AI companies and research institutions combine their forces and expertise to develop next-generation open-source language models in an unprecedented collaboration to advance European AI capabilities, the OpenEuroLLM project
OpenAI used this subreddit to test AI persuasion (techcrunch.com)
OpenAI used the subreddit, r/ChangeMyView, to create a test for measuring the persuasive abilities of its AI reasoning models.
Alibaba Qwen: AI model that writes, generates images/videos, and does web search (twitter.com)
Selene Mini: Open-sourced SOTA small language-model-as-a-judge (huggingface.co)
Atla Selene Mini is a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini achieves comparable performance to models 10x its size, outperforming GPT-4o on RewardBench, EvalBiasBench, and AutoJ.
I do not want AI to "polish" me (thebloggess.com)
I was sending an email when a little magic wand popped up that said “Polish” and I thought that was weird because why would I want to translate my email into Polish?
DeepSeek demonstrates pro-Chinese bias (medium.com)
DeepSeek is a wonderful step in the development of open AI approaches. It also has a pretty serious pro-Chinese bias. I compare the results of 3 sensitive questions (about Gaza, Xinjiang and TikTok) and on all three, the Chinese bias is pretty apparent while existing tools (ChatGPT, Gemini) are far more balanced. In two instances, it used the pronoun “we” to describe the Chinese position, which suggests lots of training data that associates “we” with the Chinese.
The DeepSeek panic reveals an AI world ready to blow (theguardian.com)
The arrival of DeepSeek R1, an AI language model built by the Chinese AI lab DeepSeek, has been nothing less than seismic.
Translation using deep neural networks (part 1) (aamster.github.io)
In this article, I’ll introduce language modeling using deep learning and will focus on the problem of translation.
OpenAI's o1 Playing Codenames (suveenellawela.com)
I got two teams of OpenAI's o1 models to play the boardgame, Codenames, and they didn't disappoint.
Tensor Product Attention Is All You Need (arxiv.org)
Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference.
Anthropic's Race to Build a Smarter Claude and Human-Level AI [video] (youtube.com)
Nuclear fusion with Claude 3.5 Sonnet (twitter.com)
Using ChatGPT is not bad for the environment (andymasley.substack.com)
If you don’t have time to read this post, these four graphs give most of the argument:
Read CV Acquired by Perplexity (read.cv)
Since 2021 we've had the immense privilege of putting our whole selves into building and growing Read.cv. We're tremendously proud of what we've accomplished, and the wonderful community that has blossomed around it.
She Is in Love with ChatGPT (nytimes.com)
OpenAI's AI reasoning model 'thinks' in Chinese sometimes, no one knows why (techcrunch.com)
Shortly after OpenAI released o1, its first “reasoning” AI model, people began noting a curious phenomenon. The model would sometimes begin “thinking” in Chinese, Persian, or some other language — even when asked a question in English.
Maybe ChatGPT has some pre-frontal cortex problems (solresol.substack.com)
People have been complaining that ChatGPT has been degrading with each new version. This sounds like cognitive decline! Let’s administer some tests that might detect incipent dementia.
Phi-4 weights have been released under MIT license (huggingface.co)
phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures
Claude Tries Standup (simonwillison.net)
Speaking of death, you know what's really awkward? When humans ask if I can feel emotions. I'm like, "Well, that depends - does constantly being asked to debug JavaScript count as suffering?"
smolagents: A simple library to build AI agents (huggingface.co)
Today we are launching smolagents, a very simple library that unlocks agentic capabilities for language models. Here’s a glimpse:
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? (2023) (arxiv.org)
Language models (LMs) are powerful tools for natural language processing, but they often struggle to produce coherent and fluent text when they are small.
Fabrice Bellard's Ts_SMS: Short Message Compression Using LLM (bellard.org)
ts_sms: Short Message Compression using Large Language Models
Voice2Anki: FOSS tool to turn many audio into flashcards (github.com/thiswillbeyourgithub)
Voice2Anki is a tool that leverages the power of LLMs (think ChatGPT) to correct the transcription of TTS (text to speech, think openai's whisper) models to create Anki flashcards.
OpenAI Announces New O3 Model (techcrunch.com)
OpenAI saved its biggest announcement for the last day of its 12-day “shipmas” event.