Hacker News with Generative AI: Linguistics

Digital cuneiforms: Updated tool expands access to ancient Hittite texts (phys.org)
From cuneiform to code: section of a Hittite cuneiform text found in Boğazköy-Hattuša in 2024 (photo and XML text). Credit: Daniel Schwemer, University of Wuerzburg
Argumentation Theory (wikipedia.org)
Argumentation theory is the interdisciplinary study of how conclusions can be supported or undermined by premises through logical reasoning.
To the brain, Esperanto and Klingon appear the same as English or Mandarin (news.mit.edu)
A new study finds natural and invented languages elicit similar responses in the brain’s language-processing network.
English Multinyms (sc.fsu.edu)
multinyms are examples of triple/quadruple/quintuple/sextuple homonyms.
A AI etymology deconstructor – can guess fake words (ayush.digital)
Press enter or space to select a node.You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel. Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.
The US island that speaks Elizabethan English (bbc.com)
Native Americans, English sailors and pirates all came together on Ocracoke Island in North Carolina to create the only American dialect that is not identified as American.
Encoding Hangeul, Koreas writing system (brookjeynes.dev)
Hangeul (한글) is the modern writing system for the Korean language, created in 1443 by King Sejong the great, the fourth king of the Joseon dynasty1.
Ancient DNA Points to Origins of Indo-European Language (nytimes.com)
A new study claims to have identified the first speakers of Indo-European language, which gave rise to English, Sanskrit and hundreds of others.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo (wikipedia.org)
"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo" is a grammatically correct sentence in English that is often presented as an example of how homonyms and homophones can be used to create complicated linguistic constructs through lexical ambiguity.
Affixes: The Building Blocks of English (affixes.org)
This dictionary contains more than 1,250 entries, illustrated by some 10,000 examples, all defined and explained.
Can you lose your native tongue? (2024) (nytimes.com)
It happened the first time over dinner. I was saying something to my husband, who grew up in Paris where we live, and suddenly couldn’t get the word out.
Do Lake Names Reflect Their Properties? (ivanludvig.dev)
A few months ago, I did a hike to a lake called “Lac Vert” (Green Lake) in France. It’s a mountain lake located close to the Italian border. I found it remarkable how vividly green the lake was. Although the name describes its appearance well, I was still surprised. This made me wonder: is it common for lakes to have appropriate names, reflecting their properties?
Whalesong patterns follow a universal law of human language, new research finds (theconversation.com)
Ancient-DNA study identifies originators of Indo-European language family (hms.harvard.edu)
A pair of landmark studies has genetically identified the originators of the massive Indo-European family of 400-plus languages.
Stares and ear-twitches: The linguist learning to speak the language of cows (bbc.com)
Dutch linguist Leonie Cornips has become fascinated with how cows communicate. But can this really be called 'language'?
Interrobang (wikipedia.org)
The Language Construction Kit (1996, 2012) (zompist.com)
This set of webpages (what’s a set of webpages? a webchapter?) is intended for anyone who wants to create artificial languages— for a fantasy or an alien world, as a hobby, as an interlanguage. It presents linguistically sound methods for creating naturalistic languages— which can be reversed to create non-naturalistic languages. It suggests further reading for those who want to know more, and shortcuts for those who want to know less.
Searching for DeepSeek's glitch tokens (outsidetext.substack.com)
“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text.
Why is zero plural? (2024) (stackexchange.com)
For example, if we choose two 2s, zero 3s, and one 5, we get the divisor
Brits still associate working-class accents with criminals – study warns of bias (cam.ac.uk)
People who speak with accents perceived as ‘working-class’ including those from Liverpool, Newcastle, Bradford and London risk being stereotyped as more likely to have committed a crime, and becoming victims of injustice, a new study suggests.
The rise and fall of the English sentence (2017) (nautil.us)
The surprising forces influencing the complexity of the language we speak and write.
Bog Standard (2005) (bbc.co.uk)
It's pretty rare in English to find a compound word with a slang first part and a formal second part.
Did OpenAI's O1 Decipher the Indus Valley Script? (yashgoenka.com)
A few weeks ago, I had a fascinating conversation with OpenAI's O1 model about decoding the Indus Valley script - one of the world's oldest and still undeciphered writing systems.
English-friendly Romanization system proposed for Japanese language (asahi.com)
The Agency for Cultural Affairs is soliciting public comments about its plans to change romanization rules of the Japanese language for the first time in about 70 years.
2025 Banished Words List (lssu.edu)
Lake Superior State University (LSSU) proudly reveals the 2025 edition of its Banished Words List, a quirky tradition that dates back to 1976, when former LSSU Public Relations Director Bill Rabe and his colleagues delighted word enthusiasts with the first “List of Words Banished from the Queen’s English for Mis-Use, Over-Use and General Uselessness”.
Ancient Indus Valley Script Deciphered (indusscript.net)
The official Indus inscriptions repository
Ancient genomes provide final word in Indo-European linguistic origins (phys.org)
A team of 91 researchers—including famed geneticist Eske Willerslev at the Lundbeck Foundation GeoGenetics Center, University of Copenhagen—has discovered a Bronze Age genetic divergence connected to eastern and western Mediterranean Indo-European language speakers.
Interpol wants everyone to stop saying 'pig butchering' (theregister.com)
Interpol wants to put an end to the online scam known as "pig butchering" – through linguistic policing, rather than law enforcement.
Noam Chomsky at 96 (theconversation.com)
Noam Chomsky, one of the world’s most famous and respected intellectuals, will be 96 years old on Dec. 7, 2024. For more than half a century, multitudes of people have read his works in a variety of languages, and many people have relied on his commentaries and interviews for insights about intellectual debates and current events.
MIT study explains why laws are written in an incomprehensible style (news.mit.edu)
Legal documents are notoriously difficult to understand, even for lawyers. This raises the question: Why are these documents written in a style that makes them so impenetrable?