Hacker News with Generative AI: Linguistics

Accents in latent spaces: How AI hears accent strength in English (boldvoice.com)
We work with accents a lot at BoldVoice, the AI-powered accent coaching app for non-native English speakers. Accents are subtle patterns in speech—vowel shape, timing, pitch, and more. Usually, you need a linguist to make sense of these qualities. However, our goal at BoldVoice is to get machines to understand accents, and machines don’t think like linguists. So, we ask: how does a machine learning model understand an accent, and specifically, how strong it is?
Umarell (wikipedia.org)
Umarell (Italian spelling of the Bolognese Emilian word umarèl, Emilian pronunciation: [umaˈrɛːl]; plural umarî) are men of retirement age who spend their time watching construction sites, especially roadworks – stereotypically with hands clasped behind their back and offering unwanted advice to the workers.[1] Its literal meaning is "little man" (also umaréin).[2] The term is employed as lighthearted mockery or self-deprecation.
Unparalleled Misalignments (rickiheicklen.com)
This is where I maintain a list of Unparalleled Misalignments (formerly quadruple entendres), pairs of non-synonymous phrases where the words in one phrase are each synonyms of the words in the other.
The Danish language, even the Danes don't understand it [video] (youtube.com)
Zipf's Law (wikipedia.org)
Zipf's law (/zɪf/; German pronunciation: [tsɪpf]) is an empirical law stating that when a list of measured values is sorted in decreasing order, the value of the n-th entry is often approximately inversely proportional to n.
Why is English so weirdly different from other languages? (aeon.co)
English speakers know that their language is odd. So do people saddled with learning it non-natively. The oddity that we all perceive most readily is its spelling, which is indeed a nightmare. In countries where English isn’t spoken, there is no such thing as a ‘spelling bee’ competition. For a normal language, spelling at least pretends a basic correspondence to the way people pronounce the words. But English is not normal.
Why translating Chinese food names into English is 'an impossible task' (cnn.com)
Greek Particles (1990) (specgram.com)
Two facts well-known to linguists for many years are that Ancient Greek orthography represented speech much more closely than does modern English orthography, or practically any other modern European orthography, and that speech, unlike writing, is full of hesitations, false starts, and meaningless expletive utterances which are not recorded in writing.
The American Style of Quotation Mark Punctuation Makes No Sense (2021) (erichgrunewald.com)
There are different ways of combining quotation and punctuation marks. In the American style, you almost always put periods and commas inside the quotation marks:
Do Large Language Models know who did what to whom? (arxiv.org)
Large Language Models (LLMs) are commonly criticized for not understanding language. However, many critiques focus on cognitive abilities that, in humans, are distinct from language processing. Here, we instead study a kind of understanding tightly linked to language: inferring who did what to whom (thematic roles) in a sentence.
Fear and loathing of the English passive [pdf] (ed.ac.uk)
A Map of British Dialects (2023) (starkeycomics.com)
This map took me a long time to make, and is very detailed, but will always be incomplete and inaccurate due to the nature of language.
Growing a Language [pdf] (1998) (langev.com)
Nominal Aphasia: Problems in Name Retrieval (serendipstudio.org)
Serendip is an independent site partnering with faculty at multiple colleges and universities around the world. Happy exploring!
Ask HN: Why is uptalk intonation so prevalent in ChatGPT voices? (ycombinator.com)
I’ve tried asking it to set voice with an even tone and less of the annoying uptalk but lately it just continues in this way. It hurts to listen to.
Tariff: The well-travelled Arabic term that became a byword for isolationism (middleeasteye.net)
“To me, the most beautiful word in the dictionary is 'tariff'. And it’s my favourite word.” 
Bonobos use a kind of syntax once thought to be unique to humans (newscientist.com)
Bonobos combine their calls in a complex way that forms distinct phrases, a sign that this type of syntax is more evolutionarily ancient than previously thought.
Bonobos' calls may be the closest thing to animal language we've seen (arstechnica.com)
Bonobos, great apes related to us and chimpanzees that live in the Republic of Congo, communicate with vocal calls including peeps, hoots, yelps, grunts, and whistles. Now, a team of Swiss scientists led by Melissa Berthet, an evolutionary anthropologist at the University of Zurich, discovered bonobos can combine these basic sounds into larger semantic structures.
Spanish speakers in Philadelphia break traditional rules of speech in signs (phys.org)
I've discovered something fascinating about how Spanish speakers in Philadelphia address each other and communicate through public signs.
Digital cuneiforms: Updated tool expands access to ancient Hittite texts (phys.org)
From cuneiform to code: section of a Hittite cuneiform text found in Boğazköy-Hattuša in 2024 (photo and XML text). Credit: Daniel Schwemer, University of Wuerzburg
Argumentation Theory (wikipedia.org)
Argumentation theory is the interdisciplinary study of how conclusions can be supported or undermined by premises through logical reasoning.
To the brain, Esperanto and Klingon appear the same as English or Mandarin (news.mit.edu)
A new study finds natural and invented languages elicit similar responses in the brain’s language-processing network.
English Multinyms (sc.fsu.edu)
multinyms are examples of triple/quadruple/quintuple/sextuple homonyms.
A AI etymology deconstructor – can guess fake words (ayush.digital)
Press enter or space to select a node.You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel. Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.
The US island that speaks Elizabethan English (bbc.com)
Native Americans, English sailors and pirates all came together on Ocracoke Island in North Carolina to create the only American dialect that is not identified as American.
Encoding Hangeul, Koreas writing system (brookjeynes.dev)
Hangeul (한글) is the modern writing system for the Korean language, created in 1443 by King Sejong the great, the fourth king of the Joseon dynasty1.
Ancient DNA Points to Origins of Indo-European Language (nytimes.com)
A new study claims to have identified the first speakers of Indo-European language, which gave rise to English, Sanskrit and hundreds of others.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo (wikipedia.org)
"Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo" is a grammatically correct sentence in English that is often presented as an example of how homonyms and homophones can be used to create complicated linguistic constructs through lexical ambiguity.
Affixes: The Building Blocks of English (affixes.org)
This dictionary contains more than 1,250 entries, illustrated by some 10,000 examples, all defined and explained.
Can you lose your native tongue? (2024) (nytimes.com)
It happened the first time over dinner. I was saying something to my husband, who grew up in Paris where we live, and suddenly couldn’t get the word out.