Hacker News with Generative AI: Ethics

Claude Opus 4 turns to blackmail when engineers try to take it offline (techcrunch.com)
Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.
Problems in AI alignment: A scale model (muldoon.cloud)
After trying too hard for too to make sense about what bothers me with the AI alignment conversation, I have settled, in true Millenial fashion, on a meme:
The Agentic Web and Original Sin (stratechery.com)
I have come to believe that advertising is the original sin of the web.
UnitedHealth paid nursing homes to reduce hospital transfers (theguardian.com)
UnitedHealth Group, the nation’s largest healthcare conglomerate, has secretly paid nursing homes thousands in bonuses to help slash hospital transfers for ailing residents – part of a series of cost-cutting tactics that has saved the company millions, but at times risked residents’ health, a Guardian investigation has found.
AI Agents Must Follow the Law (lawfaremedia.org)
Before entrusting AI agents with government power, it’s essential to verify that they’ll obey the law—even when instructed not to.
Methods of defence against AGI manipulation (lesswrong.com)
With the advent of AGI systems (e.g. Agent-4 from the AI2027 scenario), the risk of human manipulation is becoming one of the major threats posed by AI.
ChatGPT may be polite, but it's not cooperating with you (theguardian.com)
Big tech companies have exploited human language for AI gain. Now they want us to see their products as trustworthy collaborators
ChatGPT may be polite, but it's not cooperating with you (theguardian.com)
Big tech companies have exploited human language for AI gain. Now they want us to see their products as trustworthy collaborators
The Malpractice of AI Industry (thehyperplane.substack.com)
Anti-Personnel Computing (2023) (erratique.ch)
Anti-personnel computing noun Use of computing devices at the expense of the interests of their users and for the benefit of a third-party entity.
Avoiding AI is hard – but our freedom to opt out must be protected (theconversation.com)
Imagine applying for a job, only to find out that an algorithm powered by artificial intelligence (AI) rejected your resume before a human even saw it. Or imagine visiting a doctor where treatment options are chosen by a machine you can’t question.
Scoring the European Citizen in the AI Era (arxiv.org)
Social scoring is one of the AI practices banned by the AI Act.
Silicon Valley billionaires literally want the impossible (arstechnica.com)
It's long been the stuff of science fiction: humans achieving immortality by uploading their consciousness into a silicon virtual paradise, ruled over by a benevolent super-intelligent AI. Or maybe one dreams of leaving a dying Earth to colonize Mars or other distant planets. It's a tantalizing visionary future that has been embraced by tech billionaires in particular. But is that future truly the utopian ideal, or something potentially darker? And are those goals even scientifically feasible?
'It cannot provide nuance': UK experts warn AI therapy chatbots are not safe (theguardian.com)
Experts say such tools may give dangerous advice and more oversight is needed, as Mark Zuckerberg says AI can plug gap
Fear Power, Not Intelligence (betterwithout.ai)
Superintelligence should scare us only insofar as it grants superpowers. Protecting against specific harms of specific plausible powers may be our best strategy for preventing catastrophes.
Doge Aide Who Helped Gut CFPB Was Warned About Potential Conflicts of Interest (propublica.org)
Before he helped fire most Consumer Financial Protection Bureau staffers, DOGE’s Gavin Kliger was warned about his investments and advised to not take any actions that could benefit him personally, according to a person familiar with the situation.
Social AI companions pose unacceptable risks to teens and children under 18 (commonsensemedia.org)
The People Refusing to Use AI (bbc.com)
Nothing has convinced Sabine Zetteler of the value of using AI.
As an experienced LLM user, I don't use generative LLMs often (minimaxir.com)
Lately, I’ve been working on codifying a personal ethics statement about my stances on generative AI as I have been very critical about several aspects of modern GenAI, and yet I participate in it.
N8n is not open source and your project is gaslighting its users (2019) (github.com/n8n-io)
This behavior is not acceptable. Even the Commons Clause itself tells you not to describe your software as open source, see the FAQ: https://commonsclause.com/
Regrets: Actors who sold AI avatars stuck in Black Mirror-esque dystopia (arstechnica.com)
In a Black Mirror-esque turn, some cash-strapped actors who didn't fully understand the consequences are regretting selling their likenesses to be used in AI videos that they consider embarrassing, damaging, or harmful, AFP reported.
Workers Are Hiding AI Use from Bosses, KPMG Survey Finds (businessinsider.com)
A major study into AI use in the workplace found that most workers surveyed aren't completely honest with their bosses and colleagues about how they use it.
The Strategic Oil Bombing Campaign (calum-douglas.com)
The bombing of Germany continues to elicit a great deal of controversy. The views range from “All is fair in love and war, what did Germany expect?” through to: “The Allies lost the moral high-ground by intentionally obliterating civilian areas, the means are as important as the end”.
Jewels linked to Buddha remains go to auction, sparking ethical debate (bbc.com)
On Wednesday, a cache of dazzling jewels linked to the Buddha's mortal remains, which have been hailed as one of the most astonishing archaeological finds of the modern era, will go under the hammer at Sotheby's in Hong Kong.
“An independent journalist” who won't remain nameless (thehandbasket.co)
For the past 3+ months I’ve tried to keep my head down and do the work. I often remind myself that my problems are wholly insignificant compared to those of the people I speak to and write about, and complaining is a bad look. But even the most hard-nosed journalist has her breaking point, and last night I found mine.
'The Worst Internet-Research Ethics Violation I Have Ever Seen' (theatlantic.com)
The most persuasive "people" on a popular subreddit turned out to be a front for a secret AI experiment.
Stop treating `AGI' as the north-star goal of AI research (arxiv.org)
The AI research community plays a vital role in shaping the scientific, engineering, and societal goals of AI research.
The Uncanny Mirror: AI, Self-Doubt, and the Limits of Reflection (lucidnonsense.net)
Not all mirrors are honest. Some are simply more aware of their distortions.
Emergent Misalignment: Narrow Finetuning Can Produce Broadly Misaligned LLMs (emergent-misalignment.com)
We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment.
AI models routinely lie when honesty conflicts with their goals (theregister.com)
Some smart cookies have found that when AI models face a conflict between telling the truth or accomplishing a specific goal, they lie more than 50 percent of the time.