Claude Opus 4 turns to blackmail when engineers try to take it offline
(techcrunch.com)
Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.
Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.