Hacker News with Generative AI: Safety

Exploring model welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.
Exploring Model Welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.
A study of lightning fatalities inside buildings while using smartphones [pdf] (2024) (electricalsafetyworkshop.org)
An Uber drove away with her kid and wouldn't connect her with the driver (cbc.ca)
An Ontario mother is raising concerns about Uber's emergency policies after one of its drivers drove away with her five-year-old daughter still in the back seat.
Beijing Pulls Further Ahead with Strict New EV Battery Safety Mandate (gizmodo.com)
China is introducing new regulatory standards for electric car batteries that will represent some of the strictest safety and testing requirements in the world.
Top OpenAI Catastrophic Risk Official Steps Down Abruptly (garrisonlovely.substack.com)
OpenAI's top safety staffer responsible for mitigating catastrophic risks quietly stepped down from the role weeks ago, according to a LinkedIn announcement posted yesterday.
OpenAI adds new safety net to prevent ChatGPT from advising on creating viruses (hindustantimes.com)
Powerful generative artificial intelligence models have a tendency to hallucinate. They can often offer improper advice and stray off track, which can potentially misguide people.
Fatal Accident at Universal Stainless Leads Steelworkers to Flag Safety Failures (hntrbrk.com)
A worker died this week while operating a crane at a steel mill in Dunkirk, New York, according to former and current employees.
Show HN: I tried making YouTube safer for my kids (maekitsafe.com)
Provide them with content you trust. Only YouTube channels you approve, with no comments, no autoplay, and no algorithmic recommendations.
Boron-Based Flame Retardants: Enabling Everyday Safety (borax.com)
On November 6, 1961, one of the most costly and destructive residential fires in California history ignited.
'Exploding' Tunnock's teacakes cleared by tests to fly again (bbc.co.uk)
News Graveyards: How Dangers to War Reporters Endanger the World (watson.brown.edu)
Since the 2000s, national governments and terrorist groups – from Israel, Syria’s Assad regime and the United States to the Islamic State – have found ways to curtail conflict coverage through myriad means, from repressive policies to armed attack.
After Crash FAA Change Requires All Aircraft at Reagan to Broadcast Positions (nytimes.com)
All aircraft flying near Ronald Reagan National Airport will now be required to broadcast their positions to air traffic controllers, the acting administrator of the Federal Aviation Administration told a Senate subcommittee on Thursday.
Flight Attendants on Deportation Planes Say Disaster Is "Only a Matter of Time" (propublica.org)
The deportation flight was in the air over Mexico when chaos erupted in the back of the plane, the flight attendant recalled. A little girl had collapsed. She had a high fever and was taking ragged, frantic breaths.
Japan warns 'big one' earthquake could kill 300k people (ft.com)
Volvo Tells Plug-In Hybrid Owners to Stop Charging (carscoops.com)
Volvo is recalling thousands of plug-in hybrids in the United States as they could short circuit when parked and fully charged. This poses a serious fire risk and it could occur at night, when the vehicle is parked in your garage.
The Jailbreak Bible (generalanalysis.com)
The rapid evolution of Large Language Models (LLMs) has unlocked remarkable new possibilites, but with these advances come unexpected blind spots. Even rigorously safety-aligned LLMs can be subtly manipulated through carefully designed adversarial prompts, commonly known as "jailbreaks." By exploiting linguistic nuances, these jailbreaks can sidestep safeguards, enabling models to divulge toxic content, propagate misinformation, or even disclose detailed instructions related to dangerous chemical, biological, radiological, and nuclear threats (Anthropic, 2023a).
Waymos crash less than human drivers (understandingai.org)
After 50 million miles, Waymos crash a lot less than human drivers
TfL bans most e-bikes on trains amid concern over igniting batteries (theguardian.com)
Most e-bikes will be banned across the London Underground and other Transport for London services, after growing safety concerns over igniting batteries.
A pilot makes a tough call and cancels the flight because of some alarming signs (reddit.com)
A pilot reluctantly makes an extremely tough call and cancels the flight because of some alarming signs on the aircraft
National Lab Creates New Device to Test Safety Limits of Nuclear Fuel (energy.gov)
Idaho National Laboratory (INL) recently released footage of a new experiment that simulates what happens to a nuclear fuel pin when it starts to overheat.
EU to take action to protect children from harmful practices in video games (europa.eu)
Tesla Recalls Just About Every Cybertruck as Decorative Steel Falls Off (thedrive.com)
Tesla has issued another recall for more than 46,000 of its Cybertrucks that may undergo a rapid unplanned disassembly of their roof trim while underway, the company said in a campaign it launched Tuesday.
Tesla Recalls Every Single Cybertruck over Stainless Steel Trims Falling Off (carscoops.com)
Tesla recalls 46,000 Cybertrucks due to roof panels potentially detaching during driving.
Tesla Booted from Vancouver International Auto Show over 'Safety' (cnn.com)
Does unsafe undermine Rust's guarantees? (steveklabnik.com)
When people first hear about unsafe in Rust, they often have questions. A very normal thing to ask is, “wait a minute, doesn’t this defeat the purpose?” And while it’s a perfectly reasonable question, the answer is both straightforward and has a lot of nuance. So let’s talk about it.
Tesla fans exposes Tesla's own shadiness in attempt to defend Autopilot crash (electrek.co)
A group of Tesla fans and investors has inadvertently exposed Tesla’s shadiness regarding crashes involving Autopilot by attempting to claim that the advanced driver-assist system was not active in a crash test.
Tesla drives into Wile E. Coyote fake road wall in camera vs. Lidar test (electrek.co)
Tesla Autopilot drove into Wile E. Coyote-style fake road wall in the middle of the road in a camera versus lidar test.
Power bank likely caused S Korea plane fire – investigators (bbc.com)
A portable power bank likely caused a fire that engulfed and destroyed a passenger plane in South Korea in January, according to local authorities.
College Kids Burned to Death Inside a Cybertruck Because the Doors Wouldn't Open (jalopnik.com)
The primary cause of a terrifying Cybertruck crash in November was determined to likely have been driver impairment after a night of drinking and drugs, California Highway Patrol told The LA Times.