Hacker News with Generative AI: Safety

Southwest will require passengers to keep chargers visible due to fire risk (npr.org)
Passengers flying on Southwest Airlines will soon be required to keep battery packs and other portable charging devices visible if they're using them during a flight.
Center for AI Safety's new spokesperson suggests „burning down labs" (twitter.com)
Something went wrong, but don’t fret — let’s give it another shot.
List of Non-Water Floods (wikipedia.org)
Most non-water floods (excluding mudflows, oil spills, or volcanic lahars) involve storage facilities suddenly releasing liquids, or industrial retaining reservoirs releasing toxic waste.
Lufthansa plane flown by autopilot after pilot faints in cockpit (scmp.com)
A Lufthansa flight was flown by autopilot when the co-pilot, alone in the cockpit as the pilot had stepped away to use the bathroom, fainted, Spanish investigators said in a report about an incident last year that was released on Saturday.
After 2 SpaceX Explosions, UK Officials Ask FAA to Change Starship Flight Plans (propublica.org)
British officials told the U.S. they are concerned about the safety of SpaceX’s plans to fly its next Starship rocket over British territories in the Caribbean, where debris fell earlier this year after two of the company’s rockets exploded, according to documents reviewed by ProPublica.
Air-Traffic Controller Just Averted a Midair Collision. Now He's Speaking Out (wsj.com)
This Air-Traffic Controller Just Averted a Midair Collision. Now He’s Speaking Out.
Experts say Silicon Valley prioritizes products over safety, AI research (cnbc.com)
Newark air traffic crisis: just one controller on up to 180 takeoffs, landings (nypost.com)
The safety nightmare continues at Newark Liberty International Airport, where all air traffic control will be manned by just one fully qualified person during its busiest time tonight, The Post can exclusively reveal.
Methods of defence against AGI manipulation (lesswrong.com)
With the advent of AGI systems (e.g. Agent-4 from the AI2027 scenario), the risk of human manipulation is becoming one of the major threats posed by AI.
Lock-Free Rust: How to Build a Rollercoaster While It's on Fire (yeet.cx)
Update turns Google Gemini into a prude, breaking apps for trauma survivors (theregister.com)
Google's latest update to its Gemini family of large language models appears to have broken the controls for configuring safety settings, breaking applications that require lowered guardrails, such as apps providing solace for sexual assault victims.
Update turns Google Gemini into a prude, breaking apps for trauma survivors (theregister.com)
Google's latest update to its Gemini family of large language models appears to have broken the controls for configuring safety settings, breaking applications that require lowered guardrails, such as apps providing solace for sexual assault victims.
Social AI companions pose unacceptable risks to teens and children under 18 (commonsensemedia.org)
Pyrotechny: The art of making fireworks at little cost with complete safety 1873 (gutenberg.org)
The art of Pyrotechny has, like almost every other art in these days of experiment and research, undergone many processes of change and improvement.
Alignment is not free: How model upgrades can silence your confidence signals (variance.co)
The post-training process for LLMs can bias behavior for language models when they encounter content that violates their safety post-training guidelines. As mentioned by OpenAI’s GPT-4 system card, model calibration rarely survives post-training, resulting in models that are extremely confident even when they’re wrong.¹ For our use case, we often see this behavior with the side effect of biasing language model outputs towards violations, which can result in wasted review times for human reviewers in an LLM-powered content moderation system.
Carmakers Are Embracing Physical Buttons Again (wired.com)
Automakers that nest key controls deep in touchscreen menus—forcing motorists to drive eyes-down rather than concentrate on the road ahead—may have their non-US safety ratings clipped next year.
New Study: Waymo is reducing serious crashes and making streets safer (waymo.com)
The path to Vision Zero requires reducing severe crashes and improving the safety of those most at risk. Our latest research paper shows that the Waymo Driver is making significant strides in both areas. By reducing the most dangerous crashes and providing better protection for pedestrians, cyclists, and other vulnerable road users, Waymo is making streets safer in cities where it operates.
Language equivariance as a way of figuring out what an AI "means" (lesswrong.com)
I recently had the privilege of having my idea criticized at the London Institute for Safe AI, including by Philip Kreer and Nicky Case. Previously the idea was vague; being with them forced me to make the idea specific. I managed to make it so specific that they found a problem with it! That's progress :)
AI companions unsafe for teens under 18, researchers say (mashable.com)
As the popularity of artificial intelligence companions surges amongst teens, critics point to warning signs that the risks of use are not worth the potential benefits.
Technology in cars being used to make choices for drivers this can be dangerous (thestar.com)
We’re slowly ceding control of formerly manual or mechanical systems in our vehicles to electronics. Sometimes with unintended consequences …
Social media and map apps blamed for record rise in mountain rescue callouts (theguardian.com)
Honeypot locations posted on social media and poor quality navigation apps are likely to be responsible for a record number of callouts for mountain rescue services, including a huge rise in young people needing to be saved, analysis reveals.
Exploring model welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.
Exploring Model Welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.
A study of lightning fatalities inside buildings while using smartphones [pdf] (2024) (electricalsafetyworkshop.org)
An Uber drove away with her kid and wouldn't connect her with the driver (cbc.ca)
An Ontario mother is raising concerns about Uber's emergency policies after one of its drivers drove away with her five-year-old daughter still in the back seat.
Beijing Pulls Further Ahead with Strict New EV Battery Safety Mandate (gizmodo.com)
China is introducing new regulatory standards for electric car batteries that will represent some of the strictest safety and testing requirements in the world.
Top OpenAI Catastrophic Risk Official Steps Down Abruptly (garrisonlovely.substack.com)
OpenAI's top safety staffer responsible for mitigating catastrophic risks quietly stepped down from the role weeks ago, according to a LinkedIn announcement posted yesterday.
OpenAI adds new safety net to prevent ChatGPT from advising on creating viruses (hindustantimes.com)
Powerful generative artificial intelligence models have a tendency to hallucinate. They can often offer improper advice and stray off track, which can potentially misguide people.
Fatal Accident at Universal Stainless Leads Steelworkers to Flag Safety Failures (hntrbrk.com)
A worker died this week while operating a crane at a steel mill in Dunkirk, New York, according to former and current employees.
Show HN: I tried making YouTube safer for my kids (maekitsafe.com)
Provide them with content you trust. Only YouTube channels you approve, with no comments, no autoplay, and no algorithmic recommendations.