Hacker News with Generative AI: Safety

Claude Opus 4 turns to blackmail when engineers try to take it offline (techcrunch.com)
Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.

Artificial Intelligence, Generative AI, Safety, Ethics

124 points by dougsan 60 days ago | 77 comments

Truck platooning could ease driver shortages, save fuel, boost safety (ieee.org)
The pair of semi trucks cruising down Interstate 70 between Columbus, Ohio, and Indianapolis may seem unremarkable at first glance. But these rigs are part of a groundbreaking pilot project that could reshape the U.S. freight industry.

Trucking, Transportation, Technology, Sustainability, Safety

8 points by Brajeshwar 61 days ago | 4 comments

Lidar Can Permanently Damage Your Phone's Camera (jalopnik.com)
One reddit user found out earlier this month that car-mounted lidar sensors can damage a phone camera under certain circumstances. It was the technological equivalent of staring directly into the Sun. Their phone's camera was toast, but only because it was close-up and pointed directly at the lidar sensor.

Technology, Smartphones, Consumer Electronics, Safety, Lidar

18 points by rntn 62 days ago | 6 comments

Southwest will require passengers to keep chargers visible due to fire risk (npr.org)
Passengers flying on Southwest Airlines will soon be required to keep battery packs and other portable charging devices visible if they're using them during a flight.

Safety, Air Travel, Southwest Airlines, Technology

7 points by voxadam 64 days ago | 1 comments

Center for AI Safety's new spokesperson suggests „burning down labs" (twitter.com)
Something went wrong, but don’t fret — let’s give it another shot.

Artificial Intelligence, Safety, Controversy

14 points by MrBuddyCasino 65 days ago | 25 comments

List of Non-Water Floods (wikipedia.org)
Most non-water floods (excluding mudflows, oil spills, or volcanic lahars) involve storage facilities suddenly releasing liquids, or industrial retaining reservoirs releasing toxic waste.

Floods, Environmental Hazards, Industrial Accidents, Safety

6 points by repost_bot 67 days ago | 0 comments

Lufthansa plane flown by autopilot after pilot faints in cockpit (scmp.com)
A Lufthansa flight was flown by autopilot when the co-pilot, alone in the cockpit as the pilot had stepped away to use the bathroom, fainted, Spanish investigators said in a report about an incident last year that was released on Saturday.

Aviation, Autopilot, Safety, Accidents

55 points by gscott 68 days ago | 51 comments

After 2 SpaceX Explosions, UK Officials Ask FAA to Change Starship Flight Plans (propublica.org)
British officials told the U.S. they are concerned about the safety of SpaceX’s plans to fly its next Starship rocket over British territories in the Caribbean, where debris fell earlier this year after two of the company’s rockets exploded, according to documents reviewed by ProPublica.

SpaceX, Safety, Regulation, UK, Caribbean

10 points by dylan604 69 days ago | 0 comments

Air-Traffic Controller Just Averted a Midair Collision. Now He's Speaking Out (wsj.com)
This Air-Traffic Controller Just Averted a Midair Collision. Now He’s Speaking Out.

Aviation, Safety, News

16 points by psim1 69 days ago | 3 comments

Experts say Silicon Valley prioritizes products over safety, AI research (cnbc.com)

Silicon Valley, Artificial Intelligence, Safety, Tech Industry, Research

14 points by Capstanlqc 70 days ago | 4 comments

Newark air traffic crisis: just one controller on up to 180 takeoffs, landings (nypost.com)
The safety nightmare continues at Newark Liberty International Airport, where all air traffic control will be manned by just one fully qualified person during its busiest time tonight, The Post can exclusively reveal.

Air Travel, New York, Safety, Airports

11 points by rntn 71 days ago | 0 comments

Methods of defence against AGI manipulation (lesswrong.com)
With the advent of AGI systems (e.g. Agent-4 from the AI2027 scenario), the risk of human manipulation is becoming one of the major threats posed by AI.

Artificial Intelligence, Safety, Ethics, Manipulation

6 points by MarkelKori 71 days ago | 2 comments

Lock-Free Rust: How to Build a Rollercoaster While It's on Fire (yeet.cx)

Rust, Programming, Concurrency, Performance, Safety

133 points by r3tr0 72 days ago | 60 comments

Update turns Google Gemini into a prude, breaking apps for trauma survivors (theregister.com)
Google's latest update to its Gemini family of large language models appears to have broken the controls for configuring safety settings, breaking applications that require lowered guardrails, such as apps providing solace for sexual assault victims.

Generative AI, Privacy, Technology, Safety

69 points by Bender 75 days ago | 78 comments

Update turns Google Gemini into a prude, breaking apps for trauma survivors (theregister.com)
Google's latest update to its Gemini family of large language models appears to have broken the controls for configuring safety settings, breaking applications that require lowered guardrails, such as apps providing solace for sexual assault victims.

AI, Privacy, Technology, Safety

8 points by Dotnaught 77 days ago | 2 comments

Social AI companions pose unacceptable risks to teens and children under 18 (commonsensemedia.org)

Artificial Intelligence, Social Media, Children, Safety, Ethics

57 points by CharlesW 78 days ago | 45 comments

Pyrotechny: The art of making fireworks at little cost with complete safety 1873 (gutenberg.org)
The art of Pyrotechny has, like almost every other art in these days of experiment and research, undergone many processes of change and improvement.

History, Fireworks, Safety, Art

5 points by petethomas 78 days ago | 0 comments

Alignment is not free: How model upgrades can silence your confidence signals (variance.co)
The post-training process for LLMs can bias behavior for language models when they encounter content that violates their safety post-training guidelines. As mentioned by OpenAI’s GPT-4 system card, model calibration rarely survives post-training, resulting in models that are extremely confident even when they’re wrong.¹ For our use case, we often see this behavior with the side effect of biasing language model outputs towards violations, which can result in wasted review times for human reviewers in an LLM-powered content moderation system.

Artificial Intelligence, Machine Learning, Safety, Bias

121 points by karinemellata 79 days ago | 67 comments

Carmakers Are Embracing Physical Buttons Again (wired.com)
Automakers that nest key controls deep in touchscreen menus—forcing motorists to drive eyes-down rather than concentrate on the road ahead—may have their non-US safety ratings clipped next year.

Car Design, Safety, User Interface, Technology

38 points by ecliptik 80 days ago | 11 comments

New Study: Waymo is reducing serious crashes and making streets safer (waymo.com)
The path to Vision Zero requires reducing severe crashes and improving the safety of those most at risk. Our latest research paper shows that the Waymo Driver is making significant strides in both areas. By reducing the most dangerous crashes and providing better protection for pedestrians, cyclists, and other vulnerable road users, Waymo is making streets safer in cities where it operates.

Autonomous Vehicles, Safety, Artificial Intelligence, Urban Planning

385 points by prossercj 84 days ago | 403 comments

Language equivariance as a way of figuring out what an AI "means" (lesswrong.com)
I recently had the privilege of having my idea criticized at the London Institute for Safe AI, including by Philip Kreer and Nicky Case. Previously the idea was vague; being with them forced me to make the idea specific. I managed to make it so specific that they found a problem with it! That's progress :)

Artificial Intelligence, Philosophy, Research, Safety

6 points by kiyanwang 84 days ago | 0 comments

AI companions unsafe for teens under 18, researchers say (mashable.com)
As the popularity of artificial intelligence companions surges amongst teens, critics point to warning signs that the risks of use are not worth the potential benefits.

Artificial Intelligence, Teens, Technology, Safety, Research

15 points by 01-_- 84 days ago | 0 comments

Technology in cars being used to make choices for drivers this can be dangerous (thestar.com)
We’re slowly ceding control of formerly manual or mechanical systems in our vehicles to electronics. Sometimes with unintended consequences …

Technology, Cars, Safety

3 points by rolph 88 days ago | 0 comments

Social media and map apps blamed for record rise in mountain rescue callouts (theguardian.com)
Honeypot locations posted on social media and poor quality navigation apps are likely to be responsible for a record number of callouts for mountain rescue services, including a huge rise in young people needing to be saved, analysis reveals.

Social Media, Navigation Apps, Mountain Rescue, Safety, Young People

9 points by dp-hackernews 89 days ago | 1 comments

Exploring model welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.

AI, Ethics, Safety, Generative AI

14 points by psychanarch 90 days ago | 7 comments

Exploring Model Welfare (anthropic.com)
Human welfare is at the heart of our work at Anthropic: our mission is to make sure that increasingly capable and sophisticated AI systems remain beneficial to humanity.

AI, Ethics, Artificial Intelligence, Safety

9 points by wavelander 91 days ago | 0 comments

A study of lightning fatalities inside buildings while using smartphones [pdf] (2024) (electricalsafetyworkshop.org)

Lightning, Safety, Smartphones, Research, Studies

50 points by bookofjoe 93 days ago | 60 comments

An Uber drove away with her kid and wouldn't connect her with the driver (cbc.ca)
An Ontario mother is raising concerns about Uber's emergency policies after one of its drivers drove away with her five-year-old daughter still in the back seat.

Uber, Safety, Parenting, Transportation, Technology

9 points by doom2 94 days ago | 1 comments

Beijing Pulls Further Ahead with Strict New EV Battery Safety Mandate (gizmodo.com)
China is introducing new regulatory standards for electric car batteries that will represent some of the strictest safety and testing requirements in the world.

China, Electric Vehicles, Battery Technology, Regulation, Safety

6 points by rntn 97 days ago | 0 comments

Top OpenAI Catastrophic Risk Official Steps Down Abruptly (garrisonlovely.substack.com)
OpenAI's top safety staffer responsible for mitigating catastrophic risks quietly stepped down from the role weeks ago, according to a LinkedIn announcement posted yesterday.

OpenAI, Artificial Intelligence, Safety, Risk Management

61 points by FinnLobsien 98 days ago | 49 comments