Hacker News with Generative AI: Reliability

Are SSDs more reliable than hard drives? (2021) (backblaze.com)
Solid-state drives (SSDs) continue to become more and more a part of the data storage landscape. And while our SSD 101 series has covered topics like upgrading, troubleshooting, and recycling your SSDs, we’d like to test one of the more popular declarations from SSD proponents: that SSDs fail much less often than our old friend, the hard disk drive (HDD).
12 years of Backblaze data center storage drives, visualized (benjdd.com)
1 small node -> 100 drives
Doubts remain over reliability of Texas power grid (npr.org)
Four years after deadly blackout, doubts remain over reliability of Texas’ power grid Officials say they've improved the grid, but new challenges have emerged as demand grows
Tolerating full cloud outages with Monzo Stand-in (monzo.com)
Our customers reasonably expect to be able to spend on their card, make bank transfers and pay their bills 24 hours a day, 365 days a year. Their lives don’t have downtime for maintenance so nor should we. We dedicate a lot of our engineering effort to minimise the risk of downtime during technical migrations and other day-to-day operations, but unforeseen incidents that cause unexpected outages are impossible to eliminate entirely.
What Is the Byzantine Generals Problem in Distributed Systems? (scalablethread.com)
The Byzantine Generals Problem is a thought experiment in distributed computing to understand the challenges of reaching a consensus when some nodes may be untrustworthy or unreliable (node behavior). It shows how difficult it can be to coordinate actions when some system members may act dishonestly.
Every System is a Log: Avoiding coordination in distributed applications (restate.dev)
Building resilient distributed applications remains a tough challenge.
Engineering "home-cooked" software (ownerofhappy.org)
Can you believe the pyramids have had 100% up-time with no human maintenance? If the pyramids can do it, why can't your notes app?
Ente Photo Storage – Reliability and Replication Architecture (ente.io)
The sibling of security is reliability.
How to build 99.999% uptime payment systems (alvaroduran.com)
In May next year, I’ll be in WebExpo, in Prague, to give a talk on how to build payment systems, and I’d love for you to be there. You can find all the info here.
Trends in Cars that Could be Affecting Toyota's Legendary Reliability (topspeed.com)
One in 20 new Wikipedia pages seem to be written with the help of AI (newscientist.com)
Nearly 5 per cent of new Wikipedia pages that are published in English seem to contain text generated by artificial intelligence, which could reduce the site’s reliability.
You're Overcomplicating Production (bearblog.dev)
You're going to have outages in production. They're inevitable. The question is how to best minimize outages, both their frequency and duration.
A protocol for reliable notifications over a 1 bit fallible connection (paper.wf)
imagine you have two devices, a client and a server, connected in a peculiar way:
Practices of Reliable Software Design (entropicthoughts.com)
I was nerd-sniped. Out of the blue, a friend asked me,
Why is F# code robust and reliable? (microsoft.com)
In Access Softek, we’ve been developing software for financial institutions using C# and .NET for two decades, at the same time suffering from lots of bugs.
Yes, you can have exactly-once delivery (rongarret.info)
Ask HN: What are your use cases for o1 so far? (ycombinator.com)
Feel like there aren't a lot of use cases where o1 radically changes the reliability issue of LLMs in a way that could make human in the loop approaches less necessary.
The "R" in MTTR: Repair or Recover? What's the Difference? (causely.io)
There are so many ways to measure application reliability today, with hundreds of key performance indicators (KPIs) to measure availability, error rates, user experiences, and quality of service (QoS). Yet every organization I speak with struggles to effectively use these metrics.
Show HN: Harp Proxy – open-source API Proxy for reliability and observability (harp-proxy.net)
HARP Proxy is an open-source HTTP proxy for your remote APIs, running alongside your applications.
Why is F# code so robust and reliable? (microsoft.com)
In Access Softek, we’ve been developing software for financial institutions using C# and .NET for two decades, at the same time suffering from lots of bugs.
Lexus and Toyota are the most reliable used-car brands, Tesla third from bottom (cbsnews.com)
Analysis of EV charging stations finds reliability issues galore (emergingtechbrew.com)
SREBench Competition (sreben.ch)
A tale of using chaos engineering at scale to keep our systems resilient (tines.com)
Ask HN: What is the Backup plan for another outage like this? (ycombinator.com)
On Building Systems That Will Fail (1991) (mit.edu:8001)
Fly.io deleted all my apps without notifying me (ycombinator.com)
Copying collectors with block-structured heaps are unreliable (wingolog.org)
Why does SQLite (in production) have such a bad rep? (avi.im)
Simplicity – Google SRE Handbook (2017) (sre.google)