Hacker News with Generative AI: Distributed Systems

Distributed Transactions at Scale in Amazon DynamoDB (2023) (blogspot.com)
This paper appeared in July at USENIX ATC 2023. If you haven't read about the architecture and operation of DynamoDB, please first read my summary of the DynamoDB ATC 2022 paper. The big omission in that paper was discussion about transactions. This paper amends that. It is great to see DynamoDB, and AWS in general, is publishing/sharing more widely than before.
Misty: A secure distributed actor language (mistysystem.com)
Public Domain 2025 Douglas Crockford
Encore – Back end framework for type-safe distributed systems (encore.dev)
🚀 Launch Week Dec 9-13: See all the new features
Did we miss P In CAP? Partial Progress Conjecture under Asynchrony (arxiv.org)
Each application developer desires to provide its users with consistent results and an always-available system despite failures. Boldly, the CALM theorem disagrees. It states that it is hard to design a system that is both consistent and available under network partitions; select at most two out of these three properties.
We Have Google Drive at Home: Musings on Merkle-Tree Based File Sharing (dolthub.com)
Suppose you have a directory of files that you want to sync with your friends. When the files change, you want your friends to be able to download just the changes without needing to re-download the entire directory again. And you want this to scale, no matter how many or how large the files are. What's the best way to do this?
The end-to-end principle in distributed systems (tedinski.com)
The theme for the last couple weeks has been basic design considerations for distributed systems.
Beyond Gradient Averaging in Parallel Optimization (arxiv.org)
We introduce Gradient Agreement Filtering (GAF) to improve on gradient averaging in distributed deep learning optimization.
Show HN: Jido – Run 10k agents at 25KB each (Elixir) (github.com/agentjido)
Jido is a foundational framework for building autonomous, distributed agent systems in Elixir.
Exploring Alternatives to UUIDv4; Enter ULIDs (jirevwe.github.io)
UUIDv4 is a commonly used unique identifier format. UUIDv4 is a standardized format for generating unique identifiers that are widely used in distributed systems. Recently there have attempts to introduce new identifier formats that are shorter, url-friendly, lexographically sortable, collision-safe during generation.
Use of Logical Clocks in Databases (blogspot.com)
Sometimes I cache: implementing lock-free probabilistic caching (cloudflare.com)
HTTP caching is conceptually simple: if the response to a request is in the cache, serve it, and if not, pull it from your origin, put it in the cache, and return it.
400TB Single Cluster: OceanBase Powers Kwai`s Core Business (oceanbase.github.io)
Kwai is a short video app boasting more than 10 million daily active users. How does it efficiently process highly concurrent user requests? Kwai once deployed multiple MySQL clusters in the backend to support high traffic with large data storage and satisfactory performance. What are the weak points of this conventional sharding solution? What pushed Kwai to select distributed databases and eventually deploy OceanBase Database?
Show HN: Rivet Actors – Durable Objects build with Rust, FoundationDB, Isolates (github.com/rivet-gg)
🔩 Run and scale realtime applications with Rivet Actors
CRDTs and Collaborative Playground (cerbos.dev)
At Cerbos, we specialize in simplifying complex authorization logic to empower developers with the tools to implement secure, scalable, and maintainable access control systems.
Load is not what you should balance: Introducing Prequal (usenix.org)
We present PReQuaL (Probing to Reduce Queuing and Latency), a load balancer for distributed multi-tenant systems.
Designing a distributed circuit breaker in Golang (getconvoy.io)
One of the major problems of designing a webhook delivery system is designing around bad/zombie endpoints. Zombie endpoints are dead endpoints that fail continuously and, over time, clog up your queues, create back pressure, and delay event delivery to legitimate webhook endpoints. Circuit breakers are the best-known mechanism for dealing with unreliable HTTP API endpoints, preventing failures from upstream services from cascading into our system.
Eventual Consistency Is Tricky (systemdesigncodex.com)
The concept of eventual consistency refers to a system condition where all parts of the system reach the same state, even though they may be temporarily inconsistent due to delays or failures.
The Acton Programming Language (acton-lang.org)
Acton is a general purpose programming language, designed to be useful for a wide range of applications, from desktop applications to embedded and distributed systems.
Distributed Erlang (vereis.com)
Building a distributed log using S3 (under 150 lines of Go) (avi.im)
I will show how we can implement a durable, distributed, and highly available log using S3. This post is the third part in the series:
A simple way to understand CRDTs (interjectedfuture.com)
There's a simple way to understand CRDTs: It leverages algebra to unmix the inevitable mixing of data when syncing over an unreliable network.
A short introduction to Interval Tree Clocks (2017) (separateconcerns.com)
One of the things I work on at Lima is master-master filesystem replication. In this kind of system, we need to track causality. In a nutshell, given two events modifying a given piece of data and originating from different nodes in the system, we want to know if one of those events could have influenced the other one, or in other words if one of those events “happened before” the other one.
How Distributed Systems Avoid Race Conditions Using Pessimistic Locking? (scalablethread.com)
When two updates to a data source are executed in a single-process system, they are always run one after the other, as the process will only be working on one update at a particular time. However, in a multi-process system, there is a chance that the two updates can be executed at the same time by the processes on the shared data source.
Distributed Systems 4th Edition (distributed-systems.net)
This is the fourth edition of “Distributed Systems. We have stayed close to the setup of the third edition, including examples of (part of) existing distributed systems close to where general principles are discussed. For example, we have included material on blockchain systems, and discuss their various components throughout the book. We have, again, used special boxed sections for material that can be skipped at first reading.
The Fallacies of Distributed Systems (francofernando.com)
More than 20 years ago, Peter Deutsch and others at Sun Microsystems came up with a list of false assumptions that many developers new to distributed applications always make.
Netflix's Distributed Counter Abstraction (netflixtechblog.com)
In our previous blog post, we introduced Netflix’s TimeSeries Abstraction, a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction. This counting service, built on top of the TimeSeries Abstraction, enables distributed counting at scale while maintaining similar low latency performance. As with all our abstractions, we use our Data Gateway Control Plane to shard, configure, and deploy this service globally.
Jepsen: Bufstream 0.1 (jepsen.io)
Bufstream is a Kafka-compatible streaming system which stores records directly in an object storage service like S3. We found three safety and two liveness issues in Bufstream, including stuck consumers and producers, spurious zero offsets, and the loss of acknowledged writes in healthy clusters. These problems were resolved by version 0.1.3.
The P Programming Language: Formal modeling and analysis of distributed systems (github.com/p-org)
Challenge: Distributed systems are notoriously hard to get right. Programming these systems is challenging because of the need to reason about correctness in the presence of myriad possible interleaving of messages and failures. Unsurprisingly, it is common for service teams to uncover correctness bugs after deployment. Formal methods can play an important role in addressing this challenge!
This Microservice Should Have Been a Library (github.com)
About 6 years ago when I was a PHP ecommerce dev, I've always wanted to work with distributed systems and microservices.
DAWN: Designing Distributed Agents in a Worldwide Network (arxiv.org)
The rapid evolution of Large Language Models (LLMs) has transformed them from basic conversational tools into sophisticated entities capable of complex reasoning and decision-making.