How rqlite is tested(philipotoole.com) rqlite is a lightweight, open-source, distributed relational database built on SQLite and Raft. With its origins dating back to 2014, its design has always prioritized reliability, and quality. Testing plays a foundational role in achieving these qualities, shaping the implementation and guaranteeing robustness.
PostgreSQL is the Database Management System of the Year 2024(db-engines.com) DB-Engines is today announcing that PostgreSQL is our DBMS of the Year for the second year in a row, winning for the fifth time overall after also being top-ranked in 2017, 2018, 2019, and 2023. Second in the rankings was Snowflake, followed by Microsoft in third place. PostgreSQL has emerged as the most popular database management system over the past year, outpacing all other 423 monitored systems.
PostgreSQL Anonymizer(readthedocs.io) PostgreSQL Anonymizer is an extension to mask or replace personally identifiable information (PII) or commercially sensitive data from a Postgres database.
Distributed Transactions at Scale in Amazon DynamoDB (2023)(blogspot.com) This paper appeared in July at USENIX ATC 2023. If you haven't read about the architecture and operation of DynamoDB, please first read my summary of the DynamoDB ATC 2022 paper. The big omission in that paper was discussion about transactions. This paper amends that. It is great to see DynamoDB, and AWS in general, is publishing/sharing more widely than before.
SQL nulls are weird(jirevwe.github.io) Yes, you read that right. SQL does treat all NULL values differently. I learnt this a while back while working on Convoy and again on LiteQueue: a Golang a queueing library.
Speeding Up SQLite Inserts(julik.nl) In my work I tend to reach for SQLite more and more. The type of work I find it useful for most these days is quickly amalgamating, dissecting, collecting and analyzing large data sets. As I have outlined in my Euruko talk on scheduling, a key element of the project was writing a simulator. That simulator outputs metrics - lots and lots of metrics, which resemble what our APM solution collects.
B-Trees: More Than I Thought I'd Want to Know(benjamincongdon.me) Recently, I’ve been reading through the excellent Database Internals (Alex Petrov, 2019). The first half of the book is dedicated to the implementation of database storage engines – the subsystem(s) of a DBMS that handles long-term persistence of data. A surprising amount of this section discusses the implementation and optimization of various B-Tree data structures.
319 points by hochmartinez 15 days ago | 16 comments
Using watermarks to coordinate change data capture in Postgres(sequinstream.com) In change data capture, consistency is paramount. A single missing or duplicate message can cascade into time-consuming bugs and erode trust in your entire system. The moment you find a record missing in the destination, you have to wonder: is this the only one? How many others are there?
Postgres UUIDv7 and per-back end monotonicity(brandur.org) An implementation for UUIDv7 was committed to Postgres earlier this month. These have all the benefits of a v4 (random) UUID, but are generated with a more deterministic order using the current time, and perform considerably on inserts using ordered structures like B-trees.
230 points by craigkerstiens 15 days ago | 104 comments
Databases in 2024: A Year in Review(cs.cmu.edu) Like a shot to your dome piece, I'm back to hit you with my annual roundup of what happened in the rumble-tumble game of databases. Yes, I used to write this article on the OtterTune blog, but the company is dead (RIP). I'm doing this joint on my professor blog.
Fun facts about SQLite(avi.im) SQLite is the most deployed and most used database. There are over one trillion (1000000000000 or a million million) SQLite databases in active use.
Is there such a thing as "private, interactive databases" for SaaS's(ycombinator.com) So i've been building a product and my clients really hate the idea that their code is stored on my database (unencrypted). The problem is that I need to process the data in the background often and thus I cannot store it end-to-end encrypted. Is there any service that allows you to deploy some sort of database that only the client accesses and at the same time allows me to process it somehow maybe via apis?