Hacker News with Generative AI: Databases

Build your own SQLite in Rust, Part 5: Evaluating queries (sylver.dev)
In the previous posts, we've explored the SQLite file format and built a simple SQL parser. It's time to put these pieces together and implement a query evaluator!
A Language Server for Postgres (github.com/supabase-community)
A collection of language tools and a Language Server Protocol (LSP) implementation for Postgres, focusing on developer experience and reliable SQL tooling.
PostgreSQL Lands Self-Join Elimination Optimization (phoronix.com)
More than seven years in the making, merged yesterday for PostgreSQL is a self-join elimination "SJE" feature as a performance optimization for some queries.
Representing Graphs in PostgreSQL (richard-towers.com)
Let’s say we’ve got some graph-like data, such as a social network.
Searchcode.com’s SQLite database is probably 6 terabytes bigger than yours (boyter.org)
searchcode.com’s SQLite database is probably one of the largest in the world, at least for a public facing website. It’s actual size is 6.4 TB. Which is probably 6 terabytes bigger than yours.
The Impact of Metadata Configurations on Text-to-SQL Performance [pdf] (corraldata.com)
Siren Call of SQLite on the Server (pid1.dev)
At Terrateam, we are big fans of Fly.io. The service is hosted there and it’s served us well. Just deploy your TOML file, get your infrastructure, do something else with the rest of your day.
Critical PostgreSQL bug tied to zero-day attack on US Treasury (theregister.com)
A high-severity SQL injection bug in the PostgreSQL interactive tool was exploited alongside the zero-day used to break into the US Treasury in December, researchers say.
MySQL at Uber (uber.com)
Bulk inserts on ClickHouse: How to avoid overstuffing your instance (runportcullis.co)
As we hit the midway point of the second month in 2025, a lot of you might be starting to really dig in on new data initiatives and planning key infrastructure changes to your company’s data stack.
Microsoft open sources PostgreSQL extensions (theregister.com)
Analysis When Microsoft rolled out an open source extension stack for PostgreSQL to handle document-style data, it wasn't just taking aim at MongoDB – the dominant NoSQL player – but also blurring the lines between relational and non-relational databases, according to one expert.
How about trailing commas in SQL? (eisentraut.org)
Anecdotally, this might be the most requested feature in SQL: Allow some trailing commas.
The problem with MySQL foreign key constraints in Online Schema Changes (2021) (openark.org)
This post explains the inherent problem of running online schema changes in MySQL, on tables participating in a foreign key relationship. We’ll lay some ground rules and facts, sketch a simplified schema, and dive into an online schema change operation.
Show HN: Reducing memory allocs in convoy's flatten pkg (medium.com)
At convoy we started off using MongoDB as our primary data store and it was super great for a long time. However, as data complexities grew we quickly realised a relational database was superior for our use case. We detail our reasons for switching to PostgreSQL here.
Disaggregated OLTP Systems (transactional.blog)
These notes were first prepared for an informal presentation on the various cloud-native disaggregated OLTP RDBMS designs that have been getting published and it cherry-picked one paper per notable design decision. For the papers covered then, I’ve included a summary of the discussion we had after each paper. This page is being actively extended to cover all disaggregated OLTP papers, even for papers that are similar between two different vendors. ("Actively" meaning as of 2025-02-09.)
The missing tier for query compilers (scattered-thoughts.net)
Database query engines used to be able to assume that disk latency was so high that the overhead of interpreting the query plan didn't matter. Unfortunately these days a cheap nvme ssd can supply data much faster than a query interpreter can process it.
PostgreSQL Best Practices (speakdatascience.com)
PostgreSQL (Postgres) is one of the most powerful and popular relational database management systems available today. Whether you’re a database administrator, developer, or DevOps engineer, following best practices ensures optimal performance, security, and maintainability of your database systems.
Scaling with PostgreSQL without boiling the ocean (shayon.dev)
“Postgres was great when we started but now that our service is being used heavily we are running into a lot of ‘weird’ issues”
Solving Postgres' Search Limitations (paradedb.com)
We recently completed one of our biggest engineering bets to date: migrating pg_search, a Postgres extension for full text search and analytics, to Postgres' block storage system. In doing so, pg_search is the first-ever extension1 to port an external file format to Postgres block storage.
Azure Data Studio Retirement (microsoft.com)
We’re announcing the upcoming retirement of Azure Data Studio (ADS) on February 6, 2025, as we focus on delivering a modern, streamlined SQL development experience.
Show HN: SQLite disk page explorer (github.com/QuadrupleA)
A small GUI application built in redbean that lets you explore your SQLite databases "page by page" the way SQLite sees them.
DuckDB 1.2.0 (duckdb.org)
The DuckDB team is happy to announce that today we're releasing DuckDB version 1.2.0, codenamed “Histrionicus”.
The Slotted Counter Pattern (2020) (planetscale.com)
It is a common database pattern to increment an INT column when an event happens, such as a download or page view.
Build your own SQLite, Part 4: reading tables metadata (sylver.dev)
As we saw in the opening post, SQLite stores metadata about tables in a special "schema table" starting on page 1. We've been reading records from this table to list the tables in the current database, but before we can start evaluating SQL queries against user-defined tables, we need to extract more information from the schema table.
Search logs faster than Sonic – Log search engine internals (vegasecurity.com)
Have you ever wondered how Elasticsearch works? How is it so fast? What makes it different from other databases like PostgreSQL? What cool data structures are at play?
Show HN: Gave Claude LSD SQL (github.com/lsd-so)
GarminDB (github.com/tcgoetz)
Python scripts for parsing health data into and manipulating data in a SQLite database. SQLite is a light weight database that doesn't require a server.
Earthstar – A database for private, distributed, offline-first applications (earthstar-project.org)
Earthstar is a specification and JavaScript library for building connected applications owned and run by their users.
Hierarchical CSV (github.com/Ericson2314)
Suppose I have a simple database of IDs and names. I can do this in JSON5 with:
MillenniumDB: Property graph and RDF engine, still in development (github.com/MillenniumDB)
MillenniumDB is a graph oriented database management system developed by the Millennium Institute for Foundational Research on Data (IMFD).