Hacker News with Generative AI: Data Structures

Implementing complex numbers and FFT with just datatypes (2023) (github.com)
In this article, I'll explain why implementing numbers with just algebraic datatypes is desirable.
Dividing an array into fair sized chunks (lemire.me)
Suppose that you have an array of N elements and you want to divide it into M chunks. It is a common task when trying to spread N units of work over M threads, for example.
A Run of CRDT Posts (jhellerstein.github.io)
Over the next few days, I'm going to post a number of observations about CRDTs: Convergent Conflict-free Replicated Data Types. These are data structures that aspire to help us with coordination-free distributed programming, a topic that interests me a lot. How can developers (or languages/compilers) deliver distributed programs that are safe or correct in important ways, without employing expensive mechanisms for coordination that make the global cloud run as slowly as a sequential computer?
Kangaroo: A flash cache optimized for tiny objects (2021) (engineering.fb.com)
Kangaroo is a new flash cache that enables more efficient caching of tiny objects (objects that are ~100 bytes or less) and overcomes the challenges presented by existing flash cache designs.
CRDTs: Pros and Cons (Lattices and Lettuces?) (jhellerstein.github.io)
Over the next few days, I'm going to post a number of observations about CRDTs: Convergent Replicated Data Types. These are data structures that aspire to help us with coordination-free distributed programming, a topic that interests me a lot. How can developers (or languages/compilers) deliver distributed programs that are safe or correct in important ways, without employing expensive mechanisms for coordination that make the global cloud run as slowly as a sequential computer?
Optimal bounds for open addressed hash tables without reordering (arxiv.org)
In this paper, we revisit one of the simplest problems in data structures: the task of inserting elements into an open-addressed hash table so that elements can later be retrieved with as few probes as possible.
Wavelet Trees: An Introduction (2011) (alexbowe.com)
Today I will talk about an elegant way of answering rank queries on sequences over larger alphabets – a structure called the Wavelet Tree.
Show HN: Slice-tree – A piece table data structure implemented using RB tree (github.com/eu-ge-ne)
In computing, a piece table is a data structure typically used to represent a text document while it is edited in a text editor.
Adaptive Hashing (quotenil.com)
At the 2024 ELS, I gave a talk on adaptive hashing, which focusses on making general purpose hash tables faster and more robust at the same time.
Reservoir Sampling (samwho.dev)
Reservoir sampling is a technique for selecting a fair random sample when you don't know the size of the set you're sampling from.
The Design of Compact Elastic Binary Trees (Cebtree) (blogspot.com)
Those who often hear me discuss about my week-end projects have been accustomed to hearing about deuterium fusion (that's for another post), laser engraving, and the compact version of the ebtrees, aka compact elastic binary trees, without knowing all the details. That's what we'll be discussing here.
Adaptive Hashing (quotenil.com)
At the 2024 ELS, I gave a talk on adaptive hashing, which focusses on making general purpose hash tables faster and more robust at the same time.
Bloom Filters (thegreenplace.net)
The original motivation for the creation of Bloom filters is efficient set membership, using a probabilistic approach to significantly reduce the time and space required to reject items that are not members in a certain set.
Swarm Testing Data Structures (tigerbeetle.com)
We discovered a cute little pattern the other day when refactoring TigerBeetle’s intrusive queue — using Zig’s comptime reflection for exhaustively testing data structure’s public API. Isn’t it cool when your property test fails when you add a new API, because “public API is tested” is one of the properties you test?!
Gems in Geospatial Indexing (chaos.social)
Programming languages should have a tree traversal primitive (tylerglaiel.com)
There should be a control flow construct in programming languages that can handle tree-like traversal in a nice way, similar to how for/foreach loops can handle linear traversal. It's a bit of a missing gap in the current set of control flow constructs most languages these days have settled on. Its a thing I end up having to do *all the time* and it seems like there should be some shortcuts for it.
Programming languages should have a tree traversal primitive (tylerglaiel.com)
There should be a control flow construct in programming languages that can handle tree-like traversal in a nice way, similar to how for/foreach loops can handle linear traversal.
Packed Data Support in Haskell (arthi-chaud.github.io)
This blog post aims to be a short and accessible summary of a paper that will be published at ECOOP 2025, titled Type-safe and portable support for packed data.
Pahole: Analysing Memory Layout of Complex Data Structures with Ease (pramodkumbhar.com)
Sunday, 5th November 2023: Putting together this blog post feels like a positive stride! As I mentioned in the previous post, (core-to-core latency tool), I'm aiming to integrate more consistent writing into my routine. While it took a month to pen this down, it's progress from the previous year 😇. Hoping that the upward trend will continue...!
Stop Writing `__init__` Methods (glyph.im)
YEARS OF DATACLASSES yet NO REAL-WORLD USE FOUND for overriding special methods just so you can have some attributes.
Show HN: Bptree – A B+ tree implementation in C (github.com/habedi)
Bptree is a lightweight single-header B+ tree implementation written in C.
Zig's new LinkedList API (it's time to learn fieldParentPtr) (openmymind.net)
In a recent, post-Zig 0.14 commit, Zig's SinglyLinkedList and DoublyLinkedList saw significant changes.
Fibonacci Hashing: The Optimization That the World Forgot (probablydance.com)
I recently posted a blog post about a new hash table, and whenever I do something like that, I learn at least one new thing from my comments. In my last comment section Rich Geldreich talks about his hash table which uses “Fibonacci Hashing”, which I hadn’t heard of before.
Lisp Programs Don't Have Parentheses (blogspot.com)
Lisp programs don't have parentheses — they are made of nested linked lists. The parentheses only exist in the printed representation — the ASCII serialization — of a Lisp program. They tell the Lisp reader where the nested lists begin and end. Parenthesis are the contour lines in the topographic map of your Lisp program.
Stop Treating YAML Like a String (theyamlengineer.com)
Koreo is a data structure orchestration engine. Although it's primarily designed for Kubernetes resource orchestration, Koreo's core functionality can orchestrate and manage virtually any structured data. What Koreo provides today, however, is a new approach to Kubernetes configuration management empowering developers and platform teams through programmable workflows. This approach draws upon the strengths of existing tools like Helm, Kustomize, and Crossplane while addressing some of their limitations.
Vector Sets are part of Redis (antirez.com)
Yesterday we finally merged vector sets into Redis, here you can find the README that explains in detail what you get:
Two Attacks on Naive Tree Hashes (jacko.io)
A diagram of C23 basic types (wordpress.com)
This week on the C committee mailing list we had a discussion about how C’s types are organized into different categories. At the end I came up with a diagram with that organization. It basically translates the section “6.2.5 Types” of the C23 standard into a graph of inclusions.
Writing a tiny undo/redo stack in JavaScript (julik.nl)
I’ve needed this before - a couple of times. Third time I figured I needed something small, nimble - yet complete. And - at the same time - wondering about how to do it in a very simple manner. I think it worked out great, so let’s dig in.
Shift-to-Middle Array: A Faster Alternative to Std:Deque? (github.com/attilatorda)
The Shift-To-Middle Array is a dynamic array designed to optimize insertions and deletions at both ends, offering a high-performance alternative to std::deque, std::vector, and linked lists.