Hacker News with Generative AI: Data Structures

Relaxed Radix Balanced Trees (2024) (horne-khan.com)
I’m adding immutable vectors to my language, Ivan, and needed to pick a suitable data structure to implement them.
TOML [Tom's Obvious Minimal Language] (toml.io)
TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics. TOML is designed to map unambiguously to a hash table. TOML should be easy to parse into data structures in a wide variety of languages.
This New Algorithm for Sorting Books or Files Is Close to Perfection (wired.com)
Computer scientists often deal with abstract problems that are hard to comprehend, but an exciting new algorithm matters to anyone who owns books and at least one shelf.
What's in a ring buffer? And using them in Rust (ntietz.com)
Working on my cursed MIDI project, I needed a way to store the most recent messages without risking an unbounded amount of memory usage.
JesseSort: A novel sorting algorithm that is faster than Python's default sort. (github.com/lewj85)
JesseSort is a novel sorting algorithm that introduces a new data structure called a Rainbow to efficiently organize and merge elements. It achieves a runtime of O(n log n).
Tiny Pointers (arxiv.org)
This paper introduces a new data-structural object that we call the tiny pointer.
Go's new map implementation in 1.24 is powered by Swiss Tables (twitter.com)
Solving the ABA Problem in Rust with Tagged Pointers (minikin.me)
Concurrent programming is challenging, especially when designing lock-free data structures. One subtle issue that can arise is the ABA problem, leading to unexpected behavior in compare-and-swap (CAS) operations. In this post, we’ll explore the ABA problem and how to solve it in Rust using idiomatic and efficient techniques.
Generating Voronoi diagrams using Fortune's algorithm (redpenguin101.github.io)
Generating Voronoi Diagrams using Fortune’s Algorithm
Implementation of a RingBuffer in Java with optional FIFO like semantics (github.com/evolvedbinary)
Some extra Collection classes targetting Java 8.
Elastic Binary Trees (2011) (blogspot.com)
In computer science, an elastic binary tree (or EB tree or EBtree) is a binary search tree specially optimized to very frequently store, retrieve and delete discrete integer or binary data without having to deal with memory allocation.
Go Data Structures (2009) (swtch.com)
When explaining Go to new programmers, I've found that it often helps to explain what Go values look like in memory, to build the right intuition about which operations are expensive and which are not. This post is about basic types, structs, arrays, and slices.
Search logs faster than Sonic – Log search engine internals (vegasecurity.com)
Have you ever wondered how Elasticsearch works? How is it so fast? What makes it different from other databases like PostgreSQL? What cool data structures are at play?
Show HN: Multi-/BiKeyMap (Go Module) (github.com/aeimer)
A go lib which handles maps with multiple keys.
Hierarchical CSV (github.com/Ericson2314)
Suppose I have a simple database of IDs and names. I can do this in JSON5 with:
Show HN: Ldump – serialize any Lua data (github.com/girvel)
ldump is a flexible serializer, able to serialize any data, starting with circular references, tables as keys, functions with upvalues, metatables and ending with coroutines, threads and userdata (by defining how they should be serialized). It outputs valid Lua code that recreates the original object, doing the deserialization through load(data)(). It aims for functionality and flexibility instead of speed and size, allowing full serialization of complex data, such as video game saves.
[x<<1 + 2, x<<1 + 3] is a Binary Tree (ivanbelenky.com)
From the post $$x + 1 \ \text{is a list}$$ I bring to you the binary tree.
You could have invented Fenwick trees (cambridge.org)
Fenwick trees, also known as binary indexed trees are a clever solution to the problem of maintaining a sequence of values while allowing both updates and range queries in sublinear time.
Examples of quick hash tables and dynamic arrays in C (nullprogram.com)
This article durably captures my reddit comment showing techniques for std::unordered_map and std::vector equivalents in C programs. The core, important features of these data structures require only a dozen or so lines of code apiece. They compile quickly, and tend to run faster in debug builds than release builds of their C++ equivalents. What they lack in genericity they compensate in simplicity. Nothing here will be new. Everything has been covered in greater detail previously, which I will reference when appropriate.
Examples of quick hash tables and dynamic arrays in C (nullprogram.com)
This article durably captures my reddit comment showing techniques for std::unordered_map and std::vector equivalents in C programs. The core, important features of these data structures require only a dozen or so lines of code apiece. They compile quickly, and tend to run faster in debug builds than release builds of their C++ equivalents. What they lack in genericity they compensate in simplicity. Nothing here will be new. Everything has been covered in greater detail previously, which I will reference when appropriate.
Ropey – A UTF8 text rope for manipulating and editing large text (github.com/cessen)
Ropey is a utf8 text rope for Rust, designed to be the backing text-buffer for applications such as text editors. Ropey is fast, robust, and can handle huge texts and memory-incoherent edits with ease.
Data evolution with set-theoretic types (dashbit.co)
Recently I have been working on projects that integrate Elixir with native code in C and Rust. One of the Rust libraries defines the following struct (with fields removed for simplicity):
Using black magic to make a fast circular buffer (2017) (calho.st)
Yesterday, I took a glance at the Wikipedia page for the circular buffer and was intrigued by an alleged optimization technique that I was not familiar with:
Flattening ASTs and other compiler data structures (2023) (cs.cornell.edu)
Arenas, a.k.a. regions, are everywhere in modern language implementations. One form of arenas is both super simple and surprisingly effective for compilers and compiler-like things. Maybe because of its simplicity, I haven’t seen the basic technique in many compiler courses—or anywhere else in a CS curriculum for that matter. This post is an introduction to the idea and its many virtues.
Today I learned that bash has hashmaps (2024) (xeiaso.net)
Hashmaps (associative arrays) are a great way to store a bag of key-value data. At work I was writing something that needed me to spawn a bunch of GPU instances, GPU availability is spread out by region and GPU type. I wanted to store a mapping of GPU kind to region name and for some reason I thought it would be a good idea to do it in bash.
Why is hash(-1) == hash(-2) in Python? (omairmajid.com)
While browsing Reddit the other day waiting for my code to compile, I ran across this question on r/Python:
Show HN: SPath is a Rust lib for query JSONPath over any semi-structured data (github.com/cratesland)
SPath: Query expressions for semi-structured data
Parallel-hashmap: drop-in replacement for unordered_map, unordered_set (github.com/greg7mdp)
This repository aims to provide a set of excellent hash map implementations, as well as a btree alternative to std::map and std::set, with the following characteristics:
B-Trees: More Than I Thought I'd Want to Know (benjamincongdon.me)
Recently, I’ve been reading through the excellent Database Internals (Alex Petrov, 2019). The first half of the book is dedicated to the implementation of database storage engines – the subsystem(s) of a DBMS that handles long-term persistence of data. A surprising amount of this section discusses the implementation and optimization of various B-Tree data structures.
Static search trees: faster than binary search (curiouscoding.nl)
In this post, we will implement a static search tree (S+ tree) for high-throughput searching of sorted data, as introduced on Algorithmica.