Hacker News with Generative AI: Performance Optimization

Golang sync.Pool is not a silver bullet (wundergraph.com)
When it comes to performance optimization in Go, sync.Pool often appears as a tempting solution. It promises to reduce memory allocations and garbage collection pressure by reusing objects. But is it always the right choice? Let's dive deep into this fascinating topic.
Growing Buffers to Avoid Copying Data (johnnysswlab.com)
We at Johnny’s Software Lab LLC are experts in performance. If performance is in any way concern in your software project, feel free to contact us.
Why Adding a Full Hard Drive Can Make a Computer More Powerful (wired.com)
“Obviously” is a dangerous word, even in scenarios that seem simple. Suppose, for instance, you need to do an important computation. You get to choose between two computers that are almost identical, except that one has an extra hard drive full of precious family photos. It’s natural to assume that the two options are equally good—that an extra drive with no space remaining won’t aid your computation.
Ubuntu Provides More Insight into Their Decision Not to "-O3" All Packages (phoronix.com)
Since last year Canonical had been investigating using -O3 compiler optimizations for their Ubuntu package builds in the name of delivering better performance for Ubuntu Linux.
Disk I/O bottlenecks in GitHub Actions (depot.dev)
When your CI pipelines are slow, you can only optimize so much. Bottlenecks in CPU, Network, Memory, and Disk I/O can all contribute to slow CI pipelines. Let's take a look at how disk I/O can be a bottleneck in GitHub Actions.
Faster interpreters in Go: Catching up with C++ (planetscale.com)
The SQL evaluation engine that ships with Vitess, the open-source database that powers PlanetScale, was originally implemented as an AST evaluator that used to operate directly on the SQL AST generated by our parser. Over this past year, we've gradually replaced it with a Virtual Machine which, despite being written natively in Go, performs similarly to the original C++ evaluation code in MySQL.
The Curious Case of Beam CPU Usage (2019) (stressgrid.com)
While benchmarking Go vs Elixir vs Node, we discovered that Elixir (running on the BEAM virtual machine) had much higher CPU usage than Go, and yet its responsiveness remained excellent. Some of our readers suggested that busy waiting may be responsible for this behavior.
Prospero challenge, now with more garbage collection (bernsteinbear.com)
Matt Keeter put up The Prospero Challenge, which is like catnip for me. It’s a well-scoped project: we have a slow program. Make it faster within these constraints. In this post, I will describe two very small changes that can speed up his sample program with minimal effort.
Btrfs Adding Fast/Realtime ZSTD Compression and Other Performance Optimizations (phoronix.com)
David Sterba of SUSE sent in all of the Btrfs file-system updates today for the now-open Linux 6.15 kernel merge window.
Fast columnar JSON decoding with arrow-rs (arroyo.dev)
JSON is the most common serialization format used in streaming pipelines, so it pays to be able to deserialize it fast. This post covers in detail how the arrow-json library works to perform very efficient columnar JSON decoding, and the additions we've made for streaming use cases.
Optimizing by 1700x by not being silly (ayende.com)
I care about the performance of RavenDB. Enough that I would go to epic lengths to fix them. Here I use “epic” both in terms of the Agile meaning of multi-month journeys and the actual amount of work required. See my recent posts about RavenDB 7.1 I/O work.
High-Performance PNG Decoding (blend2d.com)
It's been some time I have written about a High-Performance QOI Codec, which joined other codecs offered by Blend2D library in 2024. The development of image codecs continued and now I would like to announce a new high-performance PNG codec, which is much faster than other available codecs written in C, C++, and other programming languages.
Show HN: I've created the fastest open-source DNS bruteforcer using XF_ADP (github.com/c3l3si4n)
An experimental high-performance DNS query bruteforce tool built with AF_XDP for extremely fast and accurate bulk DNS lookups.
Damon Self-Tuned Memory Tiering Shows Nice Improvement for Linux Servers (phoronix.com)
Linux developer SeongJae Park has posted a set of patches for the Linux kernel's wonderful DAMON code to provide for self-tuned memory tiering that "just works" and is racking up some nice performance wins.
One Billion Row Challenge in Racket (defn.io)
I decided to have some fun tonight and work on a Racket solution to the One Billion Row Challenge.
Make Ubuntu packages 90% faster by rebuilding them (github.com)
You can take the same source code package that Ubuntu uses to build jq, compile it again, and realize 90% better performance.
Zlib-rs is faster than C (trifectatech.org)
We've released version 0.4.2 of zlib-rs, featuring a number of substantial performance improvements. We are now (to our knowledge) the fastest api-compatible zlib implementation for decompression, and beat the competition in the most important compression cases too.
Matching Regexps 200 Times Faster (eregon.me)
You might have seen @byroot’s excellent blog post series on optimizing the json gem. From the first blog post it’s clear most of the time for generating JSON is spent in generate_json_string() and specifically in convert_UTF8_to_JSON(), i.e., in converting Ruby Strings to JSON Strings.
Going down the rabbit hole of Git's new bundle-URI (gitbutler.com)
Git's new bundle-uri could help significantly speed up clones, but what bugs lurk within?
Java Is Fast, If You Don't Create Many Objects (2022) (vanillajava.blog)
This article looks at a benchmark passing events over TCP/IP at 4 billion events per minute using the net.openhft.chronicle.wire.channel package in Chronicle Wire and why we still avoid object allocations..
Caching Strategies for Ultra-High Performance in Ruby on Rails, Part 1 (scoutapm.com)
When it comes to optimizing web applications, a proper caching strategy is critical because it can significantly reduce load times and improve the overall user experience.
Show HN: Krep a High-Performance String Search Utility Written in C (davidesantangelo.github.io)
A blazingly fast string search utility for performance-critical applications
Representing Type Lattices Compactly (bernsteinbear.com)
The Cinder JIT compiler does some cool stuff with how they represent types so I’m going to share it with you here. The core of it is thinking about types as sets (lattices, even), and picking a compact representation. Compilers will create and manipulate types with abandon, so all operations have to be fast.
Goodbye Dockerfile, Hello Bazel: Doubling Our CI Speed (plaid.com)
In the first half of 2024, Plaid’s Developer Efficiency team set out to speed up our largest CI pipeline without disrupting developer workflows—and ended up cutting CI times by 50%, shrinking container images by 90%, and making local iteration up to 5x faster.
Xcode now supports Processor Trace profiling on M4/A18 (apple.com)
An Attempt to Catch Up with JITs: The False Lead of Optimizing Inline Caches (programming-journal.org)
Is it possible to improve the performance of AoT compilers by adding Dynamic Binary Modification (DBM) to the executions?
Arranging invisible icons in quadratic time (2021) (wordpress.com)
Near the end of January I was pointed to a twitter thread where a Windows user with a powerful machine was hitting random hangs in explorer. Lots of unscientific theories were being proposed. I don’t generally do random analysis of strangers’ performance problems but the case sounded interesting so I thought I’d take a look.
Compilation of JavaScript to WASM, Part 3: Partial Evaluation (cfallin.org)
This is the final post of a three-part series covering my work on "fast JS on Wasm"; the first post covered PBL, a portable interpreter that supports inline caches, the second post covered ahead-of-time compilation in general terms, and this post discusses how we actually build the ahead-of-time compiler backends. Please read the first two posts for useful context!
Show HN: ProgrammerHumor.io – WordPress to Phoenix Liveview ~ 7x faster (programmerhumor.io)
Python and Linux: 1991's best inventions.
WordPress Web Performance Optimization (josh.blog)
Last year, I set out to see how much I could improve my site’s performance without relying on commercial performance plugins.