Hacker News with Generative AI: Performance Optimization

Show HN: AutoThink – Boosts local LLM performance with adaptive reasoning (ycombinator.com)
I built AutoThink, a technique that makes local LLMs reason more efficiently by adaptively allocating computational resources based on query complexity.

AI, Machine Learning, Performance Optimization, Software

397 points by codelion 44 days ago | 68 comments

Look Ma, No Bubbles: Designing a Low-Latency Megakernel for Llama-1B (hazyresearch.stanford.edu)
There are some applications that benefit from running LLMs really, really fast. This low-latency regime encompasses applications like chatbots and human-in-the-loop workflows, where users care a lot about seeing responses come back immediately.

Performance Optimization, Computer Science, Artificial Intelligence

236 points by ljosifov 44 days ago | 31 comments

Python Pandas Ditches NumPy for Speedier PyArrow (thenewstack.io)

Python, Pandas, PyArrow, Performance Optimization, Data Analysis

17 points by puttycat 44 days ago | 5 comments

Unlocking Ractors: class instance variables in Ruby (byroot.github.io)
In a previous post about ractors, I explained why I think it’s really unlikely you’d ever be able to run an entire application inside a ractor, but that they could still be situationally very useful to move CPU-bound work out of the main thread, and to unlock some parallel algorithm.

Ruby, Programming, Multithreading, Performance Optimization

82 points by hahahacorn 45 days ago | 0 comments

Improving performance of original dav1d video decoder (videolan.org)
I noticed a very clickbait bounty, I initially realized that company's original task was not to overtake implementation, but to advertise that Rust is 5% slower than C. Whether she actually pays or not is another matter. The main thing for Prossimo was to make a fuss that the current rav1d implementation was only 5% slower, so that the general public would think that the language was the same in speed.

Video Encoding, Performance Optimization, Rust, C, Open Source

50 points by ycomb_anon 47 days ago | 23 comments

Faster Firewalls with Bpfilter (lwn.net)
From servers in a data center to desktop computers, many devices communicating on a network will eventually have to filter network traffic, whether it's for security or performance reasons. As a result, this is a domain where a lot of work is put into improving performance: a tiny performance improvement can have considerable gains.

Network Security, Performance Optimization, Linux

23 points by signa11 49 days ago | 1 comments

Accelerating Docker Builds by Halving EC2 Boot Time (depot.dev)
We at Depot like making shit fast, whether that's Docker image builds, Github Actions runners, Bazel caching, Turborepo, or even our own infrastructure.

Docker, DevOps, Cloud Computing, Performance Optimization

20 points by Telstrom90 49 days ago | 9 comments

Whippet GC notes on Guile, heuristics, and heap growth (wingolog.org)
Greets all! Another brief note today. I have gotten Guile working with one of the Nofl-based collectors, specifically the one that scans all edges conservatively (heap-conservative-mmc / heap-conservative-parallel-mmc). Hurrah!

Garbage Collection, Programming Languages, Guile, Memory Management, Performance Optimization

80 points by paroneayea 49 days ago | 1 comments

Slack, Notion, and VSCode Improved Electron App Performance (palette.dev)
Leading the development of electron-react-boilerplate for over a decade has taught me a lot about bottlenecks in Electron apps and how to work around them. Properly engineered, Electron apps can closely rival the performance of native apps. This post is a complete guide on exploiting every Electron performance optimization I know so that you can get the most mileage.

Electron, Performance Optimization, Web Development, Software, Productivity Tools

5 points by amilajack 50 days ago | 1 comments

More than you ever wanted to know about font loading on the web (2021) (industrialempathy.com)
When I started thinking about writing a post about web font loading my intention was to propose relatively sophisticated ideas that I've been playing with for a while. However, as I was trying to use them in real-world websites I realized that deployment of the more advanced techniques is de-facto impossible without the creation of new web standards.

Web Development, Font Loading, Web Standards, Performance Optimization

6 points by Tomte 52 days ago | 0 comments

FUSE to Enjoy a Performance Improvement with Linux 6.16 (phoronix.com)
Queued up via the FUSE "for-next" Git branch ahead of the upcoming Linux 6.16 merge window is a change to increase the read directory buffer size to in turn enhance the performance.

Linux, Operating Systems, Performance Optimization

7 points by bundie 52 days ago | 0 comments

Understanding the Go Scheduler (nghiant3223.github.io)
Understanding the Go scheduler is crucial for Go programmer to write efficient concurrent programs. It also helps us become better at troubleshooting performance issues or tuning the performance of our Go programs. In this post, we will explore how Go scheduler evolved over time, and how the Go code we write happens under the hood.

Go, Programming Languages, Concurrency, Performance Optimization

180 points by gnabgib 53 days ago | 26 comments

SQL OFFSET is worse than keyset pagination (use-the-index-luke.com)
After implementing a pipelined top-N query to retrieve the first page efficiently, you will often also need another query to fetch the next pages. The resulting challenge is that it has to skip the rows from the previous pages.

Databases, Performance Optimization, SQL

7 points by fanf2 54 days ago | 0 comments

Precomputing Transparency Order in 3D (jacobdoescode.com)
Transparency — or more precisely, translucency — remains a problem when rendering in 3D. When you have translucent shapes, the order in which they get rendered is very important. Consider what happens if this is done incorrectly.

3D Rendering, Computer Graphics, Performance Optimization

14 points by jacobp100 54 days ago | 3 comments

Jetrelay: A high-performance ATproto relay in 500 LOC (asayers.com)
This post explains the design of jetrelay, a pub/sub server compatible with Bluesky’s “jetstream” data feed. Using a few pertinent Linux kernel features, it avoids doing almost any work itself. As a result, it’s highly efficient: it can saturate a 10 Gbps network connection with just 8 CPU cores.

Networking, Open Source, Performance Optimization, Software

16 points by todsacerdoti 56 days ago | 5 comments

Use Method: Linux Performance Checklist (brendangregg.com)
The USE Method provides a strategy for performing a complete check of system health, identifying common bottlenecks and errors. For each system resource, metrics for utilization, saturation and errors are identified and checked. Any issues discovered are then investigated using further strategies.

Linux, Performance Optimization, System Administration, Debugging, Troubleshooting

7 points by Tomte 58 days ago | 0 comments

Binary Formats Are Better Than JSON in Browsers (adamfaulkner.github.io)
JSON used to be faster than alternatives in browsers, but that's not the case anymore. For performance sensitive web apps, it is worth considering Avro, Protobuf, or Bebop.

Web Development, Performance Optimization, Data Formats

75 points by adamkf 58 days ago | 16 comments

Show HN: LoopMix128 – Fast C PRNG (.46ns), 2^128 Period, BigCrush/PractRand Pass (github.com/danielcota)
This repository contains LoopMix128, an extremely fast pseudo-random number generator (PRNG) with a guaranteed period of 2^128, proven injectivity, and clean passes in both BigCrush and PractRand (32TB). It is designed for non-cryptographic applications where speed and statistical quality are important.

C Programming, Performance Optimization, Software, Github

76 points by the_othernet 61 days ago | 34 comments

21 GB/s CSV Parsing Using SIMD on AMD 9950X (nietras.com)
Sep 0.10.0 was released April 22nd, 2025 with optimizations for AVX-512 capable CPUs like the AMD 9950X (Zen 5) and updated benchmarks including the 9950X. Sep now achieves a staggering 21 GB/s on the 9950X for the low-level CSV parsing. 🚀 Before 0.10.0, Sep achieved ~18 GB/s on 9950X.

Performance Optimization, CSV Parsing, AMD CPUs, Benchmarking, Software

322 points by zigzag312 62 days ago | 169 comments

Implementing a Struct of Arrays (brevzin.github.io)
Recently, I watched Andrew Kelley’s talk on Practical Data Oriented Design. It goes into some of the architectural changes he’s been making to the Zig compiler, with pretty significant performance benefit. Would definitely recommend checking out the talk, even if you’re like me and have never written any Zig.

Programming, Software Architecture, Performance Optimization

126 points by mpweiher 63 days ago | 50 comments

V8 JavaScript engine gets eager compilation hints (devclass.com)
The V8 JavaScript engine, used by the Chrome web browser, Node.js and elsewhere, has a new feature which lets developers mark a file for early compilation, with strong benefits for load time provided the option is used sparingly.

JavaScript, Performance Optimization, Web Development, Chrome, Node.js

3 points by gmac 63 days ago | 0 comments

QUIC restarts, slow problems: udpgrm to the rescue (cloudflare.com)
At Cloudflare, we do everything we can to avoid interruption to our services. We frequently deploy new versions of the code that delivers the services, so we need to be able to restart the server processes to upgrade them without missing a beat. In particular, performing graceful restarts (also known as "zero downtime") for UDP servers has proven to be surprisingly difficult.

Networking, Server Management, Performance Optimization, UDP

6 points by emot 63 days ago | 0 comments

Inheritance was invented as a performance hack (2021) (catern.com)
Inheritance was invented by the Simula language as a way to support intrusive lists, save memory, and simplify the garbage collector.

Programming, History, Performance Optimization, Garbage Collection

194 points by aquastorm 66 days ago | 244 comments

Critical CSS (kigo.studio)

Web Development, CSS, Performance Optimization

234 points by stevenpotts 66 days ago | 116 comments

Another look into PostgreSQL CTE materialization and non-idempotent subqueries (shayon.dev)
A few days ago, I wrote about a surprising planner behavior with CTEs, DELETE, and LIMIT in PostgreSQL, a piece I hastily put together on a bus ride.

PostgreSQL, SQL, Database Management, Performance Optimization

5 points by craigkerstiens 67 days ago | 0 comments

Distributed Continuous GPU Profiling (zymtrace.com)
Identify performance bottlenecks in CUDA kernels, optimize inference batch size, and eliminate idle GPU cycles —with zero friction.

GPU Profiling, CUDA, Performance Optimization, Machine Learning, Distributed Computing

12 points by tdullien 69 days ago | 2 comments

Making PyPI's test suite 81% faster (trailofbits.com)
Trail of Bits has collaborated with PyPI for several years to add features and improve security defaults across the Python packaging ecosystem.

Python, Software Development, Security, Performance Optimization

8 points by zdw 70 days ago | 0 comments

Making PyPI's test suite 81% faster (trailofbits.com)
Trail of Bits has collaborated with PyPI for several years to add features and improve security defaults across the Python packaging ecosystem.

Python, Software Development, Security, Performance Optimization

11 points by woodruffw 70 days ago | 2 comments

Optimizing eBPF I/O latency accounting when running 37M IOPS, on 384 CPUs (tanelpoder.com)
In this post I will introduce a much more efficient method for accounting block I/O latencies with eBPF on Linux.

Performance Optimization, Linux, eBPF, I/O

22 points by tanelpoder 72 days ago | 0 comments

Dataframely: A polars-native data frame validation library (quantco.com)
At QuantCo, we are constantly trying to improve the quality of our code bases to ensure that they remain easily maintainable. More recently, this often involved migrating data pipelines from pandas to polars in order to achieve significant performance gains.

Data Science, Libraries, Performance Optimization

39 points by sito42 72 days ago | 8 comments