Hacker News with Generative AI: Optimization

Having your compile-time cake and eating it too (0x44.xyz)
As programmers, we like it when our programs run well.

Programming, Software Development, Optimization

32 points by signa11 61 days ago | 6 comments

A SomewhatMaxSAT Solver (jak-linux.org)
As you may recall from previous posts and elsewhere I have been busy writing a new solver for APT. Today I want to share some of the latest changes in how to approach solving.

Software, Programming, Logic, Optimization

9 points by JNRowe 62 days ago | 0 comments

Kangaroo: A flash cache optimized for tiny objects (2021) (engineering.fb.com)
Kangaroo is a new flash cache that enables more efficient caching of tiny objects (objects that are ~100 bytes or less) and overcomes the challenges presented by existing flash cache designs.

Software, Caching, Facebook, Optimization, Data Structures

22 points by PaulHoule 64 days ago | 1 comments

Loading Pydantic models from JSON without running out of memory (pythonspeed.com)
You have a large JSON file, and you want to load the data into Pydantic. Unfortunately, this uses a lot of memory, to the point where large JSON files are very difficult to read. What to do?

Python, Data Processing, Memory Management, Optimization, Pydantic

134 points by itamarst 64 days ago | 45 comments

Fast Allocations in Ruby 3.5 (railsatscale.com)
Many Ruby applications allocate objects. What if we could make allocating objects six times faster? We can! Read on to learn more!

Ruby, Performance, Programming Languages, Optimization

267 points by tekknolagi 64 days ago | 67 comments

Improving performance of rav1d video decoder (ohadravid.github.io)
Making the rav1d Video Decoder 1% Faster

Video Encoding, Optimization, Performance, Software, Open Source

305 points by todsacerdoti 64 days ago | 115 comments

Fast Allocations in Ruby 3.5 (railsatscale.com)
Many Ruby applications allocate objects. What if we could make allocating objects six times faster? We can! Read on to learn more!

Ruby, Performance, Optimization

12 points by Ocha 65 days ago | 0 comments

Too Much Go Misdirection (tedunangst.com)
Poking through layers of indirection in go trying to recover some efficiency.

Programming, Go, Optimization

186 points by todsacerdoti 67 days ago | 97 comments

Layers All the Way Down: The Untold Story of Shader Compilation (moonside.games)
As a game developer who works primarily in frameworks instead of engines, one of the biggest pain points is the need to render on multiple platforms efficiently.

Game Development, Graphics, Optimization, Shaders

91 points by birdculture 68 days ago | 48 comments

Backtrace is finally cheap by abusing x86/Linux's shadow stack (intmainreturn0.com)
Backtrace is a very helpful debugging tool in native programming by giving out the source location at each call level. Unfortunately, getting a backtrace is expensive.

Debugging, Programming, Linux, Optimization, x86

16 points by htfy96 68 days ago | 0 comments

Show HN: KVSplit – Run 2-3x longer contexts on Apple Silicon (github.com/dipampaul17)
Run larger context windows and heavier LLMs on your Mac by applying different quantization precision to keys vs values in the attention mechanism's KV cache. KVSplit enables you to:

Apple Silicon, Optimization, Machine Learning

272 points by dipampaul17 70 days ago | 40 comments

New Life Hack: Using LLMs and Constraint Solvers for Personal Logistics Tasks (emschwartz.me)
I enjoy doing escape rooms and was planning to do a couple of them with a group of friends this weekend. The very minor and not-very-important challenge, however, was that I couldn't figure out how to assign friends to rooms. I want to do at least one room with each person, different people are arriving and leaving at different times, and there are only so many time slots.

Generative AI, Life Hacks, Time Management, Optimization, Personal Logistics

12 points by emschwartz 70 days ago | 0 comments

X X^t can be faster (arxiv.org)
We present a new algorithm RXTX that computes product of matrix by its transpose $XX^{t}$. RXTX uses $5\%$ less multiplications and additions than State-of-the-Art and achieves accelerations even for small sizes of matrix $X$. The algorithm was discovered by combining Machine Learning-based search methods with Combinatorial Optimization.

Machine Learning, Optimization, Algorithms, Matrix Multiplication

201 points by robinhouston 70 days ago | 60 comments

Solving the local optima problem – NQueens (github.com/Dpbm)
You can’t perform that action at this time.

Optimization, Algorithms, N-Queens, Programming, AI

11 points by ColinWright 70 days ago | 3 comments

A leap year check in three instructions (hueffner.de)
With the following code, we can check whether a year 0 ≤ y ≤ 102499 is a leap year with only about 3 CPU instructions:

Programming, Optimization, Algorithms

434 points by gnabgib 71 days ago | 164 comments

Determinate Nix 3.5: introducing lazy trees (determinate.systems)
Lazy trees have been one of the most hotly requested Nix features for quite some time. They make Nix much more efficient in larger repositories, particularly in massive monorepos. And so we’re excited to announce that lazy trees have landed in Determinate Nix version 3.5.2, based on version 2.28.3 of upstream Nix.

Software, New Releases, Nix, Programming, Optimization

18 points by biggestlou 71 days ago | 1 comments

The smallest possible Docker image (github.com/MarkMcCulloh)
This is (hopefully) the smallest possible docker image that can be successfully executed.

Docker, Containerization, Optimization

4 points by chriscbr 72 days ago | 0 comments

Backslash: Rate Constrained Optimized Training of Large Language Models (arxiv.org)
The rapid advancement of large-language models (LLMs) has driven extensive research into parameter compression after training has been completed, yet compression during the training phase remains largely unexplored.

Machine Learning, Optimization, Training

3 points by PaulHoule 74 days ago | 0 comments

JEP 515: Ahead-of-Time Method Profiling (openjdk.org)
Improve warmup time by making method-execution profiles from a previous run of an application instantly available, when the HotSpot Java Virtual Machine starts. This will enable the JIT compiler to generate native code immediately upon application startup, rather than having to wait for profiles to be collected.

Java, Optimization, Performance, Virtual Machines, Programming Languages

101 points by cempaka 75 days ago | 10 comments

15 Years of Shader Minification (ctrl-alt-test.fr)
How do demosceners create complex computer animations in just a few kilobytes? One of our secret weapons is Shader Minifier, a tool that minifies GLSL code. Over the years, it has evolved to pack more data into tiny executables, pushing the boundaries of what’s possible. In this blog post, we’ll go through its evolution.

Computer Graphics, Programming, Optimization, Demoparty, History

147 points by laurentlb 77 days ago | 25 comments

A whippet waypoint / Nofl: A Precise Immix (wingolog.org)
Hey peoples! Tonight, some meta-words. As you know I am fascinated by compilers and language implementations, and I just want to know all the things and implement all the fun stuff: intermediate representations, flow-sensitive source-to-source optimization passes, register allocation, instruction selection, garbage collection, all of that.

Compilers, Programming Languages, Optimization, Intermediate Representations

13 points by matt_d 77 days ago | 0 comments

Engineering Design Optimization Textbook (mdobook.github.io)
A graduate-level textbook covering a range of fundamental to advanced optimization theory and algorithms with practical tips, numerous illustrations, and engineering examples.

Engineering, Optimization, Education, Algorithms, Textbooks

5 points by TheHideout 77 days ago | 0 comments

Optimizing an HTML5 game engine using composition over inheritance (radicalfishgames.com)
We started with HTML5 game development around the end of 2011. We bought an impact.js license and started working on CrossCode. And since CrossCode demanded 3D collision, we modified the engine – and continued doing so until almost every nook and cranny was changed in one way or the other. So it’s safe to say that we did not only develop a game but a whole game engine with it.

Game Development, HTML5, Software Engineering, Optimization, Inheritance

7 points by JSLegendDev 77 days ago | 0 comments

Blazeio.SharpEvent: A Python Async Primitive That Scales to 1M Waiters with O(1) (ycombinator.com)
I’ve been working on a Python async library ([Blazeio](https://github.com/anonyxbiz/Blazeio)) and stumbled into a shockingly simple optimization that makes `asyncio.Event` look like a relic.

Python, Asynchronous Programming, Optimization, Performance, Libraries

6 points by anonyxbiz 78 days ago | 0 comments

Linear Programming for Fun and Profit: Finding Arbitrages in the GPU Market (modal.com)
If you haven’t noticed, the GPU market is highly volatile. NVIDIA repeatedly spews out new chip architectures, doubling FLOPS every few years. Everyone shifts towards the newest cards, causing temporary supply crunches and high prices. But Modal’s customers don’t want to think about these price fluctuations. They want GPUs of all kinds at predictable and good prices, and the ability to demand thousands of GPUs on a moment’s notice, without having to worry about pricing, capacity planning, or supply.

Linear Programming, GPU Market, Artificial Intelligence, Optimization, Supply Chain

11 points by cweld510 79 days ago | 0 comments

Optimizing Common Lisp (fosskers.ca)
I recently released a Parser Combinator library for Common Lisp, but was unhappy with its performance. This article is a description of how I used sb-sprof, built in to SBCL, to identify both CPU and memory allocation hotspots, improving the runtime speed of the parcom/json module by 3x and decreasing memory allocation by 25x.

Programming, Optimization, Common Lisp, Performance

52 points by todsacerdoti 79 days ago | 6 comments

Faster sorting with SIMD CUDA intrinsics (2024) (winwang.blog)
Recently, I finished a batch at the Recurse Center… is what I would have said if this post were written when I intended to write it (i.e. 3 months ago). My project there focused on a questionable application of CUDA (mostly irrelevant to this post), but it got me thinking more about other GPU-friendly algorithms.

CUDA, GPU Programming, Sorting Algorithms, Optimization

92 points by winwang 81 days ago | 11 comments

Load-Store Conflicts (zeux.io)
meshoptimizer implements several geometry compression algorithms that are designed to take advantage of redundancies common in mesh data and decompress quickly - targeting many gigabytes per second in decoding throughput.

Computer Graphics, Optimization, Game Development, Data Compression

117 points by ashvardanian 82 days ago | 5 comments

Minecraft runs on 8MB of VRAM using a 20-year-old GPU (tomshardware.com)

Gaming, Hardware, Optimization, Minecraft

7 points by 01-_- 82 days ago | 4 comments

Fast(er) regular expression engines in Ruby (serpapi.com)
Performance-oriented comparison of alternative regexp engines that may (or may not) speed up your Ruby code.

Ruby, Regular Expressions, Performance, Optimization

60 points by davidsojevic 85 days ago | 6 comments