Hacker News with Generative AI: Compression

Lzbench compression benchmark (morotti.github.io)
lzbench is an in-memory benchmark of open-source LZ77/LZSS/LZMA compressors.
Bzip3: A spiritual successor to BZip2 (github.com/kspalaiologos)
A better, faster and stronger spiritual successor to BZip2. Features higher compression ratios and better performance thanks to a order-0 context mixing entropy coder, a fast Burrows-Wheeler transform code making use of suffix arrays and a RLE with Lempel Ziv+Prediction pass based on LZ77-style string matching and PPM-style context modeling.
Deflate Decompression in C++23 (garymm.org)
In this post I describe some things I learned while working on Starflate, an implementation of Deflate decompression in C++23 that I wrote with my friend Oliver Lee.
Query Engines: Gatekeepers of the Parquet File Format (duckdb.org)
TL;DR: Mainstream query engines do not support reading newer Parquet encodings, forcing systems like DuckDB to default to writing older encodings, thereby sacrificing compression.
Show HN: Plik – a tiny FUSE filesystem with compression and deduplication (sytes.net)
File system in user space (FUSE) with compression and deduplication. Written in a single C file (1K LOC). Uses OpenSSL for hashing, zlib for compression and SQLite for data storage.
Fabrice Bellard's Ts_SMS: Short Message Compression Using LLM (bellard.org)
ts_sms: Short Message Compression using Large Language Models
Short Message Compression Using LLMs (bellard.org)
Linux EFI Zboot Abandoning "Compression Library Museum", Focusing on Gzip, ZSTD (phoronix.com)
The Linux kernel EFI Zboot code for carrying the Linux kernel image for EFI systems in compressed form is doing away with its "compression library museum" of offering Gzip, LZ4, LZMA, LZO, XZ, and Zstd compression options to instead just focus on Gzip and Zstd compression support.
BC7 optimal solid-color blocks (wordpress.com)
That’s right, it’s another texture compression blog post! I’ll keep it short. By “solid-color block”, I mean a 4×4 block of pixels that all have the same color. ASTC has a dedicated encoding for these (“void-extent blocks”), BC7 does not. Therefore we have an 8-bit RGBA input color and want to figure out how to best encode that color with the encoding options we have.
PeaZip 10.0.0 Released (peazip.github.io)
PeaZip 10.0.0 comes with a revamped GUI, providing more icon sizes, updated Themes and compression pre-sets, and better organized menus.
RRR: A Succinct Rank/Select Index for Bit Vectors (2011) (alexbowe.com)
This blog post will give an overview of a static bitsequence data structure known as RRR, which answers arbitrary length rank queries in $\mathcal{O}(1)$ time, and provides implicit compression.
Maximum best compression for binary data: xz (LZMA) vs. ZSTD vs. 7z vs. bzip2 (dwaves.de)
once upon a time, compressing massive amounts of binary was required.
RFC9659: Window Sizing for Zstandard Content Encoding on the Web (rfc-editor.org)
SQLite Transparent Compression (github.com/phiresky)
Extension for sqlite that provides transparent dictionary-based row-level compression for sqlite. This basically allows you to compress entries in a sqlite database almost as well as if you were compressing the whole DB file, but while retaining random access.
An SVE backend for astcenc (Adaptive Scalable Texture Compression Encoder) (solidpixel.github.io)
The Evolution of Extreme LLM Compression: From Quip to AQLM with PV-Tuning (medium.com)
AQLM and PV-Tuning: methods that compress LLMs by 8 times, retain 95% quality (github.com/Vahe1994)
Borg 2.0 beta (deduplicating backup program with compression and encryption) (borgbackup.org)
It's quite fascinating – I fit into a 17.6 MB text file (github.com/0x77dev)
Windows File Explorer will be more powerful with version control and 7z (theverge.com)
Decompress Anything with "X Uz" (x-cmd.com)
FastLanes Compression Layout: Decoding >100B integers/sec with scalar code [pdf] (2023) (ir.cwi.nl)
The Neuralink compression challenge seems impossible (ycombinator.com)
Neuralink Compression Challenge (neuralink.com)
Windows 11 now supports 7-zip and TAR files (xda-developers.com)
FC8 – Faster 68K Decompression (2016) (bigmessowires.com)