Hacker News with Generative AI: Computer Science

Implementing a spellchecker on 64 kB of RAM in 70s led to compression algorithm (pcgamer.com)
Show HN: TypeScript as a proof assistant for intuitionistic propositional logic (github.com)
TypeScript can be used just like a proof assistant (for intuitionistic propositional logic)!
Algorithms Design (In C) (ime.usp.br)
The 20th century was the century of the equation; the 21st century is the century of the algorithm.
Circuit Tracing: Revealing Computational Graphs in Language Models (Anthropic) (transformer-circuits.pub)
We introduce a method to uncover mechanisms underlying behaviors of language models. We produce graph descriptions of the model’s computation on prompts of interest by tracing individual computational steps in a “replacement model”. This replacement model substitutes a more interpretable component (here, a “cross-layer transcoder”) for parts of the underlying model (here, the multi-layer perceptrons) that it is trained to approximate.
Four Lectures on Standard ML (1989) [pdf] (cs.tufts.edu)
Where Do the Bytes Go? (tedunangst.com)
Or perhaps more precisely, how do they get there? What happens when you call write?
Mathematical Compact Models of Advanced Transistors [pdf] (eecs.berkeley.edu)
Superhyperbola (johndcook.com)
Learning Theory from First Principles [pdf] (di.ens.fr)
Nature of Code (natureofcode.com)
Hi! Welcome! You can read the whole book here, thank you Creative Commons!
Parameter-free KV cache compression for memory-efficient long-context LLMs (arxiv.org)
The linear growth of key-value (KV) cache memory and quadratic computational complexity pose significant bottlenecks for large language models (LLMs) in long-context processing.
C. Elegans: The worm that no computer scientist can crack (wired.com)
One of the simplest, most over-studied organisms in the world is the C. elegans nematode. For 13 years, a project called OpenWorm has tried—and utterly failed—to simulate it.
Three Tribes of Programming (2017) (josephg.com)
There's an old joke that computer science is a lie, because its not really about computers, and its not really a science.
Postel's Law and the Three Ring Circus (alexgaynor.net)
Postel’s Law famously states that “implementations should follow a general principle of robustness: be conservative in what you do, be liberal in what you accept from others.”
Tao: Using test-time compute to train efficient LLMs without labeled data (databricks.com)
Escher's Art and Computer Science (replicated.wiki)
While on a small vacation in the Hague I had a blissful chance to visit the Escher’s museum. It is now hosted in an actual royal palace which many Europeans may find surprisingly modest. Maurits Cornelius Escher was a 1940x Dutch graphic artist known for his math-rich works.
SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs (arxiv.org)
The quantization of large language models (LLMs) is crucial for deploying them on devices with limited computational resources.
Land ahoy: leaving the Sea of Nodes (v8.dev)
V8’s end-tier optimizing compiler, Turbofan, is famously one of the few large-scale production compilers to use Sea of Nodes (SoN). However, since almost 3 years ago, we’ve started to get rid of Sea of Nodes and fall back to a more traditional Control-Flow Graph (CFG) Intermediate Representation (IR), which we named Turboshaft.
Writing your own C++ standard library from scratch (blogspot.com)
The C++ standard library (also know as the STL) is, without a doubt, an astounding piece of work. Its scope, performance and incredible backwards compatibility have taken decades of work by many of the world's best programmers. My hat's off to all those people who have contributed to it.
Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares (web.stanford.edu)
This book is used as the textbook for our own courses ENGR108 (Stanford) and EE133A (UCLA), where you will find additional related material.
Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI" [video] (youtube.com)
MIT 6.S191: Deep Generative Modeling [video] (youtube.com)
Every Flop Counts: Scaling a 300B LLM Without Premium GPUs (arxiv.org)
In this technical report, we tackle the challenges of training large-scale Mixture of Experts (MoE) models, focusing on overcoming cost inefficiency and resource limitations prevalent in such systems.
BeeFormer: CF and CBF hybrid approach for recommendation systems (github.com/recombee)
This is the official implementation provided with our paper beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems.
Coding Theory and Cryptography [pdf] (ualberta.ca)
Math for Computer Science and Machine Learning [pdf] (cis.upenn.edu)
Concise Machine Learning [pdf] (eecs.berkeley.edu)
Advanced Algorithms [pdf] (ethz.ch)
Piccolo: Large-Scale Graph Processing with Fine-Grained In-Memory Scatter-Gather (arxiv.org)
Graph processing requires irregular, fine-grained random access patterns incompatible with contemporary off-chip memory architecture, leading to inefficient data access.
Graph Theory and Additive Combinatorics (yufeizhao.com)
Using the dichotomy of structure and pseudorandomness as a central theme, this accessible text provides a modern introduction to extremal graph theory and additive combinatorics.