Hacker News with Generative AI: Benchmarks

Arc-AGI-2 and ARC Prize 2025 (arcprize.org)
Good AGI benchmarks act as useful progress indicators. Better AGI benchmarks clearly discern capabilities. The best AGI benchmarks do all this and actively inspire research and guide innovation.
Open-source Rust database tops JSONBench using DataFusion (greptime.com)
GreptimeDB Takes on the Billion-JSON-Document Challenge - Outperforms ClickHouse, VictoriaLogs, and Competitors
MySQL transactions per second vs. fsyncs per second (2020) (sirupsen.com)
How many transactions (‘writes’) per second is MySQL capable of?
Apple M3 Ultra SoC: Disappointing CPU Benchmark Result Surfaces (techpowerup.com)
Just recently, Apple somewhat stunned the industry with the introduction of its refreshed Mac Studio with the M4 Max and M3 Ultra SoCs.
How the Ubuntu Linux Performance Has Evolved for RISC-V over the Last 4 Years (phoronix.com)
SiFive recently sent over their new HiFive Premier P550 developer board and as part of that fresh RISC-V CPU testing I've also been re-testing the prior SiFive HiFive Unmatched developer board from 2020~2021 for reference.
Intel Xeon 6700P and 6500P Granite Rapids-SP for the Masses Initial Benchmarks (servethehome.com)
This Intel Xeon 6 series has been a launch we have been waiting on for around three quarters. Today, we have the launch of the Intel Xeon 6700P and Xeon 6500P series. These are the smaller Socket E2 (LGA4710-2) processors compared to the big socket Intel Xeon 6900P series. At the same time, they are not necessarily “lower-end” as they can scale to 4 sockets, eight sockets and beyond.
Nvidia GeForce RTX 5090 Linux GPU Compute Benchmarks (phoronix.com)
While there have been a lot of GeForce RTX 5090 Windows gaming benchmarks since the review embargo lift yesterday, for those more fascinated by this high-end Blackwell desktop graphics card for its GPU compute potential on Linux, this article is for you.
Nvidia GeForce GTX 980 Through GeForce RTX 5080/5090 GPU Compute Performance (phoronix.com)
Complementing the recent Linux GPU benchmarks of the NVIDIA GeForce RTX 5080 and GeForce RTX 5090 looking at both the Linux / Steam Play gaming performance as well as GPU compute and other areas, in today's testing is a wide multi-generation look seeing how the NVIDIA GeForce performance has evolved going back to the GeForce GTX 980 Maxwell GPUs up through the newest GeForce RTX 5080/5090 graphics cards.
Can We Trust AI Benchmarks? A Review of Current Issues in AI Evaluation (arxiv.org)
Quantitative Artificial Intelligence (AI) Benchmarks have emerged as fundamental tools for evaluating the performance, capability, and safety of AI models and systems.
The average CPU performance of PCs and notebooks fell for the first time (cpubenchmark.net)
Over 1,000,000 CPUs Benchmarked
Linux 6.13 Performance for 250Hz vs. 1000Hz Timer Frequency Comparison (phoronix.com)
Given the recent patch proposal to raise the Linux kernel's default timer frequency from 250Hz to 1000Hz, I ran some fresh benchmarks looking at the 250Hz vs. 1000Hz comparison on some modern desktop hardware.
The first yearly drop in average CPU performance in its 20 years of benchmarks (tomshardware.com)
VictoriaLogs Beats Elasticsearch, MongoDB and PostgreSQL in ClickBench (ycombinator.com)
VictoriaLogs Beats Elasticsearch, MongoDB and PostgreSQL in ClickBench
Killed by LLM (r0bk.github.io)
A memorial to the benchmarks that defined—and were defeated by—AI progress
Fair Go vs. Elixir Benchmarks (github.com/antonputra)
The code previously used Jason.encode! but Jason.encode_to_iodata! should be preferred over IO devices. This should increase performance and reduce memory usage. This is what frameworks such as a Phoenix would have used by default
Intel Compute Runtime 24.45 vs. ROCm 6.3 vs. Nvidia R565 Linux GPU Benchmarks (phoronix.com)
Complementing yesterday's fresh Linux gaming benchmarks of mid-range Intel Arc Graphics "Alchemist" vs. NVIDIA GeForce RTX 40 vs. AMD Radeon RX 7000 series cards ahead of the upcoming Battlemage availability, today's article is providing a fresh look at the latest Intel Compute Runtime performance for Level Zero / OpenCL on current-gen Intel discrete graphics compared to mid-range AMD Radeon GPUs on ROCm 6.3 and similar NVIDIA GeForce RTX 40 Ada graphics cards on the R565 driver.
Intel Arc B580 trades blows with the RTX 4060 and RX 7600 in early benchmarks (tomshardware.com)
1B Nested Loop Iterations (benjdd.com)
Timings taken via hyperfine on an M3 Macbook pro with 16 gb RAM. Input value of 40 given to each.
OrioleDB beta7: Benchmarks (orioledb.com)
OrioleDB is a storage extension for PostgreSQL which uses PostgreSQL's pluggable storage system.
Microbenchmarks Are Experiments (mrale.ph)
Benchmarks are not numerology. Their results are not a divine revelation. Benchmarks are experiments. Their results are meaningless without interpretation and validation.
1B nested loop iterations (benjdd.com)
Ran each three times and used the lowest timing for each.
Google Axion ARM CPU (C4A) vs. AWS Graviton4, Performance benchmark (phoronix.com)
Last week Google announced the general availability of their C4A instances powered by their in-house Axion processors.
AMD Ryzen 7 9800X3D Linux Performance: Zen 5 With 3D V-Cache (phoronix.com)
Ahead of tomorrow's availability of the AMD Ryzen 7 9800X3D processor as the first Zen 5 CPU released with 3D V-Cache, today the review embargo lifts. Here is a look at how this 8-core / 16-thread Zen 5 CPU with 64MB of 3D V-Cache is performing under Ubuntu Linux compared to a variety of other Intel Core and AMD Ryzen desktop processors.
Apple's M4 Max is the single-core performance king in Geekbench 6 (tomshardware.com)
Mac Mini with M4 Pro is the fastest Mac ever benchmarked (macrumors.com)
The first Geekbench 6 benchmark results for the M4 Pro chip surfaced today. Impressively, the results that are available so far show that the highest-end M4 Pro chip is faster than the highest-end M2 Ultra chip in terms of peak multi-core CPU performance.
Benchmarks of Google's Axion Arm-Based CPU (phoronix.com)
Earlier this year Google announced Axion as their first Arm-based CPU for the Google Cloud. Today already they are taking Axion to general availability with the new C4A instances. These new C4A instances are advertised as offering up to 50% better performance and up to 60% better energy efficiency than their current generation x86 instance types.
Encore.ts: A New Type of Framework (encore.dev)
We recently published performance benchmarks showing how Encore.ts achieves 9x request throughput compared to Express.js, and 2x compared to Fastify.
AMD Zen 5 Epyc Turin dominates previous Zen 4, Intel by 40% (phoronix.com)
Across more than 140 benchmarks the AMD EPYC 9005 series processors were delivering great performance, power efficiency, and value. Those interested can see all 140 benchmarks via this result file.
AMD EPYC Turin delivers better performance/power efficiency than AmpereOne (phoronix.com)
The AMD EPYC 9965 Turin Dense processor was delivering dominating performance in most of the HPC benchmarks tested compared to the AmpereOne A192-32X flagship ARM server processor.
AMD EPYC 9755 / 9575F / 9965 Benchmarks Show Dominating Performance Review (phoronix.com)
Last month Intel introduced their Xeon 6 "Granite Rapids" processors with up to 128 P cores, MRDIMM support, and other improvements as a big step-up in performance and power efficiency for their server processors.