Hacker News with Generative AI: AMD

AMD Previews Mysterious Linux Runtime Stack for Ryzen AI NPUs (phoronix.com)
AMD was late in getting their Ryzen AI NPU support out for Linux but for a few months now the kernel accelerator driver is in the mainline kernel and the associated user-space software/toolchain is publicly available via GitHub. Thus coming as a surprise three weeks ago was an announcement of a Linux runtime stack "preview" now being available:
CUDA version of GROMACS is faster on AMD than HIP port (scale-lang.com)
With the release of version 1.3.1, SCALE has reached a major compatibility milestone: the ability to run the CUDA version of GROMACS on AMD GPUs.
Ryzen AI Max+ "Strix Halo" Delivers Best Performance on Linux over Windows 11 (phoronix.com)
Now having shown the very strong AMD Ryzen AI Max+ PRO 395 Linux performance for this "Strix Halo" SoC with Radeon 8060S iGPU for its integrated graphics, you may be wondering on the same hardware how this compares to Microsoft Windows 11.
AMD EPYC 4565P and EPYC 4585PX Benchmarks Against Xeon 6369P (phoronix.com)
With today's announcement of the AMD EPYC 4005P "Grado" entry-level server processors, up for review today are the EPYC 4565P and EPYC 4585PX processors as the top-end Zen 5 processors for budget server builds and basic bare metal server hosting.
'World's first' AMD GPU driven via USB3 (tomshardware.com)
China Export Controls Whack AMD Datacenter GPU Business (nextplatform.com)
As far as we can tell, the export controls on crippled GPU compute engines announced by the US Department of Commerce back in April have had a disproportionately hard impact on AMD compared to Nvidia, as far as we can tell. These controls did not affect AMD’s first quarter financial results announced this week, but the controls have removed $1.5 billion of revenue from AMD’s 2025 and $700 million of that will come out of the hide of its second quarter.
AMD GPU Programming in Julia (juliagpu.org)
Achieving 11M IOPS and 66 GiB/s IO on a Single Threadripper Workstation (2021) (tanelpoder.com)
TL;DR Modern disks are so fast that system performance bottleneck shifts to RAM access and CPU. With up to 64 cores, PCIe 4.0 and 8 memory channels, even a single-socket AMD ThreadRipper Pro workstation makes a hell of a powerful machine - if you do it right!
AMD 2.0 – New Sense of Urgency (semianalysis.com)
Ever since SemiAnalysis published an article in December 2024 detailing mediocre AMD software and the lack of usability, AMD has kicked into a higher gear and has made rapid progress in the past four months on many items we laid out. We view AMD’s new sense of urgency as a massive positive in its journey to catch up to Nvidia. AMD is now in a wartime stance, but there are still many battles ahead of it.
AMD Publishes Open-Source Driver for GPU Virtualization, Radeon "In the Roadmap" (phoronix.com)
AMD has published as open-source their "GPU-IOV Module" used for virtualization with Instinct accelerators. It's also reported on their roadmap for bringing virtualization support to their client (Radeon) discrete GPUs.
AMD 2.0 – New Sense of Urgency, MI450X Chance to Beat Nvidia, Nvidia's New Moat (semianalysis.com)
Ever since SemiAnalysis published an article in December 2024 detailing mediocre AMD software and the lack of usability, AMD has kicked into a higher gear and has made rapid progress in the past four months on many items we laid out.
Framework 13 AMD Ryzen AI 300 Series Strix Point Makes for a Great Linux Laptop (phoronix.com)
Today the review embargo lifts on the Framework 13 with AMD Ryzen AI 300 "Strix Point" SoCs: wow, what an upgrade! I've spent the past week testing out the Framework 13 with the AMD Ryzen AI 9 HX 370 and it's been terrific.
AMD teases its first 2nm chip, EPYC 'Venice' fabbed on TSMC N2 node (tomshardware.com)
Chinese project aims to run RISC-V code on AMD Zen processors (tomshardware.com)
Run RISC-V Binaries on AMD Zen-Series CPUs via Microcode Modification (rvspoc.org)
Current AMD Zen-series CPUs (e.g., EPYC 9004 series) have begun integrating RISC-V coprocessors for specific acceleration tasks.
Google Cloud's New C4D VMs Deliver Remarkable Performance with AMD EPYC Turin (phoronix.com)
As part of the announcements coming out today from Google Cloud Next 2025, the embargo has now lifted on the new Google Cloud C4D VMs.
Linux 6.15 Features Deliver a Lot for Intel and AMD, Many Other Changes (phoronix.com)
The Linux 6.15 merge window ended on Sunday with the release of Linux 6.15-rc1. There is a lot of exciting features and updates that were merged during the two-week merge window. Here is a look at all of the most prominent changes to be found with Linux 6.15.
Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture (chipsandcheese.com)
Modern GPUs often make a difficult tradeoff between occupancy (active thread count) and register count available to each thread.
Ask HN: Why hasn’t AMD made a viable CUDA alternative? (ycombinator.com)
I appreciate developing ROCm into something competitive with CUDA would require a lot of work, both internally within AMD and with external contributions to the relevant open source libraries.
Linux 6.15 Goes Heavy on Intel and AMD x86_64 CPU Changes (phoronix.com)
Merged today for the recently-opened Linux 6.15 merge window were all of the "x86/core" changes that are particularly heavy on new feature work for both Intel and AMD x86/x86_64 processors.
An Interview with Zen Chief Architect Mike Clark (computerenhance.com)
Zen is one of the most important microarchitectures in the history of the x86 ecosystem. Not only is it the reigning champion in many x64 benchmarks, but it is also the architecture that enabled AMD’s dramatic rise in CPU marketshare over the past eight years: from 10% when the first Zen processor was launched, to 25% at the introduction of Zen 5.
RDNA 4's “Out-of-Order” Memory Accesses (chipsandcheese.com)
AMD's RDNA 4 brings a variety of memory subsystem enhancements. Among those, one slide stood out because it dealt with out-of-order memory accesses. According to the slide, RDNA 4 allows requests from different shaders to be satisfied out-of-order, and adds new out-of-order queues for memory requests.
Aiter: AI Tensor Engine for ROCm (blogs.amd.com)
AMD launches Gaia open source project for running LLMs locally on any PC (tomshardware.com)
AMD launches Gaia open source project for running LLMs locally on any PC (tomshardware.com)
Gaia: An Open-Source Project from AMD for Running Local LLMs (amd.com)
AMD has launched a new open-source project called, GAIA (pronounced /ˈɡaɪ.ə/), an awesome application that leverages the power of Ryzen AI Neural Processing Unit (NPU) to run private and local large language models (LLMs).
Beyond the ROCm Software, AMD Has Been Making Great Strides in Documentation (phoronix.com)
AMD recently allowed me some time with their AMD Accelerator Cloud (AAC) leveraging multiple Instinct MI300X accelerators. During this brief opportunity to try out their latest software advancements with the Instinct MI300X and the ROCm compute stack, one of the most striking takeaways was their documentation improvements compared to previous forays into ROCm+Instinct compute. In addition, AMD is now offering more robust container options for easier Instinct compute deployments with more software options available and being more regularly updated.
AMD's Strix Halo under the hood (chipsandcheese.com)
At CES 2025 I got the chance to sit down with Mahesh Subramony, AMD Senior Fellow, to talk about AMD's upcoming Strix Halo SoC which is a brand new type of product for AMD and is the big iGPU SoC that many of us have been waiting for from AMD for a long while.
TSMC Pitches Intel Foundry JV to Nvidia, AMD and Broadcom (cnbc.com)
TSMC pitches Intel foundry joint venture to Nvidia, AMD and Broadcom (seekingalpha.com)