22 points by goranmoomin 31 days ago | 34 comments
AMD Previews Mysterious Linux Runtime Stack for Ryzen AI NPUs(phoronix.com) AMD was late in getting their Ryzen AI NPU support out for Linux but for a few months now the kernel accelerator driver is in the mainline kernel and the associated user-space software/toolchain is publicly available via GitHub. Thus coming as a surprise three weeks ago was an announcement of a Linux runtime stack "preview" now being available:
AMD Ryzen 9 9950X3D Delivers Excellent Performance for Linux Developers(phoronix.com) Ahead of tomorrow's availability of the Ryzen 9 9900X3D and Ryzen 9 9950X3D CPUs in retail channels, today the embargo lifts on being able to deliver Ryzen 9 9950X3D reviews and performance benchmarks. Simply put, for Linux creators, developers, enthusiasts, and others running technical computing workloads and other similar tasks on their desktop, the Ryzen 9 9950X3D with its 16 cores / 32 threads and 144MB total cache makes for an excellent desktop CPU.
How to make any AMD Zen CPU always generate 4 from RDRAND(theregister.com) Googlers have not only figured out how to break AMD's security – allowing them to load unofficial microcode into its processors to modify the silicon's behavior as they wish – but also demonstrated this by producing a microcode patch that makes the chips always output 4 when asked for a random number.
New speculative attacks on Apple CPUs(predictors.fail) We present SLAP, a new speculative execution attack that arises from optimizing data dependencies, as opposed to control flow dependencies. More specifically, we show that Apple CPUs starting with the M2/A15 are equipped with a Load Address Predictor (LAP), which improves performance by guessing the next memory address the CPU will retrieve data from based on prior memory access patterns.
Disabling Zen 5's Op Cache and Exploring Its Clustered Decoder(chipsandcheese.com) Zen 5 has an interesting frontend setup with a pair of fetch and decode clusters. Each cluster serves one of the core’s two SMT threads. That creates parallels to AMD’s Steamroller architecture from the pre-Zen days. Zen 5 and Steamroller can both decode up to eight instructions per cycle with two threads active, or up to four per cycle for a single thread.
Turning Off Zen 4's Op Cache for Curiosity and Giggles(chipsandcheese.com) CPUs start executing instructions by fetching those instruction bytes from memory and decoding them into internal operations (micro-ops). Getting data from memory and operating on it consumes power and incurs latency. Micro-op caching is a popular technique to improve on both fronts, and involves caching micro-ops that correspond to frequently executed instructions.
AMD Disables Zen 4's Loop Buffer(chipsandcheese.com) A loop buffer sits at a CPU's frontend, where it holds a small number of previously fetched instructions. Small loops can be contained within the loop buffer, after which they can be executed with some frontend stages shut off. That saves power, and can improve performance by bypassing any limitations present in prior frontend stages. It's an old but popular technique that has seen use by Intel, Arm, and AMD cores.
Antenna Diodes in the Pentium Processor(righto.com) I was studying the silicon die of the Pentium processor and noticed some puzzling structures where signal lines were connected to the silicon substrate for no apparent reason.
266 points by chmaynard 216 days ago | 51 comments
IBM Power11 CPUs Launching 2025 – Linux 6.13 Preps KVM Nested Guests for Power11(phoronix.com) IBM isn't formally releasing Power11 processors until next year, but their software engineers continue being quite busy preparing the Linux kernel and other open-source software for Power11. The newest on the kernel side is enabling support for KVM nested guests on IBM Power11 platforms.
AMD Ryzen 7 9800X3D Linux Performance: Zen 5 With 3D V-Cache(phoronix.com) Ahead of tomorrow's availability of the AMD Ryzen 7 9800X3D processor as the first Zen 5 CPU released with 3D V-Cache, today the review embargo lifts. Here is a look at how this 8-core / 16-thread Zen 5 CPU with 64MB of 3D V-Cache is performing under Ubuntu Linux compared to a variety of other Intel Core and AMD Ryzen desktop processors.