Hacker News with Generative AI: Testing

Using tests as a debugging tool for logic errors (qodo.ai)
In Java development, logic errors constitute a unique class of defects where code executes flawlessly according to its written instructions while systematically violating business requirements.
Grafana K6 v1.0.0 (github.com/grafana)
After 9 years of iteration and countless community contributions, we're thrilled to announce Grafana k6 v1.0.
Elm Test Distributions (janiczek.cz)
…in which I’ll tell you how you can make sure your property based tests are testing the interesting cases.
Swarm Testing Data Structures (tigerbeetle.com)
We discovered a cute little pattern the other day when refactoring TigerBeetle’s intrusive queue — using Zig’s comptime reflection for exhaustively testing data structure’s public API. Isn’t it cool when your property test fails when you add a new API, because “public API is tested” is one of the properties you test?!
Show HN: Magnitude – open-source, AI-native test framework for web apps (github.com/magnitudedev)
Magnitude: The open source, AI-native testing framework for web apps
AI helped write California bar exam, sparking uproar (arstechnica.com)
On Monday, the State Bar of California revealed that it used AI to develop a portion of multiple-choice questions on its February 2025 bar exam, causing outrage among law school faculty and test takers.
Unpowered SSD endurance investigation finds data loss and performance issues (tomshardware.com)
Antithesis Driven Testing (sqlsync.dev)
I want a test system smart enough to discover the bugs I can’t anticipate.
Streaming Platform for Canadian/American Content (Need Testers) (ycombinator.com)
Test Spies in Haskell (jezenthomas.com)
When testing a web application, you often want to make sure that a certain email would be sent — without actually sending it. How do you test that?
Unpowered SSD endurance investigation finds data loss, performance issues (tomshardware.com)
Scenario: Using Agents to Test Your Agents (github.com/langwatch)
Scenario is a library for testing agents end-to-end as a human would, but without having to manually do it. The automated testing agent covers every single scenario for you.
Ask HN: How to unit test AI responses? (ycombinator.com)
I am tasked to build a customer support chat. The AI should be trained with company docs. How can I be sure the AI will not hallucinate a bad response to a customer?
Local CI. Sign off on your own work (github.com/basecamp)
A GitHub CLI extension for local CI. Run your tests on your own machine and sign off when they pass.
Cargo-mutants:zombie: Inject bugs and see if your tests catch them (github.com/sourcefrog)
cargo-mutants helps you improve your program's quality by finding places where bugs could be inserted without causing any tests to fail.
Try: Test anti-framework via CL Condition System (github.com/melisgl)
Try is an extensible test anti-framework with equal support for interactive and non-interactive workflows.
Setup QEMU Output to Serial Console and Automate Tests with Shell Scripts (2019) (fadeevab.com)
While struggling to automate QEMU guest (communicate and control with the shell scripts), I faced a lot of incomplete, partially working solutions around the Internet. Now, I've got a pretty decent collection of working recipes to tune up a QEMU guest, so I decided to organize all that stuff here, and it could be definitely useful for anyone else.
Pytest for Neovim (github.com/richardhapb)
Testing integrated in neovim with pytest. Include Docker support. This project is in progress, I will be adding more features in the future and I open to contributions.
Deterministic simulation testing for async Rust (s2.dev)
You Don't Have Time Not to Test (medium.com)
Testing isn’t a sunk cost. It’s a compounding return that shapes better code and ultimately accelerates your team.
The right way to do data fixtures in Go (brandur.org)
Every test suite should start early in building a strong convention to generate data fixtures. If it doesn’t, data fixtures will still emerge (they’re that necessary), but in a way that’s poorly designed, with no API (or a poorly designed one), and not standardized.
Our own worst best customer (antithesis.com)
At Antithesis, our job is to break software before it breaks in production – ours included. We’ve spent years stress-testing our systems with property-based testing and deterministic simulation, not just because it makes our software more reliable, but because it actually makes us faster.
Show HN: Donobu – Mac app turns prompts into deterministic browser tests (ycombinator.com)
Hi HN, we’re Vasusen and Justin, and we’re building Donobu (https://www.donobu.com), a Mac desktop app. It turns prompts like “ensure onboarding works” into reliable browser tests, with optional AI (BYOK). It’s local-first, privacy-focused, and built with insights from our Coursera days—where testing hundreds of features across thousands of pages was a nightmare.
Just write a test for it (kobzol.github.io)
This is a short appreciation post about Rust continuously guiding me towards doing The Right Thing™.
Testing Without Mocks: A Pattern Language (2023) (jamesshore.com)
Dead Simple Snapshot Testing in Zig (kristoff.it)
I’ve recently added snapshot testing support to Zine, my static site generator, and I’ll share here how to get a similar setup going for your projects.
Verification-First Development (buttondown.com)
A while back I argued on the Blue Site1 that "test-first development" (TFD) was different than "test-driven development" (TDD). The former is "write tests before you write code", the latter is a paradigm, culture, and collection of norms that's based on TFD. More broadly, TFD is a special case of Verification-First Development and TDD is not.
Testing Begins for Community Notes on Facebook, Instagram and Threads (about.fb.com)
In January, Meta announced that we will end our third party fact checking program and move to a crowd-sourced Community Notes approach, starting in the United States. On March 18th, we will begin testing this new approach by allowing contributors from our community to write and rate notes on content across Facebook, Instagram and Threads.
Show HN: Testeranto – the AI driven test framework for TypeScript projects (npmjs.com)
🚧 WARNING: Testeranto is still under development and is not ready for production yet. 🚧
Problems with New California Bar Exam Enrage Test Takers and Cloud Their Futures (nytimes.com)
Even under normal circumstances, the California bar exam is one final harrowing hurdle before aspiring lawyers can practice. But last week was worse than any other, as they were thrown into limbo by technical glitches, delays and what many said were bizarrely written questions on a revamped test that didn’t match anything in preparation.