Hacker News with Generative AI: Testing

How rqlite is tested (philipotoole.com)
rqlite is a lightweight, open-source, distributed relational database built on SQLite and Raft. With its origins dating back to 2014, its design has always prioritized reliability, and quality. Testing plays a foundational role in achieving these qualities, shaping the implementation and guaranteeing robustness.
Snyk Security Labs Testing Update: Cursor.com AI Code Editor (snyk.io)
Snyk’s Security Labs team aims to find and help mitigate vulnerabilities in software used by developers around the world, with an overarching goal to improve the state of software security.
Why Rust nextest is process-per-test (sunshowers.io)
I’m often asked why the Rust test runner I maintain, cargo-nextest, runs every test in a separate process. Here’s my best attempt at explaining the rationale behind it.
Documentation is more important than tests (anonel.substack.com)
Without the intention of making a click-bait-y title, I really think docs and tests are almost equal in importance, but if a company would only have resources to do one, they should choose documentation.
Power up and tear down of a Rohde and Schwarz SKTU BN 4151/2/5 noise generator (makertube.net)
We are sorry but it seems that PeerTube is not compatible with your web browser.
Ruby-refrigerator: Freeze all core Ruby classes (github.com/jeremyevans)
Refrigerator offers an easy way to freeze all ruby core classes and modules. It’s designed to be used in production and when testing to make sure that no code is making unexpected changes to core classes or modules at runtime.
Database Release and End-to-End Testing: ClickHouse Database Cloning (notion.site)
Writing and testing a paginated API iterator in Go (thibaut-rousseau.com)
Go 1.23, amongst other features, brought Iterators to the standard library.
Webhook Tester/Debugger (hooklistener.com)
Optimize, Test, and Debug Webhooks with Precision
Testing for Thermal Issues Becomes More Difficult (semiengineering.com)
Increasingly complex and heterogeneous architectures, coupled with the adoption of high-performance materials, are making it much more difficult to identify and test for thermal issues in advanced packages.
Make your QEMU faster (2022) (schreibt.jetzt)
NixOS uses virtual machines based on QEMU extensively for running its test suite. In order to avoid generating a disk image for every test, the test driver usually boots using a Plan 9 File Protocol (9p) share (server implemented by QEMU) for the Nix store, which contains all the programs and config necessary for the test.
SeleniumBase: Python APIs for web automation and bypassing bot-detection (github.com/seleniumbase)
Python APIs for web automation, testing, and bypassing bot-detection.
Jupyter Notebooks as E2E Tests (exotext.com)
Recently, I've been involved in building a new library, and it ended up containing a half dozen notebooks, covering everything from a quick start guide to niche applications and configuration examples. It being a completely new product, we wanted our users to have extensive interactive documentation from the start.
Scores for adults are dropping on tests of basic skills (nataliewexler.substack.com)
An international test of adults’ “basic skills” shows that an increasing number of Americans are struggling to do moderately complex tasks that involve reading and math.
Test (defense.gov)
asfasfasdfasfdasdfasfasfasfsadffffffffffffffffffffasdfsfsafsfdwfgasdgbdfbgdsgasdgdsbdb
What TDD is good for (wordpress.com)
In an earlier article, I tore through some terrible arguments used to advocate for TDD that I see all too often (even by experienced engineers). I said in that piece that I would eventually go through what I think are better arguments for TDD, so that’s what I’m gonna do now.
Does Your Code Pass the Turkey Test? (2008) (moserware.com)
Over the past 6 years or so, I’ve failed each item on “The Turkey Test.” It’s very simple: will your code work properly on a person’s machine in or around the country of Turkey? Take this simple test.
Introducing Qodo Cover: Automate Test Coverage (qodo.ai)
With AI-generated code becoming a cornerstone of modern software development—Google recently revealed that a staggering 25% of their new code is AI-generated—reliable code is more critical than ever.  As AI takes on more coding tasks, the burden of ensuring code maintainability and reliability falls on processes like unit testing and regression testing—critical safeguards that verify functionality remains intact as code evolves.
Test Driven Development (TDD) for your LLMs? Yes please, more of that please (helix.ml)
Testing LLM-based applications has become one of the most crucial challenges in modern software development. While traditional software testing gives us clear pass/fail criteria, how do you verify that your AI is consistently giving good responses? When is a response "correct enough"? And how do you automate this testing process in a way that scales?
Qodo's autonomous agent tackles the complexities of regression testing (venturebeat.com)
Code is continuously evolving in the software development process, requiring ongoing testing for quality and maintainability. This is the root of regression testing, in which existing tests are re-run to ensure that modified code continues to function as intended.
Launch HN: Vocera (YC F24) – Testing and Observability for Voice AI (ycombinator.com)
Hey HN, we’re Shashij, Sidhant, and Tarush, founders of Vocera AI (https://www.vocera.ai) – a platform that automates the testing and monitoring of AI voice agents.
Vitest vs. Jest (speakeasy.com)
Effective testing frameworks are essential in building reliable JavaScript applications, helping you minimize bugs and catch them early to keep your customers happy. Choosing the right testing framework saves hours of configuration hassles and improves developer experience.
Use cocotb to test and verify chip designs in Python (cocotb.org)
Use cocotb to test and verify chip designs in Python. Productive, and with a smile.
How Did You Do on the AI Art Turing Test? (astralcodexten.com)
Last month, I challenged 11,000 people to classify fifty pictures as either human art or AI-generated images.
How do cars do in out-of-sample crash testing? (2020) (danluu.com)
While having car crash test results is obviously better than not having them, the results themselves don't tell us what happens when we get into an accident that doesn't exactly match a benchmark.
Writing Healthy Health-Checks (lorentz.app)
Packages, Not Programs (bitfieldconsulting.com)
This is the first of a two-part tutorial on designing Go packages, guided by tests:
Trunk Flaky Tests – Detect, Quarantine, and Eliminate Flaky Tests (trunk.io)
Almost eight months ago we opened early access to our flaky tests solution to help companies find and neutralize their flaky tests. Since then we’ve processed 20.2 million uploads from our development partners. We are excited to announce today - the next phase in this product - as we open our Flaky Tests solution to the public.
LocalStack raises $25M to help developers emulate and test cloud apps locally (techcrunch.com)
Knowing how your cloud application will behave in production usually requires significant development and testing in the environment in which it will be deployed, be that AWS, Azure, Google Cloud, or wherever. But this can be a resource-intensive endeavor, particularly with issues relating to latency (the time it takes to constantly send data) and the costs associated with this.
Measuring keyboard-to-photon latency with a light sensor (2023) (thume.ca)
For a long time when I’ve wanted to test the latency of computers and UIs I’ve used the Is It Snappy app with my iPhone’s high speed camera to count frames between when I press a key and when the screen changes. However the problem with that is it takes a while to find the exact frames you want, which is annoying when doing a bunch of testing.