Hacker News with Generative AI: Statistics

This is How Many Startup Businesses Fail in the First Year (+Survival Tips) (54collective.vc)
The startup journey is not for the faint-hearted. Pondering how many startup businesses fail in the first year is a prudent thing to do at pre-launch. This startup survival guide outlines the most critical startup failure statistics to keep in mind when planning your launch.
The average American spent 2.5 months on their phone in 2024 (pcmag.com)
There's a good chance that you're currently reading this article on your phone. If you’re like one of the Americans surveyed by Reviews.org, this is one of 205 times today that you’ll be checking the device in your hand.
Lots of driving is proof we have a healthy economy: Misuse of VMT and GDP charts (urbanismspeakeasy.com)
Motordom's defenders will frequently overlay vehicle miles traveled (VMT) and gross domestic product (GDP) as if the story must be "more driving equals more prosperity."
Lies. Damned Lies. P-value thresholds (newyorker.com)
Harold Eddleston, a seventy-seven-year-old from Greater Manchester, was still reeling from a cancer diagnosis he had been given that week when, on a Saturday morning in February, 1998, he received the worst possible news. He would have to face the future alone: his beloved wife had died unexpectedly, from a heart attack.
Optimality of Frequency Moment Estimation (weizmann.ac.il)
Kelly Can't Fail (win-vector.com)
You may have heard of the Kelly bet allocation strategy. It is a system for correctly exploiting information or bias in a gambling situation. It is also known as a maximally aggressive or high variance strategy, in that betting more than the Kelly selection can be quite ruinous.
Only 15% of all Steam users' time was spent playing games released in 2024 (pcgamer.com)
Lies, damn lies, and shoplifting statistics (popular.info)
For 32 years, the National Retail Federation (NRF) — the lobbying group representing major retailers in the United States — has produced the "National Retail Security Survey."
The distribution of eigenvalues of GUE and its minors at fixed index (wordpress.com)
Leadership Power Tools: SQL and Statistics (blwt.io)
A common pattern I’ve seen over the years have been folks in engineering leadership positions that are not super comfortable with extracting and interpreting data from stores, be it databases, CSV files in an object store, or even just a spreadsheet.
Why probability probably doesn't exist (but it is useful to act like it does) (nature.com)
All of statistics and much of science depends on probability — an astonishing achievement, considering no one’s really sure what it is.
Maximum likelihood estimation and loss functions (rish-01.github.io)
When I started learning about loss functions, I could always understand the intuition behind them. For example, the mean squared error (MSE) for regression seemed logical—penalizing large deviations from the ground-truth makes sense. But one thing always bothered me: I could never come up with those loss functions on my own. Where did they come from? Why do we use these specific formulas and not something else?
Datasaurus dozen – Different datasets with the same descriptive statistics (wikipedia.org)
The Datasaurus dozen comprises thirteen data sets that have nearly identical simple descriptive statistics to two decimal places, yet have very different distributions and appear very different when graphed.[1] It was inspired by the smaller Anscombe's quartet that was created in
Assisted dying now accounts for one in 20 Canada deaths (bbc.co.uk)
Medically-assisted dying – also known as voluntary euthanasia – accounted for 4.7% of deaths in Canada in 2023, new government data shows.
San Francisco is on track to have lowest homicide rate in 60 years – KTVU FOX 2 (ktvu.com)
San Francisco is on track to have the lowest homicide rate in 60 years, according to the police department and mayor's office.
Too many people are killed by supersized cars. This new rule could help (vox.com)
The deadly consequences of “autobesity,” in 3 charts.
US airlines transported passengers over two light-years since the last crash (ourworldindata.org)
When an airplane crashes, we all hear about it. Large crashes are major news events, with shocking pictures repeated endlessly across our television screens.
Review of "Statistics" by Freedman, Pisani, and Purves (2017) (cadlag.org)
Statistics depends crucially on mathematics, but it is not subordinate in this dependence. Much of the power of statistics is in common sense, amplified by appropriate mathematical tools, and refined through careful analysis. An introductory course in statistics is then, first and foremost, a course in proper reasoning – with a quantitative bent.
Factoring in the Chicken McNugget monoid (2017) (arxiv.org)
Every day, 34 million Chicken McNuggets are sold worldwide.
Hey, wait – is employee performance Gaussian distributed? (timdellinger.substack.com)
It’s probably Pareto-distributed, not Gaussian, which elucidates a few things about some of the problems that performance management processes have at large corporations, and also speaks to why it’s so hard to hire good people. Oh, and for the economists: the Marginal Productivity Theory of Wages is cleverly combined with the Gini Coefficient to arrive at the key insight.
A statistical approach to model evaluations (anthropic.com)
Suppose an AI model outperforms another model on a benchmark of interest—testing its general knowledge, for example, or its ability to solve computer-coding questions. Is the difference in capabilities real, or could one model simply have gotten lucky in the choice of questions on the benchmark?
The Birthday Paradox Experiment (2018) (pudding.cool)
The chance that two people in the same room have the same birthday — that is the Birthday Paradox 🎉. And according to fancy math, there is a 50.7% chance when there are just 23 people+This is in a hypothetical world. In reality, people aren’t born evenly throughout the year, and leap years are excluded. However, the numbers should still be pretty close. More on this in the appendix. in a room.
500 Million, But Not a Single One More (laneless.substack.com)
We will never know their names.
9.5% of software engineers are ghosts (twitter.com)
The number of exceptional people: Fewer than 85 per 1M across key traits (sciencedirect.com)
Cognitive biases can lead to overestimating the expected prevalence of exceptional multi-talented candidates, leading to potential dissatisfaction in recruitment contexts.
How do cars do in out-of-sample crash testing? (2020) (danluu.com)
While having car crash test results is obviously better than not having them, the results themselves don't tell us what happens when we get into an accident that doesn't exactly match a benchmark.
Overview of differential geometry for Hamiltonian Monte Carlo (arxiv.org)
Hamiltonian Monte Carlo has proven a remarkable empirical success, but only recently have we begun to develop a rigorous understanding of why it performs so well on difficult problems and how it is best applied in practice.
Statistical Rethinking (2024 Edition) (github.com/rmcelreath)
This course teaches data analysis, but it focuses on scientific models.
Three-Quarters of U.S. Adults Are Now Overweight or Obese (nytimes.com)
Nearly three quarters of U.S. adults are overweight or obese, according to a sweeping new study.
Trust no one: why we can't trust most stats about the cybersecurity industry (ventureinsecurity.net)
There is a problem in cybersecurity: solid industry analysis is hard to come by.