Hacker News with Generative AI: Statistics

Drug-Sniffing Dogs Are Wrong More Often Than Right (npr.org)
The Chicago Tribune sifted through three years worth of cases in which law enforcement used dogs to sniff out drugs in cars in suburban Chicago. According to the analysis, officers found drugs or paraphernalia in only 44 percent of cases in which the dogs had alerted them.

Law Enforcement, Drugs, Statistics, Canine Detection

14 points by Anon84 48 days ago | 2 comments

Lieferando.de has captured 5.7% of restaurant related domain names (mondaybits.com)
I recently decided to compile a very large list of domain names for the German country code top-level domain .de.

Domain Names, Germany, Business, Statistics, Restaurants

313 points by __natty__ 49 days ago | 200 comments

It is time to stop teaching frequentism to non-statisticians (2012) (arxiv.org)
We should cease teaching frequentist statistics to undergraduates and switch to Bayes. Doing so will reduce the amount of confusion and over-certainty rife among users of statistics.

Statistics, Education, Bayesian Statistics

94 points by Tomte 51 days ago | 100 comments

Frequentism and Bayesianism: A Practical Introduction (2014) (jakevdp.github.io)
One of the first things a scientist hears about statistics is that there is are two different approaches: frequentism and Bayesianism. Despite their importance, many scientific researchers never have opportunity to learn the distinctions between them and the different practical approaches that result. The purpose of this post is to synthesize the philosophical and pragmatic aspects of the frequentist and Bayesian approaches, so that scientists like myself might be better prepared to understand the types of data analysis people do.

Statistics, Science, Bayesian Statistics, Frequentist Statistics, Data Analysis

7 points by Tomte 51 days ago | 0 comments

Visualizing Bayes Theorem (2009) (oscarbonilla.com)
I recently came up with what I think is an intuitive way to explain Bayes’ Theorem. I searched in google for a while and could not find any article that explains it in this particular way.

Bayes Theorem, Probability, Statistics, Visualization

7 points by Tomte 51 days ago | 0 comments

I'm in the final third of my life (sive.rs)
According to statistics, I’m in the final third of my life.

Life Stages, Statistics, Personal Reflections

12 points by sebg 53 days ago | 0 comments

Bayesian Modeling and Computation in Python (2021) (bayesiancomputationbook.com)

Bayesian Statistics, Python, Machine Learning, Statistics, Books

8 points by Tomte 56 days ago | 0 comments

Ask HN: Anyone working in traditional ML/stats research instead of LLMs? (ycombinator.com)
I am curious about those who are working in the machine learning or statistics domain but are focusing on traditional ML research rather than large language models (LLMs).

Machine Learning, Statistics, Research

21 points by itsmekali321 59 days ago | 11 comments

Perfect Recession Predictors (perfectpredictors.com)
Each line shows the fraction of perfect yield curve spreads that were negative for a given prediction window (12, 18, or 24 months). An index value of 0 implies the lowest likely chance of recession, whereas an index value of 1 means the highest chance of recession.

Economics, Recession, Finance, Statistics

7 points by gscott 59 days ago | 1 comments

Initial USA Unemployment Claims (stlouisfed.org)

Economy, Employment, Statistics

37 points by mooreds 63 days ago | 5 comments

Why Are ADHD Rates So Much Higher in the U.S.? (gizmodo.com)
Roughly 11% of children and 6% of adults in the U.S are currently diagnosed with ADHD—rates that are significantly higher than those reported in most other countries.

Health, ADHD, United States, Statistics

26 points by rntn 65 days ago | 39 comments

How to avoid P hacking (nature.com)
It can happen so easily. You’re excited about an experiment, so you sneak an early peek at the data to see if the P value — a measure of statistical significance — has dipped below the threshold of 0.05. Or maybe you’ve tried analysing your results in several different ways, hoping one will give you that significant finding. These temptations are common, especially in the cut-throat world of publish-or-perish academia.

Science, Research, Statistics, Academia

116 points by benocodes 66 days ago | 87 comments

P hacking – Five ways it could happen to you (nature.com)
It can happen so easily. You’re excited about an experiment, so you sneak an early peek at the data to see if the P value — a measure of statistical significance — has dipped below the threshold of 0.05. Or maybe you’ve tried analysing your results in several different ways, hoping one will give you that significant finding. These temptations are common, especially in the cut-throat world of publish-or-perish academia.

Science, Research, Statistics, Academic Publishing

8 points by gnabgib 67 days ago | 0 comments

Don't Die of Heart Disease (empirical.health)
Heart disease kills more people than all cancers combined—but it’s also the area of health most in your control. 80% of heart attacks can be avoided, and your risk is predictable using statistical models up to 30 years in advance.

Health, Heart Disease, Statistics, Prevention

25 points by brandonb 67 days ago | 8 comments

Backstory to the Survivorship Bias Plane (yuxi-liu-wired.github.io)
I discover the exact backstory to that picture of an airplane with red dots on top of it.

Statistics, Visualizations, Data Analysis, History

4 points by YuxiLiuWired 72 days ago | 0 comments

Zipf's Law (wikipedia.org)
Zipf's law (/zɪf/; German pronunciation: [tsɪpf]) is an empirical law stating that when a list of measured values is sorted in decreasing order, the value of the n-th entry is often approximately inversely proportional to n.

Statistics, Mathematics, Linguistics, Language, Data Analysis

6 points by baxtr 74 days ago | 0 comments

Normalizing Ratings (blogspot.com)

Data Analysis, Statistics, Normalization

51 points by Symmetry 74 days ago | 52 comments

Derivation and Intuition behind Poisson distribution (notion.site)

Probability, Statistics, Mathematics, Poisson Distribution

105 points by sebg 74 days ago | 34 comments

Liverpool's title win has completed a mysterious Fibonacci sequence (bbc.com)
Liverpool FC's victory at the weekend has clinched them their second Premier League title but it also resulted in something curious – producing a strange series of numbers in the league's record books.

Sports, Football, Liverpool FC, Mathematics, Statistics

112 points by pseudolus 75 days ago | 55 comments

Kids twice as likely to die if hit by SUV than car (rte.ie)
Pedestrians and cyclists are 44% more likely to die if they are hit by an SUV or similar-sized vehicle rather than a traditional car, a study has found.

Traffic Safety, Automobiles, Studies, Statistics, Public Health

40 points by colinprince 75 days ago | 2 comments

Liverpool's title win has completed a mysterious Fibonacci sequence (bbc.com)
Liverpool FC's victory at the weekend has clinched them their second Premier League title but it also resulted in something curious – producing a strange series of numbers in the league's record books.

Sports, Football, Statistics, Liverpool FC

10 points by Lyngbakr 77 days ago | 0 comments

Can LLMs do randomness? (rnikhil.com)
While LLMs theoretically understand “randomness,” their training data distributions may create unexpected patterns. In this article we will test different LLMs from OpenAI and Anthropic to see if they provide unbiased results. For the first experiment we will make it toss a fair coin and for the next, we will make it guess a number between 0-10 and see if its equally distributed between even and odd. I know the sample sizes are small and probably not very statistically significant.

Artificial Intelligence, Randomness, Statistics

61 points by whoami_nr 77 days ago | 67 comments

Drug Overdose Deaths in the United States, 2003–2023 (cdc.gov)
The age-adjusted rate of drug overdose deaths declined 4.0% between 2022 and 2023, which follows a nonsignificant increase between 2021 and 2022 (1). Previously, rates had generally increased across most years over the period 2003–2023.

Public Health, Drug Abuse, Statistics, United States

21 points by djoldman 77 days ago | 8 comments

Are 1/3 of American Millenials Flat Earthers? (stackexchange.com)
A Forbes article and the University of Melbourne, among other sources, claim “Only Two-Thirds Of American Millennials Believe The Earth Is Round”, which seems to imply that one third of American Millennials are flat Earthers or similar.

Generations, Social Science, Statistics

7 points by onzeinternets 78 days ago | 3 comments

Economists don't know what's going on (economist.com)
The British government has launched an investigation into the Office for National Statistics. Last month the ONS found errors in some numbers that underpin its GDP calculations, and investors no longer trust its monthly jobs report. The episode hints at a wider trend: global economic data have become alarmingly poor.

Economics, Statistics, Government, Trust

92 points by pseudolus 79 days ago | 192 comments

San Francancisco crime is down, way down (growsf.org)
Citywide crime in San Francisco is now at its lowest point in 23 years. And in the past year, San Francisco saw one of the biggest drops in crime among major U.S. cities, including a 45% drop in property crime in the first quarter of 2025, alone.

Crime, San Francisco, Statistics, Urban Planning

9 points by JumpCrisscross 84 days ago | 0 comments

A puzzle of two unreliable sensors (wordpress.com)
Suppose you are trying to measure a value P and you have two unreliable sensors. Sensor A returns 0.5P + 0.5U, where U is uniform random noise over the same domain as P. Sensor B will return either P or U with 50% likelihood. In other words, sensor A is a noisy measurement of your variable, and B is sometimes the correct value and sometimes pure noise.

Data Analysis, Sensors, Probability, Statistics, Randomness

18 points by tibbar 90 days ago | 10 comments

Markov Chain Monte Carlo Without All the Bullshit (2015) (jeremykun.com)
I have a little secret: I don’t like the terminology, notation, and style of writing in statistics. I find it unnecessarily complicated.

Statistics, Machine Learning, Mathematics

229 points by ibobev 90 days ago | 48 comments

Prevalence and Early Identification of ASD Among Children Aged 4 and 8 Years (cdc.gov)
Prevalence of ASD among children aged 8 years was higher in 2022 than previous years.

Children's Health, Public Health, Statistics

3 points by bookofjoe 90 days ago | 1 comments

Monte Carlo Crash Course: Sampling (thenumb.at)
In the previous chapter, we assumed that we can uniformly randomly sample our domain. However, it’s not obvious how to actually do so—in fact, how can a deterministic computer even generate random numbers?

Probability, Computer Science, Statistics

108 points by ibobev 91 days ago | 20 comments