Hacker News with Generative AI: Human-AI Comparison

I compared my daughter against SOTA models on math puzzles (michalprzadka.com)
I created an AI math reasoning benchmark using puzzles from this year’s GMIL competition — a long-running international mathematical challenge that I participated in myself back in 1998. The results are quite interesting: some of the most advanced AI models performed comparably to my 11-year-old daughter, while others struggled significantly. This experiment gives some amusing insights into current AI capabilities in mathematical reasoning, especially when compared to human performance at the middle school level.

Artificial Intelligence, Math, Education, Benchmarking, Human-AI Comparison

15 points by przadka 535 days ago | 3 comments