Hacker News with Generative AI: Web Crawlers

How crawlers impact the operations of the Wikimedia projects (wikimedia.org)
Since the beginning of 2024, the demand for the content created by the Wikimedia volunteer community – especially for the 144 million images, videos, and other files on Wikimedia Commons – has grown significantly. In this post, we’ll discuss the reasons for this trend and its impact.
Nepenthes is a tarpit to catch AI web crawlers (zadzmo.org)
This is a tarpit intended to catch web crawlers. Specifically, it's targetting crawlers that scrape data for LLM's - but really, like the plants it is named after, it'll eat just about anything that finds it's way inside.
Reddit's robots.txt disallows all web crawlers (reddit.com)