DEDA – Tracking Dots Extraction, Decoding and Anonymisation Toolkit(github.com/dfd-tud) Document Colour Tracking Dots, or yellow dots, are small systematic dots which encode information about the printer and/or the printout itself. This process is integrated in almost every commercial colour laser printer. This means that almost every printout contains coded information about the source device, such as the serial number.
375 points by lucgommans 6 days ago | 181 comments
CNCF Git Data Miner(github.com/cncf) This is the Cloud Native Computing Foundation's fork of Jon Corbet and Greg KH's gitdm tool for calculating contributions based on developers and their companies.
Matrix Profiles(aneksteind.github.io) Lately I’ve been thinking about time series analysis to aid in Reflect’s insights features. Towards this end, I’ve had a Hacker News thread about anomaly detection bookmarked in Later. I finally got to looking at it and there was a comment that mentioned the article left out matrix profiles, which I had never heard of, so I decided to look into them.
Some Reflections After a Month of Tracking My Own Online Activity(mcwhittemore.com) Since 8:38 PM on February 22nd, I’ve been recording all my browsing activity in a database I manage using a custom-built browser extension and a wrapper around @rosskevin/ifvisible. The result? I now have a clear picture of just how much time I’ve spent on the web this past month. And, well… I spend a lot of time reading email. Go figure.
Honking Complaints Plunge 69% Inside Congestion Pricing Zone(thecity.nyc) Honking-mad motorists are laying off the horn in the core of Manhattan since the January launch of congestion pricing, data reveals — with New Yorkers’ beefs about blaring horns plummeting nearly 70% from the same time last year.
The Business of Phish (2013)(priceonomics.com) Over the past four years, the rock band Phish has generated over $120 million in ticket sales, handily surpassing more well known artists like Radiohead, The Black Keys, and One Direction.
32 points by cassianoleal 16 days ago | 45 comments
Cloudflare Analyzes Login Credentials(benjojo.co.uk) Based on Cloudflare's observed traffic between September - November 2024, 41% of successful logins across websites protected by Cloudflare involve compromised passwords.
Using traditional ML and LLMs to analyze Executive Orders (1789 – 2025)(hyperarc.com) Executive orders have been making the news recently, but aside from basic counts and individual analysis, it’s been hard to make sense of the entirety of all 11,000 accessible documents — especially for numerical analysis and trending. Thankfully we have LLMs to help with that.
DOGE Makes Its Latest Errors Harder to Find(nytimes.com) Elon Musk’s Department of Government Efficiency has repeatedly posted error-filled data that inflated its success at saving taxpayer money. But after a series of news reports called out those mistakes, the group changed its tactics.
168 points by PourquoiPas 35 days ago | 20 comments
Show HN: Telescope – an open-source web-based log viewer for logs in ClickHouse(github.com/iamtelescope) Telescope is a web application designed to provide an intuitive interface for exploring log data. It is built to work with any type of logs, as long as they are stored in ClickHouse. Users can easily configure connections to their ClickHouse databases and run queries to filter, search, and analyze logs efficiently.
ChatGPT clicks convert 6.8x higher than Google organic(medium.com) Here’s the deal : I recently dug into some data in GA4 for our website and found that while Google Organic brings in more traffic, ChatGPT clicks convert way better — 6.8X better for free trial conversions, to be exact.
189 points by cainxinth 37 days ago | 263 comments
Winners of the $10k ISBN visualization bounty(annas-archive.org) A few months ago we announced a $10,000 bounty to make the best possible visualization of our data showing the ISBN space. We emphasized showing which files we have/haven’t archived already, and we later a dataset describing how many libraries hold ISBNs (a measure of rarity).
The Deep Research problem(ben-evans.com) Most what I do for a living is research and analysis. I think of data I’d like to see and go looking for it; I compile and collate it, make charts, decide they’re boring and try again, find new ways and new data to understand and explain the issue, and produce text and charts that try to express what I’m thinking. Then I go and talk to people about it.