Hacker News with Generative AI: Text Extraction

Show HN: Kreuzberg – Modern async Python library for document text extraction (github.com/Goldziher)
Kreuzberg is a Python library for text extraction from documents. It provides a unified async interface for extracting text from PDFs, images, office documents, and more.
Ask HN: What is the best method for turning a scanned book as a PDF into text? (ycombinator.com)
I like reading philosophy, particularly from the authors rather than a secondhand account.
Show HN: PDF to MD by LLMs – Extract Text/Tables/Image Descriptives by GPT4o (github.com/yigitkonur)
Swift OCR: LLM Powered Fast OCR ⚡
Extracting Words from Scanned Books: A Step-by-Step Tutorial with Python, OpenCV (github.com/feitgemel)