Hacker News with Generative AI: Vision Models

How do open source VLMs perform at OCR (getomni.ai)
For several months, we’ve been evaluating how well vision models handle OCR. Our initial benchmark focused on the closed-source models (GPT, Gemini, and Claude) and their comparisons to traditional OCR providers (AWS, Azure, GCP, etc.).

Open Source, Vision Models, OCR

4 points by tosh 483 days ago | 0 comments

Improving Accessibility Using Vision Models (myswamp.substack.com)
One of my projects I worked on recently was migrating a massive set of math courses from one platform to another. Along the way we realized some of our math courses had not been updated in quite some time, and some schools were still leveraging these courses to teach.

Accessibility, Vision Models, Education

63 points by bearjaws 660 days ago | 11 comments