Hacker News with Generative AI: Image Recognition

Pixtral Large (mistral.ai)
Today we announce Pixtral Large, a 124B open-weights multimodal model built on top of Mistral Large 2. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding.
Claude can now view images within a PDF (twitter.com)
Implementing neural networks on the "3 cent" 8-bit microcontroller (wordpress.com)
Bouyed by the surprisingly good performance of neural networks with quantization aware training on the CH32V003, I wondered how far this can be pushed. How much can we compress a neural network while still achieving good test accuracy on the MNIST dataset?
Kagi Update: AI Image Filter for Search Results (kagi.com)
As AI-generated images become increasingly prevalent across the web, many users find their image search results cluttered with artificial content. This can be particularly frustrating when searching for authentic, human-created images or specific real-world references.
Answer any question about your photo albums with OmniQuery (jiahaoli.net)
OmniQuery enables free-form question answering on personal memories (i.e., private data in albums) with RAG. Specifically, it applies contextual data augmentation (taxonomy-based) to enhance the retrieval accuracy, and uses LLMs to generate answers based on the retrieved memory instances.
Can you tell if these images are real or generated? (britannicaeducation.com)
Show HN: Cluttr – A local first utility to make images searchable using Ollama (cluttr.ai)
Ultra simplified "MNIST" in 60 lines of Python with NumPy (github.com/tonio-m)
Google's Nonconsensual Explicit Images Problem Is Getting Worse (wired.com)
Image Self Supervised Learning on a Shoestring (theadamcolton.github.io)
The super effectiveness of Pokémon embeddings using only raw JSON and images (minimaxir.com)
Matryoshka Representation Learning with CLIP (marqo.ai)
Just got doxxed to within 15 miles by a vision model, from only a single photo (twitter.com)
CatLIP: Clip Vision Accuracy with 2.7x Faster Pre-Training on Web-Scale Data (arxiv.org)