Hacker News with Generative AI: Image Recognition

Homomorphic encryption in iOS 18 (boehs.org)
You are Apple. You want to make search work like magic in the Photos app, so the user can find all their “dog” pictures with ease. You devise a way to numerically represent the concepts of an image, so that you can find how closely images are related in meaning. Then, you create a database of known images and their numerical representations (“this number means car”), and find the closest matches. To preserve privacy, you put this database on the phone.
How we used GPT-4o for image detection with 350 similar illustrations (pages.dev)
Homomorphic Encryption in iOS 18 (boehs.org)
You are Apple. You want to make search work like magic in the Photos app, so the user can find all their “dog” pictures with ease. You devise a way to numerically represent the concepts of an image, so that you can find how closely images are related in meaning. Then, you create a database of known images and their numerical representations (“this number means car”), and find the closest matches. To preserve privacy, you put this database on the phone.
Show HN: Pixie – A tool to shop for clothes using pictures (ShopWithPixie.com)
Apple auto-opts everyone into having their photos analyzed by AI for landmarks (theregister.com)
Apple last year deployed a mechanism for identifying landmarks and places of interest in images stored in the Photos application on its customers iOS and macOS devices and enabled it by default, seemingly without explicit consent.
DreamSim: Learning New Dimensions of Human Visual Similarity (2023) (dreamsim-nights.github.io)
Which image, A or B, is most similar to the reference? We generate a new benchmark of synthetic image triplets that span a wide range of mid-level variations, labeled with human similarity judgments.
Pixtral Large (mistral.ai)
Today we announce Pixtral Large, a 124B open-weights multimodal model built on top of Mistral Large 2. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding.
Claude can now view images within a PDF (twitter.com)
Implementing neural networks on the "3 cent" 8-bit microcontroller (wordpress.com)
Bouyed by the surprisingly good performance of neural networks with quantization aware training on the CH32V003, I wondered how far this can be pushed. How much can we compress a neural network while still achieving good test accuracy on the MNIST dataset?
Kagi Update: AI Image Filter for Search Results (kagi.com)
As AI-generated images become increasingly prevalent across the web, many users find their image search results cluttered with artificial content. This can be particularly frustrating when searching for authentic, human-created images or specific real-world references.
Answer any question about your photo albums with OmniQuery (jiahaoli.net)
OmniQuery enables free-form question answering on personal memories (i.e., private data in albums) with RAG. Specifically, it applies contextual data augmentation (taxonomy-based) to enhance the retrieval accuracy, and uses LLMs to generate answers based on the retrieved memory instances.
Can you tell if these images are real or generated? (britannicaeducation.com)
Show HN: Cluttr – A local first utility to make images searchable using Ollama (cluttr.ai)
Ultra simplified "MNIST" in 60 lines of Python with NumPy (github.com/tonio-m)
Google's Nonconsensual Explicit Images Problem Is Getting Worse (wired.com)
Image Self Supervised Learning on a Shoestring (theadamcolton.github.io)
The super effectiveness of Pokémon embeddings using only raw JSON and images (minimaxir.com)
Matryoshka Representation Learning with CLIP (marqo.ai)
Just got doxxed to within 15 miles by a vision model, from only a single photo (twitter.com)
CatLIP: Clip Vision Accuracy with 2.7x Faster Pre-Training on Web-Scale Data (arxiv.org)