Advancements in embedding-based retrieval at Pinterest Homefeed
(medium.com)
At Pinterest Homefeed, embedding-based retrieval (a.k.a Learned Retrieval) is a key candidate generator to retrieve highly personalized, engaging, and diverse content to fulfill various user intents and enable multiple actionability, such as Pin saving and shopping.
At Pinterest Homefeed, embedding-based retrieval (a.k.a Learned Retrieval) is a key candidate generator to retrieve highly personalized, engaging, and diverse content to fulfill various user intents and enable multiple actionability, such as Pin saving and shopping.
DeepSeek's cutoff date is July 2024: We extracted DeepSeek's system prompt
(knostic.ai)
We extracted DeepSeek’s system prompt, below we’ll show how, and what we found. It isn't inherently hidden by design, but it's certainly interesting.
We extracted DeepSeek’s system prompt, below we’ll show how, and what we found. It isn't inherently hidden by design, but it's certainly interesting.
Add "fucking" to your Google searches to neutralize AI summaries
(gizmodo.com)
If you are tired of Google’s AI-powered search results leading you astray with poor information from bad sources, there is some good news. It turns out that if you include any expletives in your search query, Google will not return an AI Overview, as they are called, at the top of the results page.
If you are tired of Google’s AI-powered search results leading you astray with poor information from bad sources, there is some good news. It turns out that if you include any expletives in your search query, Google will not return an AI Overview, as they are called, at the top of the results page.
Supercharge vector search with ColBERT rerank in PostgreSQL
(vectorchord.ai)
Traditional vector search methods typically employ sentence embeddings to locate similar content. However, generating sentence embeddings through pooling token embeddings can potentially sacrifice fine-grained details present at the token level. ColBERT overcomes this by representing text as token-level multi-vectors rather than a single, aggregated vector. This approach, leveraging contextual late interaction at the token level, allows ColBERT to retain more nuanced information and improve search accuracy compared to methods relying solely on sentence embeddings.
Traditional vector search methods typically employ sentence embeddings to locate similar content. However, generating sentence embeddings through pooling token embeddings can potentially sacrifice fine-grained details present at the token level. ColBERT overcomes this by representing text as token-level multi-vectors rather than a single, aggregated vector. This approach, leveraging contextual late interaction at the token level, allows ColBERT to retain more nuanced information and improve search accuracy compared to methods relying solely on sentence embeddings.
Anthropic – Citations
(anthropic.com)
Claude is capable of providing detailed citations when answering questions about documents, helping you track and verify information sources in responses.
Claude is capable of providing detailed citations when answering questions about documents, helping you track and verify information sources in responses.
VideoRAG: Retrieval-Augmented Generation over Video Corpus
(arxiv.org)
Retrieval-Augmented Generation (RAG) is a powerful strategy to address the issue of generating factually incorrect outputs in foundation models by retrieving external knowledge relevant to queries and incorporating it into their generation process.
Retrieval-Augmented Generation (RAG) is a powerful strategy to address the issue of generating factually incorrect outputs in foundation models by retrieving external knowledge relevant to queries and incorporating it into their generation process.
How outdated information hides in LLM token generation probabilities
(anj.ai)
The internet usually has the correct answer somewhere, but it’s also full of conflicting and outdated information. How do large language models (LLMs) such as ChatGPT, trained on internet scale data, handle cases where there’s conflicting or outdated information? (Hint: it’s not always the most recent answer as of the knowledge cutoff date; think about what LLMs are trained to do)
The internet usually has the correct answer somewhere, but it’s also full of conflicting and outdated information. How do large language models (LLMs) such as ChatGPT, trained on internet scale data, handle cases where there’s conflicting or outdated information? (Hint: it’s not always the most recent answer as of the knowledge cutoff date; think about what LLMs are trained to do)
Wikipedia searches reveal differing styles of curiosity
(scientificamerican.com)
Mapping explorers of Wikipedia rabbit holes revealed three different styles of human inquisitiveness: the “busybody,” the “hunter” and the “dancer”
Mapping explorers of Wikipedia rabbit holes revealed three different styles of human inquisitiveness: the “busybody,” the “hunter” and the “dancer”
Embedding Models for Information Retrieval in 2025
(datastax.com)
The just-released Voyage-3-large is the surprise leader in embedding relevance
The just-released Voyage-3-large is the surprise leader in embedding relevance
RAG a 40GB Outlook inbox – Long term Staff member leaving, keeping knowledge
(reddit.com)
I've been fascinated by this concept since the early days of AI, and using ChatGPT has made it feel incredibly achievable and only just understood the concept of RAG. The idea is to leverage a local LLM paired with an open web UI to create vector or other databases of the inbox
I've been fascinated by this concept since the early days of AI, and using ChatGPT has made it feel incredibly achievable and only just understood the concept of RAG. The idea is to leverage a local LLM paired with an open web UI to create vector or other databases of the inbox
Unifying Generative and Dense Retrieval for Sequential Recommendation
(arxiv.org)
Sequential dense retrieval models utilize advanced sequence learning techniques to compute item and user representations, which are then used to rank relevant items for a user through inner product computation between the user and all item representations.
Sequential dense retrieval models utilize advanced sequence learning techniques to compute item and user representations, which are then used to rank relevant items for a user through inner product computation between the user and all item representations.
Prism: Manipulating Concepts in Latent Space
(thesephist.com)
Foundation models gesture at a way of interacting with information that’s at once more natural and powerful than “classic” knowledge tools. But to build the kind of rich, directly interactive information interfaces I imagine, current foundation models and embeddings are far too opaque to humans.
Foundation models gesture at a way of interacting with information that’s at once more natural and powerful than “classic” knowledge tools. But to build the kind of rich, directly interactive information interfaces I imagine, current foundation models and embeddings are far too opaque to humans.
The Anatomy of a Large-Scale Hypertextual Web Search Engine (1998)
(infolab.stanford.edu)
The web creates new challenges for information retrieval. The amount of information on the web is growing rapidly, as well as the number of new users inexperienced in the art of web research.
The web creates new challenges for information retrieval. The amount of information on the web is growing rapidly, as well as the number of new users inexperienced in the art of web research.
A new product that solves your tab hoarding problem and forgotten saved items
(getstasher.com)
Stasher saves what you browse and brings up related links just when you need them
Stasher saves what you browse and brings up related links just when you need them
The Tao of Topic Maps (2000)
(ontopia.net)
Someone once said that “a book without an index is like a country without a map”.
Someone once said that “a book without an index is like a country without a map”.
Understanding the BM25 full text search algorithm
(emschwartz.me)
BM25, or Best Match 25, is a widely used algorithm for full text search. It is the default in Lucene/Elasticsearch and SQLite, among others. Recently, it has become common to combine full text search and vector similarity search into "hybrid search". I wanted to understand how full text search works, and specifically BM25, so here is my attempt at understanding by re-explaining.
BM25, or Best Match 25, is a widely used algorithm for full text search. It is the default in Lucene/Elasticsearch and SQLite, among others. Recently, it has become common to combine full text search and vector similarity search into "hybrid search". I wanted to understand how full text search works, and specifically BM25, so here is my attempt at understanding by re-explaining.
Ask HN: Niche technical knowledge not found on the internet?
(ycombinator.com)
What niche subjects are you interested in for which the knowledge is hard to come by on the internet?
What niche subjects are you interested in for which the knowledge is hard to come by on the internet?
Ask HN: The Web Post ChatGPT?
(ycombinator.com)
The humble chat thread is rapidly becoming the defacto interface to information for so many right now.
The humble chat thread is rapidly becoming the defacto interface to information for so many right now.
Ask HN: Local RAG with private knowledge base
(ycombinator.com)
Looking for a free, local, open source RAG solution for running a reference library with 1000s of technical PDFs and word docs.
Looking for a free, local, open source RAG solution for running a reference library with 1000s of technical PDFs and word docs.
The Knowledge Graph: things, not strings (2012)
(google)
Search is a lot about discovery—the basic human need to learn and broaden your horizons. But searching still requires a lot of hard work by you, the user. So today I’m really excited to launch the Knowledge Graph, which will help you discover new information quickly and easily.
Search is a lot about discovery—the basic human need to learn and broaden your horizons. But searching still requires a lot of hard work by you, the user. So today I’m really excited to launch the Knowledge Graph, which will help you discover new information quickly and easily.
Bridging Search and Recommendation in Generative Retrieval
(dl.acm.org)
Generative retrieval for search and recommendation is a promising paradigm for retrieving items, offering an alternative to traditional methods that depend on external indexes and nearest-neighbor searches.
Generative retrieval for search and recommendation is a promising paradigm for retrieving items, offering an alternative to traditional methods that depend on external indexes and nearest-neighbor searches.
NotebookLM launches feature to customize and guide audio overviews
(google)
NotebookLM is a tool for understanding, built with Gemini 1.5. When you upload your sources, it instantly becomes an expert, grounding its responses in your material and giving you powerful ways to transform information. And since it’s your notebook, your personal data is never used to train NotebookLM.
NotebookLM is a tool for understanding, built with Gemini 1.5. When you upload your sources, it instantly becomes an expert, grounding its responses in your material and giving you powerful ways to transform information. And since it’s your notebook, your personal data is never used to train NotebookLM.
Phrase matching in Marginalia Search
(marginalia.nu)
Marginalia Search now properly supports phrase matching. This not only permits a more robust implementation of quoted search queries, but also helps promote results where the search terms occur in the document exactly in the same order as they do in the query.
Marginalia Search now properly supports phrase matching. This not only permits a more robust implementation of quoted search queries, but also helps promote results where the search terms occur in the document exactly in the same order as they do in the query.
A new semantic chunking approach for RAG
(gpt3experiments.substack.com)
As we saw in my last blog post, there is a shape for stories.
As we saw in my last blog post, there is a shape for stories.
Two kinds of LLM responses: Informational vs. Instructional
(shabie.github.io)
When thinking of LLM evals especially in the context of RAGs, it occurred to me that there are two kinds of distinct responses people get from LLMs: informational and instructional.
When thinking of LLM evals especially in the context of RAGs, it occurred to me that there are two kinds of distinct responses people get from LLMs: informational and instructional.
"As We May Think" by Vannevar Bush (1945)
(theatlantic.com)
As Director of the Office of Scientific Research and Development, Dr. Vannevar Bush has coordinated the activities of some six thousand leading American scientists in the application of science to warfare. In this significant article he holds up an incentive for scientists when the fighting has ceased. He urges that men of science should then turn to the massive task of making more accessible our bewildering store of knowledge.
As Director of the Office of Scientific Research and Development, Dr. Vannevar Bush has coordinated the activities of some six thousand leading American scientists in the application of science to warfare. In this significant article he holds up an incentive for scientists when the fighting has ceased. He urges that men of science should then turn to the massive task of making more accessible our bewildering store of knowledge.