- The AI Synthesizer
- Posts
- Embeddings - The Foundation of Semantic Search
Embeddings - The Foundation of Semantic Search
Why do we need them?
We’ve got summaries.
We’ve extracted the keywords and can search through them.
However, we need to upgrade our game if we want to answer user questions.
This is where embeddings come into help!
This is where we are in the process of creating LexGPT!
Why embeddings?
Embeddings are a foundation of semantic search. They convert text (e.g. summaries or sentences) into numbers (we call them vectors).
Transformation of a question into embedding.
These vectors capture the essence and context of the text, and most importantly - they allow us to search for similar documents.
Searching for texts that have similar meanings is called semantic search.
An example space of embeddings can look something like this:
An example of embedding space
Each point on the graph represents one document from a given category. We can see that similar documents are grouped close together.
This is the mindblowing property of text embeddings!
Embeddings for question-answering
What’s even more crazy?
We can embed user questions into this space and look for documents that answer user’s questions!
This is what the process looks like:
The user asks a question related to the podcast, e.g. “What’s the best advice for young people”
We are embedding the question into embedding space and looking for documents “similar” to the question
We are providing these documents as a context for the LLM model
We get an answer based on the question and documents.
Simple - yet powerful!
This is the tenth day of the 30-day AI challenge. We are 1/3 through!
Over the next month, I will be building the Lex Fridman AI engine with you!
If you're reading this, I assume you'd like to build things. If you stick to this newsletter you will have a running project after a month and know the necessary technology to build AI apps.
I've recently built PodcastGPT and want to share the process with the community. If you haven't seen the app yet, you can get access here: PodcastGPT
This is all for now! See you tomorrow.
Stay focused!
Luke