- The AI Synthesizer
- Posts
- Lex Fridman Podcast Transcripts
Lex Fridman Podcast Transcripts
How to get them?
Finally!
We get to the building part!
Hope you're excited just as much as I am!
All of the project code is available in lex-gpt Github repository so make sure to clone it to follow along. Python dependencies of the projects are handled via Poetry so it should be fairly easy to follow along.
If you don’t know how to configure the project, use this prompt in ChatGPT:
How to start using python project that has poetry.lock and pyproject.toml files already inside it?
For IDE, I recommend using VSCode, because it does a lot of stuff automatically for you and integrates with Poetry fairly well.
The first step of our project will be getting Lex Fridman Podcast transcripts:

We'll do that using YouTube Transcription API.
Today, I will show you how the YouTube Transcription API works and how we'll leverage it for getting all of the transcripts.
First, you need to get your YouTube API Key. If you don't know how to get it, follow this video.
When you have your API key, please add it to your .env file like that:

Make sure you don’t share your API key with anyone!
Python library that allows us to get transcripts of YouTube videos is called: youtube-transcript-api
(you will already have it in your poetry environment).
Now, let's get the transcript of the latest Lex Fridman Podcast episode:

In order to do it, we'll need to use the video_id of the episode. It's available in the URL of the video.
Getting the transcript is as simple as:

The code is available in the day-6.ipynb
notebook. Each transcript has a start timestamp and duration.
Nice! That's how we got a transcript for a single video.
But wait... Lex Fridman Podcast has 393 episodes, we cannot get them like that!
Yes... This is exactly what I'll show you tomorrow!
In the next episode, we'll:
collect the metadata for all of the videos (video_ids, segment timestamps, etc.)
get the transcripts for all of the videos
group the transcripts per video segment.
All of that by using ChatGPT!
This is the sixth day of the 30-day AI challenge.
Over the next month, I will be building the Lex Fridman AI engine with you!
If you're reading this, I assume you'd like to build things. If you stick to this newsletter you will have a running project after a month and know the necessary technology to build AI apps.
I've recently built PodcastGPT and want to share the process with the community. If you haven't seen the app yet, you can get access here: PodcastGPT
This is all for now! See you tomorrow.
Stay focused!
Luke