- The AI Synthesizer
- Posts
- 🧠I just cloned my voice with AI
🧠I just cloned my voice with AI
It's dead simple
Hello from the heights!
It’s Luke Skyward, The AI Synthesizer!
Recent X engagement (or rather the lack of it) got me thinking about diversifying more and returning to YouTube.
I already made a few videos, but I found creating them time-consuming so I focused solely on X.
I cannot let this happen again! I’ll try to automate as much of the video creation process as possible this time.
One of the things that I find tedious and eating a lot of time?
Recording voiceovers.
This is why in today’s newsletter, I will show you how I cloned my voice to automate creating voiceovers for my future videos.
Ready to join me on this journey? Let’s do it!
Quick plug: If you find leveraging AI in your life challenging, you can book a free 1-1 call with me.
Let’s see if I can help you with your problems!
Cloning your voice with ElevenLabs
ElevenLabs is the best AI tool for voice generation - hands down.
Recently, they reached the Unicorn (over $1 billion valuation) level.
This is why I chose to clone my voice using the ElevenLabs tool.
In the next section, I will share with you how to clone your voice so you don’t have to create voiceovers again.
Here’s a sneak peek of what I’ve been able to achieve:
How to clone your voice - step-by-step
To clone your voice, you’ll need to subscribe to a $5 plan, but you get the first month for $1. This will allow you to play with cloning your voice almost for free if you’re not planning to use it a lot:
ElevenLabs starter plan allows you for Instant Voice Cloning
However, I think $5 is not a lot, if you want to save yourself hours of time wasted on recording the right version of the voiceover (I struggled with this a lot).
The feature that allows to clone our voice is called Instant Voice Cloning.
Inside the UI, we need to select Voices > Create:
Select Add Generative or Cloned Voice and the Instant Voice Cloning:
Next up we need to upload at least one minute of audio. ElevenLabs allows us to do it as we go:
To make sure, we get as realistic voice as possible, we need to cover as many sounds as possible. Here’s a script that you can use when recording:
“In the bustling city, people hurriedly pass by, lost in their own worlds."
"The serene sunset painted the sky with hues of orange, pink, and purple."
"Sizzling bacon and freshly brewed coffee filled the kitchen with enticing aromas."
"Gentle waves lapped against the shore, creating a soothing symphony of nature."
"A mysterious figure emerged from the shadows, casting a long silhouette on the dimly lit alley."
"In the heart of the forest, the rustling leaves whispered secrets of ancient times."
"With a burst of laughter, friends gathered around a crackling bonfire under the starlit night."
"The rhythmic clatter of a distant train echoed through the quiet countryside."
"As raindrops tapped on the window pane, a cozy blanket of warmth enveloped the room."
"A melodic piano played softly, filling the room with a sense of nostalgia."
"A charismatic storyteller captivated the audience with tales of adventure and intrigue."
"The chirping of crickets signaled the arrival of a tranquil summer evening."
"An enthusiastic crowd cheered as fireworks illuminated the night sky."
"The professor passionately explained complex theories, engaging the eager students."
"A child's laughter echoed in the playground, echoing the innocence of youth."
"Thunder rumbled in the distance, heralding the arrival of a summer storm."
"A skilled chef expertly chopped vegetables, creating a symphony of culinary sounds."
"A news anchor delivered breaking headlines with a calm and authoritative voice."
"The bustling market was alive with the chatter of vendors and the rustle of shoppers."
"Soft whispers of wind danced through the autumn leaves, creating a gentle melody."
"A lone wolf's howl echoed through the moonlit wilderness, haunting yet beautiful."
"The bustling metropolis came to life with the hum of traffic and the chatter of pedestrians."
"A child recited a poem with innocent enthusiasm, capturing the hearts of those around."
"The skilled blacksmith forged metal with rhythmic hammering, shaping raw material into art."
"A group of friends engaged in a lively debate, each voice adding to the vibrant discussion."
"A scientist explained groundbreaking discoveries, unraveling the mysteries of the universe."
"The bustling kitchen of a restaurant buzzed with the clinking of dishes and the sizzle of pans."
"A jazz band played with improvisational flair, creating a tapestry of musical expression."
"The distant hum of traffic merged with the urban soundtrack, a symphony of city life."
"A child's delighted gasp filled the air as they discovered a hidden treasure."
It should take about 5 minutes to record all of it.
The last step is to select labels and the description of your voice:
Voila! At this point, you should have your cloned voice ready! You can check it by going to Speech > Create tab:
Go give it a try!
Time to wrap up
Thank you for staying till the end!
If you’d like to grab a coffee with me and talk discussing synthesizing information with AI, here’s my Calendly link.
Thanks for joining this episode of The AI Synthesizer. I'll see you in the next issue. Until then, keep reaching for the stars! 🌌
Clear skies,
Luke Skyward