Using custom and stock voices for video narration

-

Looking for a fast and easy way to create professional voiceovers for your videos?
Meet HeyGen AI Voice, your go-to tool for high-quality video narration. Let’s walk through the simple steps to bring your videos to life with HeyGen’s AI-powered voice engine.
What’s covered in this guide
In this guide, we’ll cover:
- What is HeyGen AI Voice?
- How to create your own voice
- How to customize your voice
- How to add a new voice to a video
- How to generate AI Voices
- Prompting guide
- Common issues and solutions
What is HeyGen AI Voice?
HeyGen Voice offers a vast library of AI-generated voices that cover a whopping 175 languages and capture a variety of emotions that range from friendly, to serious and everything in between. Plus, there are numerous tone, accent, and style options to work with.

Whether you're narrating a corporate presentation, a YouTube video, or an educational course, there's a voice that fits perfectly. You can even create your own emotions for your voice by uploading simple audio files.
For example, if you want to add a sad version of your voice to the mix, you can upload a separate audio file of you speaking in a sad voice. The options are endless!
How to create your own voice
Start by logging into your HeyGen account.

From the dashboard, click ‘AI Voice’ to discover our library of AI voices.
Next, click ‘Create New Voice’ but note that you can also select an existing public voice from the library.
Next you’ll be asked to upload your own audio file.
You’ll then be able to choose between uploading your audio file.
For best results, upload a file that’s 2 minutes long. You can also create an avatar and voice through the Video Avatar process.
Finally, click on ‘Create new voice’.
How to customize your voice
Once the voice is generated, you’ll find it in the AI Voice library.
Click into your new voice, and you’ll notice a handful of emotions that are available. Choose one that you’d like to be the default, and you’re ready to go.

Next, let’s add a custom emotion to the mix. When you click into the new voice you’ve created, you can click ‘Add emotion’ and then upload a specific audio file.
Examples of emotions include sad, silly, whispering, excitement, and more.
Once you’ve uploaded your specific audio file, all you have to do is name it, and you’re set!
How to add a new voice to a video
When you’re in the process of creating a video, all you have to do is add the voice to any portion of your script.
Finally, click the portion of the script you’d like to add it to and select the voice you’ve created.
How to generate AI Voices
In addition to HeyGen's high-quality AI voiceover options, users on a paid Creator, Teams, Agency or Enterprise plan have the ability to generate custom voices using text prompts.
Start by clicking on the "Create New Voice" button and click on the "Generate Voice" option.
You can name your voice, select the desired age, gender, and ethnicity. These details help shape the overall tone and character of the voice.

Write a short description explaining how you want the voice to sound—whether it's professional, friendly, calm, or energetic. If you’re unsure, you can click on "Try a Sample" to hear example voices for inspiration.

Once you’re satisfied with your customizations, click "Generate Voice." HeyGen will provide you with three voice options to choose from. Listen to each and select the one that best matches your needs.

By generating an AI voice, you add a new layer of personalization to your videos, enhancing how your message is delivered!
Prompting guide
Voice Design Types
Realistic Voice Design:
To create an original, realistic voice, you can specify attributes such as age, accent/nationality, gender, tone, pitch, intonation, speed, and emotion. Example prompts include:
- “A young Indian female with a soft, high voice. Conversational, slow, and calm.”
- “An old British male with a raspy, deep voice. Professional, relaxed, and assertive.”
- “A middle-aged Australian female with a warm, low voice. Corporate, fast, and happy.”
Character Voice Design:
For creative characters, simpler prompts work well to generate unique voices. Example prompts include:
- “A massive evil ogre, troll.”
- “A sassy little squeaky mouse.”
- “An angry old pirate, shouting.”
Other characters we’ve had success with include Goblin, Vampire, Elf, Troll, Werewolf, Ghost, Alien, Giant, Witch, Wizard, Zombie, Demon, Devil, Pirate, Genie, Ogre, Orc, Knight, Samurai, Banshee, Yeti, Druid, Robot, Elf, Monkey, Monster, and Dracula.
Voice Attributes
Each attribute varies in importance when designing your AI voice:
- Age (High Importance): Choose from options like Young, Teenage, Adult, Middle-Aged, Old, etc.
- Accent/Nationality (High Importance): Options include British, Indian, Polish, American, and more.
- Gender (High Importance): Select from Male, Female, or Gender Neutral.
- Tone (Optional): Examples include Gruff, Soft, Warm, and Raspy.
- Pitch (Optional): Options like Deep, Low, High, and Squeaky are available.
- Intonation (Optional): Options include Conversational, Professional, Corporate, Urban, and Posh.
- Speed (Optional): You can set the speed to Fast, Quick, Slow, or Relaxed.
- Emotion/Delivery (Optional): Choose emotions such as Angry, Calm, Scared, Happy, Assertive, Whispering, or Shouting.
Common issues and solutions
Doesn’t sound like me
Voice clones may not perfectly replicate the source voice due to issues in the training data or audio quality. To address this:
- Use Professional Voice Cloning (PVC) with high-quality, consistent audio.
- Record clean audio samples with no background noise.
- Enable the Remove Background Noise option when submitting your audio.
- If issues persist, re-record new samples or regenerate the clone with the same input.
Pronunciation
Pronunciation accuracy depends on the language of the training data. To improve this:
- Record training samples in the language you plan to use.
- Ensure natural inflection and clear articulation during recording.
Voice instability
Voice instability can result from inconsistent training data or audio quality. Improve stability by:
- Recording high-quality training samples with consistent tone and volume.
- Using sufficient training data to provide the AI with varied but stable input.
Incorrect tone
Achieving the desired tone requires tonal consistency in the input samples. To correct tone issues:
- Record training samples in the target tone.
- Ensure that all samples maintain a consistent emotional or tonal expression.
Too monotone
Monotone voice clones can lack natural expression. Enhance expressiveness by:
- Including training data with natural inflections and varied tones.
- Recording new samples with expressive delivery and adding them as a custom emotion to your voice.
- Adjusting settings like stability, clarity/similarity, and style exaggeration for a more dynamic result.
Emotions don’t sound like me
To capture emotional nuances, training samples must reflect the desired emotional expressions. Include diverse samples showcasing different emotions to enhance the clone's ability to replicate them.
Accent
For unique accents, consistent training data is essential. To address accent issues:
- Use Professional Voice Cloning and record samples in the desired accent.
- For non-English TTS, use the Multilingual v2 model. For English, use Turbo v2.
- If problems persist, regenerate the voice clone using the same or new samples.
- Adjust the accent in HeyGen Studio if necessary, keeping in mind this may slightly alter the voice’s authenticity.
Best practices for successful voice cloning
- Use high-quality audio: Ensure training samples are free from background noise and recorded in a quiet environment.
- Provide sufficient data: Include a diverse range of samples, covering various tones, emotions, and expressions.
- Maintain consistency: Ensure all samples are recorded with consistent tone, volume, and quality.
- Enable background noise removal: Use HeyGen’s Remove Background Noise option for cleaner input.
- Test and adjust: Experiment with settings such as stability, clarity, and style exaggeration to fine-tune the output.
Recap
In this guide, you learned:
- What is HeyGen AI Voice?
- How to create your own voice
- How to customize your voice
- How to add a new voice to a video
- Common issues and solutions
We’re looking forward to seeing what you’ll create with HeyGen!