Getting the most out of Avatar V comes down to three inputs: your motion recording, your voice clone, and your base look. This guide covers how to optimize each one for the best possible results.
New to Avatar V? Start with Avatar V guide first, then come back here to level up your results.
Motion recording
Your 15-second video is the most important input in the entire process. Avatar V learns your gestures, expressions, and mannerisms from this single clip.
Key principles:
- Be extremely expressive - more than feels natural
- Vary your tone and use your hands
- Make direct eye contact with the camera
- Bring energy even if it feels like you're overdoing it
The energy you put in is the energy you get out. A flat recording produces a stiff avatar. An expressive recording produces one that feels alive.
Example:
Voice clone
A dedicated standalone voice clone produces noticeably more realistic results than using audio from your base video.
Key principles:
- Record a standalone voice clone rather than relying on your base video audio
- Be expressive and vary your delivery when recording
- Iterate until it genuinely sounds like you at your best
Base look
Your base look is your avatar's identity. Every outfit, setting, and scene you generate is built from this single photo - so choosing the right one is critical.
Key principles:
- Use a close-up or half-body shot with your face clearly visible
- Keep your expression subtle - avoid big smiles or unusual angles
- Avoid accessories that obscure your face
- Choose a photo where you feel you look your best
Example:
| |
|---|
Unclear face, accessories, unusual angle | Inconsistent, lower quality generated looks |
Clear close-up, subtle expression, no accessories | Consistently strong generated looks |
Test a few options before committing. The right base look will produce noticeably better outputs across all your prompts.
Refining your looks
Use the Edit feature to make targeted adjustments to any generated look without starting from scratch. This is useful for tweaking outfit details, swapping backgrounds, or refining specific elements.
Motion reference
If you've uploaded multiple video looks to your avatar group, you can select a specific one as your motion reference in Advanced Settings. Your motion reference shapes the style and energy of movement in your output.
Key principles:
- Match your motion reference to the tone of the video you're making
- Use a high-energy recording for dynamic, expressive output
- Use a calm recording for natural, subtle output
- Use a side-angle recording for better results with side-angle photos
Common issues and solutions
Avatar looks stiff or robotic Your motion recording may lack expressiveness. Re-record with more energy, gestures, and facial expression.
Voice doesn't sound like me Use a standalone voice clone rather than your base video audio. Record expressively and iterate until satisfied.
Generated looks are inconsistent Your base look may not be strong enough. Try a clearer close-up with a subtle expression and no accessories.
Output doesn't match the photo angle Select a motion reference that matches the angle of your photo. For side-angle photos, use a side-angle motion reference.