Tutorial
April 8, 2026
Seedance Avatar Shots prompting guide: get the best results in HeyGen

# Seedance
# Avatar Shots
How to prompt Seedance in HeyGen for cinematic AI videos

Avatar Shots is powered by Seedance, which responds exceptionally well to detailed, cinematic prompting. Unlike other tools where a simple description is enough, Avatar Shots rewards you when you think like a director. The more specific and structured your prompt, the better your result, and since each generation uses credits with no undo, a well-crafted prompt is always worth the extra time.
New to Avatar Shots? Start with the How to Create Videos with Avatar Shots guide first, then come back here to level up your prompts.
What's covered in this guide
- Core prompting principles
- How to describe camera movement
- How to use script vs. voice-over
- How to create multi-shot videos
- How to use reference elements
- How to create longer videos across multiple shots
- Language prompting tips
Core prompting principles
Think in shots, not sentences
Structure your scenes like a storyboard rather than a description. Seedance handles multi-shot prompts extremely well. Instead of writing what you want to happen, describe it as a sequence of shots.
✅ "Start wide on both presenters → cut to close-up of left avatar → pull back to full scene."
Go bold and specific
Generic prompts produce generic videos. Specific details like "golden hour light through venetian blinds," "dust particles in the air," or "lightning crackling in slow motion" make a real difference to the final output.
Name the vibe
Adding a style reference helps anchor the visual tone. Try phrases like "cinematic, film-grade," "music video style," "documentary feel," "anime aesthetic," or "commercial product shoot."
Plan before you generate
Because each generation uses credits with no undo, spend time refining your prompt before hitting generate. The better your brief, the fewer attempts you'll need. See the credit cost breakdown for more details.
Describing camera movement
Seedance follows cinematic camera language with high consistency. Use specific terms to get exactly the movement you want.
What you want | How to prompt it |
|---|---|
Camera moving closer | "slow dolly-in" / "slow push-in to close-up" |
Camera pulling back | "dolly out" / "pull back to reveal" |
Camera from above | "crane shot overhead" |
Camera following movement | "tracking shot from the right" |
Wide scene-setting shot | "wide establishing shot panning right" |
Stationary close-up | "medium shot — static, facing camera" |
Script vs. voice-over, know the difference
The key difference is whether your avatar's mouth is visibly speaking or not. Never mix both in the same prompt, it may not work well.
Use Script when your avatar needs to deliver a message directly to camera with visible lip-sync.
Prompt: An avatar dressed professionally sits in a modern office. Slow push-in camera. Script: "Success today is about clarity, speed, and execution. The right mindset, and the right tools, make all the difference."
Use Voice-Over when you want cinematic action without direct speech, your avatar is moving, reacting, or performing while the narration plays over the footage.
Prompt: An avatar begins writing notes, then stands and walks toward a nearby window. Voiceover: "Success today is about clarity, speed, and execution. The right mindset, and the right tools, make all the difference."
Describing audio
Since you cannot upload audio files, describe what you want the audio to sound like directly in your prompt. Seedance will generate it.
- "ambient coffee shop sounds"
- "cinematic orchestral swell"
- "upbeat lo-fi music"
- "epic orchestral score building to a peak"
- "ambient city sounds with distant traffic"
Tip: If you need full control over your audio after generation, bring your Avatar Shots clip into HeyGen AI Studio to layer in your own music or sound effects.
Creating multi-shot videos in one generation
Seedance supports multiple shots in a single prompt. You can define different camera angles, movements, and timing for each shot while keeping your avatar, environment, and identity consistent throughout.
Use timestamps to clearly define each shot, and describe the environment once at the top so it stays consistent across all shots.
Example prompt:
An avatar dressed professionally is in a clean, modern office environment with a desk and soft natural window light, visible throughout the sequence.[0s–5s]: Frontal view — medium shot, slight push-in. The avatar faces the camera and begins speaking: "But turning ideas into something real?"[5s–10s]: Side profile — medium shot. The camera shifts to a clean side angle as the avatar continues: "That takes focus… and the right tools."[10s–15s]: Wide shot — dolly out. The camera pulls back, revealing more of the workspace as the avatar finishes: "Because execution is what actually makes the difference."Cinematic 4K, shallow depth of field, soft natural lighting, subtle film grain. Smooth transitions between angles, stable framing, consistent identity across shots, accurate lip sync, no distortion.
Using reference elements
You can upload up to 3 reference images per generation to guide what appears in the video. Elements cannot include a human face, supported types are products, environments, clothing, and similar non-human assets.
Goal | How to use elements |
|---|---|
Avatar wears a specific outfit | Upload an image of the clothing as an element |
Consistent background across shots | Upload an environment photo as an element |
Avatar holds or interacts with a product | Upload the product image and prompt the interaction |
Match a specific visual style or mood | Upload a reference image with the color palette or aesthetic you want |
Example prompt using all three elements (avatar + product + environment):
An avatar faces the camera in a modern office environment, wearing a pastel outfit while holding a HeyGen mug, speaking with confident and engaging energy. The avatar naturally integrates all elements — lightly gesturing with the mug, maintaining posture that highlights the outfit, and interacting naturally within the workspace. The composition keeps all elements clearly visible without clutter. Slow push-in camera. Cinematic 4K, shallow depth of field, soft balanced lighting, film grain.Script: "Great tools, great style, and the right environment, everything works together. That's how you create faster, smarter, and better."
Reference photos:

Outcome video:

Creating longer videos across multiple shots
Since the maximum per generation is 15 seconds, longer videos require multiple individual generations stitched together in HeyGen AI Studio.
To keep a consistent look across shots:
- Add the same clothing reference as an element in each generation
- Add the same environment reference as an element in each generation
- Keep the style description (lighting, camera quality, film grain) identical in each prompt
Example, two-part podcast conversation:
Scene 1 prompt:
Two avatars sit across from each other in a modern podcast studio with microphones and a clean, softly lit background. They engage in a natural conversation, taking turns speaking and reacting with subtle head movements and hand gestures. One avatar speaks first: "So what do you think is the biggest shift happening right now?" The second avatar responds thoughtfully: "Honestly, it's how fast content is evolving — everything is becoming more dynamic and accessible." They maintain eye contact with each other.

Scene 2 prompt (continuing the same scene):
The first avatar nods and follows up: "Yeah, and it feels like everyone is expected to create now, not just consume." The second avatar leans slightly forward and replies: "Exactly — and the barrier to entry is basically gone. Anyone with an idea can bring it to life."
Multi-avatar prompting
When placing multiple avatars in one scene, be explicit about how they interact with each other and what each one says.
All 3 avatars can speak in the same scene
Avatars can interact with and react to each other
Tips for multi-avatar prompts:
Label each avatar clearly, "the first avatar," "the avatar on the left"
Describe their relative positions, "sitting across from each other," "standing side by side"
Describe reactions, not just speech, "the second avatar leans forward and replies"
Language prompting tips
Seedance officially supports English, Mandarin Chinese (including Cantonese), Japanese, Korean, Spanish, French, German, and Portuguese.
You can also mix up to 2 supported languages in the same prompt to have your avatar switch between languages within a single scene.
Example, bilingual scene:
Medium shot — static. Avatar seated, facing camera. Friendly, teacher-like tone. Subtle hand gestures.
Avatar (English): "Let's learn a simple phrase in Spanish." (small pause, slight smile) Avatar (Spanish): "La comunicación es clave." (brief pause) Avatar (English): "That means: communication is key." (encouraging tone) Avatar (Spanish, slower pronunciation): "La… comunicación… es clave."
Tip: If your language isn't on the supported list, generate the video in English first and use HeyGen's Translate feature to convert it.
Quality checklist before you generate
Before clicking generate, run through this list:
- Did I describe the setting, mood, and lighting?
- Did I specify camera movement?
- Did I describe the audio I want?
- Did I use Script or Voice-Over (not both)?
- Did I add element references for product, clothing, or environment if needed
- Did I define timestamps if using multiple shots?
- Did I name the visual style or vibe?
Like
Comments (0)
Popular
Table Of Contents
Dive in
Related
External Content
HeyGen for Agencies: The Ultimate AI Video Jumpstart Guide
Aug 5th, 2025 • Views 1.9K
Guide
How to use Avatar Shots with Seedance in HeyGen to create cinematic AI videos
Apr 7th, 2026 • Views 80
Guide
How to use Avatar Shots with Seedance in HeyGen to create cinematic AI videos
Apr 7th, 2026 • Views 80
External Content
HeyGen for Agencies: The Ultimate AI Video Jumpstart Guide
Aug 5th, 2025 • Views 1.9K

