Tutorial

April 8, 2026

Seedance Avatar Shots prompting guide: get the best results in HeyGen

# Seedance

# Avatar Shots

How to prompt Seedance in HeyGen for cinematic AI videos

Avatar Shots is powered by Seedance, which responds exceptionally well to detailed, cinematic prompting. Unlike other tools where a simple description is enough, Avatar Shots rewards you when you think like a director. The more specific and structured your prompt, the better your result, and since each generation uses credits with no undo, a well-crafted prompt is always worth the extra time.

New to Avatar Shots? Start with the How to Create Videos with Avatar Shots guide first, then come back here to level up your prompts.

What's covered in this guide

Core prompting principles

How to describe camera movement

How to use script vs. voice-over

How to create multi-shot videos

How to use reference elements

How to create longer videos across multiple shots

Language prompting tips

Core prompting principles

Think in shots, not sentences

Structure your scenes like a storyboard rather than a description. Seedance handles multi-shot prompts extremely well. Instead of writing what you want to happen, describe it as a sequence of shots.

✅ "Start wide on both presenters → cut to close-up of left avatar → pull back to full scene."

Go bold and specific

Generic prompts produce generic videos. Specific details like "golden hour light through venetian blinds," "dust particles in the air," or "lightning crackling in slow motion" make a real difference to the final output.

Name the vibe

Adding a style reference helps anchor the visual tone. Try phrases like "cinematic, film-grade," "music video style," "documentary feel," "anime aesthetic," or "commercial product shoot."

Plan before you generate

Because each generation uses credits with no undo, spend time refining your prompt before hitting generate. The better your brief, the fewer attempts you'll need. See the credit cost breakdown for more details.

Describing camera movement

Seedance follows cinematic camera language with high consistency. Use specific terms to get exactly the movement you want.

What you want	How to prompt it
Camera moving closer	"slow dolly-in" / "slow push-in to close-up"
Camera pulling back	"dolly out" / "pull back to reveal"
Camera from above	"crane shot overhead"
Camera following movement	"tracking shot from the right"
Wide scene-setting shot	"wide establishing shot panning right"
Stationary close-up	"medium shot — static, facing camera"

Script vs. voice-over, know the difference

The key difference is whether your avatar's mouth is visibly speaking or not. Never mix both in the same prompt, it may not work well.

Use Script when your avatar needs to deliver a message directly to camera with visible lip-sync.

Prompt: An avatar dressed professionally sits in a modern office. Slow push-in camera. Script: "Success today is about clarity, speed, and execution. The right mindset, and the right tools, make all the difference."

Use Voice-Over when you want cinematic action without direct speech, your avatar is moving, reacting, or performing while the narration plays over the footage.

Prompt: An avatar begins writing notes, then stands and walks toward a nearby window. Voiceover: "Success today is about clarity, speed, and execution. The right mindset, and the right tools, make all the difference."

Describing audio

Since you cannot upload audio files, describe what you want the audio to sound like directly in your prompt. Seedance will generate it.

"ambient coffee shop sounds"

"cinematic orchestral swell"

"upbeat lo-fi music"

"epic orchestral score building to a peak"

"ambient city sounds with distant traffic"

Tip: If you need full control over your audio after generation, bring your Avatar Shots clip into HeyGen AI Studio to layer in your own music or sound effects.

Creating multi-shot videos in one generation

Seedance supports multiple shots in a single prompt. You can define different camera angles, movements, and timing for each shot while keeping your avatar, environment, and identity consistent throughout.

Use timestamps to clearly define each shot, and describe the environment once at the top so it stays consistent across all shots.

Example prompt:

An avatar dressed professionally is in a clean, modern office environment with a desk and soft natural window light, visible throughout the sequence.
[0s–5s]: Frontal view — medium shot, slight push-in. The avatar faces the camera and begins speaking: "But turning ideas into something real?"
[5s–10s]: Side profile — medium shot. The camera shifts to a clean side angle as the avatar continues: "That takes focus… and the right tools."
[10s–15s]: Wide shot — dolly out. The camera pulls back, revealing more of the workspace as the avatar finishes: "Because execution is what actually makes the difference."
Cinematic 4K, shallow depth of field, soft natural lighting, subtle film grain. Smooth transitions between angles, stable framing, consistent identity across shots, accurate lip sync, no distortion.

Using reference elements

You can upload up to 3 reference images per generation to guide what appears in the video. Elements cannot include a human face, supported types are products, environments, clothing, and similar non-human assets.

Goal	How to use elements
Avatar wears a specific outfit	Upload an image of the clothing as an element
Consistent background across shots	Upload an environment photo as an element
Avatar holds or interacts with a product	Upload the product image and prompt the interaction
Match a specific visual style or mood	Upload a reference image with the color palette or aesthetic you want

Example prompt using all three elements (avatar + product + environment):

An avatar faces the camera in a modern office environment, wearing a pastel outfit while holding a HeyGen mug, speaking with confident and engaging energy. The avatar naturally integrates all elements — lightly gesturing with the mug, maintaining posture that highlights the outfit, and interacting naturally within the workspace. The composition keeps all elements clearly visible without clutter. Slow push-in camera. Cinematic 4K, shallow depth of field, soft balanced lighting, film grain.
Script: "Great tools, great style, and the right environment, everything works together. That's how you create faster, smarter, and better."

Reference photos:

Outcome video:

Creating longer videos across multiple shots

Since the maximum per generation is 15 seconds, longer videos require multiple individual generations stitched together in HeyGen AI Studio.

To keep a consistent look across shots:

Add the same clothing reference as an element in each generation

Add the same environment reference as an element in each generation

Keep the style description (lighting, camera quality, film grain) identical in each prompt

Example, two-part podcast conversation:

Scene 1 prompt:

Two avatars sit across from each other in a modern podcast studio with microphones and a clean, softly lit background. They engage in a natural conversation, taking turns speaking and reacting with subtle head movements and hand gestures. One avatar speaks first: "So what do you think is the biggest shift happening right now?" The second avatar responds thoughtfully: "Honestly, it's how fast content is evolving — everything is becoming more dynamic and accessible." They maintain eye contact with each other.

Scene 2 prompt (continuing the same scene):

The first avatar nods and follows up: "Yeah, and it feels like everyone is expected to create now, not just consume." The second avatar leans slightly forward and replies: "Exactly — and the barrier to entry is basically gone. Anyone with an idea can bring it to life."

Multi-avatar prompting

When placing multiple avatars in one scene, be explicit about how they interact with each other and what each one says.

All 3 avatars can speak in the same scene

Avatars can interact with and react to each other

Each avatar uses the voice connected to their own Digital Clone

Tips for multi-avatar prompts:

Label each avatar clearly, "the first avatar," "the avatar on the left"

Describe their relative positions, "sitting across from each other," "standing side by side"

Describe reactions, not just speech, "the second avatar leans forward and replies"

Language prompting tips

Seedance officially supports English, Mandarin Chinese (including Cantonese), Japanese, Korean, Spanish, French, German, and Portuguese.

You can also mix up to 2 supported languages in the same prompt to have your avatar switch between languages within a single scene.

Example, bilingual scene:

Medium shot — static. Avatar seated, facing camera. Friendly, teacher-like tone. Subtle hand gestures.

Avatar (English): "Let's learn a simple phrase in Spanish." (small pause, slight smile) Avatar (Spanish): "La comunicación es clave." (brief pause) Avatar (English): "That means: communication is key." (encouraging tone) Avatar (Spanish, slower pronunciation): "La… comunicación… es clave."

Tip: If your language isn't on the supported list, generate the video in English first and use HeyGen's Translate feature to convert it.

Quality checklist before you generate

Before clicking generate, run through this list:

Did I describe the setting, mood, and lighting?

Did I specify camera movement?

Did I describe the audio I want?

Did I use Script or Voice-Over (not both)?

Did I add element references for product, clothing, or environment if needed

Did I define timestamps if using multiple shots?

Did I name the visual style or vibe?

Comments (0)

Popular

Table Of Contents