I am working with the same avatars I generated from photos, and when I use the heygen voice and give text as input, the movement is generally good and realistic, but when I use my own audio (result from tts model in a different platform) as input, th...