The way their model works is you have to upload 1min or less footage of your character speaking (Doesn't need to have audio - can be synthetic)
It then trains a mini model on your character to do the lipsync
The way their model works is you have to upload 1min or less footage of your character speaking (Doesn't need to have audio - can be synthetic)
It then trains a mini model on your character to do the lipsync