Current limitations: Our work is still at a research stage and we recognize limitations that still exist: more semantically-relevant gestures (closely linking gesture generation with language modeling (LLM) to allow gestures that are more semantically relevant), low latency, etc.
We’ve trained all our dyadic motion models on Seamless Interaction Dataset. Seamless Interaction Dataset is the first-of-its-kind dataset in terms of scale and breadth, with 4000+ hours and 4000+ participants.
Models can be controlled to enable more face expressiveness, with potential applications towards building more attentive or empathetic virtual listeners.
We built a family of Audio-Visual (AV) Dyadic Motion research models. Our models, conditioned on speech from two parties, can jointly generate facial expressions and body gestures.
We're introducing Seamless Interaction, a research project dedicated to modeling interpersonal dynamics, with potential applications to Avatars and Virtual Agents. Learn more at ai.facebook.com/research/sea... Thread 👇