Nvidia Audio2Face

What's That?

Nvidia's Audio2Face is a combination of AI based technologies that generates facial motion and lip sync that is derived entirely from an audio source. It can be used at runtime or to generate facial animation for more traditional content creation pipelines. the resulting blendshape weights can be exported to .json files which can in turn be imported into Blender via Faceit. Audio2Face is part of the Nvidia Omniverse Platform.


Extensive documentation for the Audio2Face tool can be found here.

Audio2Face comes with digital Mark that can be used to quickly preview or export animations.


RTX only

For now, Audio2Face requires the usage of an Nvidia RTX video card and a Windows operating system. See this page for details on hardware requirements.

46 Audio2Face Expressions

By default, A2F uses a combination of 46 micro expressions (FACS) to drive the facial motion. Any 3D character that is equipped with a set of shape keys resembling the 46 captured micro expressions can be animated.
While A2F provides a full character transfer pipeline that allows to create the 46 micro expressions in the Audio2Face app, Faceit also comes with Audio2Face expression preset. Whether the expressions are created with A2F, Faceit or by hand doesn't matter.

Audio2Face expressions can be generated with Faceit

... Or via the Audio2Face character transfer pipeline.