Automated lip sync is not a new technology, but Disney Research, in tandem with a group of researchers at University of East Anglia (England), Caltech, and Carnegie Mellon University, have added a twist to it: deep learning.
By training a neural network, the researchers are using a deep learning approach to generate real-time animated speech. In addition to automatically generating lip sync for English speaking actors, the new software can be applied to singing or adapted for foreign languages. The technology was presented at the most recent SIGGRAPH computer graphics conference in Los Angeles.
Realistic speech animation is essential for effective character animation,” said lead researcher Dr. Sarah Taylor, from UEA’s School of Computing Sciences. “Done badly, it can be distracting and lead to a box office flop. Doing it well however is both time consuming and costly as it has to be manually produced by a skilled animator. Our goal is to automatically generate production-quality animated speech for any style of character, given only audio speech as an input.”