Neural network taught to read lips

10 December 2019 - 17:13 | Interesting information
Neural network taught to read lips

A group of researchers from China and the USA trained a neural network to recognize speech by lips using video.

According to, attempts have been made to create efficient algorithms that could be read lips. However, even the most modern programs do not work as efficiently as algorithms that recognize audio speech. Zhejiang University specialists have developed the LIBS methodology, which uses the method of operation of speech recognizers. LIBS extracts the desired audio data from the video and at the same time focuses on the context of what is happening and on the movements of the speaker's lips. Then the neural network correlates this information with video information by identifying the correspondence between them and uses the filtering parameter to refine various options.

The method of knowledge distillation is based on the fact that a neural network trained on a large amount of data acts as a teacher model for a student neural network. Both networks receive the same data set, but the student at the same time tries to repeat after the teacher. In a new study, a speech-recognition neural network for audio recordings acts as a teacher for an algorithm that learns to read lips. The results showed that the new algorithm recognizes speech on the lips 7.66% better than previously created applications.