A method for directly generating speech from lip videos
A lip and video technology, applied in the field of lip video directly generating voice, can solve the problem of not being able to output voice, and achieve the effect of easy training and improved conversion efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0034] Such as figure 1 As shown, the present invention collects the video that contains lip first by video collection equipment, and extracts the lip part image and obtains the video V of lip, and V is by a series of image I 1 , I 2 ,...,I n composed in order. Then extract the lip feature FI for each image I, and get the lip feature sequence FI 1 ,FI 2 ,...,FI n . The lip feature sequence is sequentially sent to the lip sound converter P, and the speech coding parameter sequence FA can be obtained from the output end of the lip sound converter P 1 ,FA 2 ,...,FA m . Using speech synthesis technology, the speech frame coding parameter sequence is synthesized into a speech frame sequence A 1 ,A 2 ,...,A m .
[0035] The specific process of the conversion method in this embodiment is described as follows.
[0036] (1) The first step is to obtain the lip video: use the camera to collect the video containing the lips (no need to collect audio), and extract the lip are...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 

