Method and system for non-parallel corpus-to-speech conversion based on autoregressive network
A speech conversion and autoregressive technology, applied in speech analysis, speech synthesis, speech recognition, etc., can solve problems such as uneven waveform trajectory and pronunciation errors, and achieve the effect of generating smooth waveform trajectory, reducing pronunciation errors, and improving stability.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0058] Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.
[0059] Non-parallel corpus speech conversion method based on autoregressive network, obtain phoneme delay probability through pre-trained speech recognition model, use convolutional neural network and gated recurrent unit to model context information in text, and use adaptive attention mechanism Integrating the text features of the current moment and the acoustic features of the previous moment, using the long short-term memory network to predict the acoustic characteristics of the target speaker, and synthesizing speech through the LPCNet vocoder, the naturalness of the converted speech and the similarity of the speaker are improved.
[0060] Such as figure 1 ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


