Unlock instant, AI-driven research and patent intelligence for your innovation.

Voice synthesis method, vocoder training method, device, equipment and medium

A technology of speech synthesis and vocoder, which is applied in speech analysis, instruments, etc., can solve the problems of affecting speech synthesis speed and slow speech synthesis speed, so as to avoid the decrease of prediction accuracy, improve prediction speed and ensure the quality of speech synthesis Effect

Active Publication Date: 2021-12-10
TENCENT TECH (SHENZHEN) CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Although predicting a single sampling point each time can improve the stability and accuracy of speech synthesis, it will affect the speech synthesis speed, resulting in slower speech synthesis speed, especially in speech synthesis scenarios with high sampling rates

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice synthesis method, vocoder training method, device, equipment and medium
  • Voice synthesis method, vocoder training method, device, equipment and medium
  • Voice synthesis method, vocoder training method, device, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.

[0045] For the convenience of understanding, the nouns involved in the embodiments of the present application are described below.

[0046] Linear Predictive: When using the vocoder to predict the sampling value of the sampling point, the sampling value is decomposed into a linear part and a nonlinear part, wherein the linear part is obtained by linear prediction based on digital signal processing, and the nonlinear part It is predicted by the neural network, so as to reduce the difficulty of predicting the sampling points. In some embodiments, the linear prediction result of sampling point t is denoted as p t , the nonlinear prediction result (or called prediction residual, excitation) of sampling point t is denoted as e t , t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice synthesis method, a vocoder training method and device, equipment and a medium, and relates to the field of artificial intelligence. The method comprises the following steps: carrying out feature coding on acoustic features of a target voice frame to obtain a coding vector; carrying out time sequence feature extraction based on the coding vector and s groups of historical prediction data to obtain a time sequence feature vector, wherein the historical prediction data comprises a historical linear prediction result, a historical sampling point prediction result and a historical prediction residual error, and s is an integer greater than or equal to 2; performing sampling point prediction based on the coding vector and the time sequence feature vector to obtain sampling point prediction results of s sampling points in the target voice frame; and carrying out voice synthesis based on the sampling point prediction results of the s sampling points. By adopting the scheme provided by the embodiment of the invention, the voice synthesis speed can be improved under the condition of ensuring the voice synthesis stability and accuracy.

Description

technical field [0001] The embodiments of the present application relate to the field of artificial intelligence, and in particular to a speech synthesis method, a vocoder training method, a device, a device, and a medium. Background technique [0002] In the process of speech synthesis, the front end is responsible for predicting the acoustic features of the speech from the text, and the back end uses a vocoder (vocoder) to perform speech synthesis based on the acoustic features. [0003] Linear Predictive Coding net (LPCnet), as a vocoder combining digital signal processing and neural network, has good performance in real-time speech synthesis. In the related art, when the LPCnet-based vocoder is used for speech synthesis, the vocoder predicts a single sampling point each time, so as to perform speech synthesis according to the predicted continuous sampling points. [0004] Although predicting a single sampling point each time can improve the stability and accuracy of spe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L19/16G10L25/30
CPCG10L19/16G10L25/30
Inventor ιƒ‘θ‰Ίζ–Œ
Owner TENCENT TECH (SHENZHEN) CO LTD