Unlock instant, AI-driven research and patent intelligence for your innovation.

Training method, device and electronic equipment for speech spectrum generation model

A technology for generating models and training methods, applied in speech synthesis, biological neural network models, speech analysis, etc. It can solve problems such as spectrum ambiguity, modeling cannot reflect the nature of spectrum, and inconsistent vocoder training and judgment. Clear sequence effect

Active Publication Date: 2022-01-07
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the prior art, the speech spectrum generation model uses the mean square error (Mean Square Error, MSE) loss function to feed back the error of the generated spectrum, but modeling based on the MSE loss function cannot reflect the nature of the spectrum, resulting in the generated spectrum being very Vague
When the vocoder is trained with the real clear spectrum, inputting the above-mentioned fuzzy spectrum into the vocoder will cause inconsistencies in the training and judgment of the vocoder, which will seriously affect the stability of the vocoder and affect the final synthesis audio quality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method, device and electronic equipment for speech spectrum generation model
  • Training method, device and electronic equipment for speech spectrum generation model
  • Training method, device and electronic equipment for speech spectrum generation model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

[0031] Spectrum generation technology is a very important part of speech synthesis technology. It realizes the conversion from text sequence to spectrum sequence, and uses spectrum sequence as a bridge to link the input text sequence with the final synthesized audio.

[0032] The spectrum generation technology in the prior art usually uses the Tacotron model. The Tac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The application discloses a training method, device and electronic equipment for a speech spectrum generation model, and relates to the technical fields of speech synthesis and deep learning. The specific implementation plan is: input the first character sequence into the speech spectrum generation model, generate the analog spectrum sequence corresponding to the first character sequence, and obtain the first loss value of the analog spectrum sequence according to the preset loss function; An analog spectrum sequence corresponding to a text sequence is input into an adversarial loss function model to obtain a second loss value of the analog spectral sequence, and the adversarial loss function model is a generated adversarial network model; according to the first loss value and the second loss value, the speech spectrum generation model is trained. The adversarial loss function model can learn a loss function based on the generative adversarial network, and train the speech spectrum generation model in combination with the preset loss function, so as to make the spectrum sequence generated by the speech spectrum generation model clearer.

Description

technical field [0001] The present application relates to the technical field of data processing, in particular to the technical field of speech synthesis and deep learning, and in particular to a training method, device and electronic equipment for a speech spectrum generation model. Background technique [0002] Spectrum generation technology is a very important technology in speech synthesis. Spectrum acts as a bridge to link the input text sequence with the final synthesized audio. [0003] In the prior art, the speech spectrum generation model uses the mean square error (Mean Square Error, MSE) loss function to feed back the error of the generated spectrum, but modeling based on the MSE loss function cannot reflect the nature of the spectrum, resulting in the generated spectrum being very Vague. When the vocoder is trained with the real clear spectrum, inputting the above-mentioned fuzzy spectrum into the vocoder will cause inconsistencies in the training and judgment ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/08G10L13/047G10L13/04G10L25/30
CPCG10L13/08G10L13/047G10L25/30G06N3/088G10L13/027G06N3/047G06N3/045G10L25/18
Inventor 陈志杰孙涛贾磊
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD