Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech synthesis model training method and device, electronic equipment and storage medium

A technology of speech synthesis and training methods, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of low quality of speech information, increase training cost and training time, and accumulated errors, so as to improve the effect of speech synthesis and reduce training costs. , the effect of eliminating accumulated errors

Pending Publication Date: 2022-05-17
CLOUDMINDS BEIJING TECH CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the inventors of the present application have discovered that the speech synthesis technology that converts text information into specific acoustic features first, and then converts acoustic features into audio waveform data is a two-stage technology that converts text information into specific acoustic features. The acoustic features are realized by the acoustic model, and the conversion of the acoustic features into audio waveform data is realized by the vocoder. The acoustic model and the vocoder need to be trained separately, and the trained acoustic model and the vocoder are combined in series to form When the speech synthesis system is used, it will bring cumulative errors, resulting in low quality of the generated speech information. If the acoustic model and the vocoder are spliced ​​together before training, a large number of training samples are needed to make the model converge, which greatly increases the training cost and training time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis model training method and device, electronic equipment and storage medium
  • Speech synthesis model training method and device, electronic equipment and storage medium
  • Speech synthesis model training method and device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art can understand that in each embodiment of the application, many technical details are provided for readers to better understand the application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in this application can also be realized. The division of the following embodiments is for the convenience of description, and should not constitute any limitation to the specific implementation of the present application, and the embodiments can be combined and referred to each other on the premise of no contradiction.

[0026] An embodiment of the present application relates to a method for train...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention relates to the technical field of natural language processing, and discloses a speech synthesis model training method and device, electronic equipment and a storage medium, the model comprises a generator and a discriminator, and the training method comprises the following steps: obtaining a plurality of first audio data marked with text information to generate a training sample set; inputting the text information into the generator, and obtaining second audio data output by the generator; the first audio data and the second audio data are input into the discriminator, a discrimination result output by the discriminator is obtained, and the discrimination result is used for representing the similarity degree between the first audio data and the second audio data; according to the judgment result and the preset loss function, iterative training is performed on the speech synthesis model, an intermediate conversion link is not included, accumulative errors are eliminated, the speech synthesis effect of the model is improved, the model can be trained to be converged only through a small number of training samples, and the training cost is reduced.

Description

technical field [0001] The embodiments of the present application relate to the technical field of natural language processing, and in particular to a training method, device, electronic device and storage medium of a speech synthesis model. Background technique [0002] Speech synthesis technology can convert text information into corresponding speech information. Traditional speech synthesis technology is generally divided into speech synthesis technology based on statistical parameter modeling (also known as parametric synthesis), speech synthesis technology based on unit selection and waveform splicing (also known as parametric synthesis). Splicing synthesis), and speech synthesis technology based on neural network model, etc., among them, the speech synthesis technology based on neural network model first converts text information into specific acoustic features, such as Mel spectrum, and then uses vocoder to convert Mel Acoustic features such as spectrum are converted ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L13/08G10L13/04G10L25/30G10L25/51G10L19/16
CPCG10L13/02G10L13/08G10L13/04G10L25/30G10L25/51G10L19/16
Inventor 徐建明
Owner CLOUDMINDS BEIJING TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products