Speech synthesis model training method and device, computer equipment and storage medium

A technology of speech synthesis and training method, applied in the field of computer processing, which can solve the problems of poor quality of synthesized speech, difficulty in obtaining data sets, and insufficient neural network training.

Pending Publication Date: 2020-05-08
UBTECH ROBOTICS CORP LTD
View PDF14 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the training of speech synthesis model based on neural network requires a large amount of text data, and such data sets are usually difficult to obtain, in the case of limited data sets, the training of neural network is not sufficient, and the quality of synthesized speech is not good.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis model training method and device, computer equipment and storage medium
  • Speech synthesis model training method and device, computer equipment and storage medium
  • Speech synthesis model training method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0031] Such as figure 1 As shown, a training method of a speech synthesis model is proposed. The training method of the speech synthesis model can be applied to the terminal or to the server. In this embodiment, the application to the terminal is used as an example to illustrate that the training of the speech synthesis model The method specifically includes the following steps:

[0032] Step 102, acquiring training text data and training speech features corresponding to the training text data.

[0033] Wherein, the training text data refers to the text data used for training the speech synthesis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a speech synthesis model training method. The method comprises the steps of obtaining training text data and training speech features corresponding to the training text data;obtaining training phoneme data corresponding to the training text data according to the training text data; and taking the training text data and the training phoneme data as input of a speech synthesis model, and taking training speech features corresponding to the training text data as expected output of the speech synthesis model to train the speech synthesis model to obtain a target speech synthesis model. The training text data and the training phoneme data serve as input of the speech synthesis model at the same time, a training data set for training the speech synthesis model is enriched, and the quality and accuracy of synthesized speech are improved. In addition, the invention further provides a voice synthesis model training device, computer equipment and a storage medium.

Description

technical field [0001] The invention relates to the field of computer processing, in particular to a training method, device, computer equipment and storage medium of a speech synthesis model. Background technique [0002] Speech synthesis models are systems that process text input and generate human-like speech. With the maturity of deep learning technology and the improvement of computer performance, deep neural network technology is widely used in the training tasks of speech synthesis models. Since neural network-based speech synthesis model training requires a large amount of text data, such data sets are usually difficult to obtain, resulting in insufficient training of neural networks and poor quality of synthesized speech in the case of limited data sets. [0003] Therefore, it is urgent to propose a training method for a speech synthesis model with good synthetic speech quality. Contents of the invention [0004] Based on this, it is necessary to provide a train...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/02G10L13/04G10L25/30G06N3/08G06N3/04
CPCG10L13/02G10L13/04G10L25/30G06N3/08G06N3/044G06N3/045
Inventor 钱程浩黄东延熊友军
Owner UBTECH ROBOTICS CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products