Speech synthesis model training method and device, computer equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and training method, applied in the field of computer processing, which can solve the problems of poor quality of synthesized speech, difficulty in obtaining data sets, and insufficient neural network training.

Pending Publication Date: 2020-05-08

UBTECH ROBOTICS CORP LTD

View PDF14 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Since the training of speech synthesis model based on neural network requires a large amount of text data, and such data sets are usually difficult to obtain, in the case of limited data sets, the training of neural network is not sufficient, and the quality of synthesized speech is not good.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0031] Such as figure 1 As shown, a training method of a speech synthesis model is proposed. The training method of the speech synthesis model can be applied to the terminal or to the server. In this embodiment, the application to the terminal is used as an example to illustrate that the training of the speech synthesis model The method specifically includes the following steps:

[0032] Step 102, acquiring training text data and training speech features corresponding to the training text data.

[0033] Wherein, the training text data refers to the text data used for training the speech synthesis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a speech synthesis model training method. The method comprises the steps of obtaining training text data and training speech features corresponding to the training text data;obtaining training phoneme data corresponding to the training text data according to the training text data; and taking the training text data and the training phoneme data as input of a speech synthesis model, and taking training speech features corresponding to the training text data as expected output of the speech synthesis model to train the speech synthesis model to obtain a target speech synthesis model. The training text data and the training phoneme data serve as input of the speech synthesis model at the same time, a training data set for training the speech synthesis model is enriched, and the quality and accuracy of synthesized speech are improved. In addition, the invention further provides a voice synthesis model training device, computer equipment and a storage medium.

Description

technical field [0001] The invention relates to the field of computer processing, in particular to a training method, device, computer equipment and storage medium of a speech synthesis model. Background technique [0002] Speech synthesis models are systems that process text input and generate human-like speech. With the maturity of deep learning technology and the improvement of computer performance, deep neural network technology is widely used in the training tasks of speech synthesis models. Since neural network-based speech synthesis model training requires a large amount of text data, such data sets are usually difficult to obtain, resulting in insufficient training of neural networks and poor quality of synthesized speech in the case of limited data sets. [0003] Therefore, it is urgent to propose a training method for a speech synthesis model with good synthetic speech quality. Contents of the invention [0004] Based on this, it is necessary to provide a train...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L13/04G10L25/30G06N3/08G06N3/04

CPCG10L13/02G10L13/04G10L25/30G06N3/08G06N3/044G06N3/045

Inventor钱程浩黄东延熊友军

OwnerUBTECH ROBOTICS CORP LTD

Speech synthesis model training method and device, computer equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology