Speech synthesis model training method and device, equipment, storage medium and product

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and training method, which is applied in the computer field, can solve the problems of unfavorable deployment, time-consuming and labor-intensive, etc., and achieve the effect of facilitating deployment

Pending Publication Date: 2022-05-10

上海鱼尔网络科技有限公司

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Updating the specified timbre of the trained model requires updating all parameters, which is time-consuming and laborious, and is not conducive to deployment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

preparation example Construction

[0095] In one embodiment, a speech synthesis method includes: acquiring text information; inputting the text information into a target model obtained through transfer learning to obtain a target speech corresponding to the text information.

[0096] In a specific embodiment, a method for training a speech synthesis model includes: a terminal, a server, and a data storage system.

[0097] The first training data is speech synthesis training data of at least two different people. Wherein, each person's speech synthesis training data includes a plurality of speech data and text information, the speech data corresponds to the text information one by one, the length of each speech data is less than 10s, and the sum of each person's speech data is at least 10h. Optionally, the voice data may be a recording file corresponding to the text information.

[0098] The second training data is speech synthesis training data of customized timbre. The speech synthesis training data of custo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a speech synthesis model training method and device, equipment, a storage medium and a product. The method comprises the following steps: acquiring first training data; the first training data is trained to obtain a basic model, the basic model comprises a timbre learning module, and the timbre learning module distinguishes differences of different timbres in the training process of the first training data and obtains model parameters corresponding to the different timbres; acquiring second training data; and according to the second training data, transfer learning is carried out on the trained basic model to obtain a target model, and the transfer learning only modifies model parameters of the timbre learning module. By adopting the method, the training method of the speech synthesis model can be provided without updating all parameters.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a training method, device, equipment, storage medium and product of a speech synthesis model. Background technique [0002] With the development of computer technology, speech synthesis technology has appeared. [0003] Speech synthesis technology can enable the trained model to synthesize the speech data corresponding to the specified timbre, such as synthesizing the speech data corresponding to the timbre of Zhang San or Li Si. If it is necessary to synthesize the voice data of the newly added specified timbre, it is necessary to collect a large amount of training voice data corresponding to the newly added specified timbre, so as to update all the parameters of the previously trained model. [0004] Updating the specified timbre of the trained model requires updating all parameters, which is time-consuming and laborious, and is not conducive to deployment. Conte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L13/10G10L25/18G10L25/30

CPCG10L13/10G10L25/30G10L25/18

Inventor 郑振鹏

Owner 上海鱼尔网络科技有限公司

Speech synthesis model training method and device, equipment, storage medium and product

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

preparation example Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology