Model training method, speech synthesis method, device and computer program product

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A computer program and model training technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of low model training efficiency and affecting the accuracy of model transformation, so as to save manual intervention and time resources, realize full automation, and improve The effect of training efficiency

Pending Publication Date: 2022-04-12

TENCENT MUSIC ENTERTAINMENT TECH SHENZHEN CO LTD

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In order to find the conversion error of the text-to-phoneme conversion model and train the model to optimize it, the traditional technology mainly adopts the method of artificial exhaustion, firstly, the corresponding text is manually created according to the target text, and the text in the text is artificially marked with phonemes Obtain model training samples, so as to use these samples to train the model, but this method is prone to the problem of low model training efficiency due to the scarcity of artificial syllable-labeled training samples, which affects the accuracy of model conversion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0033] The model training method and speech synthesis method provided in the embodiments of the present application can be applied to computer devices such as terminals and servers. Among them, terminals can be, but not limited to, various personal computers, laptops, smart phones, tablet computers, IoT devices and portable wearable devices, and IoT devices can be smart speakers, smart TVs, smart air conditioners, smart vehicle devices, etc. Portable wearable devices can be smart watches, smart bracelets, head-mounted devices, and the like. The server can be implemented b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the field of intelligent voice, and provides a model training method, a voice synthesis method, equipment and a computer program product. According to the invention, the training efficiency of the character phoneme conversion model can be improved. The method comprises the following steps: acquiring a text sequence containing target characters with various pronunciations and an audio material of pronunciations corresponding to the text sequence from a training database, extracting a phoneme sequence corresponding to the audio material through a speech recognition model, and taking the phoneme sequence as a phoneme sequence labeling result, and inputting the text sequence into a to-be-trained character phoneme conversion model to obtain a phoneme sequence prediction result output by the to-be-trained character phoneme conversion model, then comparing the phoneme sequence labeling result with the phoneme sequence prediction result to obtain a phoneme sequence prediction deviation, and adjusting parameters of the to-be-trained character phoneme conversion model according to the phoneme sequence prediction deviation. And when a training completion condition is satisfied, obtaining a character phoneme conversion model.

Description

technical field [0001] The present application relates to the field of intelligent speech technology, in particular to a model training method, speech synthesis method, device and computer program product. Background technique [0002] Speech recognition and speech synthesis are currently widely used intelligent speech technologies. Speech recognition can convert human speech content into text, and speech synthesis is the reverse process, that is, input a piece of text to generate corresponding speech audio. [0003] For speech synthesis, when a text is entered, a speech synthesis system converts it into a sequence of phonemes and generates audio from this sequence of phonemes. Among them, for polyphonic characters, although the text-to-phoneme conversion model in the speech synthesis system will refer to word segmentation results and context information to determine the pronunciation of the text to obtain the corresponding phoneme, but the conversion error will still occur....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L13/02G10L13/08

Inventor 谭志力

Owner TENCENT MUSIC ENTERTAINMENT TECH SHENZHEN CO LTD

Model training method, speech synthesis method, device and computer program product

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology