Training method and device for prosody model used for speech synthesis

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and model training, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of prosodic pause prediction synthesis effect not being smooth and natural, poor user experience, etc., to achieve perfect prosodic model, prosodic pause fluent and natural The effect of accuracy

Active Publication Date: 2015-08-26

BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

View PDF4 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The wrong prediction of rhythmic pauses caused the final synthesis effect of the sentence to be unsmooth and natural, resulting in poor user experience

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0020] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0021] The prosodic model training method and device for speech synthesis and the speech synthesis method and device according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0022] figure 1 is a flowchart of a prosodic model training method for speech synthesis according to an embodiment of the present invention.

[0023] Such as figure 1 As shown, the prosodic model training method for speech synthesis may include:

[0024] S1. Extract text features and tag f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a training method and device for a prosody model used for speech synthesis, wherein the training method for the prosody model used for speech synthesis comprises the following steps: S1, extracting textual features and marker features corresponding to participles from a training corpus text; S2, generalizing the participles in the training corpus text on the basis of Chinese thesaurus; S3, training the prosody model according to the textual features, the marker features and the generalized participles. According to the training method and device for the prosody model used for speech synthesis, by extracting the textual features and marker features corresponding to participles from the training corpus text, generalizing the participles in the training corpus text on the basis of Chinese thesaurus and then training the prosody model according to the textual features, the marker features and the generalized participles, the prosody model is more perfect, and further the prosody prediction accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of text-to-speech conversion, in particular to a prosody model training method and device for speech synthesis. Background technique [0002] Speech synthesis, also known as text-to-speech technology, is a technology that can convert text information into speech and read it aloud. With the continuous advancement of science and technology, the application of speech synthesis is becoming more and more extensive, such as news and information broadcasting, audio novels, etc. In daily life, text messages, emails and other information can also be synthesized into voice through speech synthesis, providing users with an additional way to obtain information. [0003] In the speech synthesis system, prosody prediction is the basis of the whole system, if the prosody pause prediction is wrong, it will directly affect the effect of speech synthesis. For example: the synthesized text is "if a passer-by handed it an emp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/10

Inventor徐扬凯李秀林

OwnerBAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Training method and device for prosody model used for speech synthesis

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology