Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A prosody level and prediction model technology, applied in the field of prosody level prediction model generation and prosody level prediction, can solve the problems of small scale of precision data set, affecting the accuracy of prosody level prediction, and large investment, so as to improve the accuracy, Improve the effect of language synthesis and the effect of expanding the scale

Active Publication Date: 2015-12-23

BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

View PDF7 Cites 22 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, since the fine-labeled data set requires professional annotation, the labeling cycle is long, and the investment required is large, the scale of the fine-labeled data set is often small, resulting in poor accuracy of the trained prosodic level prediction model, which affects the prosodic level. The accuracy of prediction affects the effect of speech synthesis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0025] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar modules or modules having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0026] figure 1 It is a schematic flowchart of a method for generating a prosodic level prediction model proposed by an embodiment of the present invention, and the method includes:

[0027] S11: Generate an initial prosodic level prediction model based on the fine-scale dataset.

[0028] specific, figure 2 is a schematic diagram of the training process and prediction proce...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a rhythm-level prediction model generation method and apparatus, and a rhythm-level prediction method and apparatus. The rhythm-level prediction model generation method comprises: generating an initial rhythm-level prediction model according to a precisely marking data set; collecting speech and corresponding text data, detecting feature information in the speech, and adding the feature information to a corresponding position of the text data; according to the initial rhythm-level prediction model, carrying out rhythm-level prediction on the text data with added feature information to obtain an initial rhythm-level prediction result; and according to the precisely marking data set and the initial rhythm-level prediction result, carrying out training to generate an updated rhythm-level prediction model, and applying the updated rhythm-level prediction model to rhythm-level prediction for speech synthesis. With the method, accuracy of the generated rhythm-level prediction model is improved and thus accuracy of the rhythm-level prediction is enhanced; and the voice synthesis effect is improved.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a method and device for generating a prosodic level prediction model and prosodic level prediction. Background technique [0002] Speech synthesis, also known as text-to-speech (TTS) technology, can convert any text information into a standard and smooth voice in real time, which is equivalent to installing an artificial mouth on the machine. A key step in speech synthesis is prosodic prediction, which can be subdivided into prosody level prediction, duration prediction, and pitch prediction. In the prosodic level prediction, the prosodic level prediction model generated in the training stage is used to predict the prosodic level of the input text after text processing, and the prosodic level prediction result is obtained, which will affect the effect of speech synthesis. [0003] In the prior art, the prosodic level prediction model is generated after training the tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/10

Inventor 李秀林张辉杨鹏徐扬凯白锦峰付晓寅

Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology