Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus

A prosody level and prediction model technology, applied in the field of prosody level prediction model generation and prosody level prediction, can solve the problems of small scale of precision data set, affecting the accuracy of prosody level prediction, and large investment, so as to improve the accuracy, Improve the effect of language synthesis and the effect of expanding the scale

Active Publication Date: 2015-12-23
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF7 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since the fine-labeled data set requires professional annotation, the labeling cycle is long, and the investment required is large, the scale of the fine-labeled data set is of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus
  • Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus
  • Rhythm-level prediction model generation method and apparatus, and rhythm-level prediction method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar modules or modules having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0026] figure 1 It is a schematic flowchart of a method for generating a prosodic level prediction model proposed by an embodiment of the present invention, and the method includes:

[0027] S11: Generate an initial prosodic level prediction model based on the fine-scale dataset.

[0028] specific, figure 2 is a schematic diagram of the training process and prediction proce...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a rhythm-level prediction model generation method and apparatus, and a rhythm-level prediction method and apparatus. The rhythm-level prediction model generation method comprises: generating an initial rhythm-level prediction model according to a precisely marking data set; collecting speech and corresponding text data, detecting feature information in the speech, and adding the feature information to a corresponding position of the text data; according to the initial rhythm-level prediction model, carrying out rhythm-level prediction on the text data with added feature information to obtain an initial rhythm-level prediction result; and according to the precisely marking data set and the initial rhythm-level prediction result, carrying out training to generate an updated rhythm-level prediction model, and applying the updated rhythm-level prediction model to rhythm-level prediction for speech synthesis. With the method, accuracy of the generated rhythm-level prediction model is improved and thus accuracy of the rhythm-level prediction is enhanced; and the voice synthesis effect is improved.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a method and device for generating a prosodic level prediction model and prosodic level prediction. Background technique [0002] Speech synthesis, also known as text-to-speech (TTS) technology, can convert any text information into a standard and smooth voice in real time, which is equivalent to installing an artificial mouth on the machine. A key step in speech synthesis is prosodic prediction, which can be subdivided into prosody level prediction, duration prediction, and pitch prediction. In the prosodic level prediction, the prosodic level prediction model generated in the training stage is used to predict the prosodic level of the input text after text processing, and the prosodic level prediction result is obtained, which will affect the effect of speech synthesis. [0003] In the prior art, the prosodic level prediction model is generated after training the tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/10
Inventor 李秀林张辉杨鹏徐扬凯白锦峰付晓寅
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products