Rhythmical pause information determination method and device

A determination method and rhythm technology, applied in the determination of prosody pause information, can solve problems such as inconsistency in prosody training data, and achieve the effects of improving synthesis fluency, improving prosody rhythm, and natural synthesis effect.

Active Publication Date: 2016-01-06
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF5 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method solves the problem that the prosody training data used by the acoustic model and the prosody model are inconsistent, improves the prosody rhythm, improves the fluency of synthesis, and uses the speaker's own adaptive prosody prediction model, making it possible for multiple speakers to switch. Synthetic effect is more natural

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rhythmical pause information determination method and device
  • Rhythmical pause information determination method and device
  • Rhythmical pause information determination method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0023] figure 1 It is a flowchart of an embodiment of the method for determining rhythmic pause information in the present invention, such as figure 1 As shown, the method for determining the prosodic pause information may include:

[0024] Step 101, extracting the prosody prediction features of the text to be synthesized.

[0025] Specifically, extracting the prosodic ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a rhythmical pause information determination method and a device. The rhythmical pause information determination method comprises the steps of extracting rhythmical prediction features of a to-be-synthesized text; selecting a self-adaptive rhythmical prediction model corresponding to a selected speaker; and inputting the rhythmical prediction features of the to-be-synthesized text into the self-adaptive rhythmical prediction model corresponding to the selected speaker so as to determine the rhythmical pause information of the to-be-synthesized text. According to the technical scheme of the invention, the problem that the rhythmical training data for an acoustic model and the rhythmical training data for a rhythmical model are inconsistent can be solved. Meanwhile, the rhythmical rhyme and the synthetic fluency are improved. Moreover, the self-adaptive rhythmical prediction model corresponding to the selected speaker is adopted, so that the synthetic effect in a multi-speaker handover situation is more natural.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to a method and device for determining prosodic pause information. Background technique [0002] The purpose of speech synthesis is to convert text into speech and play it to users, and the goal is to achieve the effect of live text broadcasting. An important module in the speech synthesis link is to predict the prosodic pauses of the text to be synthesized, and then generate synthetic speech according to the predicted prosodic pauses. [0003] Currently, prosody prediction in speech synthesis is implemented based on statistical machine learning, and the process includes preparing training data, training prosody prediction models, and performing prosody predictions based on the trained models. [0004] However, in the prior art, the prosodic pause pattern trained in the prosody prediction model does not match the prosodic pause pattern trained in the acoustic model. The r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/10
Inventor 康永国
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products