Rhythm prediction method and device, equipment and medium

A prediction method and prosody technology, which is applied in the field of data processing, can solve the problems of reducing the accuracy rate of text prosody prediction and losing semantic information of words, and achieve the effect of improving accuracy rate and recall rate, improving accuracy rate, and reducing the amount of training annotation data

Active Publication Date: 2020-02-14
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF10 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, directly treating English words as multiple letters will lose the semantic information of the word, thereby reducing the accuracy of text prosodic prediction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Rhythm prediction method and device, equipment and medium
  • Rhythm prediction method and device, equipment and medium
  • Rhythm prediction method and device, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0061] figure 1 It is a flowchart of a prosody prediction method provided in the first embodiment of the present application. This embodiment is applicable to the situation of accurate prosody prediction for Chinese-English mixed text. The method can be executed by a prosody prediction device, and the device can be realized by software and / or hardware. see figure 1 , the prosody prediction method provided in this embodiment includes:

[0062] S110. Segment the Chinese-English mixed text to be predicted to obtain Chinese text and English text.

[0063] Among them, the Chinese-English mixed text is a text that includes both Chinese and English.

[0064] The Chinese text is a text including only Chinese.

[0065] The English text is a text including English only.

[0066] The quantity of both the Chinese text and the English text can be one, two or more.

[0067] S120. Determine word vectors of characters in the Chinese text and word vectors of words in the English text.

...

no. 2 example

[0078] figure 2 It is a flow chart of a prosody prediction method provided in the second embodiment of the present application. This embodiment is an optional solution proposed on the basis of the foregoing embodiments. see figure 2 , the prosody prediction method provided in this embodiment includes:

[0079] S210. Segment the Chinese-English mixed text to be predicted to obtain Chinese text and English text.

[0080] S220. Determine word vectors of characters in the Chinese text and word vectors of words in the English text.

[0081] Among them, the word vector of the word in the English text is determined, including:

[0082] Segment words in English text into sequences of letters;

[0083] determining letter vectors for letters in said sequence of letters;

[0084] According to the determined letter vector, a word vector representing the semantics of the word is extracted.

[0085] Specifically, according to the determined letter vector, the word vector representi...

no. 3 example

[0096] image 3 is a flowchart of a prosody prediction method provided in the third embodiment of the present application. This embodiment is an optional solution proposed on the basis of the foregoing embodiments. see image 3 , the prosody prediction method provided in this embodiment includes:

[0097] S310. Segment the Chinese-English mixed text to be predicted to obtain Chinese text and English text.

[0098] S320. Determine word vectors of characters in the Chinese text and word vectors of words in the English text.

[0099] S330. Input the determined word vector and word vector into the Chinese-English mixed prosody recognition model, and output the prosody prediction result of the Chinese-English mixed text.

[0100] Wherein, the Chinese-English mixed prosody recognition model is a model for prosody prediction of Chinese-English mixed text. The model is based on supervised learning and is pre-trained using labeled samples.

[0101] The Chinese-English mixed proso...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the invention disclose a rhythm prediction method and device, equipment and a medium, and relate to the field of data processing, in particular to a voice synthesis technology. The method comprises the following steps of segmenting a Chinese and English mixed text to be predicted to obtain a Chinese text and an English text; determining character vectors of characters in the Chinesetext and word vectors of words in the English text; and determining a rhythm prediction result of the Chinese and English mixed text according to the determined character vectors and word vectors. According to the rhythm prediction method and device, the equipment and the medium, the rhythm prediction accuracy of the Chinese and English mixed text is improved.

Description

technical field [0001] The embodiments of the present application relate to the field of data processing, and in particular to speech synthesis technology. Specifically, this embodiment provides a prosody prediction method, device, device and medium. Background technique [0002] Before speech synthesis, it is necessary to predict the prosody of the speech text. [0003] Existing prosody prediction methods include: predicting the text content to be predicted by a machine learning method according to a pre-trained prediction model, and obtaining the corresponding pause prediction result of the text content, wherein the pause prediction result can include pause position, pause type (can include long pause, short pause, etc.) and a probability value corresponding to the type of pause. [0004] The above scheme has the following defects: [0005] The text content to be predicted does not distinguish between languages. When the text content includes both Chinese and English, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/10G10L13/08G06F40/289G06N20/00
CPCG10L13/10G10L13/08G06N20/00
Inventor 高占杰聂志朋卞衍尧陈昌滨
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products