Sample marking method and device based on artificial intelligence prosody prediction

An artificial intelligence and sample labeling technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of high cost of text labeling, reduced prosodic labeling model performance, poor voice synthesis effect, etc., to improve labeling efficiency and accuracy, Improve the performance of the prosodic labeling model and the effect of natural speech synthesis

Active Publication Date: 2017-04-26
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF19 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, for massive audio files, the above-mentioned tagging method is costly, error-prone and inefficient for text tagging, and the newly recorded audio files cannot be applied in the prosody tagging model training in time, so more training samples cannot be provided. , reducing the performance of the prosodic labeling model, resulting in poor speech synthesis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample marking method and device based on artificial intelligence prosody prediction
  • Sample marking method and device based on artificial intelligence prosody prediction
  • Sample marking method and device based on artificial intelligence prosody prediction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0043] The sample labeling method and device based on artificial intelligence prosody prediction according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0044] In general, it is very difficult to directly predict the length of prosodic pauses. Therefore, traditional prosody prediction methods use the characteristics of human pronunciation pauses, and divide prosody into different prosody levels according to the length of pauses, thus converting the prosody predi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a sample marking method and device based on artificial intelligence prosody prediction. The method comprises steps that a first text sequence of unmarked prosodies corresponding to a first sample audio file is acquired; text characteristics and pronunciation duration of each character of the first text sequence are acquired; a pre-trained prosody marking model is applied to the text characteristics and the pronunciation duration of each character of the first text sequence to acquire an output mark of each character of the first text sequence; prosodic hierarchy marking for the first text sequence is carried out according to the output mark of each character of the first text sequence. The method is advantaged in that text marking cost is reduced, text marking efficiency and accuracy are improved, more training samples required for prosodic hierarchy marking can be provided, prosodic marking model performance is improved, and the voice synthesis effect is more natural.

Description

technical field [0001] The present invention relates to the technical field of speech synthesis, in particular to a sample labeling method and device based on artificial intelligence prosody prediction. Background technique [0002] Artificial Intelligence (Artificial Intelligence), the English abbreviation is AI. It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that responds in a manner similar to human intelligence. Research in this field includes robotics, speech recognition, image recognition, natural language processing and expert systems, etc. [0003] At present, there is a big gap between speech synthesis technology, that is, converting text into speech and playing it to users, in terms of natural...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/10
CPCG10L13/10
Inventor 徐扬凯康永国彭一平
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products