Data labeling method and device for mixing Chinese and English and labeling the tone

A labeling and tone technology, applied in speech analysis, speech synthesis, speech recognition, etc., can solve problems such as inability to train cadenced speech, and achieve the effect of facilitating online learning and increasing complexity

Active Publication Date: 2022-04-05
北京太极华保科技股份有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Under this kind of labeled data, due to the singleness of data preparation, it is basically impossible to train cadenced speech

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data labeling method and device for mixing Chinese and English and labeling the tone
  • Data labeling method and device for mixing Chinese and English and labeling the tone
  • Data labeling method and device for mixing Chinese and English and labeling the tone

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In the case of no conflict, the technical solutions between the embodiments described in the present invention can be combined.

[0038] The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings.

[0039] figure 1 A schematic flowchart of the data labeling method for mixing Chinese and English and tone labeling according to the embodiment of the application, such as figure 1 As shown, the data labeling method for mixing Chinese and English and tone labeling according to the embodiment of the present application includes the following steps:

[0040] Step 101, grab a training text from a data source, where the training text covers Chinese and English characters.

[0041] In the embodiment of the present application, the training text may be obtained from the data training database. The data source can be various web pages in the network, such as the text in Baidu Encyclopedia, etc., and the data sour...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the present application discloses a data labeling method and device for mixing Chinese and English and labeling the tone, which is applied in the deep learning speech synthesis algorithm. The method includes: grabbing the training text from the data source, and the training The text covers Chinese and English characters; add emotion tags to the captured training text, record the speaker's reading audio file of the training text marked according to the emotion tag, as the audio file for training; check the audio file for training with the corresponding Whether the emotional labels of the training text are consistent, and the audio files are revised for the inconsistent parts; the training text is mapped to a text vector, and the text vector and the audio file read by the speaker are submitted to the deep learning engine of the neural network for training. Learning and training, learning the pronunciation characteristics of the text under various combinations of Chinese, English, and emotional labels.

Description

technical field [0001] The embodiments of the present application relate to a data labeling method and device for mixing Chinese and English and labeling the tone. Background technique [0002] The current speech synthesis technology has greatly improved the quality of speech synthesis, and can directly generate realistic speech from text, which can be applied to fields such as voice navigation, automatic broadcast, and automatic queuing and calling services. However, in the current text-based voice output technology, the pitch is often flattened during the voice output process. Although it sounds smooth, it lacks emotional color and gives a very bad experience. At the same time, the traditional voice output technology cannot be applied to the mixed situation of Chinese and English at the same time. When it comes to mixed pronunciation of Chinese and English, two models are often called for processing, resulting in low processing efficiency and poor voice output effect. Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/02G10L13/033G10L15/06G10L15/26G10L25/30
CPCG10L13/02G10L13/033G10L25/30G10L15/063G10L15/26
Inventor 戴健周伟东刘华刘凯喻凌
Owner 北京太极华保科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products