Data labeling method and device for mixing Chinese and English and labeling the tone

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A labeling and tone technology, applied in speech analysis, speech synthesis, speech recognition, etc., can solve problems such as inability to train cadenced speech, and achieve the effect of facilitating online learning and increasing complexity

Active Publication Date: 2022-04-05

北京太极华保科技股份有限公司

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Under this kind of labeled data, due to the singleness of data preparation, it is basically impossible to train cadenced speech

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] In the case of no conflict, the technical solutions between the embodiments described in the present invention can be combined.

[0038] The technical solutions of the present invention will be described in detail below with reference to the accompanying drawings.

[0039] figure 1 A schematic flowchart of the data labeling method for mixing Chinese and English and tone labeling according to the embodiment of the application, such as figure 1 As shown, the data labeling method for mixing Chinese and English and tone labeling according to the embodiment of the present application includes the following steps:

[0040] Step 101, grab a training text from a data source, where the training text covers Chinese and English characters.

[0041] In the embodiment of the present application, the training text may be obtained from the data training database. The data source can be various web pages in the network, such as the text in Baidu Encyclopedia, etc., and the data sour...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the present application discloses a data labeling method and device for mixing Chinese and English and labeling the tone, which is applied in the deep learning speech synthesis algorithm. The method includes: grabbing the training text from the data source, and the training The text covers Chinese and English characters; add emotion tags to the captured training text, record the speaker's reading audio file of the training text marked according to the emotion tag, as the audio file for training; check the audio file for training with the corresponding Whether the emotional labels of the training text are consistent, and the audio files are revised for the inconsistent parts; the training text is mapped to a text vector, and the text vector and the audio file read by the speaker are submitted to the deep learning engine of the neural network for training. Learning and training, learning the pronunciation characteristics of the text under various combinations of Chinese, English, and emotional labels.

Description

technical field [0001] The embodiments of the present application relate to a data labeling method and device for mixing Chinese and English and labeling the tone. Background technique [0002] The current speech synthesis technology has greatly improved the quality of speech synthesis, and can directly generate realistic speech from text, which can be applied to fields such as voice navigation, automatic broadcast, and automatic queuing and calling services. However, in the current text-based voice output technology, the pitch is often flattened during the voice output process. Although it sounds smooth, it lacks emotional color and gives a very bad experience. At the same time, the traditional voice output technology cannot be applied to the mixed situation of Chinese and English at the same time. When it comes to mixed pronunciation of Chinese and English, two models are often called for processing, resulting in low processing efficiency and poor voice output effect. Th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L13/02G10L13/033G10L15/06G10L15/26G10L25/30

CPCG10L13/02G10L13/033G10L25/30G10L15/063G10L15/26

Inventor戴健周伟东刘华刘凯喻凌

Owner北京太极华保科技股份有限公司

Data labeling method and device for mixing Chinese and English and labeling the tone

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology