Voice data automatic labeling method and system for voice recognition

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech recognition and speech data technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of high cost, low efficiency, and long manual tagging cycle of speech data, so as to reduce labor, improve tagging quality, and solve tagging cycle long effect

Inactive Publication Date: 2020-11-13

WEIFANG MEDICAL UNIV

View PDF11 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to overcome the above-mentioned defects of the prior art, an embodiment of the present invention provides a method and system for automatic voice data labeling for voice recognition. The technical problems to be solved by the present invention are: long manual voice data labeling period, high cost and high efficiency low problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0033] The present invention provides a speech data automatic labeling system for speech recognition, comprising a silence detection module 10, a volume screening module 20, a length screening module 30, a speech recognition module 40, a recognition result judging module 50 and a manual proofreading module 60;

[0034] Each voice is split into a plurality of voice segments by the silence detection algorithm in the silence detection module 10;

[0035] Said volume screening module 20 screens out the voices that meet the requirements by the threshold of the volume, and removes the voices that do not meet the requirements;

[0036] Described length screening module 30 screens out the speech that meets the requirements by the threshold of the speech duration, and removes the speech that does not meet the requirements;

[0037] Described speech recognition module 40 is by speech recognition engine speech recognition is the text corresponding to speech, later stage will add the newl...

Embodiment 2

[0050] The present invention provides a speech data automatic labeling system for speech recognition, comprising a silence detection module 10, a volume screening module 20, a length screening module 30, a speech recognition module 40, a recognition result judging module 50 and a manual proofreading module 60;

[0051] Each voice is split into a plurality of voice segments by the silence detection algorithm in the silence detection module 10;

[0052] Said volume screening module 20 screens out the voices that meet the requirements by the threshold of the volume, and removes the voices that do not meet the requirements;

[0053] Described length screening module 30 screens out the speech that meets the requirements by the threshold of the speech duration, and removes the speech that does not meet the requirements;

[0054] Described speech recognition module 40 is by speech recognition engine speech recognition is the text corresponding to speech, later stage will add the newl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an automatic voice data labeling method and system for voice recognition, and particularly relates to the field of voice recognition. The system comprises a mute detection module, a volume screening module, a length screening module, a voice recognition module, a recognition result judgment module and a manual proofreading module, the mute detection module splits each voiceinto a plurality of voice segments through a mute detection algorithm, and the volume screening module is used for screening out voices meeting the requirements through a volume threshold value and removing voices not meeting the requirements. The invention discloses a combined system of multiple modules. According to the system, speech preprocessing and speech recognition are carried out, by a public cloud mode, recognition result judgment manual proofreading are carried out, voice data annotation is constructed, after multiple times of iteration of the processes, a new corpus is continuously trained, high-quality corpus data is obtained, manpower is reduced, the voice data annotation quality is improved, and the problems that the manual annotation period is long, the cost is high and the efficiency is low are solved.

Description

technical field [0001] The present invention relates to the technical field of speech recognition, and more specifically, the present invention relates to a method and system for automatic marking of speech data for speech recognition. Background technique [0002] Speech data labeling, speech recognition performance and robustness largely depend on whether there is accurately labeled corpus data during the modeling process of the recognition model. Traditional voice data labeling is generally done manually, which consumes a lot of manpower physical resources. VAD (Voice Activity Detection), voice activity detection, is a technology for voice processing, the purpose is to detect the presence of voice signals, VAD technology is mainly used for voice coding and voice recognition. [0003] With the popularization of various smart terminals and breakthroughs in artificial intelligence technology, voice, as an important part of human-computer interaction, is widely used in vario...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/06G10L15/22G10L15/26G10L15/30

CPCG10L15/06G10L15/063G10L15/22G10L15/26G10L15/30

Inventor于谦孙涛

OwnerWEIFANG MEDICAL UNIV

Voice data automatic labeling method and system for voice recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology