Voice data automatic annotation quality evaluation method

A technology for automatic labeling and voice data, applied in natural language data processing, voice analysis, voice recognition, etc., to achieve the effect of improving the quality of data labeling

Active Publication Date: 2021-03-02
KUNMING UNIV
View PDF10 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the defects in the above-mentioned prior art, and provide a quality assessment method for automatic voice data labeling, which solves the following problems: first, perform quality assessment on the voice automatic labeling data completed by the machine, and find that the labeling data exists "Mislabeling and missing labeling" and other quality problems, so as to improve the quality of automatic data labeling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice data automatic annotation quality evaluation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention are clearly and completely described below. Apparently, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0054] A quality inspection mechanism for automatically labeling data. Due to the gap between humans and machines in understanding "labeling errors", it is difficult to detect the quality of a large number of automatically labeled data automatically completed by computers, so it is necessary to use the original manual method. The basic idea of ​​the detection mechanism design is: to establish a system of key indicators for quality evaluation, and to extract points that a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a voice data automatic annotation quality evaluation method, which comprises the steps of pre-constructing a quality rule base for automatically annotating voice data based on aquality key index; reading to-be-detected automatic annotation voice data, and performing quality detection on the to-be-detected automatic annotation voice data according to the quality key indexesto complete quality measurement; updating an automatic annotation voice data set according to the result of the quality measurement; and converting the updated automatic annotation voice data set intoa new rule, and importing the new rule into the quality rule base. According to the method, the defects existing when a traditional data annotation quality evaluation method is used for automaticallyannotating data by a machine are overcome; and the method has a very positive support effect on promoting the intelligent development process of the small language voice.

Description

technical field [0001] The invention relates to the technical field of language information processing, in particular to a quality assessment method for automatic labeling of voice data. Background technique [0002] In recent years, automatic data labeling has gradually become a key basic technology in the field of artificial intelligence. It is hoped that automatic labeling of data by machines will replace manpower, and great progress has been made in automatic data labeling in fields such as images. The extreme scarcity of speech annotation data has become a key factor restricting the speech recognition performance of minority languages ​​in my country. Due to the influence of original data quality, human errors, and model limitations, data labeling errors are unavoidable. Therefore, it is very important to introduce effective quality assessment methods. However, data labeling standards are not uniform, and labeling quality is uneven. To a large extent, it hinders the ap...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/01G10L15/26G06F40/232
CPCG10L15/01G10L15/26G06F40/232G06F40/169Y02P90/30
Inventor 何俊张彩庆周义方申时凯岳为好
Owner KUNMING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products