Unlock instant, AI-driven research and patent intelligence for your innovation.

Audio annotation error detection method and device

An error detection and audio technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as poor recognition effect, voice annotation quality, fatigue degree of annotation personnel and influence of knowledge and cognitive level, low accuracy of speech recognition model, etc. , to achieve the effect of improving the accuracy and recognition effect, and improving the quality of labeling

Pending Publication Date: 2021-02-26
北京爱数智慧科技有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the embodiment of the present application is to provide an error detection method and device for audio labeling, which can solve the problem that the current voice labeling quality is easily affected by the fatigue degree and knowledge level of the labeling personnel, resulting in low accuracy of the voice recognition model and poor recognition effect technical issues

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio annotation error detection method and device
  • Audio annotation error detection method and device
  • Audio annotation error detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] refer to figure 1 , which shows a schematic flow chart of an audio tagging error detection method provided by an embodiment of the present application. The audio tagging error detection method includes:

[0065] S101: Acquire audio data, and divide the audio data into multiple audio segments.

[0066] Optionally, the voice detection system is used to perform voice marking on the audio data, and the audio data is segmented according to the marking.

[0067] Optionally, the audio data can be segmented according to a preset duration, such as 3s. It is also possible to segment the audio data according to the phoneme length, for example, 6 phoneme units.

[0068] S102: Label the audio segment to obtain an initial label text.

[0069] An existing audio tagging method may be used for tagging, and details are not repeated here.

[0070] S103: Using the general text error detection model to perform error detection processing on the initial marked text to obtain the first mar...

Embodiment 2

[0093] refer to figure 2 , which shows a schematic flow chart of another audio annotation error detection method provided by the embodiment of the present application, and the voice processing method includes:

[0094] S201: Acquire audio data, and divide the audio data into multiple audio segments.

[0095] S202: Label the audio segment to obtain an initial label text.

[0096] S203: Use the general text error detection model to find out the position of the label error.

[0097] S204: Obtain a list of candidate items for replacing wrong labels from the confusion dictionary.

[0098] S205: Obtain the candidate items from the candidate item list to replace the wrong label.

[0099] Optionally, the candidate with the highest priority in the candidate list is selected for replacement.

[0100] S206: Using the N-gram model to calculate fluency and perplexity of the replaced tagged text.

[0101] It should be understood that the higher the fluency and the lower the perplexity...

Embodiment 3

[0119] refer to image 3 , which shows a schematic structural diagram of an audio labeling error detection device provided in an embodiment of the present application. The error detection device 30 includes:

[0120] An acquisition module 301, configured to acquire audio data, and divide the audio data into a plurality of audio segments;

[0121] Annotation module 302 is used for annotating the audio segment to obtain an initial annotation text;

[0122] The first error detection module 303 is configured to use a general text error detection model to perform error detection processing on the initial marked text to obtain the first marked text;

[0123] Determining module 304, for determining the confusion dictionary of general text error detection model;

[0124]A recognition module 305, configured to use a text classification model to recognize the domain category of the first labeled text;

[0125] The second error detection module 306 is configured to perform error detec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an audio annotation error detection method comprising the following steps: obtaining audio data, and segmenting the audio data into a plurality of audio segments; annotating the audio clips to obtain an initial annotated text; performing error detection processing on the initial annotation text by adopting a general text error detection model to obtain a first annotation text; determining a confusion dictionary of the general text error detection model; identifying a domain category of the first annotation text by adopting a text classification model; performing error detection processing on the first annotation text by adopting a domain text error detection model corresponding to the domain category according to the domain category to obtain a second annotation text; taking a confusion dictionary of the general text error detection model and a second annotation text of the domain text error detection model as a database of a fine adjustment model; and accordingto the semanteme of the second annotation text, carrying out fine adjustment processing on the second annotation text by adopting a fine adjustment model to obtain a final third annotation text.

Description

technical field [0001] The present application belongs to the field of speech recognition, and in particular relates to a method and device for error detection of audio labeling. Background technique [0002] With the development of speech recognition technology, speech recognition technology is gradually applied to various fields, such as: smart home in daily life, smart applications in the field of education, smart robots in medical or financial fields and other scenarios. Current speech recognition technology relies on deep learning-trained speech recognition models to transcribe speech into text, and then perform subsequent processing on the text. An efficient and high-accuracy speech recognition model relies on a large amount of high-quality speech data. [0003] However, during the process of implementing the present application, the inventors found that usually, the voice data required for training the voice recognition model is obtained by manual labeling. [0004]...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/232G06F40/242G10L15/26
CPCG10L15/26G06F40/232G06F40/242
Inventor 张晴晴朱冬贾艳明何淑琳
Owner 北京爱数智慧科技有限公司