Audio annotation error detection method and device
An error detection and audio technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as poor recognition effect, voice annotation quality, fatigue degree of annotation personnel and influence of knowledge and cognitive level, low accuracy of speech recognition model, etc. , to achieve the effect of improving the accuracy and recognition effect, and improving the quality of labeling
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0064] refer to figure 1 , which shows a schematic flow chart of an audio tagging error detection method provided by an embodiment of the present application. The audio tagging error detection method includes:
[0065] S101: Acquire audio data, and divide the audio data into multiple audio segments.
[0066] Optionally, the voice detection system is used to perform voice marking on the audio data, and the audio data is segmented according to the marking.
[0067] Optionally, the audio data can be segmented according to a preset duration, such as 3s. It is also possible to segment the audio data according to the phoneme length, for example, 6 phoneme units.
[0068] S102: Label the audio segment to obtain an initial label text.
[0069] An existing audio tagging method may be used for tagging, and details are not repeated here.
[0070] S103: Using the general text error detection model to perform error detection processing on the initial marked text to obtain the first mar...
Embodiment 2
[0093] refer to figure 2 , which shows a schematic flow chart of another audio annotation error detection method provided by the embodiment of the present application, and the voice processing method includes:
[0094] S201: Acquire audio data, and divide the audio data into multiple audio segments.
[0095] S202: Label the audio segment to obtain an initial label text.
[0096] S203: Use the general text error detection model to find out the position of the label error.
[0097] S204: Obtain a list of candidate items for replacing wrong labels from the confusion dictionary.
[0098] S205: Obtain the candidate items from the candidate item list to replace the wrong label.
[0099] Optionally, the candidate with the highest priority in the candidate list is selected for replacement.
[0100] S206: Using the N-gram model to calculate fluency and perplexity of the replaced tagged text.
[0101] It should be understood that the higher the fluency and the lower the perplexity...
Embodiment 3
[0119] refer to image 3 , which shows a schematic structural diagram of an audio labeling error detection device provided in an embodiment of the present application. The error detection device 30 includes:
[0120] An acquisition module 301, configured to acquire audio data, and divide the audio data into a plurality of audio segments;
[0121] Annotation module 302 is used for annotating the audio segment to obtain an initial annotation text;
[0122] The first error detection module 303 is configured to use a general text error detection model to perform error detection processing on the initial marked text to obtain the first marked text;
[0123] Determining module 304, for determining the confusion dictionary of general text error detection model;
[0124]A recognition module 305, configured to use a text classification model to recognize the domain category of the first labeled text;
[0125] The second error detection module 306 is configured to perform error detec...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


