Unlock instant, AI-driven research and patent intelligence for your innovation.

Training data screening method, system and device, and medium

A technology for training data and screening methods, applied in neural learning methods, instruments, biological neural network models, etc., can solve problems such as instability, reduce consumption, reduce costs, and improve model recognition effects

Pending Publication Date: 2022-01-07
北京百舸飞驰科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The data is screened according to the MBR value. The disadvantage is that the calculation of the MBR value is based on the results of the entire decoding step including the language model, which will be affected by acoustic-scale and LM rescore. Different decoding strategies will get different results. Therefore, according to The strategy of screening indicators such as MBR value is not stable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training data screening method, system and device, and medium
  • Training data screening method, system and device, and medium
  • Training data screening method, system and device, and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] In order to make the object, technical solution and advantages of the present invention clearer, the implementation of the method of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0050] here will combine figure 1 A flowchart illustrating the main steps of one embodiment of the method of the present invention is shown. The method is mainly based on a semi-supervised learning method, through a speech recognition model trained by a small amount of manual labeling labels, or a speech recognition model that initializes text labels, and directly inputs multiple unlabeled audio data in the data set to the model. Perform speech recognition in the system, and output corresponding multiple decoding results, so that when one or more decoding results with high accuracy are screened out, the unlabeled audio data corresponding to the decoding results are directly marked with ...

Embodiment 2

[0070] In order to make the object, technical solution and advantages of the present invention clearer, the implementation of the system of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0071] here will combine figure 2 A block diagram showing the main structure of an embodiment of the method of the present invention will be described. The system is also mainly based on the semi-supervised learning method, through the speech recognition model trained by a small number of manual labeling labels, or the speech recognition model set by initializing text labels, directly input multiple unlabeled audio data in the data set into the Perform speech recognition in the above model, and output multiple corresponding decoding results, so that when one or more high-accuracy decoding results are selected, the unlabeled audio data corresponding to the decoding results can be direct...

Embodiment 3

[0091] An overall application scenario is described below in conjunction with Embodiments 1 and 2 to further illustrate the implementation process of the present invention:

[0092] Speech recognition using the Kaldi toolkit. A model for speech recognition in Kaldi based on semi-supervised learning training configurations. The training can be carried out directly using unlabeled audio. The training combines independent labeling, screening training data (audio), and retraining to obtain labeled training data in a simple, high-efficiency, low-cost, and low-resource consumption method, which is scalable Adapt to the model in various scenarios and train the model to improve the performance of the model and enhance its recognition effect and quality.

[0093] The speech recognition model of the Kaldi toolkit can use sound models such as HMM combined with language models to recognize the input audio. For example, the process is: the input audio is divided into frames and then its s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of speech recognition processing, and is particularly suitable for obtaining training data of a machine learning model used during speech recognition transcription. Aiming at the defects of high data acquisition cost, high consumption, poor data quality / accuracy and poor screening effect of existing pseudo-label data due to the fact that a large quantity of data training models and a large quantity of manual labeling data are needed in different scenes, the invention provides ta training data screening method, system and device, and a medium. The objective of the invention is to solve the technical problem of how to screen high-quality training data applied to models of speech recognition, search, transcription and the like based on semi-supervised learning pseudo tag accuracy. Therefore, according to the method, decoding results are sorted by means of the mean value of the number of node links of a decoding path in the decoding process, so that pseudo-tag voice data which are sorted in the front are screened to serve as model training data. The screening efficiency and the data quality are improved, and the cost and consumption are reduced.

Description

technical field [0001] The invention belongs to the technical field of speech recognition processing, and is particularly suitable for acquiring training data of a machine learning model used in speech recognition transcription, and more specifically relates to a training data screening method, system, device and medium. Background technique [0002] Using speech recognition, speech conversion, speech recognition search and other processing in intelligent speech interaction, human-computer interaction and other scenarios can be realized through machine learning such as various speech / audio recognition models. With the development of speech recognition technology becoming more and more mature, speech recognition technology trained with large batches of data has been able to surpass humans in certain specific scenarios. However, the training of the speech model requires a large amount of manually labeled data for training to improve the predictive recognition performance (prec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/214
Inventor 袁正鹏王强强
Owner 北京百舸飞驰科技有限公司