Data labeling method and device based on self-learning algorithm

A technology for learning algorithms and data, applied in electrical digital data processing, special data processing applications, computing, etc., can solve the problems of low data labeling intelligence, long man-hours, low efficiency, etc., to improve labeling efficiency, mentioning Accuracy, ensuring the effect of labeling quality

Active Publication Date: 2019-05-31
深圳平安综合金融服务有限公司
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to provide a data labeling method and device based on a self-learning algorithm to solve the problems of low intelligence, long manpower hours, low efficiency and poor quality in current data labeling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data labeling method and device based on self-learning algorithm
  • Data labeling method and device based on self-learning algorithm
  • Data labeling method and device based on self-learning algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the invention, not to limit the invention.

[0051] The execution subject of the data labeling method provided by the embodiment of the present invention (hereinafter referred to as the system) may be the data labeling device provided by the embodiment of the present invention, or a terminal device (for example, a smart phone, a tablet computer, etc.) integrated with the data labeling device ) or a server, the system can be implemented by hardware or software.

[0052] refer to figure 1 and figure 2 as shown, figure 1 A flow chart of a data labeling method based on a self-learning algorithm is disclosed according to an embodiment of the present invention...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of voice signal processing, in particular to a data labeling method and device based on self-learning algorithm. The method comprises a speech recognition step, a text comparison step, a natural language processing algorithm evaluation step, a natural language processing algorithm prediction step, a data labeling step, a quality inspection step and a self-learning step. The text comparison step is used for comparing a plurality of recognition texts, labeling difference parts of texts and performing sentence breaking processing. The data labeling step is usedfor performing data labeling on an optimal pre-labeled text for a plurality of times by referring to an original recognition text and a prediction text of the difference parts, so as to form a plurality of groups of data labeling texts. The self-learning step is used for inputting the optimal labeled text and a corresponding audio signal into a speech recognition engine, wherein the speech recognition engine is iteratively trained based on the self-learning algorithm. According to the labeling method and device, the data labeling time is greatly saved, the data labeling quality and the data labeling efficiency are effectively improved, the training support is provided for various artificial intelligence products, and the production effect of intelligent products is improved.

Description

technical field [0001] The invention relates to the field of speech signal processing, and more specifically, to a data labeling method and device based on a self-learning algorithm. Background technique [0002] With the development of artificial intelligence (AI, Artificial Intelligence) technology, intelligent products such as voice robots, agent assistants, and voice quality inspection based on ASR technology have been widely promoted. ASR, the full name in English is AutomatedSpeech Recognition, that is, automatic speech recognition technology, which is a technology that converts human speech into text. The accuracy of ASR will directly affect the effect of smart products. [0003] A self-learning algorithm must have a large number of training sets and test sets. The data in the test set and training set must be labeled data that supports the algorithm. The process of converting the collected raw data into data available to the algorithm is called data labeling. That ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G06F17/27
Inventor 余伟赵静芝任丽胡发泽徐旭东
Owner 深圳平安综合金融服务有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products