Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data labeling system and method based on automatic verification

An automatic verification and data technology, applied in the field of data processing, can solve the problems of difficult control of labeling data quality and high degree of manual dependence, and achieve the effect of improving labeling efficiency and data quality, complete functions, and simple data labeling

Inactive Publication Date: 2021-02-19
杭州知衣科技有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method is a data labeling method based on quality control. Combining this system with this method, it has relative stability and reliability for the above-mentioned problems, and fully overcomes the existing labeling methods' high degree of manual dependence, low efficiency, Difficult to control the quality of labeling data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data labeling system and method based on automatic verification
  • Data labeling system and method based on automatic verification
  • Data labeling system and method based on automatic verification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0077] Embodiment 1: A data labeling method based on automatic verification, such as figure 1 The specific steps of this method are as follows:

[0078] S1. Establish a standard annotation data sample library;

[0079] S2. Publish the task of data to be labeled;

[0080] S3. The user selects the data to be labeled, and performs labeling processing;

[0081] S4. Respond to the user's labeling action, and save the user's labeling result;

[0082] S5. When the data is marked by one person, the data marked by the user is the final marking result. When the labeling is performed by multiple people, determine whether the labeling results of multiple people for the same data are consistent;

[0083] If it is consistent, select one of the labeling results as the final labeling result, if not, then judge whether the number of inconsistent labeling results is the same;

[0084] If they are not the same, the principle of large numbers is used to automatically select the largest numbe...

Embodiment 2

[0086] Embodiment 2: a data labeling method based on automatic verification, such as figure 1 The specific steps of this method are as follows:

[0087] S1. Establish a standard labeling data sample database; select labelers with higher business level, select labeling samples for labeling, and store the labeling results as standard labeling samples in the data, and establish a standard labeling data sample database;

[0088] S2. The task of releasing the data to be labeled; the data to be labeled is released to the user in batches for labeling.

[0089] S3. The user selects the data to be labeled and performs labeling processing; when the user selects the data to be labeled, according to a certain proportion of the amount of data to be labeled selected by the user, the standard labeling samples of the corresponding category and quantity are selected from the standard labeling data sample library and The data to be labeled is randomly mixed. When the standard labeling samples...

Embodiment 3

[0095] Embodiment 3: On the basis of Embodiment 2, it also includes step S6: comparing the standard labeling samples mixed by the user labeling, and counting the total number of standard labeling samples labelled by the user and the correct rate of the standard labeling samples; When the total number of labeled standard labeling samples and the correct rate of standard labeling samples are lower than the set threshold, the user is prohibited from performing data labeling operations. After the user is prohibited from performing the data labeling operation, the data labelled by the user is also invalidated and stored in the database to be reviewed, and the labeling can be reviewed again later.

[0096] Save the final annotation results and store them in the unified annotation sample database for the model to use.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of data processing, in particular to a data labeling system and method based on automatic verification. The method comprises the following steps: S1, establishing a standard annotation data sample library; s2, issuing a to-be-labeled data task; s3, enabling the user to select to-be-annotated data and carry out annotation processing; s4, responding to the user labeling action, and storing a user labeling result; s5, judging whether multi-person labeling results of the same data are consistent or not; by means of the principle of large numbers, manualqualitation is converted into machine quantitative control, inconsistent labeling result verification and final labeling result generation decision are automatically completed, and the efficiency isimproved while the labeling data quality is guaranteed.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a data labeling system and method based on automatic verification. Background technique [0002] With the continuous development of Internet technology and computer science technology, artificial intelligence is becoming more and more intense, and its core technology is machine learning. Before training a machine learning model, it is usually necessary to prepare training data and label a large amount of training data. [0003] There are two main methods of labeling. One is purely manual labeling based on files, which is inefficient; the second is to assist manual labeling with the help of a simple visual labeling system. Compared with the first method, the second method has a certain improvement in efficiency, but it still cannot control the labeling quality. In addition, when labeling data, if the labeling content contains subjective or relatively vague qualitative i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06Q10/10G06Q10/06
CPCG06Q10/06G06Q10/10
Inventor 温苗苗郑泽宇石磊
Owner 杭州知衣科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products