Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for identifying mistakenly labeled data and medium

A technology for labeling data and identifying methods, applied in database models, relational databases, structured data retrieval, etc., can solve the problems of requiring more workload and low efficiency of quality inspection of labeling data, and achieve the goal of improving the efficiency of quality inspection. Effect

Pending Publication Date: 2021-05-04
北京中关村科金技术有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the quality inspection step for labeling data is to extract part of the labeling data from a batch of labeling data for review, and calculate the accuracy rate of the labeling data after review. If the accuracy rate does not meet the standard, it is judged that the quality inspection of the batch of labeling data is unqualified. Annotators need to re-label the batch of data until the quality inspection is passed, which requires more workload, resulting in the problem of low efficiency in the quality inspection of the labeled data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying mistakenly labeled data and medium
  • Method and device for identifying mistakenly labeled data and medium
  • Method and device for identifying mistakenly labeled data and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] According to this embodiment, an embodiment of a method for identifying wrongly labeled data is also provided. It should be noted that the steps shown in the flow charts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, Also, although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0029] The method embodiments provided in this embodiment can be executed in mobile terminals, computer terminals, servers or similar computing devices. figure 1 A hardware structural block diagram of a computing device for implementing a method for identifying mislabeled data is shown. Such as figure 1 As shown, the computing device may include one or more processors (processors may include but not limited to processing devices such as microprocessors MCUs or programmable logic devices FPGAs), memory for stori...

Embodiment 2

[0085] image 3 It is a schematic diagram of an apparatus for identifying incorrectly labeled data provided by an embodiment of the present disclosure, and the apparatus 300 corresponds to a method for identifying incorrectly labeled data according to Embodiment 1. refer to image 3 As shown, the device 300 includes:

[0086] Annotation data acquisition module 301, configured to acquire the current batch of to-be-examined annotation data;

[0087] The wrong data determination module 302 is configured to determine wrongly labeled data in the label data to be reviewed according to preset category statistical indicators and a preset label data review model; wherein, the label data review model uses the current batch It is obtained by updating and training the labeled data of the previous batch of labels.

[0088] Optionally, the error data determination module 302 is specifically configured to:

[0089] Obtaining the class labeling error rate of the data to be reviewed; where...

Embodiment 3

[0112] Figure 4 It is a schematic diagram of an apparatus for identifying mislabeled data provided by another embodiment of the present disclosure, and the apparatus 400 corresponds to the method according to the first aspect of Embodiment 1. refer to Figure 4 As shown, the device 400 includes: a processor 410; and a memory 420, connected to the processor 410, for providing the processor 410 with instructions for processing the following processing steps:

[0113] Obtain the pending annotation data of the current batch; and

[0114] According to the preset category statistical index and the preset labeling data review model, determine the wrong labeling data in the labeling data to be reviewed; wherein, the labeling data review model is to use the labeling of the previous batch of the current batch. Data update training obtained.

[0115] Optionally, determining the wrong label data in the label data to be reviewed according to the preset category statistical indicators a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for identifying mistakenly labeled data and a storage medium, and the method comprises the steps of obtaining to-be-audited labeled data of a current batch, and determining the mistakenly labeled data in the to-be-audited labeled data according to a preset class statistical index and a preset labeled data auditing model, wherein the annotation data auditing model is obtained by updating and training the annotation data annotated in the previous batch of the current batch. Through the embodiment of the invention, the quality inspection work efficiency of the annotation data can be improved.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a method, device and medium for identifying mislabeled data. Background technique [0002] With the development of artificial intelligence technology, there is more and more demand for labeling data known as the "food" in the field of artificial intelligence. After labeling the data, professionals need to perform manual review steps for quality inspection. The quality inspection results Only when the data that needs to be labeled reaches the required accuracy rate can the labeled data be considered qualified. [0003] At present, the quality inspection step for labeling data is to extract part of the labeling data from a batch of labeling data for review, and calculate the accuracy rate of the labeling data after review. If the accuracy rate does not meet the standard, it is judged that the quality inspection of the batch of labeling data is unqualified....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/21G06F16/28
CPCG06F16/21G06F16/285
Inventor 刘睿靳丁南罗欢权圣
Owner 北京中关村科金技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products