Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for active data labeling based on maximum information triple screening network

A technology with maximum information and triples, applied in the field of data labeling, can solve the problems of cumbersome, the classifier does not fully play the role of the screened data samples, and does not consider differences.

Active Publication Date: 2020-08-11
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method generally only considers the high uncertainty of the labeled samples between classes, and the process of calculating the information entropy of each sample one by one and then evaluating is too cumbersome
Although some methods use screening representative samples for manual labeling, they do not consider intra-class differences when screening representative samples.
In addition, the optimization of the classifier does not take full advantage of the filtered data samples

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for active data labeling based on maximum information triple screening network
  • Method and system for active data labeling based on maximum information triple screening network
  • Method and system for active data labeling based on maximum information triple screening network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] Preferred embodiments of the present invention are described below with reference to the accompanying drawings. Those skilled in the art should understand that these embodiments are only used to explain the technical principle of the present invention, and are not intended to limit the protection scope of the present invention.

[0055] The core idea of ​​the embodiment of the present invention is to use the Euclidean distance metric function to actively obtain samples with the most uncertainty between classes and the most differentiated samples within classes based on deep features as the samples with the most annotation value, and construct the largest information three The tuple loss function gradually updates the data structure and network parameters to obtain high-precision classification tasks, thereby ensuring the accuracy of labeling while reducing the workload of manual labeling.

[0056] An embodiment of the data active labeling method based on the maximum inf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of data labeling, in particular to a method and system for actively labeling data based on the largest information triple screening network, with the purpose of reducing the workload of manual labeling while ensuring labeling accuracy. According to the depth characteristics of the samples, the present invention selects the most valuable unlabeled samples for manual labeling, and constructs the largest information triplet loss function; gradually updates the data structure and network parameters, and then updates the screening network model. Stop updating the screening network model until the maximum intra-class differences between the unlabeled data and the labeled data are less than the preset second threshold and the minimum inter-class differences are greater than the preset first threshold. The remaining data can be annotated by computer using the last updated screening network model. Through this method, the accuracy of labeling is ensured while reducing the workload of manual labeling.

Description

technical field [0001] The invention relates to the field of data labeling, in particular to a method and system for actively labeling data based on the largest information triplet screening network. Background technique [0002] With the advent of the big data era and the development of hardware technology, the research on large-scale databases has been continuously innovated, and at the same time, we have seen the unstoppable advantages of deep learning in computer vision. [0003] But the dominance of deep learning is inseparable from the support of massive labeled data. With the continuous expansion of data volume, manual labeling of massive data is not only costly, but also time-consuming and labor-intensive. Therefore, the annotation of massive data has always been a concern in the field of image annotation. Although deep learning technology has achieved remarkable success in computer vision, due to the small amount of labeled data in the image label itself, deep lea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N3/08
CPCG06N3/08G06F18/217G06F18/24133G06F18/214
Inventor 赵鑫黄凯奇张靖康运锋
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products