Sorting method and system for active learning

A technology of active learning and classification methods, applied in the field of active learning classification methods and systems, can solve the problems of reducing classification efficiency, time-consuming and labor-intensive labeling, and not considering the redundancy of selected samples, so as to reduce labeling time and workload, The effect of improving classification efficiency

Inactive Publication Date: 2014-03-05
SUZHOU UNIV
View PDF5 Cites 41 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The above method only considers the uncertainty and representativeness of the samples, but does not consider the redundancy between the screened samples, resulting in high redundancy between the screened samples, and labeling redundant samples is time-consuming and laborious. And because the information it contains is redundant information and will not help the training of the classifier, the above method has the disadvantage of high redundancy among the selected samples, which increases the time and cost of labeling, and is time-consuming and laborious. thereby reducing the classification efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sorting method and system for active learning
  • Sorting method and system for active learning
  • Sorting method and system for active learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] Embodiment 1 of the present invention discloses an active learning classification method, please refer to figure 1 , the method includes:

[0049] S1: Obtain the most uncertain sample set including at least one sample from the original unlabeled sample set, each sample in the most uncertain sample set corresponds to a degree of uncertainty that characterizes it relative to the preset X object categories The parameter value of the first parameter satisfies the preset condition that the uncertainty of the representative sample is relatively high, wherein the X is a natural number greater than 1.

[0050] In this embodiment, comprehensively considering the uncertainty and representativeness of the samples, the samples with higher uncertainty and higher representativeness are taken as samples with higher information content, that is, the most valuable samples for the preset X object categories The sample, in actual implementation, may specifically instantiate the preset X ...

Embodiment 2

[0097] Embodiment 2 of the present invention discloses an active learning classification system, which corresponds to the active learning classification method disclosed in Embodiment 1. Please refer to Figure 4 , the system includes a first sampling module 100 , a clustering module 200 , a second sampling module 300 , a labeling module 400 , a training module 500 and a classification module 600 .

[0098] The first sampling module 100 is configured to obtain the most uncertain sample set including at least one sample from the original unlabeled sample set, each sample in the most uncertain sample set corresponds to an X type of object that characterizes it relative to a preset The first parameter of the degree of uncertainty of the category, the parameter value of the first parameter satisfies the preset condition that the uncertainty of the representative sample is high, wherein the X is a natural number greater than 1.

[0099] Among them, such as Figure 5 As shown, the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sorting method and system for active learning. The sorting method comprises the following steps: firstly sorting all samples with relatively high uncertainty from original unmarked samples in a concentrated manner so as to obtain a most uncertain sample set; then dividing the most uncertain sample set into h different clusters by utilizing the similarity among the samples, dividing the samples with the relatively high similarity into the same cluster, and screening out most representative samples in each cluster to form the most representative sample set; and subsequently, carrying out information marking on the most representative sample set, and finally realizing the sorting of target objects by utilizing a marked sample train sorter and a trained analyzer. Therefore, the sorting method and system disclosed by the invention has the advantages that by clustering, the similar and relatively-high-redundancy samples are divided into the same type, and the screening is carried out on the basis of the clustering, so that the redundancy among the finally-selected samples to be marked is avoided, the marking time and workload are reduced and the sorting efficiency is improved.

Description

technical field [0001] The invention belongs to the technical field of machine learning (Machine Learning) classification, and in particular relates to an active learning classification method and system. Background technique [0002] Information classification is a very important issue in the fields of information processing and pattern recognition, and the research on classification methods has become a hot topic in current research. The key issue for each classification method to achieve classification is to train a classifier with high classification accuracy based on labeled samples. [0003] The classification accuracy of a classifier largely depends on the labeled sample set. In reality, the cost of labeling samples and obtaining labeled samples is high (manual labeling by domain experts is required). Therefore, in order to obtain high classification accuracy at the lowest possible labeling cost, it is necessary to start from the original unidentified samples. The s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
Inventor 赵朋朋焦阳辛洁吴健崔志明
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products