Four-classifier cooperative training method combining active learning

A technology of active learning and collaborative training, applied in the fields of instruments, character and pattern recognition, computer parts, etc., it can solve the problems of waste of unlabeled samples, inconsistent determination of unlabeled sample categories, and enlargement, so as to improve the accuracy rate and increase the Effects and conditions that require high effects

Inactive Publication Date: 2012-01-18
XIDIAN UNIV
View PDF0 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the other hand, for the unlabeled samples whose judgments of the two classifiers are inconsistent, the traditional collaborative training method is to discard the samples, which will cause a certain degree of waste of unlabeled...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Four-classifier cooperative training method combining active learning
  • Four-classifier cooperative training method combining active learning
  • Four-classifier cooperative training method combining active learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] The present invention is a four-classifier collaborative training method combined with active learning, that is, the CTA method. Taking iris as an example, the implementation process of the CTA method is as follows:

[0036] Input: an unlabeled dataset D containing 96 samples u , a labeled dataset D containing 24 samples 1 , a test set T containing 30 samples.

[0037] Output: Classification error rate on the test set T.

[0038] ① Select the naive Bayesian algorithm L that is sensitive to the data set;

[0039] ② For the unlabeled data set D containing 96 samples taken out u , a labeled data set (initial training set) D containing 24 samples 1 , a test set T containing 30 samples and a naive Bayesian algorithm L, combined with figure 1 to D 1 Using Bootstrap technology to extract four times, the number of samples obtained is equal to |D 1 |The four training sets S 1 , S 2 , S 3 , S 4 , use the algorithm L to train the classifier C 1 , C 2 , C 3 , C 4 ; ...

Embodiment 2

[0050] The cooperative training method of four classifiers combined with active learning is the same as in Embodiment 1, taking thyroid as an example, see image 3 , the specific process is as follows:

[0051] Take out 552 data into the marked data set, take out 138 data into the unmarked data set, and put the remaining data into the test set. On the basis of the labeled data set, four training sample sets with a size of 552 are taken out by the Boostrap method. The four training sample sets are trained separately with the selected learning algorithm, and four classifiers are obtained. Use these four classifiers to determine the data in the unlabeled data set. For a classifier C, if the judgment results of the other three classifiers are the same, then this data is marked as the judgment result of the classifier, and then added to In the training sample set corresponding to classifier C, if the judgment results of the other three classifiers are different from each other, t...

Embodiment 3

[0054] The cooperative training method of four classifiers combined with active learning is the same as in Example 1-2, taking wine as an example, the specific implementation method is as in Example 1, and verified by experiments, such as figure 2 The results of the wine data set shown, the learning effect of the CTA method is better than the Co-Training method and the Tri-Training method; in the 10 experiments of CTA, the number of unlabeled samples is 114, and the average number of active learning is 1.7 times, indicating that The invention uses as few queries as possible to obtain strong generalization ability, and is a semi-supervised learning method with simple implementation, higher recognition rate and good effect.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a four-classifier cooperative training method combining active learning, relates to cooperative training combining active learning, and belongs to the technical field of machine learning. Four classifiers and active learning are used in the implementation process to further improve the cooperative training method. The four classifiers are adopted for cooperative training, and samples consistent in the judgment of three classifiers are directly added into a training set, so excessive noise is prevented from being introduced at the same time of improving the confidence of the unmarked samples; because active learning is performed on the difficultly distinguished samples, the learning effect is improved, and meanwhile, the identification function of each classifier is modified by properly processing the samples; and because the samples inconsistent in the judgment of the three classifiers are used as the difficultly distinguished samples, the condition requirement for active learning is high, and the implementation is simple. The method is applied in the fields of web page classification, image processing, face identification, intrusion monitoring and the like.

Description

technical field [0001] The invention belongs to the technical field of machine learning, and relates to cooperative training combined with active learning, in particular to a four-classifier cooperative training method combined with active learning, which can be used to improve the utilization rate of unlabeled samples in semi-supervised learning and further improve semi-supervised learning. learning performance. The proposed method is suitable for applications such as webpage classification, image processing, face recognition, intrusion detection and so on. Background technique [0002] The standard collaborative training method was proposed by Blum and Mitchell in 1998. They proposed a standard co-training method based on the following three basic assumptions: (1) the attribute set can be divided into two sets; (2) each subset of the attribute set is sufficient to train a classifier; (3) in a given In the case of class tags, these two attribute sets are independent of ea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/66
Inventor 杨利英王轶初韩玉想盛立杰
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products