High-efficiency SVM active half-supervision learning algorithm

A semi-supervised learning and active learning technology, applied in computing, computer components, instruments, etc., can solve problems such as lack of incremental learning ability, affecting active learning performance, and high complexity

Inactive Publication Date: 2015-01-28
AIR FORCE UNIV PLA
View PDF1 Cites 105 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the active learning method based on error reduction needs to search the entire sample space before selecting samples. For a large amount of unlabeled sample sets, this sample selection strategy directly calculates the classifier on the test data set after adding samples. Classification error, the complexity of its calculation is quite high, it is not feasible in practice;
[0028] 2. Sensitive to label noise and unbalanced data distribution, etc., easy to sample repeated, similar, and meaningless samples
For example, active learning based on uncertainty sampling may sample isolated points, and it is difficult to distinguish samples with a large amount of information from abnormal samples;
[0029] 3. The influence of error propagation
That is, if the learner trained in the initial stage of active learning is inaccurate, the sample selected during the active learning process may not be the "most beneficial" sample for the learner training, which will affect the performance of a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-efficiency SVM active half-supervision learning algorithm
  • High-efficiency SVM active half-supervision learning algorithm
  • High-efficiency SVM active half-supervision learning algorithm

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0107] SVM active learning generally selects the most uncertain and low-confidence samples of the current learner for labeling, and relatively certain or well-represented samples are not used for training, while semi-supervised learning methods can use these classifiers to label them Relatively certain or high-confidence samples to make more full use of the information contained in unlabeled samples that are useful for classifier training, which can avoid the error propagation caused by the uncertainty of the initial classifier in SVM active learning, thereby improving SVM Active Learning Performance. Based on this, the present invention provides an SVM active semi-supervised learning algorithm that integrates semi-supervised learning and active learning. figure 1 , the process of the efficient SVM active semi-supervised learning algorithm of the present invention is introduced in detail.

[0108] In this example, the data uses the breast-cancer-wisconsin, ionosphere, house-vot...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-efficiency SVM active half-supervision learning algorithm. The algorithm comprises: (1), training an initial SVM classifier f<SVM><0>; (2), determining whether the f<SVM><0> satisfies a learning termination condition, and if not, skipping to step (3); (3), performing prediction marking on unmarked samples Us by use of the f<SVM><0>; (4), performing Tri-learning based half-supervision learning/QBC-based active learning on samples whose prediction mark confidence are greater than/smaller than a threshold in the Us, and adding the samples selected in the half-supervision learning/active learning to a marked training sample set; (5), training a f<SVM><k> on the updated marked training sample; and (6), repeating step (2) until the SVM classifier satisfies the termination condition of the active learning. The algorithm provided by the invention has the following advantages: during an SVM training learning process, according to the learning process, the samples which best facilitate classifier performance are autonomously selected for training the classifier, after these samples are added to the tainting set, the accuracy of classifying the unmarked samples through the semi-supervision learning is improved to the maximum degree, and the SVM classification precision is enhanced.

Description

technical field [0001] The invention relates to an algorithm, in particular to an efficient SVM active semi-supervised learning algorithm, and belongs to the technical field of machine learning algorithms. Background technique [0002] SVM (Support Vector Machines, Support Vector Machines) is a new pattern recognition method developed on the basis of the VC dimension theory of statistical learning theory and the principle of structural risk minimization. It can seek the best compromise between the complexity of the model (i.e., the learning accuracy for a specific training sample) and the learning ability (i.e., the ability to identify any sample without error) based on limited sample information, in order to obtain the best generalization ability. It largely solves the problems of model selection and over-learning, nonlinear and dimension disasters, local minimum points and other problems existing in traditional pattern recognition technology. Many unique advantages have ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2411
Inventor 徐海龙别晓峰龙光正冯卉吴天爱白东颖郭蓬松史向峰田野高歆
Owner AIR FORCE UNIV PLA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products