Average error classification cost minimized classifier integrating method

A classifier and weak classifier technology, applied in the fields of instruments, character and pattern recognition, computer parts, etc., can solve problems such as difficulty in integrated learning methods

Inactive Publication Date: 2011-09-14
CAS OF CHENGDU INFORMATION TECH CO LTD
View PDF2 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In general, constructing ensemble learning methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Average error classification cost minimized classifier integrating method
  • Average error classification cost minimized classifier integrating method
  • Average error classification cost minimized classifier integrating method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0130] Combine below figure 1 Describe the specific process steps of a classifier integration method for multi-category cost-sensitive learning provided by the present invention. The method includes the following steps:

[0131] S1. Obtain a training sample set S;

[0132] S2. Initialize sample weights and assign initial values, Among them, i=1,..., m, l=1,..., K, y i ∈ {1, 2, ..., K}, Z 0 for The normalization factor of c(y i , l) means y i The cost of class being wrongly divided into l class, m is the number of training samples;

[0133] S3. Iterate T times, train to obtain T best weak classifiers, and realize through steps S31-S33:

[0134] S31. Based on weighted value The training sample set S trains the weak classifier, t=1,...,T, through steps S311-S313 to achieve: S311, the division of the corresponding sample set S, calculate where j=1,...,n t , l represents the class in the multi-classification problem, x i represents the i-th sample, express The ...

Embodiment 2

[0145] Utilize the classifier integration method of multi-classification cost-sensitive learning of the present invention can realize a kind of multi-classification continuous AdaBoost integrated learning method, and its similarity with embodiment one is no longer repeated, and its difference is:

[0146] The method for assigning initial values ​​to the training samples in step S2 is: i=1,...,m,l=1,...,K,Z 0 is a normalization factor, where c(i, i)=0, and c(i, j)=1 when i≠j. At this time, the average misclassification cost is simplified to the training error rate, and the classifier ensemble method for multi-classification cost-sensitive learning described in Embodiment 1 is simplified to a new multi-classification continuous AdaBoost ensemble learning method.

[0147] The method of the present invention introduces K weights to each sample, and when considering whether the target can be correctly classified, attention is paid to its opposite. equivalent to The l label su...

Embodiment 3

[0150] A kind of classifier ensemble method of multi-classification cost-sensitive learning proposed by the present invention and a kind of multi-classification continuous AdaBoost integrated learning method are used in practical application below, and integrate with the existing multi-classification continuous AdaBoost based on Bayesian statistical inference Compare learning methods.

[0151] The data selects the wine data set and random data set (Random data) on the UCI data set. The wine data has 3 types of labels, and the random data set is randomly generated. The random data used in the experiment uses the random matrix generation function rand(n) in MATLAB to generate an n×n matrix, intercepts the first d columns to obtain n samples with d attributes, and then divides the samples into 3 categories to obtain a random 3 Categorical datasets. The representativeness of the random data set is determined by the insignificant difference between the classes and the indistinct i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an average error classification cost minimized classifier integrating method. The method comprises the following steps of: 1, acquiring a training sample set; 2, initializing a sample weight and assigning an initial value; 3, iterating for T times, and training to obtain T optimal weak classifiers, wherein the step 3 comprises the following sub-steps of: 31, training weak classifiers on the basis of the training sample set S with the weight; 32, regulating the sample weight according to the results of the step 31; 33, judging whether t is smaller than T, if so, making t equal to (t+1) and returning to the step 31, otherwise, entering a step 4; and 4, combining the T optimal weak classifiers to obtain the optimal combined classifier. Compared with the prior art, themethod has the advantages that: classification results can be gathered in a class with low error classification cost in real sense, and on the premise of not requiring the classifiers to be independent of one another directly, the training error rate is reduced along with the increase of the number of the trained classifiers and the problem that the classification results can be only gathered in a class with the lowest total error classification cost in the conventional cost-sensitive learning method is solved.

Description

technical field [0001] The invention relates to machine learning and pattern recognition methods, in particular to a classifier integration method for minimizing the average misclassification cost, in particular to a classifier integration method for multi-classification cost-sensitive learning and a classifier integration method for multi-label classification problems. Background technique [0002] The current classification methods generally pursue classification accuracy, that is, the minimum classification error rate, which is based on the fact that all classes are misclassified at the same cost. When the cost of different classes being misclassified is not equal, the problem of cost-sensitive classification is introduced. At this time, the designed classifier is required to meet the minimum cost of misclassification rather than the minimum classification error rate. At present, there are many cost-sensitive learning methods. For example, Domingos et al. used the meta-co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/66
Inventor 付忠良赵向辉姚宇李昕
Owner CAS OF CHENGDU INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products