Multi-objective optimization-based high-dimensional data semi-supervised ensemble classification method

A multi-objective optimization, high-dimensional data technology, applied in the direction of instruments, character and pattern recognition, computer components, etc., can solve the problems of inability to make corresponding evaluations in the subspace classification process, no subspace optimization and selection, etc.

Active Publication Date: 2017-05-31
SOUTH CHINA UNIV OF TECH
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Third, some methods that use random subspaces simply sample attributes without optimizing and selecting each subspace
Fourth, in the integrated learning method, the results of each subspace are generally combined to obtain the final result, but the current method is only a simple voting method, and cannot classify each selected subspace. evaluate the process accordingly

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-objective optimization-based high-dimensional data semi-supervised ensemble classification method
  • Multi-objective optimization-based high-dimensional data semi-supervised ensemble classification method
  • Multi-objective optimization-based high-dimensional data semi-supervised ensemble classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0121] attached figure 1 It is a flow chart of the multi-objective optimization-based semi-supervised integrated classification method for high-dimensional data disclosed by the present invention. figure 1 The steps of the present invention are further described.

[0122] Step S1, input training data set;

[0123] Input a high-dimensional data set X to be classified, the row vector corresponds to the sample dimension, the column vector corresponds to the attribute dimension, and then the class label of the training data is used to divide the data into unlabeled data accounting for 90% and unlabeled data accounting for 10%. of labeled data.

[0124] Step S2, data normalization;

[0125] Perform data normalization on the above input training data set, the specific process is as follows:

[0126] Get the maximum value W(d) corresponding to the data in column d max and the minimum value W(d) min , convert the data in column d according to the following formula:

[0127] i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-objective optimization-based high-dimensional data semi-supervised ensemble classification method, relates to the field of artificial intelligent ensemble learning, and mainly solves the problems in sub-space optimization and selection and semi-supervised information utilization in the prior art. The high-dimensional data semi-supervised ensemble classification method comprises the following steps of S1, inputting a training data set; S2, performing data normalization on the input training data set; S3, generating a random sub-space set; S4, performing multi-objective optimization selection of the sub-space set; S5, searching for an optimal semi-supervised classifier in feature sub-spaces; S6, classifying test samples; and S7, calculating classification accuracy. According to the method, the problem of difficulty in processing high-dimensional data is solved by utilizing random sub-spaces; the selection of the sub-spaces is fully optimized by utilizing a multi-objective optimization solution to improve the robustness of the sub-spaces; and the classifiers of the sub-spaces are optimized by fully utilizing untagged and tagged information, so that the generalization capability of the classifiers is improved.

Description

technical field [0001] The invention relates to the field of computer artificial intelligence, in particular to a semi-supervised integrated classification method for high-dimensional data based on multi-objective optimization. Background technique [0002] The acquisition of labeled data requires a lot of manpower and material resources, but unlabeled data is often easier to collect. For example, there is a large amount of unmarked webpage information in text mining, and it would be unimaginable to spend time and other expenses if manually marking the webpage information one by one. Since the classification effect of the learning model trained by supervised learning will be relatively poor when the number of training samples is insufficient, and in order to solve this problem, more and more researchers have focused their attention on how to use unlabeled and In the field of semi-supervised classification of labeled data, such as video annotation, image annotation, natural ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/2155G06F18/24
Inventor 余志文张乙东陈洁彦
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products