Data classification method

A data classification and classifier technology, applied in the field of data processing, can solve the problems of increasing sample weight, difficult training, incomplete training, etc., to achieve the effect of ensuring complete training, improving accuracy and performance

Pending Publication Date: 2018-09-28
湖南湖大金科科技发展有限公司
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] (1) If full samples are used for iterative training, after each iteration, the number of samples will increase exponentially, increasing the difficulty of training;
[0006] (2) If random sampling is used to form the corresponding weight ratio, some samples will be missed, resulting in incomplete training;
[0007] (3) For samples with repeated errors, the original algorithm will consistently increase the weight of the samples. If the sample is an outlier, it will cause the subsequent classifier to over-train the outlier, thus deviating from the actual data sample.
[0008] At present, the improvement of the classification algorithm mainly includes two methods: the way of improving the algorithm itself, and the way of combining and superimposing multiple algorithms. Among them, the way of improving the classification algorithm itself is usually through some characteristics of the algorithm itself. Improvements, such as adding discriminant formulas, inte

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data classification method
  • Data classification method
  • Data classification method

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0041] The following further describes the present invention with reference to the accompanying drawings of the specification and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0042] Such as figure 1 As shown, the data classification method of this embodiment includes the following steps:

[0043] S1. Obtain training set samples for training the classifier, and divide the obtained training set samples into equal parts according to the number of iterations required for training to obtain multiple training subset samples;

[0044] S2. Based on the Adaboost algorithm, multiple weak classifiers are used to train each training subset sample, and when each weak classifier is trained, part of the training subset sample and part of the wrong sample combination obtained by the previous weak classifier are selected to form the final After training the training samples of each weak classifier, the final ADB strong classifier is obtai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a data classification method. The method comprises the steps of: S1, obtaining training set samples for training a classifier, and averaging the obtained training set samples according to the number of iterations required for training to obtain a plurality of training subset samples; S2, based on the Adaboost algorithm, training each of the training subset samples respectively by using a plurality of weak classifiers, when each weak classifier performs training, selecting some training subset samples and some error samples obtained by the previous weak classifier to constitute and form a final training sample, and obtaining a final ADB strong classifier from various weak classifiers after completing training; and S3, using the trained ADB strong classifier to classify to-be-classified data, and outputting a classification result. According to the method disclosed by the present invention, the data during classification training is complete, the trainingdata can be prevented from multiplying and over-fitting, and the method has the advantages of a simple implementation principle, high classification efficiency and precision, and the like.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a data classification method. Background technique [0002] Data classification is to map data to specified categories. Adaboost (Adaptive Boostin, adaptive enhancement) is an adaptive data classification algorithm that trains different classifiers (weak classifiers) for the same training set, and then classifies these weak classifiers The classifiers are assembled to form a stronger final classifier (strong classifier). Its adaptation is: the wrong samples of the last weak classifier will be strengthened, and all samples after weighting will be used to train the next basic classifier again. At the same time, a new weak classifier is added in each round until a predetermined small enough error rate is reached, or a pre-specified maximum number of iterations is reached. The Adaboost algorithm has a strong recurrent learning ability and can better combine and strengthen we...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2413G06F18/214
Inventor 赵寒枫陈佐杨胜刚陈邦道梅雪松余湘军李浩之王芍
Owner 湖南湖大金科科技发展有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products