Method and system for optimizing classification of random forest based on weighted decision trees

A random forest classification and decision tree technology, applied in the field of optimized random forest classification based on weighted decision trees, can solve problems affecting the stability of the classification ability of the model, and achieve the effect of improving generalization ability and efficiency

Inactive Publication Date: 2018-03-06
HUAZHONG NORMAL UNIV
View PDF0 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional random forest model has the same voting weight for decision trees with different gener

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for optimizing classification of random forest based on weighted decision trees
  • Method and system for optimizing classification of random forest based on weighted decision trees
  • Method and system for optimizing classification of random forest based on weighted decision trees

Examples

Experimental program
Comparison scheme
Effect test

example

[0060] The data sets used in this example are selected from the UCI public database, which contains up to 383 data sets recording different individual characteristics. Each data set describes samples in the form of "attribute-value", "attribute" is the feature vector of the sample, and "value" is the label of the sample. Using the random forest algorithm is to use the "attributes" and "values" of a large number of samples as input, and output the mapping relationship between "attributes" and "values", or to be able to predict "values" based on new "attributes". The specific implementation Proceed as follows:

[0061] 1. Using the "bootstrap method" to generate multiple training data sets

[0062] The "bootstrap method" is used, that is, a random sampling strategy with replacement is used to generate new training data sets. Each new data set contains the same number of samples, and each sample can be regarded as a vector. Repeat the process of "bootstrap method" 100 times to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a method and a system for optimizing classification of a random forest based on weighted decision trees, belonging to the mode identification technology field. The method comprises the steps of: employing a bootstrap method to generate a plurality of training data sets; randomly extracting one feature set for each training data set; training decision trees, and distributing a voting weight of each decision trees according to statistic features of the feature sets or performances of the decision trees; and introducing a voting mechanism, and accelerating a classification process of a random forest. The method and the system for optimizing classification of a random forest based on weighted decision trees employ the statistic features of the feature sets or theperformances of the decision trees to distribute voting weights of the decision trees, and employ the voting mechanism to accelerate the decision process so that the classification performances and the classification efficiency of the random forest are effectively improved.

Description

technical field [0001] The invention belongs to the technical field of pattern recognition and data mining, and in particular relates to an optimized random forest classification method based on a weighted decision tree. Background technique [0002] With the rapid development of information technology, the amount of data in various fields has shown explosive growth, and the world has entered the era of big data. In order to find valuable information contained in massive data, data mining technology has become one of the most active research fields. The so-called data mining generally refers to the process of searching for information hidden in a large amount of data through algorithms. Data mining is often associated with computer science and accomplishes the above goals through methods such as statistics, online analytical processing, intelligence retrieval, machine learning, expert systems (relying on past rules of thumb), and pattern recognition. [0003] Random forest...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/241G06F18/214
Inventor 陈靓影徐如意刘乐元张坤
Owner HUAZHONG NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products