Visualized optimization processing method and device for random forest classification model

A random forest classification and model technology, applied in the direction of electrical digital data processing, special data processing applications, computer components, etc., can solve the problems of decreasing prediction speed, increasing storage space, etc. The effect of speed and precision

Inactive Publication Date: 2015-04-29
HUAWEI TECH CO LTD
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The inventors of the present application found in the long-term research and development that the prediction effect of random forest is significantly better than that of a single decision tree, but there are

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visualized optimization processing method and device for random forest classification model
  • Visualized optimization processing method and device for random forest classification model
  • Visualized optimization processing method and device for random forest classification model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0032] refer to figure 1 , figure 1 It is a flowchart of an embodiment of the visual optimization processing method of the random forest classification model of the present invention, including:

[0033] Step S101: For the constructed random forest classification model, estimate the correlation between each decision tree of the random forest classification model through out-of-bag data.

[0034] In machine learning, a random forest classification model is a classifier that includes multiple decision trees, and its output classification results are determined by the total number of classification results output by a single decision tree. Let the random forest can be expressed as {h(X, θ k ), k=1, 2,..., K}, where Represents a decision tree, and K is the number of decision trees included in the random forest. Here {θ k , k=1, 2,..., K} is a sequenc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Disclosed is a visualized optimization processing method for a random forest classification model. The method comprises: for a random forest classification model which has been constructed, estimating the degree of correlation between various decision trees of the random forest classification model via out-of-bag data; constructing a correlation matrix using the degree of correlation between various decision trees of the random forest classification model; according to the correlation matrix, by means of the dimension reduction technology, acquiring a visual pattern of the random forest classification model in a space with dimensions fewer than three; and according to the visualized pattern of the random forest classification model, conducting optimization processing on the random forest classification model, so that the upper limit of a second generalization error of the processed random forest classification model does not go beyond the upper limit of a first generalization error of the random forest classification model prior to processing. By means of the above-mentioned method, the present invention can reduce the number of decision trees in the random forest classification model and reduce the memory space required by the random forest classification model, and can also improve the prediction speed and accuracy at the same time.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a visual optimization processing method and device for a random forest classification model. Background technique [0002] Classification problems are one of the most fundamental tasks frequently encountered in statistics, data analysis, machine learning, and data mining research fields. The main goal of this task is to use the training data to build a predictive model (ie, a learning machine) with strong generalization ability, and ensemble learning has significant advantages in this regard. The basic idea of ​​ensemble learning is to use multiple learning machines to solve the same problem. Two prerequisites determine the feasibility of ensemble learning: one is that a single basic learning machine is effective, that is to say, the accuracy of a single basic learning machine should be greater than the probability of random guessing; the other is the difference between the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F18/24323
Inventor 赫彩凤李俊杰郭向林
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products