Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for optimizing random forest models

A technology of random forest model and optimization method, applied in the field of data processing, can solve the problems of increasing prediction time and decreasing prediction speed of random forest model, achieving the effect of small scale, improved prediction efficiency and accuracy, and fast prediction speed

Inactive Publication Date: 2015-05-20
SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The embodiment of the present invention is to provide a random forest model optimization method to solve the problems of the existing random forest model prediction speed decrease and prediction time increase

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for optimizing random forest models
  • Method and system for optimizing random forest models

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0019] figure 1 The implementation flow of the random forest model optimization method provided by the first embodiment is shown, and the process of the method is described in detail as follows:

[0020] In step S101, a histogram of thermal distribution of the random forest model and a distribution histogram of decision trees with different prediction accuracy in the random forest model are created.

[0021] In this embodiment, the heat distribution histogram (Heat Map) of the random forest model can represent the density distribution of the decision trees in the random forest model. Partition the random forest model, build a distribution grid, count the number of decision trees falling into the grid, and quantify the number of decision trees according to the statistics to obtain the heat distribution histogram of different colors, through which the heat distribution histogram can Clearly display the similar distribution among decision trees, which facilitates the optimizatio...

Embodiment 2

[0040] figure 2 The composition structure of the random forest model optimization system provided by the second embodiment of the present invention is shown, and for the convenience of description, only the parts related to the embodiment of the present invention are shown.

[0041] The random forest model optimization system can be applied to various data processing terminals, such as pocket computers (Pocket Personal Computer, PPC), handheld computers, computers, notebook computers, personal digital assistants (Personal Digital Assistant, PDA), etc., can be run on The software unit, hardware unit or combination of software and hardware in these terminals can also be integrated into these terminals as an independent pendant or run in the application systems of these terminals.

[0042] The random forest model optimization system includes a histogram creation unit 21 , a similarity calculation unit 22 and an optimization unit 23 . Among them, the specific functions of each u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention is applicable to the technical field of data processing, and provides a method and a system for optimizing random forest models. The method includes creating heat distribution histograms of the random forest models and distribution histograms of decision trees, with different prediction accuracies, in the random forest models; computing similarity degrees among the decision trees by the aid of proportions of identical attribute nodes among the decision trees according to the heat distribution histograms and the distribution histograms of the decision trees, with the different prediction accuracies, in the random forest models; deleting the decision trees with the minimum prediction accuracies according to the distribution histograms of the decision trees, with the different prediction accuracies, in the random forest models, and / or deleting the decision trees with the highest similarity degrees among the decision trees in the random forest models according to the computed similarity degrees among the decision trees. The method and the system have the advantages that the random forest models optimized by the aid of the method and the system are small in scale and high in prediction accuracy and prediction speed, the prediction efficiency of the random forest models can be effectively improved, and the like.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to an optimization method and system of a random forest model. Background technique [0002] Random forest is a supervised ensemble learning classification technique. Its model is composed of a group of decision tree classifiers. The model classifies data by collectively voting on the classification results of a single decision tree to determine the final result. By injecting randomness into the training sample space and attribute space, the independence and difference between each decision tree are fully guaranteed, and the over-fitting problem of the decision tree is well overcome. At the same time, it also has a good effect on noise and outliers. robustness. Although the prediction effect of the random forest model is significantly better than that of a single decision tree, its prediction speed decreases significantly. As the number of decision trees increas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06Q10/04
CPCG06Q10/04G06F16/212G06F16/285G06F16/35
Inventor 权奕铭李俊杰郭向林高琴吴胤旭
Owner SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products