A Random Forest Data Processing Method Based on Attribute Subspace Weighting

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A random forest and data processing technology, applied in the field of data processing, to achieve the effect of improving modeling efficiency

Active Publication Date: 2017-11-24

SHENZHEN INST OF ADVANCED TECH

View PDF2 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0017] In view of this, the purpose of the present invention is to provide a random forest data processing method with attribute subspace weighting, to solve the problem of effectively processing ultra-high-dimensional large data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0044] In order to enable those skilled in the art to better understand the technical solutions in the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described The embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0045] The invention discloses a random forest data processing method with attribute subspace weighting to solve the problem of effectively processing ultra-high-dimensional big data. Its main parts include:

[0046] 1) When establishing a decision tree node, the method of attribute subspace weighting is used to improve the selection r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a random forest data processing method weighted by attribute subspace. The method includes: S1. Extracting N which is consistent with the number of decision trees to be established by means of sampling with replacement for the data sample set that needs to be trained. sample subsets; S2. Construct a decision tree model without pruning for each sample subset. When constructing the nodes of the decision tree model, use the information gain method to first weight the attributes of all participating nodes, and select the weight The highest M attributes participate in node construction; S3, merge the constructed N decision tree models into a large random forest model. The invention uses information gain for attribute subspace weighting, so that useful information can be extracted, thereby improving classification accuracy.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to an attribute subspace weighted random forest data processing method. Background technique [0002] With the continuous development of computers, the Internet and information technology and their widespread use in all walks of life, the various types of data accumulated by people have become larger and more complex. For example, the attribute dimensions of various types of biological information data, Internet text data, digital image data and other data can reach tens of thousands, and the amount of data is still increasing, making it difficult for traditional data mining classification algorithms to cope with ultra-high dimensions. and the challenges of ever-increasing computational load. [0003] Random forest algorithm is an integrated learning method for classification. It uses decision tree as a sub-classifier. Compared with other classification algorithms, it has t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F17/30G06F9/38

CPCG06F18/24323

Inventor 赵鹤黄哲学姜青山吴胤旭陈会

Owner SHENZHEN INST OF ADVANCED TECH

A Random Forest Data Processing Method Based on Attribute Subspace Weighting

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A random forest and data processing technology, applied in the field of data processing, to achieve the effect of improving modeling efficiency

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A random forest and data processing technology, applied in the field of data processing, to achieve the effect of improving modeling efficiency

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology