Unlock instant, AI-driven research and patent intelligence for your innovation.

Multiple-GPU based random forest training method

A training method and random forest technology, applied in the field of multi-GPU random forest training, can solve the problems of changing computing requirements, inability to make full use of multi-GPU resources, and low random forest training efficiency, and achieve the effect of improving training efficiency.

Active Publication Date: 2016-11-23
BEIJING DIANZAN TECH CO LTD
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The parallel scale of existing methods is limited by the mismatch between the number of decision trees and the number of GPUs. Usually, the number of GPUs is one thousand. When the number of random forest decision trees is small, multi-GPU resources cannot be fully utilized, resulting in random forest At the same time, when a single / group GPU trains alone, as the depth of the tree increases, leaves are continuously generated, which means that the leaf sample operation ends, the depth increases, and the calculation requirements change (decrease). This unbalanced calculation Due to the demand, the GPUs in the group cannot always work at full capacity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multiple-GPU based random forest training method
  • Multiple-GPU based random forest training method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] Such as figure 1 As shown, assuming that there are N samples, each sample has d features, and the number of decision trees in the random forest RF is M, without loss of generality, the number of GPU units G in this example is less than N, and the decision of the sample is used as the training Tasks can utilize multiple GPU units in parallel to the maximum.

[0025] A) Control multiple GPUs to calculate the first decision tree, and each GPU unit calculates a sample decision;

[0026] B) As the depth of the tree increases, the decision tree calculates to the leaves and stops the calculation

[0027] C) Release of GPU units at leaf nodes

[0028] D) Start the second decision tree, and the released GPU unit calculates and trains according to steps A-C

[0029] E) Other GPU units released by the first decision tree join the calculation of the second decision tree

[0030] F) Similarly, the GPU unit released by the leaves of the second decision tree starts the calculation...

Embodiment 2

[0033] Such as figure 2 As shown, similarly, assuming that there are N samples, each sample has d features, and the number of decision trees in the random forest RF is M. In this example, the number of GPU units G is greater than N. In order to ensure the full load of the GPU, group the GPU , without loss of generality, take two groups as an example, assuming that G / 2<=N

[0034] A) Control multiple GPUs in each group to calculate a decision tree, and each GPU unit calculates a decision for a sample; control multiple groups to perform synchronously

[0035] B) As the depth of the tree increases, the first group of decision trees calculates to the leaves and stops the calculation

[0036] C) Release of GPU units at the first set of leaf nodes

[0037] D) Start the third d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multiple-GPU based random forest training method. The method includes steps of controlling the multiple GPUs to calculate one decision making tree, wherein each GPU calculates a decision of one sample; stopping calculation when leaves of the decision making tree are calculated and releasing a GPU of the leave nodes; starting the calculation of the next decision making trees by the GPU released by the decision making leaves and doing the rest in the same manner until the calculation is completed. GPU full load during training can be ensured and the training efficiency is improved.

Description

technical field [0001] The invention relates to a multi-GPU-based random forest training method, in particular to a multi-GPU-based random forest training method for improving training efficiency. Background technique [0002] In machine learning, Random Forest (RF-Random Forest) is a classifier that contains multiple decision trees, and its output category is determined by the mode of the category output by individual trees. Leo Breiman and Adele Cutler developed algorithms to infer random forests. For many kinds of data (input data or training samples), random forest can produce a high-accuracy classifier by balancing the error through multiple trees. However, the training of multiple trees makes the training complex and takes too much time. The traditional single CPU can no longer meet the random forest training needs in practical applications. Usually, the data scale in practical applications is huge, especially in the era of big data. Single CPU or multi-core CPU (due ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
Inventor 张京梅
Owner BEIJING DIANZAN TECH CO LTD