Pre-statistics of data for nodes of decision tree

A decision tree and node technology, applied in image data processing, special data processing applications, computing, etc., can solve problems such as reducing processing efficiency and large processing delay

Active Publication Date: 2018-07-27
MICROSOFT TECH LICENSING LLC
View PDF7 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The generation of traditional decision trees requires a considerable number of visits to the training data. Such

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pre-statistics of data for nodes of decision tree
  • Pre-statistics of data for nodes of decision tree
  • Pre-statistics of data for nodes of decision tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present disclosure are shown in the drawings, it should be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein; A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for exemplary purposes only, and are not intended to limit the protection scope of the present disclosure.

[0015] The term "data sample" as used herein refers to the data used to train the learning model (or process). Examples of data samples include, without limitation, documents in network (eg, Web) search rankings, advertisements in ad click predictions, and the like.

[0016] The term "feature" as used herein refers to information on which a decision tree i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the disclosure relate to decision tree generation based on pre-statistics of data for nodes. A plurality of data samples for nodes of a decision tree are obtained, the data samples having respective feature values with respect to a first feature. A target range is determined from a plurality of predefined ranges of values such that the number of feature values falling within the target range exceeds a predetermined threshold number. Then, the remaining feature values other than the feature values falling within the target range are assigned to the corresponding range of values,and the feature values falling into all the ranges of values are counted based on the allocation of the remaining feature values so as to be used for the allocation of the plurality of data samples tochild nodes of the nodes. In this way, the data processing speed and efficiency are significantly improved, and therefore the generation speed and efficiency of the decision tree are increased.

Description

Background technique [0001] Decision trees are a technique widely used in machine learning models or processes. Using this technique, non-linear dependencies between data can be modeled and results can be interpreted without additional feature preprocessing such as normalization. When combined with different loss functions, decision trees can be used across various domains such as classification, recursion, and ranking. Moreover, when combined with different integration techniques such as bagging and boosting, a variety of decision tree algorithms can be derived, including random forests, gradient boosting decision trees (GBDT), and so on. As examples, decision trees have been widely used in network (eg, Web) applications such as document ranking in Web search, click prediction for advertising targets, etc. by combining with different loss functions and different integration techniques. [0002] In the decision tree algorithm, the fitting of a single tree is realized by recu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N99/00G06N20/00
CPCG06N20/00G06N5/01G06T7/162G06F16/9027G06F18/24323
Inventor 周虎成李翠
Owner MICROSOFT TECH LICENSING LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products