Data flow classification method and device based on dynamic fast decision tree algorithm

A decision tree and data flow technology, applied in computing, complex mathematical operations, computer components, etc., can solve problems such as not being able to quickly adapt to concept drift, low algorithm accuracy, and limiting online decision tree learning capabilities

Pending Publication Date: 2020-11-03
TSINGHUA UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In a non-stationary environment, the distribution of data may change over time, that is, concept drift. When concept drift occurs, the optimal split attribute of the leaf node will usually change drastically, and it is difficult in this case Choosing an appropriate split attribute for splitting makes it difficult for the online decision tree algorithm using a conservative mechanism to quickly recover from drift, so this conservative mechanism limits the learning ability of the online decision tree and cannot quickly adapt to concept drift, resulting in Algorithms are less accurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data flow classification method and device based on dynamic fast decision tree algorithm
  • Data flow classification method and device based on dynamic fast decision tree algorithm
  • Data flow classification method and device based on dynamic fast decision tree algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Apparently, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0064] It should be noted that the terms "include" and "have" and any variations thereof in the embodiments of the present invention and the drawings are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes steps or units that are not listed, or optionally further includes For other steps or units inhe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data flow classification method and device based on a dynamic fast decision tree algorithm. According to the method, a difference value between a heuristicmetric value of a first optimal attribute and a heuristic metric value of an empty attribute is compared with a first current splitting threshold value, and the heuristic metric value of the empty attribute is smaller than a second optimal attribute, so that the difference value between the heuristic metric value of the first optimal attribute and the heuristic metric value of the empty attributeis greater than the first current splitting threshold value; the splitting speed of the fast decision tree is much higher than that under a conservative mechanism, the learning ability of the online decision tree can be brought into full play, and for the internal node, the difference value between the heuristic metric value of the second optimal attribute and the heuristic metric value of the splitting attribute of the node is compared with a second current splitting threshold value. When a judgment is made that the original splitting attribute for splitting is not the optimal attribute, nodes are spitted by using the optimal second optimal attribute so as to quickly adapt to concept drift and improve the algorithm accuracy.

Description

technical field [0001] The invention relates to the technical field of decision trees, in particular to a method and device for classifying data streams based on a dynamic fast decision tree algorithm. Background technique [0002] The online decision tree is one of the most popular algorithms in the data stream classification algorithm. During the learning process, the online decision tree algorithm will frequently perform split attempts based on the selected best split attributes to ensure that the leaf nodes split in time to achieve data stream classification. [0003] In a non-stationary environment, the distribution of data may change over time, that is, concept drift. When concept drift occurs, the optimal split attribute of the leaf node will usually change drastically, and it is difficult in this case Choosing an appropriate split attribute for splitting makes it difficult for the online decision tree algorithm using a conservative mechanism to quickly recover from ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06F17/18
CPCG06F17/18G06F18/24323G06F18/2415
Inventor 赵曦滨万海孙剑贾宏宇
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products