Method for classifying data streams under dynamic data environment

A technology of dynamic data and classification methods, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of concept drift, ineffective use of historical information, inability to effectively deal with dynamic data phenomena, etc.

Inactive Publication Date: 2013-04-03
DALIAN UNIV OF TECH
View PDF1 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The method of integrated learning can solve the problem that the historical information is not effectively used in the example-based method, but most of the existing methods adopt the blind learning method, which cannot effectively deal with the dynamic data phenomenon of sudden change, that is, the so-called concept drift of sudden change

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for classifying data streams under dynamic data environment
  • Method for classifying data streams under dynamic data environment
  • Method for classifying data streams under dynamic data environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0064] refer to figure 1 , the framework of a data stream classification method in a dynamic data environment of the present invention, including a data stream receiving module 102, a data stream dividing module 103, a kdq tree module 104, a classifier training module 105, a classifier-feature data pool 106, concepts Drift detection module 107, classifier selection module 108, classifier forgetting module 109;

[0065] Wherein, the data stream receiving module 102 receives data from the data stream 101 in order. The data flow 101 includes any type of data flow known to those skilled in the art, especially including network intrusion detection data flow, network security monitoring data flow, sensor data monitoring data flow and grid power supply data flow. Data streams are usually transmitted at very high speeds, so calculation and storage of data s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of intelligent information processing and discloses a method for classifying data streams in a dynamic data environment. The method comprises the following steps: partitioning the data streams; establishing different classifiers for different concept drift; storing in a characteristic data pool of the classifiers; when a new data block arrives, judging whether the concept drift occurs or not by Kullback-Leibler (KL) divergence; if the concept drift does not occur, classifying by using the classifier at the last moment; if the concept drift occurs, seeking the proper classifiers from the characteristic data pool of the classifiers by the KL divergence and classifying; and if no coincident classifier exists, training a new classifier, adding the new classifier into the characteristic data pool of the classifiers and deleting the outdated classifiers. By the method, stable and mutational concept drift can be detected simultaneously; when the concept drift occurs, classification is performed by selecting the proper classifier to guarantee the efficiency of a model; and the performance of the model is guaranteed by deleting the outdated classifiers.

Description

technical field [0001] The invention relates to the technical field of intelligent information processing, in particular to a data stream classification method in a dynamic data environment, which is applicable to network intrusion detection, network security monitoring, sensor data monitoring, power grid power supply and the like. Background technique [0002] With the development of information technology, data stream, as a special data, has attracted more and more attention from the industry. Data stream refers to a huge sequence of data transmitted at high speed, and can only be read in a predetermined order. In real-world applications, since the data stream is usually transmitted at a very high speed, it will become very difficult to calculate and store the data stream data. Usually, there is only a chance to process the data when it first arrives, and it is difficult at other times. Then access the data. In addition, in the process of data flow generation, the data i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 冯林姚远陈沣
Owner DALIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products