Data flow concept drift detection method based on historical model diversity

A concept drift and historical model technology, applied in the field of big data and machine learning, can solve the problems of unstable underlying distribution, and achieve the effect of real-time monitoring and response

Active Publication Date: 2021-02-05
HOHAI UNIV CHANGZHOU
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Especially in the IIoT environment, the underlying distribution may be unstable in practical applications, since the environment in which the data is generated may change over time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data flow concept drift detection method based on historical model diversity
  • Data flow concept drift detection method based on historical model diversity
  • Data flow concept drift detection method based on historical model diversity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] The online learning model based on the data stream extraction of intelligent production line based on industrial big data realizes the dynamic response mechanism of fault diagnosis and real-time monitoring and detection of gradual faults and sudden faults.

[0038] In this embodiment, the data flow concept drift detection method based on the diversity of historical models, the flow chart is as follows figure 1 As shown, the method includes the following steps:

[0039] S1 uses the online bagging method to process the data stream online;

[0040] S2 builds a base tree that retains historical diversity for building random forests;

[0041] S3 identifies concept drift areas and performs noise reduction processing;

[0042] S4 uses ensemble methods for concept drift detection;

[0043] S5 removes the maximum difference model and maintains the detection system update.

[0044] In step S1, the online bagging algorithm adopted is as follows. Given a training data set d=(x,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data flow concept drift detection method based on historical model diversity, which comprises the following steps of: extracting data flow on line by adopting an on-line bagging method, constructing a basic tree in a random forest based on a diversity standard, and detecting the change of a data flow characteristic space, pre-warning possible concept drifting, identifyinga concept drifting area, identifying and processing noise by integrating drifting pre-warning and the drifting area, and finally detecting the concept drifting by an integration method of a random forest. According to the method, the concept drift detection problem is solved by using a random forest, and various new historical model storage strategies are provided, so that the problems of instance storage and how to promote future concept drift detection by using the models are solved.

Description

technical field [0001] The invention relates to a data flow concept drift detection method based on the diversity of historical models, which belongs to the technical field of big data and machine learning. Background technique [0002] In the traditional concept drift detection method, the learning machine (model) is updated when processing a large amount of new training data, in which the model is updated without accessing the previous data, without storing or reprocessing the previous data. In dataflow processing, although different data blocks can be independently used as training data, knowledge gained in one subtask can also be used to help solve future subtasks. Especially in the IIoT environment, the underlying distribution may be unstable in practical applications, since the environment in which the data is generated may change over time. Therefore, the present invention aims at the industrial Internet of Things environment, correspondingly retains those historical...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00G06K9/62
CPCG06N20/00G06F18/24323
Inventor 刘立叶根张杰韩光洁
Owner HOHAI UNIV CHANGZHOU
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products