Data flow classification algorithm based on AAE-DWMAL-LearnNSE (Automatic Assisted Engineering-Discrete Wavelength Multiple Input Multiple Output-LearnNSE)

A classification algorithm and data flow technology, applied to other database clustering/classification, other database retrieval, etc., can solve the problems of classification accuracy impact, distribution, etc., achieve accurate classification features, save storage space, and reduce model construction time Effect

Inactive Publication Date: 2020-01-03
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Algorithms such as IS3RS and IDS-ELM lack selective selection of training samples, which will greatly affect the classification accuracy
Algorithms such as SEA and CELM are much more accurate than the results predicted by a single model, but there are problems such as how to select, update, and assign base classifiers

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data flow classification algorithm based on AAE-DWMAL-LearnNSE (Automatic Assisted Engineering-Discrete Wavelength Multiple Input Multiple Output-LearnNSE)
  • Data flow classification algorithm based on AAE-DWMAL-LearnNSE (Automatic Assisted Engineering-Discrete Wavelength Multiple Input Multiple Output-LearnNSE)
  • Data flow classification algorithm based on AAE-DWMAL-LearnNSE (Automatic Assisted Engineering-Discrete Wavelength Multiple Input Multiple Output-LearnNSE)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] specific implementation plan

[0043] The present invention will be described in further detail below through examples of implementation.

[0044] The data set selected in this implementation case has a total of 6 sets of samples, including 50,000 sets of moving Gaussian, ocean, hyperplane, chessboard, electric power and weather data, and 40,000 samples are extracted from each of the 6 sets of data by random sampling as a training set , and the remaining 10,000 are used as the test set.

[0045] The overall flow of the data stream classification algorithm provided by the present invention is as follows: figure 1 As shown, the specific steps are as follows:

[0046] (1) Update each classifier Weights

[0047] First, calculate the classification error rate ε of each base classifier, and set the initial value of the weight ω of each base classifier according to the classification error rate, as shown in the following formula:

[0048]

[0049] In the formula, j=...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data flow classification algorithm based on AAE-DWMIL-LearnNSE, which is a method for classifying data flows, belongs to the field of data mining and machine learning, and is characterized by comprising the following steps of: (1) updating the weight of each classifier; (2) carrying out weighted processing on the base classifier; (3) weighting and synthesizing an integrated classifier; (4) calculating a classification error rate on the new data set; (5) creating a comprehensive prediction model; and (6) determining and classifying a data flow classification function.The problem that the classification precision is greatly influenced due to lack of selective selection of the training samples is solved, the link of storing old data samples is omitted, most of storage space is saved for new data, meanwhile, an old classification model is fully utilized, and high classification accuracy is obtained. The problem of similarity between labeled training samples in the measurement source field and unlabeled test samples in the target field is effectively solved.

Description

technical field [0001] The invention relates to the fields of data mining and machine learning, and mainly relates to a method for identifying and classifying data streams. Background technique [0002] At present, for the classification of data streams, some traditional algorithms or improved algorithms in data mining are mainly used. The general processing flow requires a large number of labeled samples for training the classification model. Furthermore, due to some inherent characteristics of data streams, it is easy to cause difficult sample labeling and frequent concept drift in traditional data stream classification methods. The traditional incremental and integrated data stream classification methods can achieve better results in some fields, but they also have the disadvantages of gradually reducing the classification effect as the amount of data increases and the parameter explosion caused by the sharp increase in the number of base classifiers. In addition, the d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/906
CPCG06F16/906
Inventor 赵兴昊王松胡燕祝
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products