Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Bayes classifier based on pattern discovery in data flow

A Bayesian classifier, Bayesian classification technology, applied in electrical digital data processing, character and pattern recognition, special data processing applications, etc. performance effect

Inactive Publication Date: 2017-01-25
XINYANG NORMAL UNIVERSITY
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Different from the establishment of classification models on static data sets, pattern-based data stream classifiers need to deal with the following problems: (1) The data stream algorithm can only obtain a data segment of the data stream at any time, so it is not possible to scan all the data multiple times Compared with algorithms suitable for static data sets, it is difficult for algorithms on data streams to determine the integrity of frequent patterns mined
(3) Due to the high-speed and unbounded characteristics of the data flow, the algorithm must complete the data processing within the limited processing time and memory consumption
(4) The distribution of data in the data stream may change, and the algorithm must be able to quickly adapt to changes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bayes classifier based on pattern discovery in data flow
  • Bayes classifier based on pattern discovery in data flow
  • Bayes classifier based on pattern discovery in data flow

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0039] PDSB is a semi-lazy classifier that builds a dense data representation in the form of items during the training phase. When there is a request to be classified, a specific classification model is established for the instance to be classified.

[0040] (1) Create a product approximation in the data stream

[0041] PDSB uses frequent patterns extracted in the data stream to estimate Bayesian probabilities, and builds product approximations according to conditional independence models under the assumption of attribute independence.

[0042] 1) The extracted itemset determines the structure of the product approximation.

[0043] For example, a given data stream DS contains attribute A 1 , A 2 , A 3 , A 4 , A 5 and class attribute C. c i is any class attribute value, T={a 1 , a 2 ,...,a 5} is the instance to be c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of data mining and particularly provides a Bayes classifier based on pattern discovery in data flow. A method mainly includes a pattern discovery stage and a classifier establishment stage. In order to establish a Bayes classification model based on patterns in the data flow, a single sweep algorithm FFI is provided for excavating frequent item sets on the continuous data flow through a sliding window model. The method has high performance on the aspects of running time and classification precision, and better adapts to the data flow dynamic environment.

Description

technical field [0001] The invention belongs to the technical field of data mining, in particular to a Bayesian classifier based on pattern discovery in data streams. Background technique [0002] Most of the current pattern-based Bayesian classification models are aimed at static data sets, and usually cannot adapt to high-speed dynamic changes and unlimited data flow environments. In this regard, a Bayesian classification learning model based on pattern discovery in the data flow environment is proposed to adapt to the high-speed data flow environment. Classification is to build a classification model based on existing data, which can map the data records in the database to one of the given categories, so that it can be used for data prediction. A Bayesian classifier is a classifier that has been extensively studied. One of the difficulties in constructing a Bayesian classification model is the calculation of the joint probability in Bayesian theory, which usually requir...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/2465G06F16/254G06F18/24155
Inventor 孙艳歌邵罕李艳灵李刚李然郭华平
Owner XINYANG NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products