Unlock instant, AI-driven research and patent intelligence for your innovation.

Information gain computation based mass data abnormality detecting method

A massive data and information gain technology, applied in computing, digital data processing, special data processing applications, etc., can solve the problems of memory overflow and processing time overhead, and achieve the effect of good detection

Inactive Publication Date: 2012-07-11
EAST CHINA NORMAL UNIV
View PDF1 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention overcomes the defects of memory overflow and excessive processing time overhead caused by traditional algorithms in the prior art when analyzing massive scale data, and proposes a massive data anomaly detection method based on information gain calculation
The invention proposes a new method to process massive information through two different stages, that is, the offline stage and the online stage, and solves the problems caused by traditional algorithms when analyzing massive-scale data in view of the large-scale data and the relatively insufficient memory of the computer system. problems such as memory overflow and excessive processing time overhead, thereby improving analysis performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information gain computation based mass data abnormality detecting method
  • Information gain computation based mass data abnormality detecting method
  • Information gain computation based mass data abnormality detecting method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The present invention will be described in further detail in conjunction with the following specific examples and accompanying drawings, and the protection content of the present invention is not limited to the following examples. Without departing from the spirit and scope of the inventive concept, changes and advantages conceivable by those skilled in the art are all included in the present invention, and the appended claims are the protection scope.

[0026] The massive data anomaly detection method based on information gain calculation of the present invention is processed by a computer system based on a hash table data structure, including offline stage processing and online stage processing.

[0027] The present invention processes on-line stage, such as figure 2 As shown, it is an online data storage strategy based on a hash table. The basic data structure used is a hash table. The structure of each item in the hash table is: (src, attrValue, FCount, SCount), wh...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an information gain computation based mass data abnormality detecting method, which comprises an off-line stage processing step and an on-line processing step. The off-line processing step is used for generating statistical information, and the on-line processing step is used for quickly generating analysis results in an approximate process manner on the basis of the statistical information. Accordingly, the problems of memory outflow, long processing time and high expense and the like caused when mass data are analyzed by the convention algorithm are solved, and analysis performance is improved.

Description

technical field [0001] The invention relates to a method for solving key classifications only by traversing a database once, and belongs to the technical field of data mining and knowledge discovery. Background technique [0002] Data mining technology obtains useful knowledge from various data collections. Since the mid-1990s, data mining technology has been deeply applied in many fields, such as finance, logistics, transportation, scientific research and other fields. Typical data mining algorithms include classification, clustering, association rules, regression analysis, etc. Since the beginning of the 21st century, the scale of data to be processed in many fields has become larger and larger, and often cannot be solved directly by applying traditional data mining algorithms. It is necessary to develop new algorithms and improve some key steps to solve related problems. [0003] The present invention researches an anomaly detection technology for massive data, and its ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 金澈清张敬伟周傲英
Owner EAST CHINA NORMAL UNIV