Unlock instant, AI-driven research and patent intelligence for your innovation.

Spectral clustering algorithm parallelization method in abnormal data detection and system

A technology of abnormal data detection and spectral clustering algorithm, which is applied in the direction of digital data information retrieval, file system, file system type, etc., can solve the problem of low execution efficiency of spectral clustering algorithm and the inability of stand-alone storage system to meet the storage requirements of massive data, etc. problem, to achieve low latency, reduce difficulty, and improve computing efficiency

Pending Publication Date: 2021-06-18
WUHAN UNIV
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem that the stand-alone storage system cannot meet the storage requirements for massive data, and the execution efficiency of spectral clustering algorithm is low in the face of massive data, the present invention proposes a parallelization method and system for spectral clustering algorithm in abnormal data detection

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spectral clustering algorithm parallelization method in abnormal data detection and system
  • Spectral clustering algorithm parallelization method in abnormal data detection and system
  • Spectral clustering algorithm parallelization method in abnormal data detection and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] In order to facilitate the understanding of the present invention, the present invention will be understood in connection with the accompanying drawings and examples, and the embodiments described herein are intended to illustrate and explain the present invention. this invention.

[0034] Please see figure 1 An abnormal data detection method provided by the present invention is parallelized by the spectrum clustering algorithm, including the following steps:

[0035] Step 1: Data sets of data to be clustered by data distributed storage;

[0036] In this embodiment, a data set sample to be clustered is divided into several data blocks, and these data blocks are abstracted into RDD objects, and these RDDs are assigned to several working nodes in the Spark cluster for storage, deposit open source distribution. File system HDFS.

[0037] Please see figure 2 The detailed process of data distributed storage is displayed. HDFS contains a NameNode and several DataNode (data nodes)...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a spectral clustering algorithm parallelization method and system. The method comprises the following steps of firstly, carrying out data distributed storage on data set samples to be subjected to clustering analysis, then constructing a similar matrix of data set samples in a parallelization manner, carrying out parallel calculation on a Laplacian matrix of the similar matrix, calculating feature vectors of the Laplacian matrix in parallel to obtain a feature vector matrix with the dimension of n*d, and finally, executing the K-mean clustering algorithm in a parallelization manner. Experimental results show that when the method is used for clustering analysis of mass log data, the execution efficiency of the algorithm is remarkably improved while a good clustering effect is ensured.

Description

Technical field [0001] The present invention belongs to the technical field of computer software, and it is related to a spectral clustering algorithm and a systematic method and system, which specifically relates to an abnormal data detection method and system. Background technique [0002] Software in large systems is very large during actual operations. These data can sometimes reach TB or even PB, so many data need to be processed and generated, and the large system has been inevitable. The fault log is used to record the relevant information of the system failure. As the system scale gradually expands, the size of the log is exponentially, and the type of log is increasingly complex. Once the computer system has a performance failure, it is necessary to fix the fault to fix the fault. Otherwise, it will affect normal social life and cause huge economic losses, and it may also affect social stability in severe cases. [0003] When the large computer system fails, how to disti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/18G06F16/182G06F16/172G06K9/62
CPCG06F16/1815G06F16/182G06F16/172G06F18/213G06F18/23213G06F18/22
Inventor 应时周慧敏成海龙段晓宇
Owner WUHAN UNIV