A Log Clustering Method Based on Graph Structure

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A clustering method and graph structure technology, applied in the field of text clustering, can solve the problems of the number of log categories cannot be automatically identified, the amount of calculation is large, and the number of categories cannot be guaranteed by clustering.

Active Publication Date: 2019-11-19

NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Traditional clustering algorithms cannot meet the needs of massive log clustering

For example, the traditional K-Means and K-Medoid clustering algorithms require specifying the number of clusters and cannot automatically identify the appropriate number of categories for logs

In order to obtain a better clustering effect, the traditional Denclue clustering algorithm needs continuous experiments to obtain the appropriate number of clusters. The parameters are difficult to control, the amount of calculation is too large, and the clustering cannot guarantee the real number of categories.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0043] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0044] Such as figure 1 As shown, a log clustering method based on a graph structure, the method includes: based on text word segmentation, vector similarity and clustering the logs of the largest connected subgraph to obtain a feature library; and according to the category features in the feature library Massive logs are categorized.

[0045] 1. Obtaining the feature library includes the following steps:

[0046] (1) Structuring the original log to generate structured log data; including: inputting the original log, structuring the semi-structured original log by columns, and outputting the structured log data.

[0047] For example, the form of Linux syslog logs is shown in Table 1.1, and the columns are structured into fields such as Timestamp, Level, Source, and Message. The original syslog becomes the format in Ta...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a log clustering method based on a graph structure. The method comprises the following steps of clustering logs based on text segmentation, vector similarity and a maximum connected sub-graph in order to obtain a feature library; and carrying out class labelling on the massive logs according to the class features in the feature library. The method can automatically recognize the most appropriate class number in the massive logs without manually assigning the clustering number; in addition, the method can classify the logs precisely to lay a foundation for mining of massive log data.

Description

technical field [0001] The invention relates to the field of text clustering, in particular to a log clustering method based on a graph structure. Background technique [0002] With the rapid development of information technology and the continuous expansion of cluster scale, massive log data is generated, but there is no effective analysis and mining of log data. Log data records the operating information of the system, and mining log data is of great significance. For example, by analyzing log data, we can build an intelligent operation and maintenance system to complete functions such as fault location and fault warning. Accurate category labeling of logs is an important direction of log data mining. Based on this, we automatically identify the appropriate number of categories for logs by clustering massive logs. By extracting the features of each category, a log category feature library is generated, and new logs are marked according to the category of the feature libr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F16/35G06F16/36

CPCG06F16/355G06F16/36

Inventor吕雁飞王树鹏张鸿丁煜樊冬进肖东方郑亚松周晓阳何慧虹史亮

OwnerNAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT

A Log Clustering Method Based on Graph Structure

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology