Log event extraction method and system based on log tree and parse tree

An extraction method and extraction system technology, applied in the field of log analysis event extraction, can solve the problems of ignoring internal structure information, static field error extraction, coarse event granularity, etc., so as to improve log preprocessing efficiency, reduce workload, and reduce matching time. Effect

Pending Publication Date: 2021-11-09
NANJING UNIV OF SCI & TECH
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, existing event extraction methods formulate heuristic rules for global structural information such as the number of log fields or local content information such as log field categories, ignoring the hidden internal structural information between fields
After the log is matched to the log cluster, compare the fields of the log and the event, and replace the fields with different values ​​with wildcards, and extract the value as a parameter to update the event. This method also ignores the internal structure information of the original log, which can easily lead to static Fields are incorrectly extracted as parameters, event granularity is too coarse

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Log event extraction method and system based on log tree and parse tree
  • Log event extraction method and system based on log tree and parse tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] Such as figure 1 As shown, a kind of log event extracting method based on log tree and parsing tree of the present invention, adopts the following steps:

[0019] Step 1: Automatic identification of the log format based on the rule base. In this step, based on the heuristic rules and regular expressions in the rule base, a small part of logs is extracted to automatically generate a log format.

[0020] (1) Use spaces to split fields and perform an alignment operation on the split fields;

[0021] (2) According to rule 1, use regular expressions in the rule base to replace identifiable fields. Compute the column complexity where the number of column fields is equal to the number of rows according to rule 2. Traversing column fields to calculate column complexity, the formula is as follows:

[0022]

[0023] In the formula, when a column has the same value, the column complexity is 0; otherwise, the column complexity is the number of possible values.

[0024] (3) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a log event extraction method and system based on a log tree and a parse tree. The method is divided into two steps of preprocessing and log content parsing, and the method specifically comprises the steps of providing and maintaining a rule base composed of regular expressions and heuristic rules, and extracting a small part of logs to automatically generate a log format; recognizing the log as a log head and log content on line based on the log format; searching the analytic tree, and respectively calculating the similarity between the static field and the dynamic parameter in the log tree and the event tree by adopting the longest common substring and the longest common subvector; and matching the log tree and the event tree by adopting a clustering technology, and extracting events and corresponding parameters. In order to cope with the complexity of the log content, the preprocessing and log content analysis steps in the online event extraction method are improved. The workload of manually recognizing log formats is reduced, the problem that an existing method is difficult to identify events containing uncertain number of parameters is solved, and log events are extracted more accurately.

Description

technical field [0001] The invention belongs to the field of log analysis event extraction, in particular to a method and system for extracting log events based on log trees and parsing trees. Background technique [0002] With the rise of today's Internet technology, the scale of computing and communication infrastructure has expanded, and large-scale distributed systems have emerged. The log is the text data generated by the printout code embedded in the program, recording the current operating status and behavior mode of the system. In the process of system development and maintenance, domain experts realize real-time monitoring, anomaly detection, fault prediction and diagnosis of the system by analyzing logs. The scale of logs is expanding rapidly, and it is difficult to efficiently identify new abnormalities in the rapidly updated system and effectively eliminate system failures only by manually analyzing logs. Therefore, log analysis has gradually changed from offli...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/18G06F16/14
CPCG06F16/1815G06F16/148
Inventor 傅媛媛徐建
Owner NANJING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products