Word segmentation algorithm-based log parsing method and word segmentation algorithm-based log parsing system

A word segmentation algorithm and analysis method technology, applied in semantic analysis, calculation, special data processing applications, etc., can solve problems such as low performance of regular expression chain matching, regular expression mismatching, etc., to reduce difficulty and complexity, The effect of improving efficiency

Active Publication Date: 2015-03-04
HANGZHOU ANHENG INFORMATION TECH CO LTD
View PDF5 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The approach based on regular expressions may give us headaches after a period of time, especially how to maintain a large number of regular expressions, mismatches between regular expressions, and low performance of regular expression chain matching.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word segmentation algorithm-based log parsing method and word segmentation algorithm-based log parsing system
  • Word segmentation algorithm-based log parsing method and word segmentation algorithm-based log parsing system
  • Word segmentation algorithm-based log parsing method and word segmentation algorithm-based log parsing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] First of all, it should be explained that the present invention relates to the field of log auditing and security management technology, and is an application of computer technology in the field of information security technology. During the implementation of the present invention, the application of multiple software function modules will be involved. The applicant believes that, after carefully reading the application documents and accurately understanding the realization principle and purpose of the present invention, combined with existing known technologies, those skilled in the art can fully implement the present invention by using their software programming skills. The aforementioned software functional modules include but are not limited to: word segmentation module, word meaning analysis module, word meaning filtering module, word order feature extraction module, etc. All mentioned in the application documents of the present invention belong to this category, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of log audit and safety management, and aims at providing a word segmentation algorithm-based log parsing method and a word segmentation algorithm-based log parsing system. The word segmentation algorithm-based log parsing method comprises the following steps: performing segmentation on a log, performing word sense analysis on segmentation results, performing word sense filtration on obtained segmentation results with word sense tagging, performing feature extraction on the obtained filtered segmentation results with the word sense tagging, performing feature matching on obtained word sense order feature codes, and performing semantic parsing on obtained semantic parsing rules; the word segmentation algorithm-based log parsing system comprises a segmentation module, a word sense analysis module, a word sense filtration module, a word order feature extraction module, a feature matching module and a semantic parsing module. According to the word segmentation algorithm-based log parsing method and the word segmentation algorithm-based log parsing system disclosed by the invention, the difficulty and complexity of log parsing are greatly reduced, and therefore the efficiency of performing parsing rule development on the log is increased; the word segmentation algorithm-based log parsing method and the word segmentation algorithm-based log parsing system can be better adapted to certain changes of a log format.

Description

technical field [0001] The invention relates to the technical field of log auditing and safety management, in particular to a log parsing method and system based on a word segmentation algorithm. Background technique [0002] Any program in the computer system may output logs: the operating system kernel, various application servers, and so on. The log contains a lot of information that people—mainly security managers, operation and maintenance personnel, and business analysts—will be interested in, such as the visitor's IP, access time, source address, and client information used by the visitor, Analyze user behavior characteristics, etc. [0003] Since these logs are so useful, how to perform log analysis is not a simple problem. Logs contain thousands of possible formats and data, and "analysis" is even more difficult to define. It may be simple calculation of statistical values ​​or complex data mining algorithms. Of course, there are countless ready-made tools that c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/951G06F40/30
Inventor 谈修竹范渊
Owner HANGZHOU ANHENG INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products