Non-artificial access log filtering method and device

A filtering method and log technology, which is applied in the field of filtering non-human access logs, can solve the problems of user behavior analysis interference and low analysis value, and achieve the effect of improving the efficiency of log mining

Inactive Publication Date: 2017-12-15
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Because the analysis value of non-human access itself is low, but its scale is huge, so it may cause unnecessary interference to normal network log mining and user behavior analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Non-artificial access log filtering method and device
  • Non-artificial access log filtering method and device
  • Non-artificial access log filtering method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0053] figure 1 It is a flowchart of this embodiment. according to figure 1 As shown, the first embodiment of the present invention provides a method for filtering non-human access logs, the method comprising:

[0054]S1: Filter out the access logs that meet the preset conditions to obtain the first standard log;

[0055] S2: Filter out access logs within a predetermined time period based on the first standard log to obtain a second standard log;

[0056] S3: Obtain the URL prefix in the log from the first standard log to obtain a prefix set;

[0057] S4: Filter the second standard log according to the prefix set to obtain a filter result log.

[0058] In this regard, high-frequency non-human access can be quickly and effectively filtered, which is of great significance for improving log mining efficiency, analyzing user behavior, and even detecting internal security threats.

[0059] Specifically, a method for filtering non-human access logs provided in the first embodim...

no. 2 example

[0097] figure 2 It is a flowchart of this embodiment. according to figure 2 As shown, according to the second embodiment of the present invention, a filter device for non-human access logs is provided, the device includes:

[0098] The first filtering module is used to filter out the access logs meeting the preset conditions to obtain the first standard log;

[0099] A second filtering module, configured to filter out access logs within a predetermined period of time based on the first standard log to obtain a second standard log;

[0100] A collection module, configured to obtain URL prefixes in logs from the first standard log to obtain a prefix collection;

[0101] The third filtering module is configured to filter the second standard log according to the prefix set to obtain a filtered result log.

[0102] Optionally, the third filter module includes:

[0103] Use a hash table to traverse the URL prefix in each log of the prefix set, and use a classic linked list de...

no. 3 example

[0111] A computer device includes a processor and a memory; the memory is used to store computer instructions, and the processor is used to run the computer instructions stored in the memory to implement the above-mentioned method for filtering non-human access logs.

[0112] The methods include:

[0113] Filter out the access logs that meet the preset conditions to obtain the first standard log;

[0114] filtering out access logs within a predetermined period of time based on the first standard log to obtain a second standard log;

[0115] Obtain the URL prefix in the log from the first standard log to obtain a prefix set;

[0116] The second standard log is filtered according to the prefix set to obtain a filter result log.

[0117] Optionally, the filtering the second standard log according to the prefix set to obtain the filtered result log includes: using a hash table to traverse the URL prefix in each log of the prefix set, and using a classic linked list method Solve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a non-artificial access log filtering method and device. The method comprises the steps that access logs matched with a preset condition are filtered away, and first standard logs are obtained; based on the standard logs, the access logs in a preset time period are filtered away, and second standard logs are obtained; URL prefixes in the logs are obtained from the first standard logs, and a prefix set is obtained; according to the prefix set, the second standard logs are filtered, and a filtering result log is obtained. By means of the non-artificial access log filtering method and device, high-frequency non-artificial access can be quickly and effectively filtered, and it is of great significance in improving the log mining efficiency, analyzing user behaviors and detecting internal security threats.

Description

technical field [0001] The invention relates to the field of access log filtering, in particular to a method and device for filtering non-human access logs. Background technique [0002] The gateway, located between the client and the server, has the characteristics of rich logs, many users and web pages. This location feature determines that it contains a large number of people and web page information visited. Therefore, it provides very convenient conditions for studying large-scale user access behaviors, web page classification tasks, and mining user click data. [0003] However, the gateway logs are complex and changeable, and there are a large number of web pages, so it is necessary to distinguish between human visits and non-human visits. The so-called non-human access refers to access behaviors that are not actively generated by humans. Non-human access includes automatic software updates, automatic pop-up of advertisements, and other behaviors. Because non-human ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/1815G06F16/9535G06F16/955
Inventor 李鹏霄杜翠兰任彦刘晓辉查奇文易立柳毅李睿程光
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products