Check patentability & draft patents in minutes with Patsnap Eureka AI!

A method and device for analyzing network log url

An analysis method and analysis device technology, applied in the field of data processing, can solve the problems of large amount of calculation, complex URL regular matching process, high calculation cost, etc., and achieve the effect of reducing calculation cost, occupying less resources, and reducing calculation cost

Active Publication Date: 2017-05-31
TAOBAO CHINA SOFTWARE
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The problem existing in the above prior art is that the URL regular matching process is relatively complicated, and the number of records in a large-scale Internet Web log is massive, and multiple regular matching rules sequentially perform regular matching on massive URLs one by one, the calculation amount is very large, and the calculation cost higher

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for analyzing network log url
  • A method and device for analyzing network log url

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to make the above objects, features and advantages of the present application more obvious and comprehensible, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0042] refer to figure 1 , which shows a flow chart of an embodiment of a web log URL analysis method of the present application, which may specifically include the following steps:

[0043] Step 101, extract the URL in the webpage log.

[0044] A web page log is a file ending with .log that records various raw information such as web server receiving and processing requests and runtime errors. To be precise, it should be a server log. The webpage log contains the URL of the webpage address that the visitor requests to visit.

[0045] The URL consists of three parts: protocol, domain name, and request address. A complete URL uniquely determines a requested resource, which can be a page, content module,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an analysis method and device of URLs (Uniform Resource Locator) of a weblog. The analysis method comprises extracting the URLs in the weblog; performing duplicate removal processing on the URLs; performing regex match on the duplicate removed URLs through a plurality of preset regular expressions in turn and extracting serial numbers of regular expressions which are matched with the duplicate removed URLs; duplicating serial numbers of regular expressions of duplicate removed URLs which are the same with the URLS before duplicate removal to be utilized as corresponding serial numbers of the regular expressions; performing statistics on different regular expression serial numbers corresponding to the URLS before the duplicate removal. The analysis method and device of the URLs of the weblog can reduce the calculated amount of the regex match and reduce the calculation cost.

Description

technical field [0001] The present application relates to the technical field of data processing, in particular to a method and device for analyzing web log URLs. Background technique [0002] In business analysis, various analysis and mining processes are often performed on these massive Web logs (network logs). Among them, the URL of the Web log contains important information about visitor visits, and it is usually necessary to use regular expressions to match the URL. The regular expressions on the above fall into the category of business analysis. [0003] In the prior art, the URL processing process of the entire Web log is divided into three steps: [0004] 1. Collect massive Web logs and store raw data; [0005] 2. Match URLs with regular expressions, and each URL may have multiple rules (usually within the range of 1-10); [0006] 3. According to the business classification corresponding to the regularization rules, the follow-up data index analysis of the busines...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 张清
Owner TAOBAO CHINA SOFTWARE
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More