A method for preprocessing URLs in access logs

A preprocessing and log technology, applied in the field of website analysis, can solve the problems of too detailed REFERER and REQUEST, unfavorable statistical analysis and extraction of access paths, etc.

Active Publication Date: 2017-05-24
FOCUS TECH
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

A problem will be encountered when performing path analysis based on the original REFERER and REQUEST recorded in the access log. The REFERER and REQUEST are too detailed, which is not conducive to subsequent statistical analysis and extraction of access paths.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for preprocessing URLs in access logs
  • A method for preprocessing URLs in access logs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] specific implementation plan

[0040] The specific embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings and examples. Obviously, the described examples are only some of the examples of the present invention, not all of them. Changes or equivalent changes made based on the embodiments of the present application and the technical essence of the claims of the present invention still fall within the protection scope of the present application.

[0041] refer to figure 1 As shown, the implementation steps of the application are as follows:

[0042] S11: Website URL collection, that is, sorting and summarizing the website URL address system. The collection of website URLs can rely on manual collection at the initial stage, by manually collecting the main or important page URLs of the website, and confirming the basic information of these URLs, including URL identification rules, URL names, etc. Wherein...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for preprocessing website access log URLs, comprising: S11: collecting website URLs, that is, organizing and summarizing the website URL address system; S12: configuring and storing URLs, configuring and storing the website URLs collected in S11 in the URL In the rule storage table; the URL rule storage table includes the following fields: URL unique code, URL identification rule, URL name, URL matching order; S13: Take out the information in the URL rule storage table obtained in S12, and follow the "URL matching order" Sorting to ensure that the parent URL is ranked before the child URL; S14: Obtain the record of the access log, including the visitor's IP, access time, REFERER, and REQUEST information; S15: compare the REFERER and REQUEST in each access log record in S14 with the The URL identification rules in the URL rule storage table acquired in S13 are matched according to the order of acquisition in S13; S16: Acquire records in S15 that do not have matching REFERER and REQUEST encoded as ‑1 or null.

Description

technical field [0001] The invention relates to the field of website analysis, in particular to a method and device for preprocessing website access log URLs. Background technique [0002] Website access path analysis provides important data support and guidance for optimizing website structure and page layout, and understanding visitor behavior preferences. The basic data of website path analysis comes from the website's access log, which records information such as visitor's IP, access time, REFERER (the page visited last time), REQUEST (the page currently visited) and so on. Among them, REFERER and REQUEST are very important information for constructing a set of visited webpages and an access path. [0003] The REFERER and REQUEST recorded in the access log are all in the form of URL addresses. For example, the URL (uniform resource locator, that is, the address of the WWW page) of the home page of Made in China (hereinafter referred to as: MIC) is [0004] "www.made-in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/90344G06F16/958
Inventor 陈静房鹏展
Owner FOCUS TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products