Key Cookies identification method for Web session merging

An identification method and key technology, applied in electrical components, transmission systems, etc., can solve problems such as incomplete user sessions

Inactive Publication Date: 2014-07-23
DONGHUA UNIV +1
View PDF1 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Sometimes, simply relying on URL (English full name Uniform Re

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Key Cookies identification method for Web session merging
  • Key Cookies identification method for Web session merging
  • Key Cookies identification method for Web session merging

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the present invention more comprehensible, preferred embodiments are described below in detail with accompanying drawings.

[0031] The present invention provides a Web session merging method based on key Cookies identification, figure 1 It is a flowchart of identifying key cookies of a website according to an exemplary embodiment of the present invention, which specifically includes the following steps:

[0032] Step 101: Obtain a Web log file from a network provider. The data record format is shown in Table 1, including 8 fields. According to the URL field, a Map task is used to extract the site name Site of each site;

[0033] UUID

Browser unique identifier

sessionID

Session id

sourceIP

Source IP

ADSL

ADSL device number

Timestamp

Timestamp

[0034] URL

URL address

Referer

URL of the previous redirect page

UserAgent

User agent, including device, OS, Browser and other information

destIP

Target IP

cookie

Cookie information

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a key Cookies identification method for Web session merging. Session merging by use of Cookie identification is a key step of Web log preprocessing and directly affects subsequent Web log mining. The key Cookies identification method for Web session merging is provided for solving the problems of low efficiency and low accuracy rate of traditional session merging. The method is used for identifying user-Cookies associated with users and terminal-Cookies of a terminal by which a user browses websites. The method realizes the identification of the user-Cookies by using a CookiePicker system proposed by a predecessor, and also realizes the identification of the terminal-Cookies by virtue of the top-k thought. At last, the user-Cookies and the terminal-Cookies are combined to form final key-Cookies. The method has the advantage of quickly identifying the user associated Cookie information and thus can be well applied to session merging.

Description

Technical field [0001] The invention relates to a method for identifying key cookies that can be used for Web session merging, and belongs to the field of Web log preprocessing. Background technique [0002] Web log mining refers to the application of mining techniques such as association rules, clustering analysis, and prediction to Web server log files to discover the hidden patterns of users accessing Web pages. Web log preprocessing is the process of cleaning, filtering and recombining Web logs before Web log mining. The accuracy of the data preprocessing results of Web log mining directly affects the efficiency and accuracy of Web log mining. [0003] Identifying user sessions is the most important part of web log preprocessing. Sometimes, simply relying on URL (full name in English as Uniform Resource Locator) to identify user sessions is not comprehensive. At this time, you need to rely on the information in the Cookie to determine whether several incomplete user sessions...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/06
Inventor 陈德华沈昌干潘乔罗昕
Owner DONGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products