Method for user behavior analysis under https environment and system thereof

A behavior analysis and user technology, applied in the Internet field, can solve problems such as technical limitations, scene limitations, and the inability to realize the ability to analyze user behavior on the entire network, and achieve the effect of reducing development and maintenance costs

Inactive Publication Date: 2017-06-20
JIUYUAN QIANCHANG BEIJING TECH SERVICE CO LTD
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

By hijacking HTTPS requests, the same analysis effect as HTTP can be achieved, but the ability to analyze user behavior on the entire network cannot be realized. It is more suitable for setting up a proxy agent at the egress of the enterprise network to track and analyze user behavior within the enterprise. Therefore, this solution There are technical limitations, scenario limitations, and cost limitations of Proxy proxy server deployment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for user behavior analysis under https environment and system thereof
  • Method for user behavior analysis under https environment and system thereof
  • Method for user behavior analysis under https environment and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] Such as figure 1 As shown, a method for analyzing user behavior in an https environment includes:

[0056] Step 1) set up a content feature library for the Internet resource page that needs to be analyzed. The feature library is composed of multiple feature codes. The feature codes include: the Host of the HTTPS request URL, the page size, the page contains resource content, resource content size, dynamic resource information, One or a combination of embedded URLs and their quantities;

[0057] Step 2) analyze the https message data of the user's access to Internet resources one by one, extract feature information, and analyze with feature code matching;

[0058] Step 3) Match the log of extracted feature information with the feature code in the content feature database, restore the user's real access behavior, and make further analysis and statistics.

[0059] The present invention realizes user Internet behavior analysis under https by establishing an Internet content...

Embodiment 2

[0061] The present invention is further described in conjunction with the embodiments, wherein, preferably, in step 2), the extracted feature information is selected from the following:

[0062] Access the Host / Domain of the URL;

[0063] The total length of the uncached part of the https request page;

[0064] The number of uncached images or CSS loaded resources in the https request page;

[0065] The size of each resource object loaded by the page;

[0066] The time when the https request occurred.

[0067] Preferably, in step 2), one or more characteristic fingerprints are formed based on the combination of one or more characteristic information above, and the user access path is determined through user access https requests within a certain time range.

[0068] Preferably, in step 3), the log of the feature information is extracted and the feature code matching in the content feature library is specifically selected from the following methods:

[0069] Unique matching...

Embodiment 3

[0073] In one embodiment, the present invention mainly comprises following main steps:

[0074] 1. Establish a content feature library for the Internet resource pages that need to be analyzed

[0075] Use crawler technology to crawl each webpage that needs to be analyzed for the website that needs to be analyzed, and build a webpage feature database based on the data of the crawled webpage. The feature database is composed of multiple feature codes (fingerprints). The feature codes include but are not limited to HTTPS request URLs Host, page size, page contains resource content, resource content size, dynamic resource information, embedded URLs and quantity, etc.

[0076]

[0077] 2. Analyze the HTTPS message data of users accessing Internet resources one by one, extract feature information, and match and analyze with feature codes / fingerprints

[0078] When HTTPS encrypts the communication channel through TLS / SSL, the following information can still be obtained after capt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for user behavior analysis under the https environment and a system thereof. The method comprises the steps that step 1) a content feature library is established for an internet resource page requiring analysis, the feature library is composed of multiple feature codes, and the feature codes include one or combination of the Host of https request URL, the page size, the resource content included in the page, the size of the resource content, dynamic resource information and the embedded URLs and number; step 2) the https message data of user access internet resources are analyzed one by one, and feature information is extracted so as to be matched with the feature codes for analysis; and step 3) the log of feature information extraction is matched with the feature codes in the content feature library, the real access behavior of the user is restored and further analysis and statistics are performed.

Description

technical field [0001] The invention belongs to the field of the Internet, and relates to a method for analyzing user behavior in an HTTPS environment. Background technique [0002] HTTP user access request message data has always been the main data source for user behavior analysis in the Internet environment. Through http messages, user behavior path, content and frequency can be effectively tracked, so as to analyze the behavior habits and behavior prediction of Internet users, so as to provide Enterprises, investors, etc. provide powerful decision-making basis, formulate and implement detailed and effective strategies according to different users. [0003] With the continuous development of the Internet, more and more services are carried, especially the development of terminal payment, financial management and other services, which makes the security requirements of data transmission higher and higher. Therefore, many applications / Web services gradually switch from http...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/08H04L29/06G06F17/30
CPCH04L67/025G06F16/951G06F16/9535G06F16/955G06F16/9574H04L63/10
Inventor 白晟张伟
Owner JIUYUAN QIANCHANG BEIJING TECH SERVICE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products