Log-based user behavior data processing method, medium, device and device

A data processing device and data processing technology, applied in the Internet field, can solve problems such as inapplicability, long time consumption, and increased calculation time, and achieve the effects of improving clustering accuracy, saving calculation time, and saving memory

Active Publication Date: 2019-01-04
BEIJING SHU AN XINYUN TECH CO LTD
View PDF10 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] First, it is difficult to use such logs to classify user interests without knowing the correspondence between Uniform Resource Identifier (URI) and its content, or the correspondence between content and content category
[0004] Second, the feature extraction of the web server is mainly statistical features, such as count, average value, standard deviation, etc., which are access behavior features but do not include access targets, so the access targets of the clustered clusters will be inconsistent, which will lead to Misjudgment
Taking KMeans, the most common clustering method, as an example, the space complexity is O(n*m), where n is the number of data items and m is the number of features. The increase of n and m makes clustering consume a lot of system m

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Log-based user behavior data processing method, medium, device and device
  • Log-based user behavior data processing method, medium, device and device
  • Log-based user behavior data processing method, medium, device and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention. It should be noted that the embodiments in this application and the features in the embodiments can be combined with each other arbitrarily if there is no conflict.

[0052] The log-based user behavior data processing method in the embodiment of the present invention includes:

[0053] Step 101: Collect log information;

[0054] Step 102:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a log-based user behavior data processing method, a medium, a device and a device. The method comprises: 1, collecting log information; 2, determining a plurality of access characteristics, and extracting access behavior values of different users for different access characteristics accord to log information; 3, dividing that access characteristic into N groups, and determine the cluster number of each group; N is an integer greater than or equal to 1; step 4, clustering each grouping according to the number of clusters corresponding to the grouping to obtain clusteringresults. The invention solves the problem of poor high-dimensional clustering effect by constructing high-dimensional data as low-dimensional features composed of a plurality of similar or related features and clustering. And by constructing the high-dimensional data into a plurality of similar or related features of the low-dimensional features after clustering, the number of clusters per sub-clustering process is reduced, saving computational time and saving computational memory.

Description

Technical field [0001] The present invention relates to the field of Internet technology, in particular to log-based user behavior data processing methods, media, equipment and devices. Background technique [0002] With the development of Internet services, massive web server (webserver) system logs are generated every day due to user access on the Internet. The web server system log mainly includes: client IP address, client user name, access time, request uri, request status, file size, page link source, client browser and other information. In the prior art, web server system logs are used to classify user behaviors, but the existing classification methods mainly have the following difficulties: [0003] First, when the correspondence between the Uniform Resource Identifier (URI) and its content cannot be known, and the correspondence between the content and the content category cannot be known, it is difficult to use such logs to classify user interests. [0004] Second, the f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06F16/9535
CPCG06F18/23213G06F18/24
Inventor 刘鑫琪丛磊
Owner BEIJING SHU AN XINYUN TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products