User behavior log density peak clustering method capable of automatically determining clustering center

A user log and automatic determination technology, applied in the direction of instruments, character and pattern recognition, computer components, etc., can solve the problems of large resource consumption, unbalanced user division, long division process time, etc., and achieve the effect of high accuracy

Inactive Publication Date: 2019-09-24
ZHEJIANG UNIV OF TECH
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Aiming at how to classify users with the same preferences and similar behavior trajectories into one category, dig out potential connections among them, and cluster users into different groups, this invention proposes an algorithm that can automatically determine the cluster center, which can not only Browsing log data divides unlabeled users into different groups, and can also solve the problems of unbalanced user division, long division process, and large resource consumption
[0003] Now in the era of data explosion, traditional clustering analysis algorithms can no longer meet the requirements. In the process of data mining, how to cluster data quickly, efficiently and with low consumption has become the focus of many scholars.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • User behavior log density peak clustering method capable of automatically determining clustering center
  • User behavior log density peak clustering method capable of automatically determining clustering center
  • User behavior log density peak clustering method capable of automatically determining clustering center

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to understand the process of the present invention more easily, the present invention will take a user behavior data set as an example, and combine figure 1 The flow chart is described in detail below.

[0040]The user behavior density peak clustering method for automatically determining cluster centers of the present invention comprises the following steps:

[0041] Step 1. Read the dataset in the user log data file and calculate the similarity between two users. Assuming that the data set contains 1000 users (user behavior data), calculating the similarity between user i and user j (i, j<=1000) is to calculate the distance between two data points (using Euclidean distance for calculation) . Similarity refers to the degree of similarity in behavior between two users. If the online trajectories are similar and the behavior is the same, the higher the degree of similarity, the two users can be classified as a group of users.

[0042] Step 2. Calculate the sim...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a user log density peak clustering method capable of automatically determining a clustering center. The user log density peak clustering method comprises the following steps: step 1, reading a data set in a user log data file; step 2, calculating the similarity between the two users; step 3, calculating an initialization truncation distance dc; step 4, calculating the density Rhoi of the user i, wherein the density value reflects the similarity index of each user; step 5, calculating the distance Deltai of the user point i; step 6, calculating the normalized Rho* and Delta* of the user i; step 7, searching data points with relatively high density; step 8, finding out data points with large distances; step 9, finding out a correct category center point; step 10, removing outlier users; step 11, classifying the behaviors of the users according to the found central points, and classifying the non-central points into categories which are mainly the main central users; and step 12, outputting a clustering result. The user log density peak clustering method has the advantages that the influence of subjective factors on the clustering effect is reduced, and the accuracy is high.

Description

technical field [0001] The invention relates to a method for clustering website log data, in particular to a method for automatically determining a cluster center and performing density peak clustering on website user behavior log data. Background technique [0002] In today's era of rapid information development, a large amount of log data of users browsing websites is generated every day, recommending users' preferences, whether the client's service needs to be improved, and what kind of behavior users generally have when browsing the web Habits, finding out the rules, improving service module settings, and making recommendations based on user preferences have become hot spots that all information portals need to pay attention to. Aiming at how to classify users with the same preferences and similar behavior trajectories into one category, dig out potential connections among them, and cluster users into different groups, this invention proposes an algorithm that can automa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/23G06F18/22
Inventor 吴菲王万良吕闯
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products