Clustering method for network behavior habits based on K-means and LDA (Latent Dirichlet Allocation) two-way authentication

A technology of two-way verification and clustering methods, applied in text database clustering/classification, character and pattern recognition, instruments, etc., can solve problems such as poor efficiency and very bad answers
CN106202480AActive Publication Date: 2016-12-07HUAIYIN INSTITUTE OF TECHNOLOGY

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
HUAIYIN INSTITUTE OF TECHNOLOGY
Publication Date
2016-12-07

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

The invention discloses a clustering method for network behavior habits based on K-means and LDA (Latent Dirichlet Allocation) two-way authentication. According to the clustering method, webpage properties, keywords and frequency in internet browsing records of persons are utilized to combine with a K-means algorithm, an LDA document topic extracting model and an annealing algorithm. The clustering method comprises the following steps: firstly, performing K-means algorithm clustering and LDA document topic extracting model generation on a staff-label-frequency set and a person browsing record-person-keyword set; secondly, storing and calculating an intermediate result, and then performing K-means and LDA two-way authentication by using the annealing algorithm; calculating a global best topic-classification label sequence, and optimizing a network behavior habit clustering result by taking the global best topic-classification label sequence as a reference. By means of the K-means and LDA two-way authentication, the sensitivity to person-classification labels is improved; by using the annealing algorithm, the optimizing efficiency of the clustering result can be improved, and further the clustering accuracy is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical field

[0001] The invention belongs to the field of clustering analysis and optimization algorithms, and particularly relates to a network behavior habit clustering method based on two-way verification of K-means and LDA, which is used to optimize clustering results, thereby improving clustering accuracy, and increasing The use value of information recorded by people online. Background technique

[0002] Mastering the clustering method of network behavior habit data has an important role and significance for researchers' surfing habits. With the continuous popularization of the Internet, more and more people choose to obtain interesting information through the Internet. The amount of information that people browse online is huge. Relying on manual analysis of these data is not only inefficient, but also not accurate. Through cluster analysis, coupled with two-way verification with another clustering method, the efficiency and accuracy of analysis can be improved. Gener...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More