User clustering and short text clustering method based on social network short text stream

A social network and user clustering technology, applied in the computer field, can solve the problems of not considering the user's expression habits, unable to truly capture the user's theme characteristics in the social network, and not applicable to short text data streams, etc.

Active Publication Date: 2017-05-10
SUN YAT SEN UNIV
View PDF13 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

(But the LDA that this invention relates to is not suitable for the short text data stream in the social network, there are three reasons, 1. Time factor is not considered
2. Social factors are not considered 3. User expression habits are not considered
Therefore, it is impossible to truly capture the subject characteristics of users in social networks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • User clustering and short text clustering method based on social network short text stream
  • User clustering and short text clustering method based on social network short text stream
  • User clustering and short text clustering method based on social network short text stream

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The content of the present invention will be further elaborated below in conjunction with the accompanying drawings and a specific embodiment. It should be understood that the specific embodiments described here are only used to explain related inventions, not to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the invention are shown in the drawings.

[0034] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0035] The invention aims at the problem of "word meaning drift" and short text sparsity of short text data streams published by users in social networks, and proposes a new text topic modeling method to obtain text topics and user topics in each time period.

[0036] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention aims at solving the problem that social contact factors, 'semantic drifting' and sparsity of a short text are not given into considerations in a user clustering and short text clustering method based on semantics. The invention provides a user clustering and short text clustering method based on social network short text stream topic modeling. The method comprises the first step of acquiring a corpus; the second step of conducting pretreatment on the corpus; the third step of conducting topic modeling based on the short text data stream in the social network; the fourth step of deduction and sampling; the fifth step of conducting clustering on users; the sixth step of conducting clustering on short texts. According to the user clustering and short text clustering method based on social network short text stream topic modeling, the three factors of 'semantic drifting', 'sparsity of the short text', and 'the social network' which can influence the topic modeling are fully considered, the problem that analyzing clustering of the user and text through the social network short text stream is lack of social semantic information is solved, and thus precision of an existing clustering algorithm is sharply improved.

Description

technical field [0001] The present invention relates to the field of computer technology, and more specifically, to a method and system for clustering users and short texts based on social network short text streams. Background technique [0002] With the popularization of mobile Internet and the rapid development of social networks, hundreds of millions of user data have been deposited on social networks, how to analyze the short texts published by these users for user clustering and short text clustering has become a very important topic. However, the existing methods do not have an effective method for performing user dynamic clustering on the latent semantic information of short text data streams in social networks, so the present invention proposes an effective dynamic clustering method to solve the problem of latent semantic clustering of short text data streams in social networks. Analyze the problem of user clustering and short text clustering. [0003] The present ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 沈鸿邱章成
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products