Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method for large-scale micro-blog user interest group discovery

A technology of user interest and discovery method, applied in the field of large-scale microblog user interest group discovery, it can solve the problem that K-means algorithm artificially determines the k value, etc., and achieve the effect of quality improvement

Inactive Publication Date: 2019-03-08
JIANGSU UNIV
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the one hand, it cannot solve the problem that the K-means algorithm needs to manually determine the value of k; on the other hand, as the size of the microblog user data set to be clustered continues to increase, the execution of the K-means algorithm is limited by the actual available memory of the system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for large-scale micro-blog user interest group discovery
  • A method for large-scale micro-blog user interest group discovery
  • A method for large-scale micro-blog user interest group discovery

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0071] Cluster analysis of big data of Sina Weibo users, discover Weibo user interest groups, and provide personalized services for different interest groups, providing support for the optimization of Weibo personalized services and the improvement of marketing revenue.

[0072] 1. Use the "Octopus Collector" to capture the Sina Weibo information of 627 ordinary users. After filtering the collected information of 627 Weibo users, 60 "silent users" are eliminated, and finally retained 567 valid microblog user information to complete the user filtering process. In this embodiment, Sina Weibo data is preprocessed as shown in Table 1, and the first 5 rows of data in Table 1 are taken as an example for example.

[0073] Table 1 Preference vector of user interests

[0074]

[0075]

[0076]2. Sample data inspection:

[0077] 1) Before performing cluster analysis, it is necessary to check whether the sample data can represent the whole, otherwise clustering is meaningless. A...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a large-scale micro-blog user interest group discovery method, belonging to the technical field of data mining. The method comprises the following steps of (1) acquiring data as a data source of micro-blog user interest group discovery; (2) data checking and preprocessing; (3) standardizing the data and converting the unstructured data into structured data for clustering analysis; (4) carrying out the improved SSLOK-Means clustering analysis based on Calinski-Harabasz; (5) using the CH validity discriminant function to determine the optimal number of clusters k, and constructing the model of interest group discovery. The invention makes it possible to cluster analysis and automatically determine the cluster number in the limited memory according to the big data of the micro-blog user, and provides support for the optimization of the micro-blog personalized service and the promotion of the marketing income.

Description

technical field [0001] The invention belongs to the technical field of data mining, and in particular relates to a method for discovering interest groups of large-scale microblog users. Background technique [0002] As an open network platform, Weibo provides users with a broad space for sharing and communication. With real-time, concise, and open features, Weibo has a huge number of users. According to the 2017 Sina Weibo User Development Report: As of September 30, 2017, the monthly active users of Sina Weibo reached a new high of 376 million, an increase of 27% over the same period in 2016. Faced with the ever-increasing user groups, how microblog operators can provide users with more accurate and personalized services is a major problem that needs to be solved urgently. The massive data generated by Weibo users on the platform contains a wealth of user behavior information. Through the analysis and research of user data, it is found that user groups with similar intere...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06Q50/00G06K9/62
CPCG06Q50/01G06F18/23
Inventor 申彦
Owner JIANGSU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products