Emergent topic detecting method and system facing text streams of micro-blog platform

A topic detection and micro-blog technology, applied in the field of Internet information management, can solve problems such as poor, inflexible topic model parameter settings, noise, etc., to ensure the effect of suddenness

Inactive Publication Date: 2013-09-04
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF4 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the topic model has problems such as inflexible parameter setting, noise in real-time text streams, and insufficient statistical information, etc., and the topics it discovers are not necessarily sudden topics or events.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Emergent topic detecting method and system facing text streams of micro-blog platform
  • Emergent topic detecting method and system facing text streams of micro-blog platform
  • Emergent topic detecting method and system facing text streams of micro-blog platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] figure 1 It is a flow chart of the method for detecting sudden topics oriented to microblog platform text flow in the present invention. Such as figure 1 As shown, the method includes:

[0045] Step 1, collecting user data and user-generated message data of the microblog platform in real time, and extracting message text and accompanying pictures from the user data and user-generated message data;

[0046] Step 2, setting a time window to divide the message text to obtain real-time data flow and historical data;

[0047] Step 3, select feature from described historical data, and utilize classification method to carry out the training of popularity evaluation model and long microblog extraction model to described message text;

[0048] Step 4, use the popularity evaluation model to evaluate the popularity of the real-time data stream, use the long microblog extraction model to extract long microblogs, and put the messages evaluated as popular into the popular message ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an emergent topic detecting method and system facing text streams of a micro-blog platform. The method comprises the following steps that (1) user data and user generation information data of the micro-blog platform are collected in real time, and information text and images are extracted; (2) a time window is set, the information text is divided, and real-time data streams and historical data are obtained; (3) characteristics are selected, and training of a popularity evaluation model and a long micro-blog extraction model is carried out; (4) popularity evaluation is carried out on the real-time data streams by means of the popularity evaluation model, long micro-blog extraction is carried out on the real-time data streams by means of the long micro-blog extraction model, the information which is evaluated to be popular is put into popular information sets, and extracted long micro-blog contents are put into long micro-blog sets; (5) whether the number of the popular information sets and the number of the long micro-blog sets achieve preset threshold values is judged, if yes, topic extraction is carried out through an LDA model or in a weighting summation mode, emergent topics are extracted from data of the popular information sets and the long micro-blog sets, if no, the method goes back to the step (1).

Description

technical field [0001] The invention relates to the field of Internet information management, in particular to a sudden topic detection method for microblog platform text flow. Background technique [0002] With the rapid development of the Internet, especially the rapid development of Web2.0, social networking services represented by Facebook, Myspace and Twitter have become indispensable communication tools for network users. These social network services provide users with functions including updated information of friends, updated information of interested persons or groups, and related information of the latest popular times, and these functions are gradually changing the information acquisition methods of users of social network services. As a new type of social network, microblog, represented by foreign Twitter and domestic Sina Weibo, is quite different from Facebook and other virtual communities based on traditional communities. On the real-time nature of the news....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 程学旗李静远房伟伟王元卓刘悦
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products