Unlock instant, AI-driven research and patent intelligence for your innovation.

A microblog burst topic detection method and device

A topic detection and microblogging technology, which is applied in network data retrieval, other database retrieval, digital data information retrieval, etc., can solve the problem of difficulty in identifying sudden topics in microblogging, and achieve the effect of improving the accuracy rate

Active Publication Date: 2019-10-29
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The invention provides a microblog burst topic detection method and device to solve the problem that the current microblog burst topic is difficult to identify

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A microblog burst topic detection method and device
  • A microblog burst topic detection method and device
  • A microblog burst topic detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] This embodiment provides a microblog burst topic detection method, which is used to identify and acquire microblog burst topics, such as figure 1 As shown, the method includes the following steps:

[0027] Step 101: Extract feature items in the specified microblog data set, where feature items are language units containing specific semantics;

[0028] In this step, extracting the feature items in the specified microblog data set includes: extracting the repeated character strings in the specified microblog set; Set, extract the words that are located behind the repeated string in the text where the repeated string is located, and obtain the second adjacency set; determine the number of elements in the first adjacency set and the second adjacency set; in the first adjacency set and the second adjacency set When the number of elements in is greater than the preset value, it is determined that the current repeated character string is a feature item.

[0029] Step 102: Dete...

Embodiment 2

[0047] In order to solve the above technical problems, this embodiment discloses more technical details in combination with the attached figure 2 , to further illustrate the microblog burst topic discovery method in the above embodiment.

[0048] Step 1: Dynamically extract the meaningful string features of the microblog information flow in the specified time window, that is, meaningful strings, as the dynamic features of local microblog information, using the repetitive characteristics of microblog information, combined with contextual adjacency analysis of strings , to extract meaningful strings in microblog information.

[0049] Consider the microblog information as a text stream on a time series, set the observation time window T, use the microblog information in the time window T as a document collection D={D1, D2, D3,...}, extract meaningful characters in D Strings form the feature space S of microblog information in the window T, and the feature space S will change dy...

Embodiment 3

[0077] This embodiment provides a microblog burst topic detection device, which is used to implement the microblog burst topic detection method provided in the above-mentioned embodiment 1 and embodiment 2, such as image 3 As shown, the device 20 includes the following components:

[0078] The extraction module 21 is used to extract the feature items in the specified microblog data set, and the feature items are language units containing specific semantics;

[0079] The determination module 22 is used to determine the circulation of the feature item in the text of the microblog data set and the current popularity of the feature item;

[0080] The modeling module 23 is used to perform dynamic modeling on the feature item with the circulation as the quality parameter item and the heat as the position parameter item, so as to obtain the current energy and acceleration of the feature item;

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a microblog sudden topic detection method and device, which can solve the problem that microblog sudden topics are difficult to recognize at preset. The microblog sudden topic detection method includes: extracting a feature item of a designated microblog data set, wherein the feature item is a language unit including specific semantics; determining the circulation degree of the feature item in a text of the microblog data set and the current hot degree of the feature item; performing dynamics modeling on the feature item with the circulation degree as a mass parameter and with the hot degree as a position parameter to obtain the current energy and the current acceleration of the feature item; detecting sudden feature items when the obtained energy and the obtained acceleration are greater than a first preset value and a second preset value respectively; calculating mutual information among the sudden feature items according to the situation that the sudden feature items occur simultaneously in the same microblog; and merging the sudden feature items when the mutual information is greater than a third threshold to obtain a sudden topic. The microblog sudden topic detection method and device can improve the accuracy of sudden topic detection.

Description

technical field [0001] The invention relates to the field of network information mining, in particular to a microblog burst topic detection method and device. Background technique [0002] Weibo is a web2.0 new media that has emerged in recent years. Users can post text information within 140 characters, pictures, audio and video and other multimedia content on personal Weibo through mobile phones, instant messaging tools, Email, Web and other media to show their latest personal information. Dynamic, share real-time information around you. A huge amount of information is generated on the microblog platform every day. By the end of 2013, the total number of microblog users in my country had exceeded 1.3 billion, and the average daily posting volume of users exceeded 200 million. Moreover, because Weibo is associated with various media, information publishing and forwarding are very convenient, and Weibo has become the media with the fastest speed of information dissemination...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/953
Inventor 贺敏王丽宏周勇林云晓春程学旗包秀国马宏远丁丽刘玮刘悦赵立永杨建武
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT