A microblog burst topic detection method and device
A topic detection and microblogging technology, which is applied in network data retrieval, other database retrieval, digital data information retrieval, etc., can solve the problem of difficulty in identifying sudden topics in microblogging, and achieve the effect of improving the accuracy rate
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0026] This embodiment provides a microblog burst topic detection method, which is used to identify and acquire microblog burst topics, such as figure 1 As shown, the method includes the following steps:
[0027] Step 101: Extract feature items in the specified microblog data set, where feature items are language units containing specific semantics;
[0028] In this step, extracting the feature items in the specified microblog data set includes: extracting the repeated character strings in the specified microblog set; Set, extract the words that are located behind the repeated string in the text where the repeated string is located, and obtain the second adjacency set; determine the number of elements in the first adjacency set and the second adjacency set; in the first adjacency set and the second adjacency set When the number of elements in is greater than the preset value, it is determined that the current repeated character string is a feature item.
[0029] Step 102: Dete...
Embodiment 2
[0047] In order to solve the above technical problems, this embodiment discloses more technical details in combination with the attached figure 2 , to further illustrate the microblog burst topic discovery method in the above embodiment.
[0048] Step 1: Dynamically extract the meaningful string features of the microblog information flow in the specified time window, that is, meaningful strings, as the dynamic features of local microblog information, using the repetitive characteristics of microblog information, combined with contextual adjacency analysis of strings , to extract meaningful strings in microblog information.
[0049] Consider the microblog information as a text stream on a time series, set the observation time window T, use the microblog information in the time window T as a document collection D={D1, D2, D3,...}, extract meaningful characters in D Strings form the feature space S of microblog information in the window T, and the feature space S will change dy...
Embodiment 3
[0077] This embodiment provides a microblog burst topic detection device, which is used to implement the microblog burst topic detection method provided in the above-mentioned embodiment 1 and embodiment 2, such as image 3 As shown, the device 20 includes the following components:
[0078] The extraction module 21 is used to extract the feature items in the specified microblog data set, and the feature items are language units containing specific semantics;
[0079] The determination module 22 is used to determine the circulation of the feature item in the text of the microblog data set and the current popularity of the feature item;
[0080] The modeling module 23 is used to perform dynamic modeling on the feature item with the circulation as the quality parameter item and the heat as the position parameter item, so as to obtain the current energy and acceleration of the feature item;
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


