Unlock instant, AI-driven research and patent intelligence for your innovation.

A microblog emergency detection method and detection device based on heap optimization

A detection method and emergency technology, applied in other database retrieval, digital data information retrieval, clustering/classification of other databases, etc., can solve problems such as sparse data and difficulty in obtaining detection results

Active Publication Date: 2019-01-29
TIANJIN UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the small number of short texts and sparse data, it is difficult for ordinary text-centric methods to achieve good detection results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A microblog emergency detection method and detection device based on heap optimization
  • A microblog emergency detection method and detection device based on heap optimization
  • A microblog emergency detection method and detection device based on heap optimization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] A microblog emergency detection method based on heap optimization, see figure 1 , the detection method includes the following steps:

[0043] 101: Perform noise reduction and word segmentation preprocessing on Weibo text;

[0044] Among them, the microblog text contains a large number of invalid data such as emoticons, web address links, and user comment content, and these characters are matched in the microblog text and deleted. Microblog text is segmented by word segmentation software.

[0045] 102: Group the preprocessed microblog data by time window, and respectively calculate the word weights of the microblog text in the group;

[0046] 103: Obtain the burst degree of words by word weight, calculate the burst degree of words in the time window, and extract the burst word set;

[0047] User influence is affected by factors such as the number of fans, the number of microblogs posted, whether they are VIP users, and their activity. Weibo content published by users...

Embodiment 2

[0062] The technical solution in embodiment 1 is described in detail below in conjunction with specific calculation formulas and examples, see below for details:

[0063] 201: In the microblog emergency detection process, the noise reduction processing of the microblog text must be performed first. During this process, junk characters such as emoji, web page links, and comment content in the microblog text need to be deleted.

[0064] 202: Perform word segmentation processing on the microblog text after noise reduction through the IKAnalyzer word segmentation tool;

[0065] In the process of word segmentation, it is necessary to add extended vocabulary and deactivated vocabulary to improve the effect of word segmentation. Through the word segmentation processing, the word segmentation result of the microblog text is obtained. Among them, IKAnalyzer is an open source, lightweight Chinese word segmentation toolkit developed based on the java language, which is well known to tho...

Embodiment 3

[0093] Combined with the following specific examples, the attached image 3 The scheme in embodiment 1 and 2 is carried out feasibility verification, see the following description for details:

[0094] The purpose of the embodiment of the present invention is to optimize the clustering algorithm of the original method, so as to improve the detection efficiency of the microblog emergency detection method. By using this method, the time complexity of the original algorithm is successfully reduced from O(N 3 ) reduced to O(N 2 *log(N)), and achieved the expected purpose in the experiment.

[0095] In the comparison experiment, the actual running time of this method and the original method is compared by controlling the number of burst words to 100, 200, 400, 800, and 1600. In the comparison experiment, the running time of the two methods is compared by controlling the consistency of input data, the consistency of data preprocessing, and taking the average value of multiple exp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a heap optimization based microblog emergency event detection method and a detection apparatus thereof. The detection method comprises the following steps: grouping microblog data after pretreatment according to a time window, and separately calculating a word weight of a microblog text in a group; obtaining a burst degree of a word by using the word weight, and calculating a burst degree of a word inside the time window, to extract a burst word set; performing clustering on the burst word set, and accelerating a clustering process by heap optimization; and performing processing on a clustering result, so as to extract an effective event. The detection apparatus comprises a calculation module, a first extraction module, a clustering module, and a second extraction module. According to the detection method and detection apparatus disclosed by the present invention, by combining with a factor such as user influence, an emergency event is detected in a large amount of microblog short texts, so as to meet a demand of a user for obtaining the emergency event, thereby meeting requirements in real application.

Description

technical field [0001] The invention relates to the field of microblog emergency event detection of short text streams, in particular to a heap optimization-based microblog emergency event detection method and a detection device thereof. Background technique [0002] TDT (Topic Detection and Tracking) technology has been emerging since 1996, and its earliest goal was to identify and track topics in online news texts. With the development of the Internet, short text applications such as Weibo and Twitter have emerged, and the demand for topic detection in short texts has become increasingly prominent, so the TDT technology in short texts has also continued to develop. At present, there are mainly two kinds of research methods on microblog emergencies at home and abroad, the text-centered method and the burst feature-centered method. [0003] The text-centered method first extracts the subject words of the text, expresses the content of the text through the subject words, and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/9535G06F16/906
CPCG06F16/9535
Inventor 于瑞国林榆旺喻梅王建荣于健赵满坤
Owner TIANJIN UNIV