A microblog emergency detection method and detection device based on heap optimization
A detection method and emergency technology, applied in other database retrieval, digital data information retrieval, clustering/classification of other databases, etc., can solve problems such as sparse data and difficulty in obtaining detection results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0042] A microblog emergency detection method based on heap optimization, see figure 1 , the detection method includes the following steps:
[0043] 101: Perform noise reduction and word segmentation preprocessing on Weibo text;
[0044] Among them, the microblog text contains a large number of invalid data such as emoticons, web address links, and user comment content, and these characters are matched in the microblog text and deleted. Microblog text is segmented by word segmentation software.
[0045] 102: Group the preprocessed microblog data by time window, and respectively calculate the word weights of the microblog text in the group;
[0046] 103: Obtain the burst degree of words by word weight, calculate the burst degree of words in the time window, and extract the burst word set;
[0047] User influence is affected by factors such as the number of fans, the number of microblogs posted, whether they are VIP users, and their activity. Weibo content published by users...
Embodiment 2
[0062] The technical solution in embodiment 1 is described in detail below in conjunction with specific calculation formulas and examples, see below for details:
[0063] 201: In the microblog emergency detection process, the noise reduction processing of the microblog text must be performed first. During this process, junk characters such as emoji, web page links, and comment content in the microblog text need to be deleted.
[0064] 202: Perform word segmentation processing on the microblog text after noise reduction through the IKAnalyzer word segmentation tool;
[0065] In the process of word segmentation, it is necessary to add extended vocabulary and deactivated vocabulary to improve the effect of word segmentation. Through the word segmentation processing, the word segmentation result of the microblog text is obtained. Among them, IKAnalyzer is an open source, lightweight Chinese word segmentation toolkit developed based on the java language, which is well known to tho...
Embodiment 3
[0093] Combined with the following specific examples, the attached image 3 The scheme in embodiment 1 and 2 is carried out feasibility verification, see the following description for details:
[0094] The purpose of the embodiment of the present invention is to optimize the clustering algorithm of the original method, so as to improve the detection efficiency of the microblog emergency detection method. By using this method, the time complexity of the original algorithm is successfully reduced from O(N 3 ) reduced to O(N 2 *log(N)), and achieved the expected purpose in the experiment.
[0095] In the comparison experiment, the actual running time of this method and the original method is compared by controlling the number of burst words to 100, 200, 400, 800, and 1600. In the comparison experiment, the running time of the two methods is compared by controlling the consistency of input data, the consistency of data preprocessing, and taking the average value of multiple exp...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


