Topic-oriented multi-microblog time sequence abstracting method

A technology of microblogging and timing, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as redundancy

Active Publication Date: 2016-07-06
TIANJIN UNIV
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the huge number of users of Weibo and the open information release method, there is a lot of redundancy in the information on it. In order t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Topic-oriented multi-microblog time sequence abstracting method
  • Topic-oriented multi-microblog time sequence abstracting method
  • Topic-oriented multi-microblog time sequence abstracting method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The technical solution of the present invention will be described in detail below in conjunction with specific embodiments.

[0059] Taking four real Twitter data sets based on name A and ipad as examples, the implementation of the topic-oriented multi-microblog time-series summarization method of the present invention is given. The algorithm flow of the whole system is as follows: figure 1 As shown, including microblog data set input, heat signal modeling, important time point selection, microblog instantaneous timing characteristics and user authority modeling and T2ST microblog sorting model design, MMR-based microblog abstract selection, and summary The result outputs these 6 steps.

[0060] Specific steps are as follows:

[0061] 1) Weibo dataset input

[0062] As shown in Table 1, the initial input of the system is three Twitter real corpus datasets retrieved from the topic keywords of name A, ipad and microsoft respectively. The numbers are 221364, 143887 and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a topic-oriented multi-microblog time sequence abstracting method. The method comprises the following steps of 1) by taking a time point as a horizontal axis and a microblog updating speed corresponding to a corresponding time point as a longitudinal axis, performing topic-oriented microblog text stream popularity signal modeling; 2) denoising an initial signal in the step 1) by adopting wavelet denoising, selecting a signal maximum point in the signal according to a certain time granularity, and performing sorting according to the corresponding updating speed to detect an important time point; 3) establishing a text sorting model T2ST which reflects the importance of a microblog by fusing an instantaneous time sequence characteristic of a microblog stream popularity signal and the user social contact authority of a social network; and 4) selecting an abstract sentence by adopting a maximum edge related technology and establishing an MMR microblog abstract sentence selection model. According to the method, the important time point in a microblog sequence under a specific topic is detected through a wavelet denoising method, and based on this, multiple microblogs are abstracted by utilizing an improved graph-based random walk algorithm, so that the accuracy of an output result is high.

Description

technical field [0001] The invention relates to the technical field of data mining in network microblogs, in particular to a topic-oriented multi-microblog time series abstract method. Background technique [0002] With the rapid development of Internet technology, especially the emergence of Weibo, the way people obtain information has undergone certain changes. However, due to the huge number of users of Weibo and the open information release method, there is a lot of redundancy in the information on it. In order to help users obtain the information they need from Weibo accurately and quickly, the summary algorithm Research design has important practical significance. [0003] The abstract is a short article that clearly and concisely describes the important content of the document, and its length is generally less than 15% of the length of the original document. Obtaining information in this way can greatly shorten the time required to obtain information without distorti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/9535
Inventor 贺瑞芳于广川党建武胡清华
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products