An Event Evolution Analysis Method for Short Text Data

A technology of event evolution and short text, which is applied in the field of event evolution analysis of short text data, and can solve problems such as inapplicability of short text data and inability to track event evolution process in real time and dynamically

Active Publication Date: 2015-07-29
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, tools such as GAC-INCR only analyze the data statically, and cannot track the evolution process of events dynamically in real time.
In addition, the clustering method used by GAC-INCR is not suitable for short text data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Event Evolution Analysis Method for Short Text Data
  • An Event Evolution Analysis Method for Short Text Data
  • An Event Evolution Analysis Method for Short Text Data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to make the purpose, technical solution and advantages of the present invention clearer, a short text data event evolution analysis method proposed by the present invention will be further described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0043] According to an embodiment of the present invention, a method for analyzing event evolution of short text data is provided. Specifically include the following steps:

[0044] The first step is to obtain the events of the first period and the number of articles associated with each event.

[0045] First, in the initial period, according to the thesaurus with a fixed size of D (that is, including D terms), and N short text data (or called N documents / articles) input in real time during the initial period, the first The document-term matrix of th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an event evolution analysis method of short text data, which comprises the steps of constructing a document-lexical item matrix of the current time period according to a lexicon and the short text data input in the current time period, conducting non-negative matrix factorization on the document-lexical item matrix, obtaining a document-event matrix and an event-lexical item matrix, calculating similarity between an event of the current time period and an event of the previous time period according to the event-lexical item matrix, constructing an event relation graph of the current time period according to the similarity, the event of the current time period and a residual graph of the previous time period, dividing the event relation graph of the current time period into one or more subgraphs, classifying the subgraphs to obtain a newly generated event set and an evolution event set, calculating a document amount associated with each event according to the document-event matrix, and analyzing and predicting a trend of the evolution event set according to the document amount to serve as a constraint condition of the non-negative matrix factorization of the next time period. The method is suitable for tracking an event evolution process of the short text data dynamically.

Description

technical field [0001] The invention relates to the field of data mining, in particular to an event evolution analysis method for short text data. Background technique [0002] With the emergence of web2.0 technology, users are more and more involved in existing network applications. Among them, Weibo is a very popular type of network application at present, and it is an information sharing, dissemination and acquisition platform based on user relationships. Users can transmit information, comment, etc. through short text data on Weibo. How to deal with these short text data has attracted more and more attention. In the data mining process of short text data, there are usually three requirements: first, discover newly generated topics (or events, topics) from short text data in a timely manner; tracking; third, create system models that can withstand large-scale network data. [0003] However, data mining for short text data such as Weibo is quite difficult. The reasons a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 程学旗刘盛华李福鑫王元卓刘悦
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products