News event evolution analysis method based on time sequence distribution information and topic model

A technology for distributing information and topic models, applied in the field of text analysis, which can solve the problem that it is difficult to find new changes in events in subsequent time slices

Active Publication Date: 2014-08-13
TONGJI UNIV
View PDF3 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For a corpus that only contains specific news events, it is difficul

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • News event evolution analysis method based on time sequence distribution information and topic model
  • News event evolution analysis method based on time sequence distribution information and topic model
  • News event evolution analysis method based on time sequence distribution information and topic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to make the object, technical solution and advantages of the present invention clearer, the ontology concept and layer generation method according to the embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the present invention, that is, the protection scope of the present invention is not limited to the following embodiments, on the contrary, according to the inventive concept of the present invention, those skilled in the art Appropriate changes can be made by those skilled in the art, and these changes can fall within the scope of the invention defined by the claims.

[0039] like figure 1As shown in the basic frame diagram of the present invention, the news event evolution analysis based on the time series distribution information and the t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a news event evolution analysis method based on time sequence distribution information and a topic model and relates to the field of text analysis. The method comprises the following steps: firstly, dividing a corpus into a plurality of sub-corpuses according to time by analyzing distribution characteristics, presented on a time sequence, of a news report, and by using a K-Means clustering algorithm; secondly, sequentially performing topic modeling on each sub-corpus by using the topic model, and learning the model through a Gibbs sampling method to obtain topic distribution information of each sub-corpus; finally, connecting topics between which the distance is minimum in series by calculating a Jensen-Shannon distance between each two topics in the adjacent sub-corpuses, wherein the topics are connected in series to obtain a main topic of an event, and auxiliary topics except the main topic in each sub-corpus are concerns and new developments of the event in each stage. According to the method, the mainline of event development in a news prediction and new concerns burst in each stage can be better described.

Description

technical field [0001] The invention relates to the field of text analysis, in particular to a method for analyzing news event topic evolution. Background technique [0002] In the Internet age, information is growing at an explosive rate, but it is becoming more and more difficult to find the information we really need. Therefore, we need new methods to help us organize and understand this huge amount of information. As a method that can automatically organize, understand, search and summarize large-scale electronic documents, the topic model can be used to mine the topic information hidden in the document collection, then mark according to the topic of the document, and finally organize and summarize according to the mark and search text. [0003] The basic idea of ​​Topic Models is that a document is a mixture of multiple topics, and a topic is a probability distribution on the lexicon. The topic model is a generative model. In order to generate a document, the probabil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
Inventor 王俊丽王志成赵卫东王坚
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products