Topic mining based event cluster acquisition method

A technology for acquiring methods and topics, applied in the field of computer text processing and mining, can solve problems such as the inability to directly measure the degree of topic relevance, and achieve the effect of improving rationality

Active Publication Date: 2016-03-09
TSINGHUA UNIV
View PDF1 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, Blei, DavidM and Lafferty, JohnD pointed out that one of the defects of LDA in Acorrelated topic model of science (CTM) [5] is that it cannot directly measure the degree of correlation between topics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Topic mining based event cluster acquisition method
  • Topic mining based event cluster acquisition method
  • Topic mining based event cluster acquisition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0026] In describing the present invention, it should be understood that the terms "center", "longitudinal", "transverse", "upper", "lower", "front", "rear", "left", "right", " The orientations or positional relationships indicated by "vertical", "horizontal", "top", "bottom", "inner" and "outer" are based on the orientations or positional relationships shown in the drawings, and are only for the convenience of describing the present invention and Simplified descriptions, rather than indicating or implying that the device or element refe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a topic mining based event cluster acquisition method. The method comprises the following steps: S1, collecting a text data set C; S2, preprocessing the text data set C, and removing meaningless words from the text data set; S3: setting the number n of topics and a parameter, running CTM to obtain a CTM model; S4, in a covariance matrix sigma indicating an association degree between topics of the CTM model, searching for all maximum clusters by using a backtracking algorithm, wherein the maximum clusters are topic clusters; and S5, for each topic comprised in each topic cluster, selecting a most corresponding article from the text data set C, and clustering events corresponding to the most corresponding article to form an event cluster. The method provided by the present invention has the following advantages: association degree information at the topic level is used during association degree analysis, and compared with the traditional technology in which association degree information is mined and used at the word level, the method provided by the present invention can better improve rationality in calculating an event association degree.

Description

technical field [0001] The invention belongs to the field of computer text processing and mining, and relates to a hierarchical theme model mining technology, in particular to a method for acquiring event clusters based on theme mining. Background technique [0002] With the rapid development of science and technology, the way of dissemination of information has undergone earth-shaking changes. In particular, the popularization of Internet technology and the growing influence of the Internet have made network information the main means for people to obtain information. With more and more text information on the Internet, how to mine effective event information from text information has become a big challenge. Based on this practical requirement, there is a need for a technology that can extract events automatically, accurately and in real time. Existing machine learning techniques can solve the problem of mining events from text, and most of them use topic models to mine i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/285G06F16/35
Inventor 靳晓明张宇婷
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products