Online topic detection method and system for text stream

A detection method and detection system technology, applied in the Internet field, can solve the problems of insufficient interpretation, high time and space complexity, and insufficient effect, and achieve the effect of low memory storage capacity and high real-time requirements.

Inactive Publication Date: 2018-07-03
上海神计信息系统工程有限公司
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Early research on online topic detection mainly focused on the selection and fusion of clustering methods, including unilateral clustering algorithms, agglomerative h

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online topic detection method and system for text stream
  • Online topic detection method and system for text stream
  • Online topic detection method and system for text stream

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The above-mentioned features and advantages of the present invention can be better understood after reading the detailed description of the embodiments of the present disclosure in conjunction with the following drawings. In the drawings, components are not necessarily drawn to scale, and components with similar related properties or characteristics may have the same or similar reference numerals.

[0042] figure 1 The flowchart of an embodiment of the online topic detection method for text streams of the present invention is shown. See figure 1 , the following is a detailed description of the implementation steps of the online topic detection method of this embodiment.

[0043] Step S1: Construct the ODT-LTF algorithm framework.

[0044]The entire OTD-LTF algorithm is described as follows: at the beginning, set the global variable tensor array, read the incoming corpus at the current moment, and initialize all parameters, including the number of documents, the numbe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an online topic detection method and system for a text stream. A user can quickly find interested topics in a complex text. According to the technical scheme, the method comprises the steps of building an ODT-LTF algorithm framework; extracting topics by adopting an LDA Bayesian network structure model; inferring implicit parameters of the LDA Bayesian network structure model by adopting a Gibbs sampling algorithm; building a three-order tensor of topic-topic-time through an incremental building method of the topic tensor, and fusing a time dimension in the topic tensor; decomposing the three-order topic tensor; and clustering the similar topics to obtain topics, a hierarchical structure on the topics and change of the topics in time, thereby finishing online topicdetection.

Description

technical field [0001] The invention relates to a text stream processing method in the technical field of the Internet, in particular to a text stream online topic detection method. Background technique [0002] With the rapid development of computer technology and Internet technology, web2.0 applications represented by blogs, Wikipedia, and Twitter are widely popularized, enabling people to upload user-defined data anytime and anywhere, and people's ability to create data has greatly improved. Exceeding the ability to obtain information, all kinds of data have exploded. Information is an important tool for people to understand, communicate, and express their opinions and objective things and objects. Its carriers include text, graphics, images, animations, audio, video, etc. Among all data types, text is the most common data type. Knowledge dissemination and information exchange still use text as the main information medium, and it has the characteristics of small capacity...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06N99/00
CPCG06N20/00G06F40/30
Inventor 向阳涂笑陈千姚莉萍吕冬冬
Owner 上海神计信息系统工程有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products