A method for online news topic detection

A topic detection and news technology, applied in network data retrieval, other database retrieval, unstructured text data retrieval, etc., can solve problems such as topic drift, difference, and impact on clustering effects, and achieve the goal of improving quality and accuracy Effect
CN104715014BActive Publication Date: 2017-10-10SUN YAT SEN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SUN YAT SEN UNIV
Publication Date
2017-10-10

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses an online news topic detection method and belongs to the field of computer science and technology. The more efficient topic detection method is raised up for web texts with the requirement for topic detection in the internet. A cluster buffer zone is established to initially cluster reached texts of a certain number or within a certain period through an X-means algorithm, a dual-threshold (a topic gathering threshold value and a topic mass center updating threshold value) thought is introduced, topic shift is effectively controlled, and the clustering effect is improved. The effects achieved through the method are superior to those of a classic Single-Pass algorithm at all evaluation indexes, and topics with the topic detection requirement are more accurately identified.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the field of computer science and technology, and more specifically, to a method for detecting online topics of network news. Background technique

[0002] Topic Detection (TD) is one of the five basic research tasks in Topic Detection and Tracking (TDT), which mainly detects topics that are unknown beforehand in the detection and organization system. The TDT (Topic Detection and Tracking) project is a project funded by the US Defense Advanced Research Projects Agency (DARPA) and jointly participated by the University of Massachusetts, Carnegie Mellon University and Dragon Systems. This project is mainly to automate the analysis of continuous news media information, detect the topics in it, and track the detected topics. The research on topic detection is carried out under the background of TDT (Topic Detection and Tracking) project. For the task of topic detection, the Single-Pass algorithm is widely used. Single-Pass is an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More