Method and device for detecting network hot topics found based on maximal clique

A technology for topic detection and network hotspots, applied in special data processing applications, instruments, electrical digital data processing, etc.

Inactive Publication Date: 2012-02-08
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 55 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Clustering methods for this type of data do not perform well
[0006] The second type of topic detection method is to directly count the number of occurrences of words or repeated strings, and use frequent word sets to express hot topics. Therefore, this type of method is not sensitive to text length, but the accuracy rate needs to be improved.
These methods are less effective for Internet information of varying lengths

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting network hot topics found based on maximal clique
  • Method and device for detecting network hot topics found based on maximal clique
  • Method and device for detecting network hot topics found based on maximal clique

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0084] Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0085] figure 1 It is a flowchart of an embodiment of the present invention, comprising the following steps:

[0086] Step S1: Data collection, complete real-time data collection of online news websites, forums, blogs, and microblogs.

[0087] Step S2: Build a hot word pair set, process the collected data, and build a hot word pair set.

[0088] Step S3: Hot words are numbered, and each hot word is represented by a unique number.

[0089] Step S4: Maximal clique mining, the set of hot word pairs is regarded as an undirected graph, and each vertex in the graph is the number corresponding to the corresponding hot word. The graph is mined to get all maximal cliques.

[0090] Step S5: hot topic representation, the vertex number of each maximal clique is represented by the corresponding hot word, and each maximal clique is transformed into a wo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a method and device for detecting network hot topics found based on a maximal clique. The method comprises the following steps of: acquiring data of a network news website, a forum, a blog and a microblog in a real time; carrying out word segmentation, word frequency statistics and other processing on the acquired data to find all hot point word pairs and construct a hot point word pair set; expressing each hot-point word by using a unique serial number; viewing the hot point word pair set as an undirected graph and excavating the undirected graph to obtain all maximal cliques; and transforming each maximal clique into a word combination for expressing one hot point topic. The invention also discloses a device for detecting the network hot topics. According to the embodiment of the invention, the hot point topics in the network can be accurately found in a real time, the detection speed and precision of the hot point topics are improved and higher practical value is obtained.

Description

technical field [0001] The invention relates to network information analysis and data mining technology in the field of text information processing, in particular to a hot topic detection method and device based on maximal clique discovery. Background technique [0002] The Internet has gradually become the main place for the generation and dissemination of public opinion, and many people actively express their views and opinions on the Internet. Because the network itself has the characteristics of virtuality, concealment, permeability and randomness, the social influence of network public opinion is increasing, and it may even affect major national decisions. Therefore, the governments and militaries of various countries pay close attention to the research of Internet public opinion in order to respond to hot spots, focal points and sensitive topics in a timely manner. [0003] Network hot topic discovery is the primary problem that needs to be solved in network public op...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 肖波蔺志青郭军
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products