Internet hot point topics correlativity excavation method

A correlation, Internet technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of improper processing of infrequent keywords, and achieve the performance requirements, overcome low performance, good performance requirements. Effect

Inactive Publication Date: 2008-04-09
ZHEJIANG UNIV
View PDF0 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to provide a correlation mining method for Internet hot topics, which uses conditional probability to overcome

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Internet hot point topics correlativity excavation method
  • Internet hot point topics correlativity excavation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] Figure 1 shows the system framework diagram of Internet hot topic correlation mining. First, according to popular queries, hot topic keywords are extracted to form a hot topic keyword dictionary; then the data source is scanned, and hot topic keyword pairs are filtered for each record, and updated Corresponding frequency in the sparse matrix, update the frequency of each hot topic keyword at the same time; by calculating the correlation score between hot topic keywords, sort according to the score; when the user queries, it will correspond to the hot topic keyword according to the descending order of the score Key words of hot topics are given as result feedback.

[0026] The specific implementation process is shown in Figure 2, and the important steps are:

[0027] 1. Load the original hot topic keyword dictionary. The content of the dictionary should be the hot topic keywords that most users care about. Load the original hot topic keyword dictionary file, use the Se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for taping the internet hot-point subjects correlation. The invention extracts the hot-point subject key words through the search engine inquiry log, and conducts the model building analysis on correlation degree among the hot-point subject key words. The effective Hash method shall be adopted to build a sparse matrix, so as to increase the algorithm efficiency. The invention can conduct the incrementation processing to newly increased hot-point subject key words and data, and is convenient for realization of distributed processing. The algorithm only conducts one-time scan to the data source, so as to update the corresponding sparse matrix area, and finally conduct the sequencing to obtain the sequence of correlation degree among the hot-point subject key words. The invention can accurately and speedily pick out the correlation among the hot-point subjects, thereby overcoming the problems that the prior algorithm performance is low, particularly that the processing effect on the newly increased hot-point subject key words is poor, which can better satisfy the performance requirements recommended by large-scale hot-point subject in internet.

Description

technical field [0001] The design of the invention belongs to the field of association rule mining, and in particular relates to a method for mining the relevance of Internet hot topics. Background technique [0002] With the increasing popularity of the Internet, blogs are also used by more and more people. In the context of rapid data expansion, correlating hot topics is a necessary and effective means to sort out massive amounts of information. The purpose of the so-called hot topic correlation mining is to quickly and accurately extract hot topic keywords with internal correlations from massive data, and recommend them when users search. For example: the system uses more than 30,000 popular keywords currently searched by users as the topics to be mined, and after scanning and analyzing the user's blog title as the data source to obtain the result files, when the user searches for Andy Lau, the system will recommend: Mo Gong, Movies , Fan Bingbing, Mozi and other keywor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 寿黎但陈刚胡天磊陈珂汪源
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products