Method and device for discovering hot words

A hot word and word frequency technology, applied in the field of computer clustering, can solve the problems of increasing clustering complexity, short duration, lack of time information, etc., to meet real-time requirements, reduce complexity, and reduce time.

Active Publication Date: 2013-07-24
SHENZHEN TENCENT COMP SYST CO LTD
View PDF5 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] It can be seen from the above that although the existing improved method for mining hot words based on documents can effectively reduce the lack of time information closely related to events caused by statically representing documents, the clustered words still contain a large number of hot words. Words unrelated to the event increase the complexity of clustering; further, it is necessary to manually identify the words contained in the hot event in the document, and use the existing clustering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for discovering hot words
  • Method and device for discovering hot words
  • Method and device for discovering hot words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0078] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0079] Hot words are important reminder information to social hot events within a period of time. Therefore, in the embodiment of the present invention, by presetting the hot word library and setting corresponding hot word weights for each hot word in the hot word library, and assigning hot words The library is dynamically maintained, and the document is represented by the hot words in the hot word library, and then based on the hot word mining method of the embodiment of the present invention, the documents are clustered to form a document class, and a clustered social network is classified in the document class. The hot words describing the same hot event within a time period are aggregated and filtered, and finally the aggregated and fi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for discovering hot words. The method comprises the following steps of presetting a hot word bank and arranging a corresponding hot word weight for each hot word in the hot word bank; expressing a document by utilizing the hot words in the hot word bank according to the word frequency of each hot word in the document and each hot word weight arranged in the hot word bank; clustering documents expressed by the hot words in the hot word bank to be document classes with preset number; carrying out focus sorting on the document classes with preset number, and filtering out the document classes with focus values being smaller than a preset focus threshold; and carrying out hot word selection on the filtered document classes according to the preset hot word selecting strategy. The method and the device for discovering the hot words are applied, so that the clustering complexity can be reduced, and the efficiency for discovering social network focuses is improved.

Description

technical field [0001] The invention relates to computer clustering technology, in particular to a method and device for mining hot words. Background technique [0002] With the development of computer communication technology, especially the development of 3G network and smart mobile terminals, users' network life is becoming more and more abundant, such as chatting on social networks, browsing news, watching movies, playing games, searching, shopping, publishing information, etc. , has increasingly become a part of online life. How to enable users to find valuable information from online communities effectively has become an important research topic in the information field. [0003] At present, in the massive network information in various fields in the community, the method of mining hot words based on documents is adopted, and the documents in the network are expressed as feature vectors composed of words by using the space vector model (VSM, Vector Space Model). The ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 邸楠
Owner SHENZHEN TENCENT COMP SYST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products