Method and system of automatically discovering hot news theme on the internet

An automatic discovery and Internet technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as corporate image damage, social stability and unity

Inactive Publication Date: 2012-09-12
上海引跑信息科技有限公司
View PDF1 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the past, the discovery of Internet hotspots by manpower often had a lag, and these hot spots of public opinion could not be discovered in the first time, resulting in the inability to make corresponding countermeasures in the first time, and th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system of automatically discovering hot news theme on the internet

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0006] The technical solution of the present invention will be described in further detail below in conjunction with accompanying drawing 1 .

[0007] Fig. 1 is a block diagram of various modules involved in the method for automatically discovering Internet hot news topics, which includes three parts: a data preprocessor, a cluster analyzer and an automatic category parser. The data preprocessor includes two parts: news content collection and word frequency matrix generation. The news content collection part mainly obtains various news webpages on the Internet and extracts the text from them. The word frequency matrix generation part mainly generates a Word frequency vector, all word frequency vectors form a word frequency matrix. The cluster analyzer is one of the core parts of the method, mainly to classify the articles. The category automatic parser is an automatic interpretation of each category obtained by clustering.

[0008] The main steps of this web page content ana...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method of automatically discovering a hot news theme (or a hot event) on the internet, which can automatically discover various news events on the internet and automatically and briefly explain the events. The method comprises the following steps: downloading webpages of recent news, blogs, microblogs and the like from the internet; extracting titles and texts from the news, blogs, microblogs and the like; carrying out word segmentation for the extracted titles and texts to acquire a word frequency vector, and combining all the texts to acquire a word frequency matrix; carrying out cluster analysis for the word frequency matrix with clustering algorithm so that texts with the same theme are gathered and clusters of various themes are acquired; and extracting a title of a central text and keywords from each cluster to explain the cluster, i.e., to explain the news theme.

Description

technical field [0001] The invention relates to the field of automatic discovery of hot news in Internet public opinion analysis. Background technique [0002] With the vigorous development of the Internet, online media has a huge influence in guiding public opinion and influencing audiences. The hot spots of public opinion formed on the Internet frequently become social hot spots and even cause major public opinion crises. In the past, the discovery of Internet hotspots by manpower often had a lag, and these hot spots of public opinion could not be discovered in the first time, resulting in the inability to make corresponding countermeasures in the first time, and the development of an event could not be controlled and correctly guided. The further deterioration of the situation will affect the stability and unity of the whole society as a result, and will damage the image of the enterprise for the enterprise. Contents of the invention [0003] The purpose of the present...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
Inventor 不公告发明人
Owner 上海引跑信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products