Subject-oriented customized news information extraction system

An extraction system, subject-oriented technology, applied in special data processing applications, instruments, electrical and digital data processing, etc. Category retrieval and search query browsing, time saving effect

Active Publication Date: 2012-12-19
JIANGSU R & D CENTER FOR INTERNET OF THINGS
View PDF4 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, all the found information is manually judged and collected one by one, which is very inefficient;
[0006] (2) It is difficult for a piece of online news to explain the incident clearly at one time, and there may be new situations and new problems as the incident progresses
[0007] (3) Search engines cannot guarantee the timeliness and authority of information when retrieving information, which is a very serious and even fatal weakness for intelligence extraction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Subject-oriented customized news information extraction system
  • Subject-oriented customized news information extraction system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention will be further described below in conjunction with drawings and embodiments.

[0023] Such as figure 1 As shown, the topic-oriented customized news information extraction system of the present invention includes three subsystems, which are respectively a news collection subsystem 1, a text processing subsystem 2, and a human-computer interaction subsystem 3.

[0024] The news collection subsystem 1 completes the function of searching for relevant news of related topics customized by users and extracting news texts. The news collection subsystem 1 includes: a focused crawler unit 102 , a web page database 104 , a text extraction unit 106 and a text library 108 .

[0025] The focused crawler unit 102 receives the theme customized by the user, and visits one by one according to the page, site or randomly generated URL table customized by the user, according to the crawling strategy, and judges the relevance degree of each crawled page to determine w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a subject-oriented customized new information extraction system which comprises a news collecting subsystem, a text processing subsystem and a human-computer interaction subsystem, wherein the news collecting subsystem completes the functions of searching related news about related subjects customized by a user and extracting news texts; the text processing subsystem divides the texts into different categories, detects and tracks topics in the content of the texts on that basis, automatically generates abstracts and establishes corresponding indexes; and the human-computer interaction subsystem firstly analyzes the topics, calculates the hot degree of the topics, presents the hot topics in the groups of subjects of the topics for the user in the sequence of hot degree and simultaneously provides topic retrieval, and the user can artificially screen the obtained content and store intelligence obtained by artificially extracting the screened information into an intelligence library. With the system, news on the internet can be comprehensively collected in time and can be automatically detected, classified and tracked, and intelligence required by the user can be extracted from immense network news by exerting the cognitive ability of the user on intelligence.

Description

technical field [0001] The invention relates to a news information extraction system, in particular to a theme-oriented and customized information extraction system with news as an object. Background technique [0002] Today, with the rapid development of the Internet, using the public information release system to collect intelligence in the fields of politics, military, economy, and culture has become one of the important channels for obtaining intelligence. According to the definition of intelligence in information science, the so-called intelligence refers to real-time required information within an effective time. At present, 90% of the intelligence is obtained from the public information release system, and all kinds of news information is undoubtedly the largest amount of public information. [0003] However, Internet news is a massive information source, and it is an open and distributed information space. The following characteristics inherent in it have obviously ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 台宪青王艳军赵旦谱楚涌泉张伟娜
Owner JIANGSU R & D CENTER FOR INTERNET OF THINGS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products