Unlock instant, AI-driven research and patent intelligence for your innovation.

Network information monitoring and analyzing system

A network information and analysis system technology, applied in the field of network information monitoring and analysis system, can solve the problems of inability to guarantee the timeliness and authority of information, low speed and efficiency of retrieval, and many "garbage" information, etc.

Inactive Publication Date: 2012-12-26
ZHANGJIAGANG KAINA INFORMATION TECH
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

To some extent, these search engines have improved the efficiency and speed of search, but there are still great limitations, most prominently in the following aspects: First, due to the use of full-text search or keyword search, The retrieval mechanism based on literals causes a deviation between the actual retrieval results and the user's needs, that is, the retrieval returns too little "useful" information and too much "garbage" information, which is called the problem of Rich Data Poor Information; secondly, web search engines need to In the face of a wide range of knowledge fields, and lack of sufficient background knowledge for a particular field, a large number of irrelevant web pages are searched, but few web pages with greater relevance; finally, the retrieval speed and efficiency are too low, and The timeliness and authority of the information cannot be guaranteed. The above shortcomings are very serious and even fatal weaknesses in information collection.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] Embodiment 1 About the network information collection and analysis of the automobile industry

[0029] (1) The "network information collection subsystem" collects the automobile channel URL, anchor text, and web pages of automobile industry websites or portal websites.

[0030] (2) Clean the collected web pages, eliminate the interference of noise content in the web pages, and take the subject content of the web pages as the processing object to improve the accuracy of the processing results; secondly, simplify the complexity of the label structure in the web pages and reduce the complexity of the web pages. size, thereby saving the time and space overhead of subsequent processing.

[0031] (3) "Intelligent analysis and pre-categorization subsystem" classifies the web pages collected in the system, and filters useless information according to the threshold

[0032] (4) "Automatic Summary and Retrieval Subsystem" completes the functions of in-station retrieval and autom...

Embodiment 2

[0033] Implementation process of embodiment 2 network information collection subsystem

[0034] In order to realize the automatic collection function of network information, we divide the entire processing process of the network information collection subsystem into four steps: initial URL selection, web page collection, web page preprocessing, and data storage. The main workflow of this subsystem is as follows: firstly, the Spider collects web pages from the Web according to the initial URL selection and theme definition, and then preprocesses the collected pages, and then sends the results to the specified database for storage.

[0035] (1) Selection of initial URL

[0036] A general web page collection system starts from a set of seed URLs and expands to the required pages on the Web through the Web protocol. The information collection system needs to select a high-quality theme URL as the initial seed URL. In this embodiment, the seed URL set is manually defined, and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a network information monitoring and analyzing system and belongs to the field of network informatization. According to the features of the most widely-used HTML (Hypertext markup language) webpage information, and on the basis of deeply analyzing and researching the network information processing technologies such as information collecting, preprocessing and automatic sorting, the invention designs and develops a network information monitoring and analyzing system based on information field, aiming at the defects in the current information collecting technology, and the invention also realizes the function of directionally and automatically collecting useful information of professional field from multiple portal websites and specialized websites in real time through internet. The working process is as follows: (1) URLs, anchor texts and web pages are collected by a network information collecting subsystem, and the collected web pages are cleaned; (2) web pages in the system can be sorted by an intelligent analyzing and sorting subsystem, and garbage information is filtered according to thresholds; and (3) an automatic summarizing and retrieving subsystem finishes the functions of searching in the website and automatically generating a report.

Description

technical field [0001] The invention relates to a network information monitoring and analysis system, which belongs to the field of network informatization. Background technique [0002] Since the birth of the Internet, the Internet has developed into a huge global information warehouse with nearly 100 million users and hundreds of millions of pages, and its information capacity is still growing exponentially. Obtaining information from the Internet has become the main method and important means for individuals to acquire knowledge, and it has also become an important way for enterprises to obtain intelligence. However, in the face of the vast amount of network information, traditional methods such as manual collection and processing are no longer competent. [0003] To this end, a lot of research has been done in the field of information search at home and abroad, and a variety of search engines have been developed, such as Baidu, Google, Yahoo, Lycos and so on. To some ex...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 庞兵
Owner ZHANGJIAGANG KAINA INFORMATION TECH