Method and system for digging object grade knowledge

A technology of knowledge mining and objects, applied in the field of computer networks, can solve problems such as not forming useful knowledge

Active Publication Date: 2008-07-30
上海估家网络科技有限公司 +1
View PDF0 Cites 37 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] What the current search engine does is full-text search, which only provides a lot of information, but does not form a useful knowledge, so users need to find out which information is useful according to their own knowledge background in the search results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for digging object grade knowledge
  • Method and system for digging object grade knowledge

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] As shown in FIG. 1, an object-level knowledge mining system includes a data collection module 100 for collecting information from the Internet, which includes a WEB grabber 110, a data adapter 120, and a data converter 130.

[0020] Wherein, the WEB grabber 110 obtains the required webpage from a predefined URL list, and then obtains related information from the obtained webpage to form an object.

[0021] Generally, only the list to be crawled can be defined in a general web crawler, and then the content in these web page lists are directly captured in source code based on these lists. Therefore, common web crawlers have the following two Two problems: 1. The information needed by the user may be located in multiple related web pages; 2. The relevant information cannot be obtained from the web page to form the content of the object needed by the user, and irrelevant information is removed.

[0022] The WEB grabber 110 according to the present invention classifies web pages...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an object-level information excavation system, which comprises a data collection module, a data cleaning module, a content pretreatment module and an object correlation search module, wherein, the data collection module used to collect data comprises a WEB grabber, the data cleaning module used to process structured data comprises a data verification module and a repeat-ridding process module, the content pretreatment module used to pretreat unstructured data comprises a metadata management module and a content analyzer, and the object correlation search module used to analyze the correlation degree of the processed content of the content pretreatment module comprises a correlation degree analyzer. The invention also discloses an object-level information excavation method, which comprises the following steps that: information is collected from web pages; the data cleaning process is carried out to the structured data collected; the content pretreatment operation is carried out to the unstructured data collected; the object correlation search operation is carried out to the content obtained after the pretreatment.

Description

Technical field [0001] The invention relates to computer network technology, in particular to a method and system for object-level knowledge mining based on Internet information. Background technique [0002] With the development of the Internet, all kinds of information are increasing explosively. It will be very difficult to obtain useful information through artificial means. Therefore, how to obtain the required content from this massive amount of information and make it useful? Knowledge presentation will become a very important key point, so as to avoid being submerged in the information explosion. [0003] The current search engine is only full-text search, it only provides a lot of information, but does not form a kind of useful knowledge, so users need to find out what information is useful in the search results according to their own knowledge background. In order to have a deeper understanding of the relevance of a certain information, users also need to analyze it them...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 张效海虞继恩
Owner 上海估家网络科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products