Method and system of directionally acquiring Internet resources

An Internet resource and acquisition method technology, applied in the field of Internet resource directional acquisition, can solve problems such as resource disorganization, webpage snapshot failure, spam, etc., achieve localized permanent archive, reduce spam and useless information, and avoid resources repeated effect

Inactive Publication Date: 2010-03-24
BEIJING LEISU TECH
View PDF0 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The purpose of the present invention is to provide a method and system for directional acquisition of Internet resources, which can solve the problems of a ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system of directionally acquiring Internet resources
  • Method and system of directionally acquiring Internet resources
  • Method and system of directionally acquiring Internet resources

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0043] Such as figure 1 Shown is the flow chart of the method for directional acquisition of Internet resources in the present invention, the method includes steps:

[0044] s101. Determine the basic information required. These basic information include the scope of the crawling website, the resource information to be obtained and the resource category to which it belongs. Generally, the retrieval is based on commonly used websites as the crawling website to download information. The resource information to be obtained is Refers to the type determined by the search. If you want to obtain sports badminton information, the category it belongs to is sports;

[0045] s102, according to the resource category, obtain effective webpages corresponding to the resource categories on each crawling website through human-computer interaction. The effective webpages mentioned here refer to those that have a relatively high degree of correlation with the resource category to be obtained or a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method of directionally acquiring Internet resources, comprising the following steps: determining the scope of the websites from which webpages are to be captured, the information of the resources to be acquired and the types of the resources; acquiring effective webpages which correspond to the types of the resources on each website through human-computer interaction according to the types of the resources; generating the configuration information on the information of the resources to be acquired according to the uniform resource locators of the websites and the effective webpages, the webpage structures and the information of the resources to be acquired; capturing text information which is matched with the configuration information on the information of the resources to be acquired on the websites, and saving the text information; indexing deeply the captured text information through human-computer interaction; and creating indexes for the text information which is indexed deeply so as to facilitate the text information indexing by the user. The system of directionally acquiring Internet resources comprises an Internet resource directionally acquiringunit and a text information deeply indexing unit. Through directionally acquiring the Internet resources with the searching engine, the problems on the generation of a large amount of junk information, the replication of resources, the disorganization of resources and the invalidation of the webpage snapshot, which can be caused by the commonly-used method of acquiring the Internet resources withthe searching engine, are solved.

Description

technical field [0001] The invention relates to the field of Internet search engines, in particular to a method and system for directional acquisition of Internet resources. Background technique [0002] A search engine is a computer system that collects information on the Internet according to a certain strategy, and provides users with network information services after organizing and processing the information. Its main function is to help users quickly and efficiently obtain high-quality information that exists in the Internet information environment and can meet user needs. [0003] At present, a general search engine includes three parts: information collection, information arrangement and user query. The information search part is responsible for capturing information on the Internet and storing the captured information in the data server. The information sorting is responsible for sorting out the captured information with the indexer, and then for users to use the q...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 刘锦山崔凤雷
Owner BEIJING LEISU TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products