Crawler capturing method and device thereof

A crawler and crawling unit technology, applied in the field of Internet information search, can solve problems such as poor technical timeliness, and achieve the effects of good timeliness, shortening cycle, and increasing crawling frequency

Inactive Publication Date: 2012-07-04
CHINA MOBILE COMM GRP CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The invention provides a method and device for optimizing crawler crawling to solve the problem of poor timeliness of existing crawler crawling technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Crawler capturing method and device thereof
  • Crawler capturing method and device thereof
  • Crawler capturing method and device thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to improve the timeliness of information crawling by reptiles and improve user satisfaction with search engines, the embodiment of the present invention proposes a crawler crawling method and its device. The main implementation principles of the embodiment of the present invention will be described below in conjunction with the attached drawings , The specific implementation process and the corresponding beneficial effects that can be achieved are described in detail.

[0025] Based on a computer or computer network search engine system, the search results returned by the user query usually include a list of webpage links, and the webpages in the list are generally organized according to the degree of correlation between the information in the webpage and the query keywords. Sort from high to low. Aiming at this feature of the search results returned by the search engine, in one embodiment of the present invention, a method of using the ranking of the web pages...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a crawler capturing method and device thereof, aiming at solving the problem that the existing crawler capturing technology is poor in timeliness. The main technical scheme includes that: the current weight number of the webpage is determined according to the rank of the webpage in the current search result or / and the sequence of the webpage clicked by users; result weightnumber of the webpage is determined according to the current weight number and history weight number of the webpage; and when the result weight number is equal to preset threshold, information in thewebpage is captured again. By the technical scheme, the period for crawler to capture information in the webpage can be influenced according to the rank of the webpage in the current search result or / and the sequence of the webpage clicked by users, the period for crawler to capture information in the webpage with high user attention can be reduced, thus ensuring the information in the webpage tohave good timeliness and improving user experience.

Description

technical field [0001] The invention relates to the field of Internet information search, in particular to a crawler crawling method and a device thereof. Background technique [0002] Search engine is a technology that is widely used on the Internet today. People only need to input some keywords of the information they need to find a large amount of information related to this keyword through search engines, such as search engines such as Baidu and Google. [0003] There are various sources of information for search engines, some of which are paid by the advertiser who initiates the advertisement to the search engine operator in the form of bidding advertisement, and the search engine operator publishes the brief information of the advertisement in its own search engine and Links, and more non-advertising information, such as news, academic information, etc., need to be searched by search engine operators themselves and captured and added to the search engine. Facing the ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 孙宏伟胡珉罗治国
Owner CHINA MOBILE COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products