Resource downloading system and method, and crawler downloading system

A resource downloading and crawling technology, applied in special data processing applications, network data retrieval, instruments, etc., can solve problems such as crawler function failure, achieve the effect of increasing functional stability and reducing the probability of recognition

Active Publication Date: 2017-07-25
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF8 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Crawlers in the prior art are mainly divided into traditional crawlers and focused crawlers, but whether they are traditional crawlers or focused crawlers, their crawling frequency to the target website is fixed, which will make some anti-crawler programs pass the fixed crawling frequency This feature can easily identify the crawler, so as to deny the crawler's access or perform some robot verification work, resulting in the failure of the crawler's function

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Resource downloading system and method, and crawler downloading system
  • Resource downloading system and method, and crawler downloading system
  • Resource downloading system and method, and crawler downloading system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0037] The embodiment of the present application provides a resource downloading system, such as figure 1 As shown, for providing tokens for the crawler 100, the resource downloading system includes: a database 200 and a random token generator 300, wherein,

[0038] The random token generator 300 is used to generate a token after receiving the generation request and store it in the database 200. The value range of the timestamp increment value of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a resource downloading system and method, and a crawler downloading system. The resource downloading system limits a capture frequency of a crawler to a site through a token bucket flow limiting method; a value range of a timestamp increase value of a token generated by a random token generator is determined according to the rate of query per second of the crawler to the site; and the timestamp of the token is determined according to a timestamp at a current moment and the timestamp increase value, so that the timestamp increase value of the token is limited to be a random value, the capture frequency of the crawler to the site is limited to be a random frequency, an anti-crawler program is prevented from identifying the crawler according to the characteristic, namely, a fixed capture frequency, the probability of identifying the crawler by the anti-crawler program is reduced, and the functional stability of the crawler is improved.

Description

technical field [0001] The present application relates to the field of computer application technology, and more specifically, to a resource downloading system, method and crawler downloading system. Background technique [0002] A crawler, also known as a web crawler, is a program that automatically obtains web content. It is an important part of search engines, so search engine optimization is largely optimized for crawlers. [0003] Crawlers in the prior art are mainly divided into traditional crawlers and focused crawlers, but whether they are traditional crawlers or focused crawlers, their crawling frequency to the target website is fixed, which will make some anti-crawler programs pass the fixed crawling frequency This feature can easily identify the crawler, thereby denying the crawler's access or performing some robot verification work, resulting in the failure of the crawler's function. Contents of the invention [0004] In order to solve the above technical pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 帅伟良
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products