Web crawler crawling rule replacement method, scheduling end and crawling end
A web crawler and scheduling terminal technology, applied in the related fields of web crawler, can solve the problems of chaotic management of crawling nodes modifying crawling rules, frequent restart of crawling nodes, etc., to avoid management confusion.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0036] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.
[0037] Such as figure 1 Shown is a working flow diagram of a web crawler crawling rule replacement method of the present invention, including:
[0038] Step S101, sending a capture task to a capture terminal that captures network information, the capture task includes a website to be captured, and a dispatcher version number of a dispatcher capture rule file corresponding to the website to be captured;
[0039] Step S102, receiving a request for obtaining a new rule file from the crawler including the rule website to be switched and the version number of the rule to be switched, sending the rule file to be switched and the rule website to be switched to the crawler, The rule file to be switched is a scheduling terminal capture rule file that is stored in the rule file library and is jointly identified by the website of the rule to be...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 