An Object Storage Based Crawler Network Path Tracing Method
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 广州探迹科技有限公司
- Publication Date
- 2020-06-09
Smart Images

Figure 1
Abstract
Description
technical field
[0001] The invention belongs to the research field of path tracing in software engineering, in particular to a crawler network path tracing method based on object storage. Background technique
[0002] A web crawler is a program or script that automatically captures information on the World Wide Web according to certain rules. In the current path tracing, most of the crawler network path tracing is based on the crawler task as the basic unit. For example, the open source crawler framework pyspider, the default The action is to store the result into the database. If the external system needs to retrieve the data in the database, there is no convenient retrieval method. It can only scan the database, and it is necessary to modify the status of the result data in the database so that these processed data will be excluded in the next processing. result. As a result, the data in the database needs to be maintained by the two systems together, causing great uncer...