IP restricted controlled source information capture method based on agent pools
A technology of controlled source and information source, which is applied in electrical digital data processing, special data processing applications, other database retrieval, etc. It can solve the problems of IP failure and cannot be changed all the time, so as to improve the grabbing speed and efficient grabbing. , Overcome the effect of high cost
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0047] This embodiment specifically describes the process of a method for grabbing IP-based controlled source information based on proxy pools in the present invention.
[0048] figure 1 A schematic flow diagram of an IP-restricted controlled source information grabbing method based on an agent pool in the present invention; As can be seen from the figure, the present invention mainly includes: an agent pool initialization module initializes an agent pool, an available agent test module tests available agents, and There are three parts of data capture and dynamic maintenance.
Embodiment 2
[0050] This embodiment specifically describes the proxy pool initialization module in a proxy pool-based IP-restricted controlled source information grabbing method of the present invention operation process .
[0051] figure 2 It is a schematic representation of the operation of the proxy pool initialization module in a proxy pool-based IP-restricted controlled source information grabbing method of the present invention.
Embodiment 3
[0054] This embodiment specifically narrates the available agent test module in a proxy pool-based IP-restricted controlled source information grabbing method of the present invention operation process .
[0055] image 3 It is an available agent test module in the method of grabbing information from an IP-restricted controlled source based on an agent pool in the present invention operation instructions . The specific implementation method shown in the figure is:
[0056] For each agent in the agent pool A, it is tested in turn, and the test method is to obtain an agent from the agent pool each time; the crawler system uses the agent as a proxy server to send N requests to the selected crawling source; Among them, the crawling source selects the root directory of the crawling source website by default; the value range of N is >=1; judge whether the request is successful or not by the status code returned by the server, and perform corresponding operations: if the first r...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com