Method and device for dynamically scheduling IP agent pool in distributed environment and storage medium

A distributed environment and dynamic scheduling technology, applied to electrical components, transmission systems, etc., can solve problems such as difficulty in ensuring long-term stable operation of the data acquisition system, and achieve the effects of avoiding access failures, ensuring acquisition efficiency, and preventing high-frequency access

Active Publication Date: 2019-05-10
XIAMEN MEIYA PICO INFORMATION
View PDF8 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, although the IP proxy pool is one of the best solutions, it is difficult to ensure the long-term stable operation of the data acquisition system due to the poor performance of the existing IP proxy pool in terms of availability and stability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for dynamically scheduling IP agent pool in distributed environment and storage medium
  • Method and device for dynamically scheduling IP agent pool in distributed environment and storage medium
  • Method and device for dynamically scheduling IP agent pool in distributed environment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, rather than to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

[0032] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0033] figure 1 A method for dynamically dispatching an IP proxy pool in a distributed environment of the present invention is shown, the method comprising:

[0034] Construction step S101, scan proxy IP resources, and build an IP proxy pool after initializing status identi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for dynamically scheduling an IP agent pool in a distributed environment and a storage medium. The method comprises the following steps: a construction step:scanning proxy IP resources, and constructing an IP proxy pool after initializing a state identifier for a proxy IP obtained by scanning; A detection step: detecting the proxy IP in the initial generation IP sorting pool, and updating the state identifier of the proxy IP according to a detection result; And a scheduling step: obtaining M proxy IPs in the IP proxy pool to generate a proxy IP queuefor a downloading center to request. According to the invention, it is ensured that the available IP agents in the agent pool are kept above a certain number; according to a locking mechanism and a caching mechanism; the orderly response in the time dimension can be realized under the condition that a single proxy IP faces multi-thread scheduling; High-frequency access of a single agent IP is prevented, and access failure caused by high-frequency access can be avoided while the acquisition efficiency of the distributed data acquisition system is ensured through cooperation of an IP agent pool, effectiveness detection and scheduling.

Description

technical field [0001] The invention relates to the technical field of network data processing, in particular to a method, device and storage medium for dynamically dispatching an IP proxy pool in a distributed environment. Background technique [0002] With the increasing scale of the Internet, timeliness has gradually become a key problem in the field of data collection. In general, the data collector can use the distributed data collection system to conduct high-frequency visits to multiple target websites within a unit of time, so as to achieve high-efficiency collection of multi-tasks. However, since the IP resources of the entire system are fixed and limited, when the target website has a threshold value for the request frequency of accessing the IP, the above method will easily lead to website access failure. [0003] In the existing technology, there are mainly two ways to effectively solve the problem of such access failure: ①Use request frequency control to limit ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/12
Inventor 谢鹏达栾江霞李火泉徐晓文章正道
Owner XIAMEN MEIYA PICO INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products