Unlock instant, AI-driven research and patent intelligence for your innovation.

A method for using large-scale IP address resources in a targeted information capture scenario

An IP address and directional information technology, applied in transmission systems, electrical components, etc., can solve problems such as poor availability, abnormal access, uncontrollable quality and stability of agents, and improve the success rate.

Active Publication Date: 2019-04-30
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At the same time, in actual scenarios, in order to save crawling costs, a large number of IP addresses are usually public proxies (HTTP or Socks proxies) on the Internet. The quality and stability of the proxies are usually uncontrollable, and even poor overall availability may occur. If it is used indiscriminately during the crawling process, it will cause a large number of unnecessary access exceptions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for using large-scale IP address resources in a targeted information capture scenario
  • A method for using large-scale IP address resources in a targeted information capture scenario
  • A method for using large-scale IP address resources in a targeted information capture scenario

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention. The features and advantages of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0030] The method for using large-scale IP address resources in the directional information capture scene of the present invention includes an IP address allocation mechanism and an IP address availability evaluation mechanism based on the allocation mechanism.

[0031] IP address allocation mechanism: For a specific...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for using large-scale IP address resources in a directional information grabbing scene. The method includes the following steps: for the network resources with access frequency restrictions set, establish a priority queue including all IP addresses in the IP address set L; when allocating available IP addresses, take out the IP address with the highest priority in the priority queue, and update the IP address The time when the network resource can be accessed next time, if the current time is greater than or equal to t, the current task can use the IP address immediately; for each IP address, maintain the number of times un and access failures fn of the IP address, when from When the IP address with the highest priority is taken out from the priority queue, it is selected with a probability of 1‑fn / un, and discarded with a probability of fn / un. It not only realizes the full use of IP address access capabilities, but also improves the success rate of network information acquisition tasks.

Description

technical field [0001] The present invention relates to the field of specific network information acquisition, and in particular to a method for using large-scale IP address resources in a directional information capture scenario, which can be used efficiently when the access frequency of a single IP to specific network resources is limited and a large number of tasks are executed concurrently And allocate a large number of IP address resources. Background technique [0002] With the rapid development of the Internet, the data resources on the network are also expanding rapidly. In some scenarios that require centralized acquisition of network resources, such as search engine crawlers crawling the web pages of a certain website, due to the huge number of target web pages, simply executing the acquisition tasks in a single thread serially is far from achieving performance. Require. The more commonly used method at this time is to execute multiple information acquisition tas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/12
CPCH04L61/5007H04L61/5046
Inventor 时金桥谭庆丰王学宾
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI