Unlock instant, AI-driven research and patent intelligence for your innovation.

Quality evaluation method and device for network agent, storage medium and processor

A network proxy and quality assessment technology, which is applied to the processor, the quality assessment device of the network proxy, and the quality assessment field of the network proxy, can solve the problem that the crawler cannot efficiently crawl the network data and so on.

Active Publication Date: 2020-09-01
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above problems, a quality assessment method, device, storage medium and processor of a network agent are proposed to solve the problem that crawlers cannot efficiently crawl to the required network data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Quality evaluation method and device for network agent, storage medium and processor
  • Quality evaluation method and device for network agent, storage medium and processor
  • Quality evaluation method and device for network agent, storage medium and processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0079] refer to figure 1 , which shows a flow chart of a method for evaluating the quality of a network agent in Embodiment 1 of the present invention, which may specifically include:

[0080] Step 101, obtaining the agent usage results when multiple target network agents are used to crawl data on the target site, the multiple target network agents include: multiple network agents with different priorities, among the multiple target network agents The quantity of network agents with high priority is greater than the quantity of network agents with low priority, and the quality of network agents with higher priority is higher.

[0081] In the embodiment of the present invention, the network proxy refers to a server or a server cluster used to proxy a crawler program to crawl data from the network. For a crawler program, network agents can be divided according to sites, and each site corresponds to a list of network agents. When a crawler program crawls data from a target site,...

Embodiment 2

[0090] refer to figure 2 , which shows a flow chart of a method for evaluating the quality of a network proxy in Embodiment 2 of the present invention, which may specifically include:

[0091] Step 201, select a plurality of network agents with different priorities from the different network agent pools according to preset rules.

[0092] Wherein, step 201 can be combined with the steps of Embodiment 1 to form an embodiment. On the basis of Embodiment 1, this embodiment provides a method of how to select a network agent from the network agent pool, so as to ensure that the selected The network agent can meet the corresponding priority requirements.

[0093] In the embodiment of the present invention, multiple network agents with different priorities are stored in different network agent pools. For example, the agent pools are divided for different sites, and a network agent pool with high priority is maintained for each site domain name And a network agent pool with low pri...

Embodiment 3

[0125] refer to image 3 , which shows a structural block diagram of an apparatus for evaluating the quality of a network proxy in Embodiment 3 of the present invention, which may specifically include:

[0126] The result obtaining module 301 is used to obtain the agent usage results when using multiple target network agents to crawl data on the target site, the multiple target network agents include: multiple network agents with different priorities, the multiple The number of network agents with high priority among the target network agents is greater than the number of network agents with low priority, and the quality of network agents with higher priority is higher;

[0127] An evaluation module 302, configured to evaluate the quality of the target network agent according to the result of using the agent.

[0128] In the embodiment of the present invention, optionally, multiple network agents with different priorities are stored in different network agent pools, and the d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a quality evaluation method and device for a network agent. The method comprises the following steps: obtaining an agent use result when a plurality of target network agents are adopted to perform data crawling on a target site, depending on proxy usage results, evaluating the quality of the target network agent; wherein the plurality of target network agents comprise: a plurality of target network agents, a plurality of network agents with different priorities, wherein the number of the network agents with high priorities in the plurality of target network agents is greater than the number of the network agents with low priorities; the higher the priorities of the network agents being, the higher the quality of the network agents being. In a data crawling process,more network agents with high quality are used, and the network agents with low quality are used less. The probability that the network agents with high quality are listed into a blacklist due to over-high use frequency is reduced, the problem that the network agents with low quality cannot be found after the quality of the network agents is improved is avoided, the network agents are used more evenly on the whole, and then the efficiency of crawling network data is improved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a network proxy quality assessment method, a network proxy quality assessment device, a storage medium and a processor. Background technique [0002] With the rapid development of network technology, the network has become the carrier of a large amount of information. In order to solve the problem of crawling web resources, crawlers came into being. Web crawler (also known as web spider, web robot) is a program or script that automatically grabs information on the World Wide Web according to certain rules. [0003] However, in order to prevent the system pressure brought by web crawlers, many sites will set access frequency restrictions on visitors on the server, and will also determine whether the visitors are web crawlers, and blacklist the visitors who are judged to be web crawlers , to prevent its frequent access. [0004] In order to cope with anti-crawler technolo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/958G06F16/951
Inventor 武玉博
Owner BEIJING GRIDSUM TECH CO LTD