Unlock instant, AI-driven research and patent intelligence for your innovation.

A distributed agricultural network data collection method and collection system

A network data and collection method technology, applied in the direction of network data index, network data retrieval, transmission system, etc., can solve problems such as failure to capture and attack

Active Publication Date: 2021-08-24
SOUTH CHINA AGRI UNIV
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] When collecting data information on agricultural websites, although the crawler work complies with the Robots protocol to interact with the website, the long-term and / or frequent normal crawler work may be attacked by the website's anti-crawler error, and cannot be performed normally. fetching jobs for

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A distributed agricultural network data collection method and collection system
  • A distributed agricultural network data collection method and collection system
  • A distributed agricultural network data collection method and collection system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The present invention will be described in detail below through specific embodiments in conjunction with the accompanying drawings. It should be noted that, under the condition of no conflict, the embodiments in the present invention and the features in the embodiments can be combined with each other, and the protection scope of the present invention is not limited to this.

[0046] see figure 1 , which shows an implementation environment diagram related to various embodiments of the present invention, the implementation environment includes a host 100 , a slave 200 and the Internet 300 .

[0047] The host 100 refers to a computer that issues main commands, and may be a desktop computer, a portable computer, a tablet computer, or other intelligent terminals that can be used to issue main commands. A Scrapy framework is provided in the host 100, and the Scrapy framework mainly includes an engine, a scheduler interacting with the engine through scheduling middleware, a d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to the technical field of network data acquisition, in particular to a distributed network data acquisition method and its acquisition system. The method includes deduplicating the links in the request queue through a scheduler, and assigning the request queue to corresponding slaves. When the network data collection behavior of a certain collection node is attacked by the collected website, the corresponding defense mechanism is triggered; the defense mechanism judges the attack type according to the attack behavior, and judges whether the attack type is related to the collected website. Whether the preset defense type of the slave node corresponding to the node matches; if it matches, execute the defense measure corresponding to the defense type to eliminate the attack; The queue returns to the scheduler to wait for redistribution, which solves the problem of timely taking corresponding measures to relieve the crisis when the normal network data collection work is attacked by the collected website errors.

Description

technical field [0001] The invention relates to the technical field of network data collection, in particular to a distributed agricultural network data collection method and a collection system thereof. Background technique [0002] Network data acquisition refers to the process of using Internet search engine technology to achieve targeted, industry-specific and precise data capture, and classifying data according to corresponding rules to form database files. [0003] The publication number is CN108121706A patent a distributed crawler optimization method, the specific steps of the distributed crawler optimization method are as follows: the scheduling center issues tasks; the crawler grabs webpage content according to the URL; the parser parses the webpage content; if the webpage If there are many updates, the content of the webpage is returned to the data warehouse; the parser parses the links in the webpage, and uses the Bloom filter to deduplicate locally; Hash the URL ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/06G06F16/951
CPCH04L63/1416H04L63/145H04L63/1466
Inventor 王乐乐杨自尚韩宇星
Owner SOUTH CHINA AGRI UNIV