Unlock instant, AI-driven research and patent intelligence for your innovation.

Data acquisition method and device and storage medium

A data acquisition and data acquisition node technology, applied in the field of communication, can solve problems such as uncontrollable system resource consumption, unbalanced machine resource utilization, resource waste, etc., and achieve the effect of improving responsiveness and machine resource utilization

Active Publication Date: 2020-05-01
PEKING UNIV FOUNDER GRP CO LTD +1
View PDF21 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The current open source distributed collection framework needs to deploy multiple collection nodes in the same computer room based on the requirements of reducing public network bandwidth and transmission efficiency. Based on this deployment requirement, users with multiple computer rooms need to Unable to conveniently utilize the bandwidth of multiple computer rooms and the advantages of multiple IP addresses
At the same time, if multiple acquisition programs are deployed on one machine, it is impossible to control the consumption of system resources by each acquisition program, and only one acquisition program is deployed on one machine, which will cause machine resource utilization Problems of imbalance and waste of resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data acquisition method and device and storage medium
  • Data acquisition method and device and storage medium
  • Data acquisition method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0047] The data acquisition method provided by the present invention can be applicable to figure 1 The distributed acquisition system shown. Such as figure 1As shown, the distributed collection system includes a scheduling node, a master control node, and multiple data collection nodes. The scheduling node is the core task scheduling of each collection product. This node needs to calculate the target webpages to be downloaded according to the collection task...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data acquisition method and device and a storage medium. The scheduling task is sent to the master control node through the scheduling node; and the master control node receives the operation state information sent by each data acquisition node, and allocates a scheduling task to the data acquisition nodes according to a preset strategy, the pre-acquired processing capability information of each data acquisition node and the operation state information of each data acquisition node, so that the data acquisition nodes execute the scheduling task. All the data acquisition nodes are managed in a unified mode through the master control node, loads are balanced for all the data acquisition nodes, the response capacity of data acquisition and the utilization rate of machine resources are improved, the data acquisition nodes can be distributed in different machine rooms, the advantages of bandwidth and multiple IP addresses of the multiple machine rooms are fully utilized, and dynamic expansion and contraction of the nodes are supported.

Description

technical field [0001] The present invention relates to the field of communication technology, in particular to a data collection method, device and storage medium. Background technique [0002] The most important resources that data collection needs to rely on are bandwidth, IP address, processor and memory. In the case of relatively cheap hardware resources, processor and memory will not become the bottleneck that limits the scale of the collection system. The real bottleneck is often bandwidth and IP address. . Large-scale downloading of web pages and other network content requires the support of sufficient network bandwidth. Websites usually limit the number of times a certain IP address can be accessed per unit time. Therefore, it is necessary to have enough IP addresses for large-scale and high-time collection of websites. support. [0003] The current open source distributed collection framework needs to deploy multiple collection nodes in the same computer room bas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L29/08
CPCH04L67/1008H04L67/12H04L67/60
Inventor 曹六一张丹
Owner PEKING UNIV FOUNDER GRP CO LTD