Extensible distributed data acquisition method and system

A data acquisition system and distributed network technology, applied in the computer field, can solve problems such as ease of use, versatility, and manageability of unreasonable distribution of distributed deployment tasks.

Active Publication Date: 2020-08-25
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF6 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention proposes a highly scalable distributed data acquisition framework and method, which solves the contradiction between the ease of use, versatility and manageability of existing data acquisition systems and the problem of unreasonable task allocation in distributed deployment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extensible distributed data acquisition method and system
  • Extensible distributed data acquisition method and system
  • Extensible distributed data acquisition method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] In order to make the above-mentioned objects, features and advantages of the present invention more obvious and understandable, the present invention will be further described through specific embodiments and drawings below.

[0080] For the problem of unreasonable task allocation in distributed deployment, the present invention proposes a solution. The steps are as follows:

[0081] S1: The system consists of a master node, several working nodes, and an intermediate node that provides communication services for the two types of nodes. Each node is deployed on each server as needed, which can be divided into the following steps:

[0082] S101: Deploy the intermediate node in an environment that can be accessed by other nodes; the intermediate node includes programs such as message queues and databases, and the message queue includes task queues and data queues;

[0083] S102: Deploy the master node in a stable environment, which may be an internal network;

[0084] S103: Deploy t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an extensible distributed data acquisition method and system. The method comprises the following steps that a main node, a working node and an intermediate node are deployed;the main node regularly generates an acquisition task according to a timing task in a database and publishes the acquisition task to a task queue of a message queue; the working node reads the collection task from the task queue at regular time and decides whether to apply for executing the collection task according to the state of the local server; the main node selects an optimal working node from the working nodes applying for executing the same acquisition task to execute the acquisition task, and removes the acquisition task from the task queue; the working node generates and executes anacquisition process according to the acquisition task, and puts the acquired data into a data queue in a message queue of the intermediate node; and the working node monitors the running state of theacquisition process and records related data. According to the invention, the contradiction among usability, universality and manageability of the existing data acquisition system is solved, and the problem of unreasonable task allocation in distributed deployment is solved.

Description

Technical field [0001] The present invention relates to the field of computers, and in particular to a scalable distributed data collection method and system. The data collection mentioned in the present invention specifically refers to data collection publicly available on the Internet. Background technique [0002] Network data collection refers to the process of obtaining public data resources from the Internet and then saving them to a specific location as required. Network data collection is usually realized by using a web crawler, which is a computer program that automatically crawls Internet web page data according to certain rules. A crawler can download webpages from a certain website on the Internet, then parse and filter the webpages, and extract data from them as required. With the development of artificial intelligence and big data technology, the demand for network data is increasing rapidly. A research job requires tens of millions of data, which has become very c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F9/54G06F16/951
CPCG06F9/5038G06F9/546G06F16/951
Inventor 姜政伟江钧李仲举贺义通
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products