Airflow-based distributed asynchronous task construction and scheduling system and method

A technology of distributed tasks and asynchronous tasks, which is applied in the field of distributed asynchronous task construction and scheduling system, can solve the problems that the task scheduling platform cannot meet the business system, the timed task management and configuration are chaotic, and the integration of machine resources is difficult, etc., so as to improve the output. Stability, simple structure, and the effect of improving the efficiency of task execution

Active Publication Date: 2020-08-07
SHANGHAI DATATOM INFORMATION TECH CO LTD
View PDF11 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, with the increase of application complexity, when the number of scheduled tasks increases and dependencies between tasks arise, the management and configuration of scheduled tasks will be very chaotic, and it will be extremely difficult to integrate machine resources. Existing task scheduling platforms have already Can not meet the needs of the business system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Airflow-based distributed asynchronous task construction and scheduling system and method
  • Airflow-based distributed asynchronous task construction and scheduling system and method
  • Airflow-based distributed asynchronous task construction and scheduling system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] Example 1: Constructing distributed asynchronous tasks based on Airflow.

[0037] Such as figure 2 As shown, its workflow is as follows:

[0038] S1: The interface call module API Server receives an image transcoding task request sent by the user, and parses the request parameters of the task creation request. The request parameters include: specify the image transcoding task template image-template, and the scheduling period is @once (ie Execute immediately after reading the script, and execute once), task command parameters, task queue image, failed retries are 3, task weight is 3 (the larger the number, the higher the weight), among them, the task command parameters are a set of JSON data Object, which can obtain task command execution data sets through data operations such as serialization and deserialization;

[0039] S2: Transfer the request parameters obtained in S1 to the Caster task construction and distribution module through the RPC transmission protocol. ...

Embodiment 2

[0046] Such as image 3 As shown, its working process is as follows:

[0047] S1: The task scheduling module Scheduler periodically polls whether the registered DAG tasks in the metadata database need to be executed;

[0048] S2: Let the interface be called to generate a picture transcoding task. At this time, there is a picture transcoding task script (DAG) in the task script directory. When the task scheduling module Scheduler reads the DAG script, it will create a DagRun instance. And generate a DagID to associate with this instance; a DAG task is composed of multiple tasks with dependencies, and each task has a taskID. Here, our image transcoding task has three tasks, which are the transcoding task execute and task Successful execution callback success_callback, failure callback failure_callback; among them, the executed functions of the two tasks of success_callback and failure_callback are both a callback function; tasks will be sorted according to the task weight and e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed asynchronous task scheduling system based on Airflow and a working method of the distributed asynchronous task scheduling system. The system comprises an interface calling module API Server, a task construction and distribution module Caster, an Airflow scheduling platform and a data storage module DB. The interface calling module is used for calling an API Server; the task construction distribution module Caster is used for rendering a task script; the Airflow scheduling platform comprises a task scheduling module Seduler, a task execution unit Worker Node, a task execution unit management module Flower, a distributed task queue Cell and a visual task scheduling management interface WebServer; and the data storage module DB is connected with the taskconstruction distribution module Caster and the Airflow scheduling platform and used for storing data logs generated in the operation process of the task construction distribution module Caster and the Airflow scheduling platform. According to the invention, the availability, flexibility and fault tolerance of the system can be effectively improved, and the load balance of the system is ensured.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to an Airflow-based distributed asynchronous task construction and scheduling system and method thereof. Background technique [0002] In recent years, with the development of distributed technology and the gradual evolution of micro-service architecture, a large number of computer application systems have gradually evolved from a single architecture to a distributed and micro-service architecture; large-scale or ultra-large-scale distributed applications have become mainstream, and with the gradual penetration of cloud computing into public life, small and medium-scale distributed applications have also begun to appear in various fields. Whether it is an Internet application or an enterprise-level application, it is full of a large number of batch processing tasks. Task scheduling can be said to be an intermediate system that all systems must rely on; in the act...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/48G06F9/50G06F9/54
CPCG06F9/4843G06F9/5083G06F9/546Y02D10/00
Inventor 李磊谢赟吴新野韩欣樊飞
Owner SHANGHAI DATATOM INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products