Supercharge Your Innovation With Domain-Expert AI Agents!

Distributed job flow task management and scheduling system and method

A task management and scheduling system technology, applied in the field of business scheduling, can solve problems such as high requirements for data developers, integration of scheduling and jobs, and no unified management and scheduling scheme, so as to achieve efficient use of computing resources and eliminate differences in technical characteristics. , the effect of flexibly controlling the number of concurrent jobs

Active Publication Date: 2021-05-28
SHANGHAI PUDONG DEVELOPMENT BANK
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] This paper is used to solve the problem that in the prior art, there is scheduling and job integration in task scheduling in comprehensive and complex scenarios, there is no unified scheduling solution for unified management of different computing platforms, and the problem of high requirements for data developers

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed job flow task management and scheduling system and method
  • Distributed job flow task management and scheduling system and method
  • Distributed job flow task management and scheduling system and method

Examples

Experimental program
Comparison scheme
Effect test

other Embodiment approach

[0122] In other implementation manners, the preset selection criterion may also be to select one of the online sub-scheduling nodes according to a certain algorithm or randomly as the main scheduling node.

[0123] The election criteria for the new master scheduling node can be set according to requirements, which is not limited in this paper.

[0124] In an embodiment of this paper, in order to be able to discover and respond to jobs in a timely manner, the main scheduling node 120 is also used to scan signal files at predetermined time intervals (for example, scanning once every 5 seconds). Determine whether the signal file is scanned, and when the signal file is scanned, start the job corresponding to the signal file, and put the started job into the pending job queue.

[0125] In one embodiment of this document, in order to detect and eliminate faults in time, the main scheduling node 120 is also used to detect the number of jobs to be executed and the result of job distri...

other Embodiment approach

[0157] In other embodiments, according to the number of jobs to be processed and their growth rates, determining the increased number of main scheduling nodes and sub-scheduling nodes includes: inputting the number of jobs to be processed and their growth rates into the first quantity calculation model to obtain the main scheduling node Increased number of nodes and sub-scheduling nodes.

[0158] According to the number of jobs to be processed and their growth rates, determine the reduced number of main scheduling nodes and sub-scheduling nodes, including: inputting the number of jobs to be processed and their growth rates into the second quantity calculation model to obtain the main scheduling nodes and sub-scheduling nodes of reductions.

[0159] Wherein, the first quantity calculation model and the second quantity calculation model are pre-established according to historical records.

[0160] During specific implementation, the first quantity calculation model and the seco...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a distributed job flow task management and scheduling system and method, and the system comprises a database node, a main scheduling node, and a sub-scheduling node. The database node is used for storing job configuration information, and the job configuration information comprises the maximum job amount of the sub-scheduling node, the maximum job amount of the computing platform and the job resource consumption amount; the main scheduling node is connected with the database node and the sub scheduling nodes and is used for generating job distribution information according to the job configuration information and a to-be-processed job queue and sending the job distribution information to the sub scheduling nodes, and the job distribution information comprises association relationships of to-be-processed jobs, the sub scheduling nodes and a computing platform; the sub-scheduling node is used for submitting the to-be-processed jobs to the corresponding computing platforms for processing according to the job distribution information, and the jobs are obtained by logically packaging the to-be-processed jobs of the computing platforms. Scheduling and operation can be separated, the technical characteristic difference of a computing platform is eliminated, and unified scheduling is achieved.

Description

technical field [0001] This article relates to the field of business scheduling, in particular to a distributed job flow task management and scheduling system and method. Background technique [0002] In the era of big data, many open source or commercial big data computing engine systems have a large number of complex data analysis or computing tasks, and the job links are complex. In order to solve or adapt to such scenarios, there are certain solutions emerging in the field of open source and commercial software, each with its own advantages and disadvantages. The current mainstream types of open source and commercial batch job scheduling systems mainly include the following categories: [0003] Azkaban is mainly used in batch processing scenarios related to the open source big data ecosystem. After specific packaging, it can also be applied in the field of commercial computing engines. It has comprehensive functions and good scheduling support capabilities, but it has d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/48G06F9/50
CPCG06F9/4881G06F9/5011G06F9/5038G06F9/5083G06F9/505G06F2209/484G06F2209/5021
Inventor 张振洪周犇
Owner SHANGHAI PUDONG DEVELOPMENT BANK
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More