Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Big data job scheduling system based on directed acyclic graph

A directed acyclic graph and job scheduling technology, which is applied in digital data processing, program startup/switching, program control design, etc., can solve problems such as fewer big data components, inconvenient task management and maintenance, and error analysis results , to achieve high availability

Inactive Publication Date: 2017-06-09
上海轻维软件有限公司
View PDF5 Cites 54 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The tasks of these components require different operating environments, and in addition to scheduled operation, there are dependencies between various types of tasks. Generally, the data tasks of each business are basically scheduled by Crontab, and the dependencies between tasks only rely on simple Serial implementation, the problems of doing this: 1. It is easy to cause the previous tasks to not end or fail, and the subsequent tasks will also run, and finally run out of wrong analysis results. 2. Tasks cannot be executed concurrently, which increases the overall time of task execution Window 3. Task management and maintenance are very inconvenient, and it is not easy to count the execution time and running logs of tasks. 4. Lack of timely and effective alarms
[0004] However, both Oozie and Zeus have shortcomings. The disadvantages of Oozie are: 1. The Workflow scheduled by Oozie can only be configured using XML files; 2. The scheduling can only be started through the command line; 3. The scheduling script cannot be debugged through the Oozie interface; 4. , Oozie cannot visually debug scripts; 5, there are few big data components that support scheduling
The disadvantages of Zeus are: 1. Long-term lack of human maintenance; 2. It supports Hadoop1.X; 3. There are few big data components that support scheduling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data job scheduling system based on directed acyclic graph
  • Big data job scheduling system based on directed acyclic graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In the embodiment of the present invention, the big data job scheduling system based on directed acyclic graph supports task scheduling and monitoring of components between big data ecosystems, meets the needs of future expansion components, and has active management modules and standby management Module Two sets of management modules to achieve high availability of the scheduling system.

[0025] In order to make the above objects, features and beneficial effects of the present invention more comprehensible, specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0026] figure 1 It is a module diagram of a big data job scheduling system based on a directed acyclic graph in an embodiment of the present invention.

[0027] Such as figure 1 As shown, the big data job scheduling system based on directed acyclic graph provided by the present invention includes a database 11, which is used to store al...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a big data job scheduling system based on a directed acyclic graph. The system comprises a database, wherein the database is used to store job tasks and all the information of service noises. The big data job scheduling system also comprises a database management module, a DAG algorithm module, a scheduling module and a monitoring module, wherein the monitoring module is connected to the scheduling module and the job tasks; the monitoring module is used to monitor information fed back by all the operated job tasks and transmit the information fed back by the job tasks to the scheduling module. The big data job scheduling system based on the directed acyclic graph provided by the invention supports task scheduling and monitoring of assemblies between big data ecologic circles, satisfies future demands for extension assemblies, can be easily equipped with two sets of management modules including a vivid management module and a standby management module, and can achieve high usability of the scheduling system.

Description

technical field [0001] The invention relates to a big data job scheduling system, in particular to a big data job scheduling system based on a directed acyclic graph. Background technique [0002] With the iteration of business indicators, when big data operations become more and more complicated, the monitoring of task operation and troubleshooting of abnormal problems become more complicated, and the components of the big data ecosystem are rich, resulting in many different types of programs (Task) runs on big data platforms, such as: MapReduce, Hive, Pig, Spark, Java, Shell, Python, etc. The tasks of these components require different operating environments, and in addition to scheduled operation, there are dependencies between various types of tasks. Generally, the data tasks of each business are basically scheduled by Crontab, and the dependencies between tasks only rely on simple Serial implementation, the problems of doing this: 1. It is easy to cause the previous ta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/48
CPCG06F9/4843
Inventor 程永新宋辉温国祥
Owner 上海轻维软件有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products