Decentralization scheduling and execution method and device for big data platform

A big data platform, decentralized technology, applied in the direction of electrical components, transmission systems, etc., can solve problems such as insufficient computing resources of the data platform, platform downtime, task trigger time and completion time not reaching the business side, etc.

Active Publication Date: 2021-05-18
鲸灵科技有限责任公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in some actual use scenarios, there are still various problems, such as: due to insufficient computing resources of the data p

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Decentralization scheduling and execution method and device for big data platform
  • Decentralization scheduling and execution method and device for big data platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] Such as figure 2 A decentralized scheduling and execution method for a big data platform is shown, including the following steps:

[0038] (1) The trigger Trigger uses the Quartz clock expression to automatically trigger the task once; the definitions of all trigger Trigger instances are persisted in MySQL by the trigger management module;

[0039] MySQL includes support for Triggers. A trigger is a database object related to table operations. When a specified event occurs on the table where the trigger is located, the object will be invoked.

[0040] (2) Timing of the trigger management module, taking 5 minutes as an example, such as figure 2 As shown, the key information of the Trigger instance that is persistent in the MySQL library and the Quartz clock expression is legal is written to the ZooKeeper cluster, and each Trigger instance occupies a ZNode node;

[0041] (3) Several scheduling trigger modules are distributed in the cluster to run simultaneously in un...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a decentralized task scheduling triggering and executing method for a big data platform. The method comprises the following steps: a plurality of scheduling trigger modules are distributed in a ZooKeeper cluster by taking a process as a unit to run at the same time; the scheduling trigger module writes META information of the scheduled big data calculation task into a ZooKeeper cluster; the task execution module is distributed in a plurality of ZNode nodes of the ZooKeeper cluster to run by taking a process as a unit; the task execution module carries out traversal on a ZNode tree of the ZooKeeper in sequence; when it is found that a certain sub-process has an execution condition, the task is locked, task information is obtained, the task is executed, a sub-process execution result and state information of the task are written into a task history of MySQL, and a user establishes connection of a back-end WEB module through a WebSocket protocol and masters configuration and state information of all tasks recorded in a ZooKeeper tree in real time.

Description

technical field [0001] The invention relates to a decentralized scheduling and execution method and device for a big data platform. Background technique [0002] Task scheduling and execution are important components of the operating system. For the unified scheduling of periodic tasks, in the prior art, the scheduling task is mainly determined by judging that the time difference between the current time and the last execution start time of the task is greater than the task execution interval. In the concept of data warehouse, the task is the minimum data processing action. A technical implementation, in a data warehouse, data processing is completed by thousands of various processing tasks. In the current popular data warehouse platforms, task scheduling generally implements concurrent task scheduling based on task dependencies, and can set the concurrency of task execution and task priority. The higher the priority, the task will be triggered first. However, in some actu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/08
CPCH04L67/10H04L67/02H04L67/60H04L67/62Y02D10/00
Inventor 王晟
Owner 鲸灵科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products