A work-slot-aware storm platform job-sharing scheduling method

A scheduling method and work slot technology, applied in the directions of resource allocation, multi-program installation, program startup/switching, etc., can solve the problems of Storm cluster security threats, topology tasks are no longer evenly distributed, aggravated load imbalance, etc. The effect of cluster resource utilization, avoiding homogeneous resource competition, and reducing communication traffic

Active Publication Date: 2019-10-22
PEKING UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

By analyzing the sorting results, we can see that the sort-slots method has limitations. In the sorting results, some working nodes are always sorted first, and some working nodes are sorted later.
[0019] 2) Due to the problem of the sort-slots method of the class A equal-partition scheduler 1), when multiple topologies are scheduled and allocated, the topological tasks will always be assigned to some of the top-ranked working nodes
During the operation of the Storm cluster, if the working node fails (fast failure), the working process fails, the topology is killed by the user or the new working node is expanded, etc., the working slot resources in the cluster will be released-recycled, causing the topology The tasks are no longer evenly distributed on the working nodes in the cluster, and the cluster resource load is unbalanced
Once this situation is exploited by malicious users in the cluster, it will pose a security threat to the Storm cluster
[0021] 4) In the normal use of the Storm cluster, although the B-type equal share scheduler can realize the equal share of cluster resources in a single topology and multiple topologies, it still belongs to the node-level scheduling strategy
In particular, if the situation mentioned in 3) occurs in the cluster, it is obvious that the current A / B class evenly distributed scheduler that comes with Storm cannot solve the problem of load imbalance
This node-level round-robin scheduling strategy still distributes the new topology tasks equally to each worker node, which will aggravate the load imbalance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A work-slot-aware storm platform job-sharing scheduling method
  • A work-slot-aware storm platform job-sharing scheduling method
  • A work-slot-aware storm platform job-sharing scheduling method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0051] 1) Scenario 1: In the normal operation scenario, three types of clusters, such as A-type Storm cluster, B-type Storm cluster, and F-Storm cluster, respectively schedule the allocation of work slots after topology 1, topology 2, and topology 3. For example, figure 2 shown.

[0052] Initially, 3 working nodes are started in each Storm cluster, and the available port numbers of the working slots on each working node are 6700, 6701, 6702, and 6703, that is, the number of working slots available for each working node is 4.

[0053] ① Class A Storm cluster.

[0054] Submit topology 1 (4worker) first, occupying 4 slots. The incoming set of the sort-slots method in the class A equal-partition scheduler is: {[S1 6700][S1 6701][S1 6702][S1 6703][S2 6700][S2 6701][S2 6702][S26703][ S3 6700][S3 6701][S3 6702][S3 6703]}, grouped according to the working node number, divided into three groups: {[[S1 6700][S1 6701][S1 6702][S1 6703]][[S2 6700][S2 6701][S2 6702][S2 6703]][[S3 6700]...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a work groove perceptive Storm platform operation equipartition dispatching method. The method includes the steps that 1, a set of topologies needing task dispatching in a cluster and a set of idle and available working grooves in the cluster are obtained; 2, the working grooves of the same working node in the working groove set are divided into the same group; 3, the working grooves are taken out of the group with the most remaining working grooves currently and put into a groove perceiving sorting sequence, and the number of the working grooves in each group is updated until the working groove set is null; 4, for the topologies in the topology set, the number Na of the working grooves needing to be dispatched to the topology according to the process number set for the topology, the total number of the working grooves and the number of the working grooves dispatched already; 5, the working grooves are sequentially taken out of the sequence, and it is controlled in real time that in the working grooves dispatched to the topology, the ratio of the number n of the working grooves belonging to the same working node to the total number of the topology working grooves is smaller than or equal to a set ratio alpha until Na working grooves are taken out and dispatched to the topology.

Description

technical field [0001] The invention belongs to Storm, a mainstream big data real-time streaming computing platform, and relates to the load balancing problem of job scheduling on the Storm platform, in particular to a fine-grained work slot-aware Storm platform job equalization scheduling method. Background technique [0002] In the era of big data, it is a challenge to extract valuable information from large-scale, rapidly and continuously generated data sets in real time. Hadoop, a traditional distributed big data processing system, mainly performs batch calculations, which is suitable for offline batch processing of static data and requires pre-storing the processed input data in a distributed file system. As a real-time distributed big data stream computing platform, Storm is suitable for real-time stream processing of dynamic data. It does not require pre-storage of data but directly performs real-time calculation of data in memory. Since data streaming computing is d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50G06F9/48
Inventor 沈晴霓钱文君杨雅辉吴中海
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products