Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Two-Level Adaptive Scheduling Method for Massively Parallel Data Processing Tasks

A technology of data processing and scheduling methods, applied in the direction of electrical digital data processing, multi-program device, program control design, etc., to achieve the effect of realizing system resources, high flexibility, and improving task processing efficiency

Active Publication Date: 2020-03-24
中国航天系统科学与工程研究院
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The problem solved by the technology of the present invention is: to overcome the deficiencies of the prior art, to provide a two-level adaptive scheduling method suitable for large-scale parallel data processing tasks, to process tasks based on two levels of task / subtask, and to improve the degree of parallelism , which effectively solves the difficult problem of parallel scheduling caused by complex dependencies among tasks, realizes orderly and efficient parallel processing of large-scale data processing tasks, and reduces the overall execution time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Two-Level Adaptive Scheduling Method for Massively Parallel Data Processing Tasks
  • A Two-Level Adaptive Scheduling Method for Massively Parallel Data Processing Tasks
  • A Two-Level Adaptive Scheduling Method for Massively Parallel Data Processing Tasks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to make large-scale data processing tasks orderly and efficiently executed in parallel, the method of the present invention processes tasks from two levels, which aims to maximize the amount of parallelism and reduce task waiting or execution time: first level, task level, each Each task declares its dependent predecessor tasks, and the scheduler builds a topology based on this to ensure that tasks are executed in the order of dependencies, and tasks without dependencies can be executed in parallel; the second level, the subtask level, divides the tasks into a series of actions or functions The data and resources required by each subtask have been loaded by the first-level task layer. The purpose of dividing the subtasks is to further increase the degree of parallelism, and assign subtasks without resource conflicts and no execution order to multiple threads at the same time carried out.

[0054] Data processing tasks refer to tasks such as data collection, data p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a two-stage self-adaptive scheduling method suitable for a large-scale parallel data processing tasks. Two-stage scheduling is performed on tasks from a task stage and a subtask stage. By the method, the problem of parallel scheduling difficulty caused by complex dependence relationship among the tasks is solved effectively, parallelism degree is increased, orderly and efficient parallel processing of large-scale data processing tasks is realized, task waiting or executing time is reduced, and overall executing time is shortened. In addition, by the method, executor operation statistical information can be fed back to a scheduler, self-adaptive adjusting of executor pool size and task type is realized, and scheduling is constantly optimized, so that system resourceusing efficiency is improved.

Description

Technical field [0001] The invention relates to a two-stage adaptive scheduling method suitable for large-scale parallel data processing tasks, and belongs to the field of data processing task scheduling. Background technique [0002] With the continuous development of Internet technology, the demand for large-scale and massive data storage and processing in various fields continues to increase, and the requirements for its work efficiency and processing costs are also increasing. How to reasonably allocate large-scale data processing tasks to multi-processor systems, improve execution efficiency, and pursue the smallest overall execution time has become an urgent problem that needs to be solved. [0003] Traditional general task scheduling algorithms, such as first-come, first-served scheduling algorithm, high-priority priority scheduling algorithm, time slice round-robin scheduling algorithm, etc., have certain limitations and are not suitable for large-scale data processing task...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/48
CPCG06F9/4881
Inventor 顾升高刘瑞齐俊鹏胡泉杨越孙毅方
Owner 中国航天系统科学与工程研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products