Two-stage self-adaptive scheduling method suitable for large-scale parallel data processing tasks

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of data processing and scheduling methods, applied in the direction of electrical digital data processing, program start/switch, program control design, etc., to achieve the effect of realizing system resources, high flexibility, and improving task processing efficiency

Active Publication Date: 2018-11-30

中国航天系统科学与工程研究院

View PDF8 Cites 19 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The problem solved by the technology of the present invention is: to overcome the deficiencies of the prior art, to provide a two-level adaptive scheduling method suitable for large-scale parallel data processing tasks, to process tasks based on two levels of task / subtask, and to improve the degree of parallelism , which effectively solves the difficult problem of parallel scheduling caused by complex dependencies among tasks, realizes orderly and efficient parallel processing of large-scale data processing tasks, and reduces the overall execution time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0053] In order to execute large-scale data processing tasks in an orderly and efficient manner, the method of the present invention processes tasks from two levels, aiming at maximizing the amount of parallelism and reducing task waiting or execution time: the first level, the task level, each Each task declares its dependent predecessor tasks, and the scheduler builds a topology based on this to ensure that tasks are executed in the order of dependencies, and tasks without dependencies can be executed in parallel; the second level, the subtask level, divides tasks into a series of actions or functions The data and resources required by each subtask have been loaded by the first-level task layer. The purpose of dividing subtasks is to further improve the degree of parallelism, and assign subtasks without resource conflicts and execution order associations to multiple threads at the same time. implement.

[0054] Data processing tasks refer to tasks such as data collection, da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a two-stage self-adaptive scheduling method suitable for a large-scale parallel data processing tasks. Two-stage scheduling is performed on tasks from a task stage and a subtask stage. By the method, the problem of parallel scheduling difficulty caused by complex dependence relationship among the tasks is solved effectively, parallelism degree is increased, orderly and efficient parallel processing of large-scale data processing tasks is realized, task waiting or executing time is reduced, and overall executing time is shortened. In addition, by the method, executor operation statistical information can be fed back to a scheduler, self-adaptive adjusting of executor pool size and task type is realized, and scheduling is constantly optimized, so that system resourceusing efficiency is improved.

Description

technical field [0001] The invention relates to a two-level adaptive scheduling method suitable for large-scale parallel data processing tasks, belonging to the field of data processing task scheduling. Background technique [0002] With the continuous development of Internet technology, the demand for large-scale massive data storage and processing in various fields is increasing, and the requirements for its work efficiency and processing cost are also increasing. How to reasonably allocate large-scale data processing tasks to multi-processor systems, improve execution efficiency, and pursue the smallest overall execution time has become a problem that needs to be solved urgently. [0003] Traditional general task scheduling algorithms, such as first come first serve scheduling algorithm, high priority priority scheduling algorithm, time slice round robin scheduling algorithm, etc., have certain limitations and are not suitable for large-scale data processing task scheduli...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06F9/48

CPCG06F9/4881

Inventor顾升高刘瑞齐俊鹏胡泉杨越孙毅方

Owner中国航天系统科学与工程研究院

Two-stage self-adaptive scheduling method suitable for large-scale parallel data processing tasks

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology