Supercharge Your Innovation With Domain-Expert AI Agents!

Spark task scheduling method and system based on heterogeneous resources

A task scheduling and resource technology, applied in the computer field, can solve problems such as abnormal operation, loss of high-efficiency performance, and unsupported resource manager status updates, etc., so as to avoid waste of system resources, solve performance optimization problems, and reduce repeated computing operations.

Pending Publication Date: 2022-07-29
HUNAN UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its purpose is to solve the technical problems of the existing task scheduling system that the resource manager is limited to collecting the number of CPU cores, which leads to waste of system resources, and because the spark task scheduling process involves a large number of RDD data reorganization operations, and performance optimization during the reorganization operation Poor, resulting in greatly reduced operating efficiency, and does not support the technical problems of real-time status update of the resource manager through the operating results to set anomaly detection, and the fact that the data structure for efficient storage is not selected during the execution of the task node, resulting in The technical problem of losing high-efficiency performance, and the technical problem that the running results obtained during the spark task scheduling process will not be updated to the resource manager in real time, so that the running exception cannot be detected

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spark task scheduling method and system based on heterogeneous resources
  • Spark task scheduling method and system based on heterogeneous resources
  • Spark task scheduling method and system based on heterogeneous resources

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as there is no conflict with each other.

[0058] like Figure 4 As shown, the present invention provides a Spark task scheduling method based on heterogeneous resources, which specifically includes the following steps:

[0059] (1) The server obtains the resource information required by the system based on the Linux command and submits it to the resource manager to create a cluster manager and complete the initialization;

[0060] Specifi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a heterogeneous resource-based Spark task scheduling method, which comprises the following steps that: a server acquires resource information required by a system based on a Linux command and submits the resource information to a resource manager to create a cluster manager and complete initialization, and the server receives a task job submitted by a client, the method comprises the steps that a task job is created, the task job is submitted to a created cluster manager so as to convert the task job into a plurality of RDDs, all the obtained RDDs are analyzed so as to obtain an RDD graph representing the dependency relationship among the RDDs, a server side generates a DAG graph of a scheduling stage according to the dependency relationship among all the RDDs in the RDD graph, and the DAG graph of the scheduling stage is generated according to the dependency relationship among all the RDDs in the RDD graph. And the server divides all the RDDs in the DAG into a first task stage, a second task stage and a third task stage according to the corresponding dependencies. According to the task scheduling system, the performance optimization problem can be solved from all aspects, and the technical problem of system resource waste caused by the fact that a resource manager of an existing task scheduling system is only limited to collection of the number of CPU cores is solved.

Description

technical field [0001] The invention belongs to the field of computer technology, and more particularly, relates to a Spark task scheduling method and system based on heterogeneous resources. Background technique [0002] Task scheduling is an important part of the operating system. For real-time operating systems, task scheduling directly affects its real-time performance. The task scheduling system is a very core component in the data platform. In daily data processing, it is very common to run some businesses regularly, such as importing new data from the database to the data platform regularly, and exporting the data processed by the data platform to the database or file system; so to speak, the task The dispatch system is similar to the commander of the army, directing the operation of each component on the data platform, and supervising the operation of the task at all times. [0003] However, the traditional task scheduling system has some deficiencies that cannot be...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/48G06F9/50
CPCG06F9/5066G06F9/5077G06F9/4843G06F9/5027G06F9/5011
Inventor 唐卓伍晨李肯立向婷李虹宇王啸罗文明程欣威
Owner HUNAN UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More