Task priority control implementation method and device for Spark JDBC

A technology of task priority and implementation method, applied in the field of task priority control for SparkJDBC, can solve problems such as affecting business applications, unable to schedule resources, unable to retrieve SQL priority control, etc., to achieve the effect of improving usability and large application prospects

Inactive Publication Date: 2019-06-07
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, with the continuous increase of data volume and the continuous development of big data technology, SparkJdbc's native architecture c

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Task priority control implementation method and device for Spark JDBC
  • Task priority control implementation method and device for Spark JDBC
  • Task priority control implementation method and device for Spark JDBC

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0039] The invention provides a method for implementing Spark JDBC-oriented task priority control. Including the method of establishing multiple task priority queues in SparkJDBC; the method of mapping the retrieval SQL submitted by the user to the task priority queue waiting to be run in Spark JDBC; setting the execution limit for each task priority queue in Spark JDBC, Reject the method of retrieving SQL over quota; the method of hardware resource scheduling based on preset priorities and weights between task priority queues; within a single task priority queue, use the "first in first out" strategy or "fair" strategy The method of hardware resource scheduling. Using the embodiments of the present invention can meet the needs of flexible control of hardware resource usage through the JDBC interface in actual business use; and meet the execution sequence requirements of urgent tasks, general tasks and low-priority tasks in business use.

[0040] Hereinafter, exemplary embodimen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a task priority control implementation method and device for Spark JDBC. The method comprises the steps of describing an XML file by a Spark Jdbc service according to a pre-written priority queue to establish a plurality of task priority queues when started; receiving a specified priority queue command issued by a user through a Jdbc interface, and completing priority setting of a Jdbc session level; receiving a retrieval SQL submitted by a user, generating a Spark Task set after the SQL statement is subjected to a plurality of analysis planning processes, and adding the Spark Task set into a target priority queue of a corresponding name; and scheduling and allocating hardware resources through a resource scheduler according to the resource allocation strategy between the priority queues and the resource allocation strategy in the queues, and allocating Spark Tasks to Task actuators on the computing nodes for execution.

Description

technical field [0001] The invention relates to the field of big data processing, in particular to a method and device for implementing Spark JDBC-oriented task priority control. Background technique [0002] With the continuous development of computer technology and the continuous improvement of informatization, the amount of data has grown rapidly, and the storage and application of massive data has also flourished. In massive data retrieval applications, Apache Foundation's distributed retrieval framework SparkJdbc provides a HiveQL interface with Hive, which has high efficiency and availability and is widely used in this field. [0003] After the user submits a SQL retrieval request to SparkJdbc, the SQL statement is parsed to generate an execution plan, and then a SparkRDD is generated. The Spark RDD is converted into a DAG to generate a Spark Stage, and finally the Stage generates a Spark Task collection. Spark Task is a task structure generated in Spark that can perf...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/48G06F9/50G06F16/242G06F16/25
Inventor 刘欣然张鸿惠榛吕雁飞马秉楠李斌斌王振宇黄航王树鹏
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products