Unlock instant, AI-driven research and patent intelligence for your innovation.

An automatic tuning method of spark parameters based on cost model

A cost model and parameter technology, applied in the direction of program startup/switching, electrical digital data processing, multi-program device, etc., can solve the problem that the performance model is greatly affected by the system operating state, cannot accurately establish the model, cannot determine the configuration, etc. problem, to reduce the workload and reduce the impact

Active Publication Date: 2022-07-01
BEIHANG UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] For experienced operation and maintenance personnel, when there are few parameters to be adjusted, the task performance can be greatly increased by adjusting the configuration parameters, but this requires very familiarity with the system to achieve, and it is impossible to determine whether there is a better Configuration
But with the increase of parameters, even experienced operation and maintenance personnel, it is difficult to tune by manually adjusting the configuration parameters
[0010] To sum up, the current method is difficult to dynamically upgrade the performance model in the configuration parameter tuning; the generated performance model is greatly affected by the system operating state, and the model cannot be accurately established; manual tuning cannot adapt to the large parameter space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An automatic tuning method of spark parameters based on cost model
  • An automatic tuning method of spark parameters based on cost model
  • An automatic tuning method of spark parameters based on cost model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0083] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below in combination. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

[0084] The basic idea of ​​the present invention is to obtain the optimal parameters of Spark tasks by establishing a cost-based performance model combined with a parameter space search algorithm, and to provide reference values ​​of optimization parameters for unknown tasks by judging similar rows of tasks.

[0085] figure 1 A schematic diagram of the system architecture for realizing the automatic tuning method of SPARK parameters based on ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention proposes a cost model-based Spark parameter automatic tuning method, which includes the following steps. Step 1: By acquiring task execution configuration and corresponding cost information, a cost-based performance model is constructed, and a cost-based performance model is obtained in a given parameter space. Optimize configuration; Step 2: For unknown types of tasks, use default parameters to run once, and provide reference values ​​for optimized configuration by judging the similarity of tasks. The invention proposes a cost-based performance model in view of the problems that may exist in the current configuration parameter tuning. The performance model is generated by analyzing Spark historical tasks, and the optimization parameters are obtained through the parameter space search algorithm. Upgrade and adjust to increase the accuracy of the model. For unknown types of tasks, after running it once, provide parameter reference values.

Description

technical field [0001] The invention relates to Spark task performance model establishment, big data system configuration parameter space search, and task similarity judgment. Background technique [0002] With the continuous development of science and technology, everything from a mobile phone and a tablet to an astronomical telescope and the Large Hadron Collider are all generators of data. massive data. How to store, process, and analyze these data has also become a practical issue in front of everyone. Since Google published the Google File System paper in 2003, a number of distributed computing frameworks have emerged, such as hadoop and sparkSpark—cluster computing on working sets. The emergence of these frameworks provides a basis for the storage and processing of big data, and also It plays an important role in various application scenarios. In response to the inefficiency of hadoop in iterative processing, Spark came into being. Although spark has a good perform...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/48
CPCG06F9/4881
Inventor 杨海龙马群李云春
Owner BEIHANG UNIV