Automatic optimization method for performance of Spark platform

An automatic optimization and performance technology, applied in resource allocation, multi-programming devices, data processing applications, etc., can solve problems such as high threshold, high cost, and low efficiency

Active Publication Date: 2016-08-17
UNIVERSITY OF CHINESE ACADEMY OF SCIENCES
View PDF2 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] For the defects of the prior art, the purpose of the present invention is to provide a method for automatically optimizing the performance of the Spark platform, thereby solving the problems of high cost, low efficiency, high threshold and increasing system complexity and unstable factors existing in the existing method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic optimization method for performance of Spark platform
  • Automatic optimization method for performance of Spark platform
  • Automatic optimization method for performance of Spark platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0081] Below in conjunction with accompanying drawing and specific implementation case, further illustrate the present invention, it should be understood that these implementation cases are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after reading the present invention, those skilled in the art will understand various aspects of the present invention Modifications in equivalent forms all fall within the scope defined by the appended claims of this application.

[0082] Such as figure 1 As shown, the present invention is firstly based on an overhead performance model and is divided into four parts: performance data collection, performance analysis and prediction, front-end display and interaction, and automatic parameter optimization.

[0083] First, perform performance data collection, and collect performance data of the Spark platform for user applications, including cluster runtime environment, hardware con...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses an automatic optimization method for performance of a Spark platform. The method comprises: 1) creating a Spark application performance model according to an executing mechanism of a Spark platform; 2) for a set Spark application, selecting some data of the Spark application to be loaded and run on the Spark platform, and acquiring performance data when the Spark application is run; 3) inputting the acquired performance data into the Spark application performance model, so as to obtain a value of each parameter in the Spark application performance model when the Spark application is run; and 4) assigning the value that is of each parameter of the performance model and that is obtained in step 3) to the Spark application performance model, calculating performance (total execution time of the application) of the Spark platform when configuration parameters are combined in different ways, and then outputting a configuration parameter combination when the performance of the Spark platform is optimum. The method disclosed by the present invention has the advantages of a low threshold, easy extension, a low cost and high efficiency and the like.

Description

technical field [0001] The invention relates to the field of performance optimization of a big data processing platform, in particular to a method for automatically optimizing the performance of a Spark platform. Background technique [0002] With the advent of the big data era, the corresponding big data processing new technologies continue to develop, and a variety of big data processing platforms have also emerged, the most eye-catching of which is Apache Spark. [0003] Spark is a distributed big data parallel processing platform based on memory computing. It integrates batch processing, real-time stream processing, interactive query and graph computing, avoiding the need to deploy resources brought by different clusters in various computing scenarios. waste. [0004] Spark's memory-based computing properties make it inherently advantageous for iterative computing, and it is especially suitable for iterative algorithms in machine learning. Compared with Hadoop's MapRed...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06Q10/04
CPCG06F9/50G06Q10/04
Inventor 王国路徐俊刚刘仁峰
Owner UNIVERSITY OF CHINESE ACADEMY OF SCIENCES
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products