Unlock instant, AI-driven research and patent intelligence for your innovation.

Performance optimization and parameter configuration method based on memory computing framework Spark

A parameter configuration method and memory computing technology, applied in computing, computer components, resource allocation, etc., can solve the problem that performance is greatly affected by configuration parameters, and achieve the effect of satisfying accuracy

Active Publication Date: 2020-05-19
CHONGQING UNIV OF POSTS & TELECOMM
View PDF5 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to propose a performance optimization and parameter configuration based on the memory computing framework Spark for the existing distributed computing framework Spark due to the large number of configuration parameters, the performance is greatly affected by the configuration parameters, and the application program has different characteristics. method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Performance optimization and parameter configuration method based on memory computing framework Spark
  • Performance optimization and parameter configuration method based on memory computing framework Spark
  • Performance optimization and parameter configuration method based on memory computing framework Spark

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0024] Such as figure 1 As shown, a performance optimization and parameter configuration strategy method based on the memory computing framework Spark includes the following four steps:

[0025] One, such as image 3 As shown, the Spark resource scheduling process, the specific resource scheduling process is shown in the following three steps:

[0026] (1) Driver (driver) is the main() function that runs Spark Application (Spark application), and it will create SparkContext (Spark context object). SparkContext is responsible for communicating with Cluster Manager (client manager) for resource a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a performance optimization and parameter configuration method based on a memory computing framework Spark. The method comprises the following steps: firstly, determining the type of a Spark application program and Spark performance parameters influencing different types; and randomly combining the configuration parameters to obtain a training set, establishing a configuration parameter model for the training set through a LightGBM algorithm, and searching a hyper-parameter optimal combination of the LightGBM algorithm through a Bayesian optimization algorithm to furtherenable the configuration model to select optimal configuration parameters. According to the method, the optimal configuration parameters of different types of application programs running in different cluster environments can be found for the user under the condition that the user is not required to understand a Spark running mechanism, parameter meaning operation, a value range, application program type characteristics and an input set, and the method is simpler, clearer and more convenient than a previous parameter configuration method.

Description

technical field [0001] The invention belongs to the technical fields of big data, cloud computing, distributed systems, etc., and specifically relates to a performance optimization and parameter configuration method based on the memory computing framework Spark. Background technique [0002] The distributed memory computing framework Spark is a big data parallel computing framework based on memory computing. The characteristics of massive data and real-time processing requirements brought by big data have created a huge contradiction with the traditional computing-centered model, making it difficult for traditional computing models to adapt to data processing in today's big data environment. In general, the processing method has also changed from computational processing to data processing. Therefore, the problem of data processing speed is becoming more and more prominent, and the real-time efficiency is not strong. The characteristics of big data, such as fast incrementa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/50G06F9/445G06K9/62
CPCG06F9/5016G06F9/5083G06F9/4451G06F18/24155G06F18/214
Inventor 范天文龙昭华沈励芝余快崔永明
Owner CHONGQING UNIV OF POSTS & TELECOMM