Docker container-oriented Spark big data application program performance modeling method and device and storage device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A docker container, application technology, applied in the direction of program control design, multi-programming device, program control device, etc., can solve problems such as the inability to give full play to the advantages of cloud computing systems and the unstable performance of big data applications

Pending Publication Date: 2019-11-08

SHENZHEN INST OF ADVANCED TECH

View PDF4 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, due to the complex relationship between Docker resource allocation and application performance, the performance of big data applications (such as Spark) running in Docker containers is unstable, and the advantages of cloud computing systems cannot be fully utilized.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0045] Embodiment one, refer to figure 1 , the present invention is oriented to the implementation of the Spark big data application performance modeling method of Docker container comprising:

[0046] 101. Obtain key parameters that affect the performance of Docker containers and Spark big data applications, and collect corresponding experimental data;

[0047]Obtain key parameters that affect the performance of Docker containers and Spark big data applications, and collect corresponding experimental data. Specifically, the CPU, memory, and I / O resource allocation of Spark big data applications on typical Docker containers can be adjusted jointly. Determine the key parameters of Docker container resource allocation and Spark resource allocation that affect the performance of Spark big data applications on Docker containers, and collect corresponding experimental data based on Spark big data applications. specific:

[0048] In order to study the impact of different resource ...

Embodiment 2

[0064] Embodiment two, refer to figure 2 , the present invention is oriented to the implementation of the Spark big data application performance modeling method of Docker container comprising:

[0065] 201. Obtain the key parameters of the Docker container;

[0066] Get the key parameters of the Docker container. Specifically, by inserting different parameter options into the command to start the Docker container to limit the resource usage of the Docker container, and obtain key parameters that affect performance, mainly CPU, memory and disk-related key parameters.

[0067] 202. Obtain key parameters that affect the execution performance of Spark big data applications;

[0068] In the case of limiting the resource usage of a Docker container, adjust the resource allocation of the Spark big data application, and obtain the key parameters that affect the execution performance of the Spark big data application.

[0069] 203. Collect experimental data;

[0070] When obtainin...

Embodiment 3

[0081] Embodiment 3, the present invention is specifically described through a specific application example below:

[0082] Step 1, parameter adjustment, that is to obtain key parameters that affect the performance of Docker containers and Spark big data applications, and collect corresponding experimental data, specifically:

[0083] It is necessary to deploy a Docker container-based big data cluster, set the value of the Docker container resource allocation parameter, and set the value of the corresponding Spark big data application resource allocation parameter, and conduct an experimental test. The Docker container resource allocation parameters are detailed in Table 1. The resource allocation parameters of the Spark big data application are detailed in Table 2 and will not be described here.

[0084]In this application example, we choose the HDFS file system and Yarn resource manager in the Hadoop ecosystem, as well as the Spark distributed computing framework. Using Doc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention discloses a Docker container-oriented Spark big data application program performance modeling method, and further discloses corresponding equipment and storage equipment. According to the embodiment of the invention, the method comprises the steps: acquiring key parameters and corresponding experimental data which influence the performance of a Docker container anda Spark big data application program; putting the experimental data into machine learning for model training, obtaining a corresponding resource allocation model, and obtaining an optimal resource allocation model according to the resource allocation model and the input test data. According to the method, the resource parameters of the Docker container and the Spark big data application program are jointly tuned and optimized; a corresponding relation between the resource parameters of the Spark big data application program and the resource parameters of the Docker container is found out; theoptimal resource allocation parameter value of the Spark big data application program is set according to the size of the resource parameter of the Docker container, so that the Spark big data application program based on the Docker container is more stable.

Description

technical field [0001] The invention relates to the field of image recognition and processing, in particular to a Docker container-oriented Spark big data application performance modeling method, device and storage device. Background technique [0002] With the continuous development of cloud computing technology, more and more enterprises migrate complex IT applications to the cloud. The cloud platform uses virtualization technology to realize the management and elastic scaling of large-scale underlying physical resources. For a long time in the past, virtual machines have been playing the role of the backbone of the cloud platform infrastructure layer, using it to provide isolation and control of physical resources. However, due to the additional virtualization control layer of the virtual machine, additional performance loss is caused to the cloud platform. The traditional mode of using the virtual machine as the minimum resource scheduling unit has a series of problems ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F9/50G06F9/455G06N20/00

CPCG06F9/5005G06F9/45533G06F2009/45562

Inventor 扣彦敏叶可江须成忠

Owner SHENZHEN INST OF ADVANCED TECH

Docker container-oriented Spark big data application program performance modeling method and device and storage device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology