Container-based distributed computing method and device

A distributed data and container technology, applied in the field of container-based distributed computing methods and devices, can solve the problems of unfriendly Shuffle-intensive computing tasks, unrecognizable, inferior memory read and write performance, etc.

Active Publication Date: 2020-10-30
INSPUR SUZHOU INTELLIGENT TECH CO LTD
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the Shuffle of the current mainstream distributed memory computing framework, on the one hand, operations such as sorting, joining, and grouping will generate a large amount of Shuffle data when writing code; , SSD, etc.), but the read and write performance of HHD is not good, even the SSD with relatively good read and write performance is far behind the read and write performance of memory, which makes the Shuffle stage consume a lot of time. For Shuffle-intensive computing tasks very unfriendly
Persistent memory has read and write performance close to that of DRAM and has a capacity that cannot be compared with ordinary DRAM. However, there are certain technical barriers in the interaction between the container and the host's persistent memory device, and the container cannot recognize the persistent storage on the host. equipment
[0003] There is currently no effective solution to the problems of excessive Shuffle data volume and slow reading and writing cache speed in the container in the existing technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Container-based distributed computing method and device
  • Container-based distributed computing method and device
  • Container-based distributed computing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0043] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0044] Based on the above purpose, the first aspect of the embodiments of the present invention proposes an embodiment of a container-based distributed computing method that can reduce the processing amount of Shuffle data and increase the speed of r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a container-based distributed computing method and device, and the method comprises the following steps in a Shuffle stage: calling a bottom driver to initialize a persistent memory connected to a host machine, determining an equipment application mode for the persistent memory, and creating an area and a namespace on the host machine; creating a data volume with a file system for the persistent memory based on the region and the namespace, and mounting the data volume to a host machine to allow the container to access the file system through a container storage interface; and monitoring the Shuffle management interface, determining Shuffle data according to a dependency relationship between the elastic distributed data sets output by the management interface, and accessing the data volume from the container through the container storage interface so as to overwrite and/or cache the Shuffle data to the file system. According to the method, the processing amountof the Shuffle data can be reduced, the reading and writing cache speed of the Shuffle data in the container can be increased, and then the Shuffle efficiency of distributed computing is improved.

Description

technical field [0001] The present invention relates to the field of distributed computing, and more specifically, to a container-based distributed computing method and device. Background technique [0002] In the Shuffle of the current mainstream distributed memory computing framework, on the one hand, operations such as sorting, joining, and grouping will generate a large amount of Shuffle data when writing code; , SSD, etc.), but the read and write performance of HHD is not good, even the SSD with relatively good read and write performance is far behind the read and write performance of memory, which makes the Shuffle stage consume a lot of time. For Shuffle-intensive computing tasks Very unfriendly. Persistent memory has read and write performance close to that of DRAM and has a capacity that cannot be compared with ordinary DRAM. However, there are certain technical barriers in the interaction between the container and the host's persistent memory device, and the conta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/172G06F9/455G06F16/182
CPCG06F16/172G06F16/182G06F9/45558
Inventor 宋奇秦朝阳
Owner INSPUR SUZHOU INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products