Unlock instant, AI-driven research and patent intelligence for your innovation.

Dynamic computing node grouping method and system for large-scale parallel processing

A parallel processing and computing node technology, applied in computing, electrical digital data processing, special data processing applications, etc., can solve problems such as insufficient use of computing nodes, inefficient use of suboptimal logic plans, and excessive use

Active Publication Date: 2021-09-07
HUAWEI TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This static allocation of memory and compute nodes between architectures can cause some compute nodes to be underutilized or overutilized
In addition, a particular compute node may also be inefficiently used by a sub-optimal logical plan for retrieving a response to the query, rather than a logical plan that efficiently uses memory and compute nodes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dynamic computing node grouping method and system for large-scale parallel processing
  • Dynamic computing node grouping method and system for large-scale parallel processing
  • Dynamic computing node grouping method and system for large-scale parallel processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The technology generally involves dynamic computing node grouping, wherein the dynamic computing node grouping combines storage and computing in a massively parallel processing (abbreviated as MPP) shared-nothing relational database management system (relational database management system, referred to as RDBMS). Decoupling. This technology realizes the flexibility of supporting higher inter-partition processing parallelism through data redistribution among MPP nodes. Dynamic compute node grouping also adds another dimension to a query or plan optimizer to take into account when constructing an optimal logical plan that provides a response to a query.

[0029] A data skew aware cost model can be used to select the optimal set of compute nodes at the correct stage of the query processing pipeline. The data skew-aware cost model enables the plan optimizer to analyze and compare the estimated cost of redistribution of data across the network with the cost of parallel reduc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A massively parallel processing shared-nothing relational database management system includes multiple memories allocated to multiple computing nodes. The system includes a non-transitory memory having instructions and one or more processors in communication with the memory. The one or more processors execute the instructions for: storing a data set in a first set of memories of the plurality of memories; hashing the first data set into a repartitioned data set; reallocating the first set of memory to a second set of computing nodes of the plurality of computing nodes; distributing the repartitioned data set to the second set of computing nodes; A database operation is performed on the repartitioned dataset.

Description

Background technique [0001] A massively parallel processing (massively parallel processing, MPP for short) shared-nothing relational database management system (relational database management system, RDBMS for short) usually includes a plurality of shared-nothing nodes. A shared nothing node may include at least one memory coupled to at least one compute node. Typically, in an MPP shared-nothing RDBMS, some memory is statically allocated to several computing nodes in a particular shared-nothing node. [0002] When processing queries to an MPP shared-nothing RDBMS, data may need to be repartitioned and transferred from one shared-nothing node to another shared-nothing node, where the other shared-nothing node storage may require Additional data in response to this query. This static allocation of memory and compute nodes between architectures can cause some compute nodes to under- or over-commit. Furthermore, a particular compute node may also be inefficiently used by a sub-...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2453
CPCG06F16/24542G06F16/24532G06F16/24544G06F16/2255G06F16/24545G06F16/24554
Inventor 张立杰森·扬·孙丁永华
Owner HUAWEI TECH CO LTD