Method, device, server and storage medium for determining data redistribution mode

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A determination method and redistribution technology, applied in the database field, can solve problems such as serious resource consumption, improve execution efficiency, and solve the effect of high resource consumption of data redistribution

Active Publication Date: 2021-06-08

SHANGHAI DAMENG DATABASE

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Embodiments of the present invention provide a method, device, server, and storage medium for determining a data redistribution mode, so as to solve the problem of serious resource consumption during data redistribution in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0030] figure 1 It is a flowchart of a method for determining a data redistribution method provided by Embodiment 1 of the present invention. This embodiment is applicable to determining the data redistribution method of nodes according to the distribution attributes of nodes in a large-scale parallel processing environment, so that In the case where the node executes data redistribution based on the determined data redistribution method, the method can be executed by the device for determining the data redistribution method, which can be implemented by software and / or hardware, and the device is integrated in the server , specifically, the method includes the following steps:

[0031] S110. Traverse the execution binary tree, and determine leaf nodes, intermediate nodes, and level information of the leaf nodes and intermediate nodes of the execution binary tree.

[0032] The execution binary tree is generated by analyzing the query statement input by the user, and the interm...

Embodiment 2

[0046] Figure 4 It is a flow chart of a method for determining a data redistribution method provided by Embodiment 2 of the present invention. This embodiment takes the connection operation as an example, that is, the query statement is a connection query statement, and is embodied on the basis of the above-mentioned embodiment. Specifically, the method includes:

[0047] S210. Traverse the execution binary tree, and determine leaf nodes, intermediate nodes, and level information of the leaf nodes and intermediate nodes of the execution binary tree.

[0048] S220. Determine the distribution attributes of the leaf nodes and the intermediate nodes in sequence from bottom to top.

[0049] The distribution attributes of leaf nodes and intermediate nodes are determined in different ways, as follows:

[0050] S2201. Search the data dictionary to obtain the distribution attributes of the leaf nodes.

[0051] The data dictionary is used to store distribution attributes of each lea...

example 1

[0108] Data table A and data table B execute hash inner join, the join condition is HI (A.c1=B.d1 and A.c2=B.d2), execute binary tree such as figure 2 shown. The HI node has two optional data redistribution methods, which are the distribution redistribution method and the collection redistribution method.

[0109] (1) Select the distribution redistribution method

[0110] According to the connection conditions, the redistribution item is determined as (c1, d1), (c2, d2) or {(c1, c2), (d1, d2)}, where the distribution attribute of the HI node is consistent with the redistribution item, based on the data table The distribution attributes of A and data table B, specifically include the following situations:

[0111] 1) Both data table A and data table B are not hash distribution, such as random distribution or copy distribution, then both data table A and data table B need to perform distribution redistribution, and the redistribution items are (c1,d1), (c2 ,d2) or {(c1,c2),(...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention discloses a method, a device, a server and a storage medium for determining a data redistribution mode. The method includes: traversing the execution binary tree, determining leaf nodes, intermediate nodes, and level information of the leaf nodes and intermediate nodes of the execution binary tree, and sequentially determining the distribution attributes of the leaf nodes and the intermediate nodes in a sequence from bottom to top, According to the distribution attributes of leaf nodes and intermediate nodes, the dynamic redistribution modes of data of leaf nodes and intermediate nodes are respectively determined. Compared with the prior art, the embodiment of the present invention sequentially determines the distribution attributes of the leaf nodes and intermediate nodes according to the order from bottom to top, and then determines the corresponding data redistribution mode according to the distribution attributes, which solves the problem of data redistribution in the prior art. The problem of high resource consumption improves the execution efficiency of the system.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of databases, and in particular, to a method, device, server, and storage medium for determining a data redistribution mode. Background technique [0002] MPP (Massively Parallel Processing) is massively parallel processing. In a non-shared database cluster, each node has an independent disk storage system and memory system. The data between the nodes are connected to each other through the network to coordinate calculations with each other. To put it simply, MPP distributes tasks to multiple servers and nodes in parallel. After the calculation of each node is completed, the results of each part are aggregated to obtain the final execution result. [0003] When data tables are connected and other operations are performed, if the operation involves non-distributed columns, the calculation of each node needs data from other nodes other than the node to complete. At this time, the data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F16/27G06F16/242G06F16/22

CPCG06F16/2246G06F16/2433G06F16/27

Inventor张钦朱仲颖

OwnerSHANGHAI DAMENG DATABASE

Method, device, server and storage medium for determining data redistribution mode

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

example 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology