Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method, device, server and storage medium for determining data redistribution mode

A determination method and redistribution technology, applied in the database field, can solve problems such as serious resource consumption, improve execution efficiency, and solve the effect of high resource consumption of data redistribution

Active Publication Date: 2021-06-08
SHANGHAI DAMENG DATABASE
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Embodiments of the present invention provide a method, device, server, and storage medium for determining a data redistribution mode, so as to solve the problem of serious resource consumption during data redistribution in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device, server and storage medium for determining data redistribution mode
  • Method, device, server and storage medium for determining data redistribution mode
  • Method, device, server and storage medium for determining data redistribution mode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] figure 1 It is a flowchart of a method for determining a data redistribution method provided by Embodiment 1 of the present invention. This embodiment is applicable to determining the data redistribution method of nodes according to the distribution attributes of nodes in a large-scale parallel processing environment, so that In the case where the node executes data redistribution based on the determined data redistribution method, the method can be executed by the device for determining the data redistribution method, which can be implemented by software and / or hardware, and the device is integrated in the server , specifically, the method includes the following steps:

[0031] S110. Traverse the execution binary tree, and determine leaf nodes, intermediate nodes, and level information of the leaf nodes and intermediate nodes of the execution binary tree.

[0032] The execution binary tree is generated by analyzing the query statement input by the user, and the interm...

Embodiment 2

[0046] Figure 4 It is a flow chart of a method for determining a data redistribution method provided by Embodiment 2 of the present invention. This embodiment takes the connection operation as an example, that is, the query statement is a connection query statement, and is embodied on the basis of the above-mentioned embodiment. Specifically, the method includes:

[0047] S210. Traverse the execution binary tree, and determine leaf nodes, intermediate nodes, and level information of the leaf nodes and intermediate nodes of the execution binary tree.

[0048] S220. Determine the distribution attributes of the leaf nodes and the intermediate nodes in sequence from bottom to top.

[0049] The distribution attributes of leaf nodes and intermediate nodes are determined in different ways, as follows:

[0050] S2201. Search the data dictionary to obtain the distribution attributes of the leaf nodes.

[0051] The data dictionary is used to store distribution attributes of each lea...

example 1

[0108] Data table A and data table B execute hash inner join, the join condition is HI (A.c1=B.d1 and A.c2=B.d2), execute binary tree such as figure 2 shown. The HI node has two optional data redistribution methods, which are the distribution redistribution method and the collection redistribution method.

[0109] (1) Select the distribution redistribution method

[0110] According to the connection conditions, the redistribution item is determined as (c1, d1), (c2, d2) or {(c1, c2), (d1, d2)}, where the distribution attribute of the HI node is consistent with the redistribution item, based on the data table The distribution attributes of A and data table B, specifically include the following situations:

[0111] 1) Both data table A and data table B are not hash distribution, such as random distribution or copy distribution, then both data table A and data table B need to perform distribution redistribution, and the redistribution items are (c1,d1), (c2 ,d2) or {(c1,c2),(...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a method, a device, a server and a storage medium for determining a data redistribution mode. The method includes: traversing the execution binary tree, determining leaf nodes, intermediate nodes, and level information of the leaf nodes and intermediate nodes of the execution binary tree, and sequentially determining the distribution attributes of the leaf nodes and the intermediate nodes in a sequence from bottom to top, According to the distribution attributes of leaf nodes and intermediate nodes, the dynamic redistribution modes of data of leaf nodes and intermediate nodes are respectively determined. Compared with the prior art, the embodiment of the present invention sequentially determines the distribution attributes of the leaf nodes and intermediate nodes according to the order from bottom to top, and then determines the corresponding data redistribution mode according to the distribution attributes, which solves the problem of data redistribution in the prior art. The problem of high resource consumption improves the execution efficiency of the system.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of databases, and in particular, to a method, device, server, and storage medium for determining a data redistribution mode. Background technique [0002] MPP (Massively Parallel Processing) is massively parallel processing. In a non-shared database cluster, each node has an independent disk storage system and memory system. The data between the nodes are connected to each other through the network to coordinate calculations with each other. To put it simply, MPP distributes tasks to multiple servers and nodes in parallel. After the calculation of each node is completed, the results of each part are aggregated to obtain the final execution result. [0003] When data tables are connected and other operations are performed, if the operation involves non-distributed columns, the calculation of each node needs data from other nodes other than the node to complete. At this time, the data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/27G06F16/242G06F16/22
CPCG06F16/2246G06F16/2433G06F16/27
Inventor 张钦朱仲颖
Owner SHANGHAI DAMENG DATABASE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products