Unlock instant, AI-driven research and patent intelligence for your innovation.

Data skew method and system in connection operation and computer equipment

An operation, data technology, applied in database design/maintenance, database index, structured data retrieval, etc., can solve the problem of no join operation processing and so on

Active Publication Date: 2020-10-13
中邮消费金融有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Based on this, it is necessary to provide a data skew processing method, system and computer equipment in the connection operation, so as to solve the problem that there is no general method for processing the join operation of two large tables with data skew in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data skew method and system in connection operation and computer equipment
  • Data skew method and system in connection operation and computer equipment
  • Data skew method and system in connection operation and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and implement it, but the examples given are not intended to limit the present invention.

[0045] Such as figure 1 As shown, the embodiment of the present invention provides a data skew processing method in a connection operation, which is characterized in that it includes the following steps:

[0046] S1. In the SQL statement, mark the associated object with data skew, and the content of the annotation includes the associated object and the skewed value;

[0047] S2. Parse the SQL execution plan, and identify an associated execution plan corresponding to the associated object in the SQL execution plan according to the annotation;

[0048] S3. Split the associated execution plan into two sub-plans according to the value of data skew,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a data skew processing method in connection operation. The data skew processing method comprises the following steps: marking associated objects with data skew in SQL statements, wherein the marked contents comprise the associated objects and skew values; analyzing an SQL execution plan, and identifying an associated execution plan corresponding to the associated object inthe SQL execution plan according to the label; splitting the associated execution plan into two sub-plans according to the value of the data skew, one sub-plan being used for executing the value of the data skew in the associated execution plan, and the other sub-plan being used for executing the value of the data skew which does not occur in the associated execution plan, and taking a union setof execution results of the sub-plans to obtain a replacement execution plan; and replacing the associated execution plan with the replacement execution plan to obtain an SQL execution plan after dataskew processing is completed. According to the data skew processing method, the problem of data skew in the join process of the two tables can be solved without customizing development codes.

Description

technical field [0001] The invention relates to the technical field of data skew processing, in particular to a method, system and computer equipment for data skew processing in connection operations. Background technique [0002] At present, for the data processing method in the field of big data processing, the basic idea is to partition a large data set according to specific rules, and then use multiple executors to perform parallel processing and calculation on the data in each partition. These executors can be located at On different machines, or in different processes on the same machine. If a process is complex, the entire process needs to be composed of multiple computing tasks, and data repartitioning (or shuffle) is usually required between computing tasks in different stages. [0003] If the data volume distribution in each partition is relatively balanced during the calculation process, each executor can process the data in its respective partition within a rela...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/21G06F16/22
CPCG06F16/217G06F16/2282
Inventor 范灿升廖健祝大裕杨思吉梁伟雄
Owner 中邮消费金融有限公司