Unlock instant, AI-driven research and patent intelligence for your innovation.

A greenplum-based quick sort query method and system

A quick sorting and query method technology, applied in the field of big data query, can solve the problems of scarce documents, complex structure, large amount of source code, etc., and achieve the effect of reducing partition data reading and improving performance

Active Publication Date: 2021-07-02
南威北方科技集团有限责任公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] Difficulty: Large amount of source code (more than 2.6 million lines of total code), complex architecture, and scarce documentation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A greenplum-based quick sort query method and system
  • A greenplum-based quick sort query method and system
  • A greenplum-based quick sort query method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0086] Assume that there is a partition table test, which has 2 columns, the column names are id and create_date, and the key value of the partition with create_date is divided into 365 partitions according to the time from January 1, 2017 to December 31, 2017. 30 million records are inserted into the table, which are distributed in 365 partitions, and the number of records falling on this partition on January 1, 2017 is 10,000.

[0087] When the present invention executes the following SQL statements, the optimized query process will replace the original process:

[0088] select * from test order by create_date limit 10;

[0089] The comparison of the specific processing flow is as follows: figure 1 As shown, the present invention discloses a fast query method based on the Order by of the Greenplum partition key value plus Limit, which includes the following steps:

[0090] Step 1. The user submits a SQL request to the Master node, and the Master node generates an executio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of big data query, and discloses a Greenplum-based quick sorting query method and system, in which Greenplum receives and parses SQL requests and generates a corresponding abstract syntax tree, generates a query execution plan tree according to the syntax tree, and passes Modify the execution plan tree, add a LimitNode, and send the Limit operation to the Segment for execution; sort the partitions according to the partition key value; scan the records of the partitions in order, and return directly when the number of records limited by the Limit is reached. The present invention sorts the partitions in advance, starts querying from adjacent partitions during query, and only sorts and queries a few partitions that need to be sorted, reducing a large amount of unnecessary partition data reading, and greatly improving the performance of conditional query.

Description

technical field [0001] The invention belongs to the technical field of big data query, and in particular relates to a Greenplum-based quick sort query method and system. Background technique [0002] At present, the existing technologies commonly used in the industry are as follows: [0003] Greenplum is a relational database cluster, which is actually a logical database composed of multiple independent database services. Different from the Shared-Everything architecture of Oracle RAC, Greenplum adopts the Shared-Nothing architecture. The entire cluster consists of multiple data nodes (Segment Host) and control nodes (Master Host). Each data node can run multiple databases, and also called multiple instances. [0004] The control node (Master Host) receives and parses the SQL request and generates a corresponding abstract syntax tree, generates an execution plan tree based on the syntax tree, and sends the execution plan tree to each data node (Segment Host), and each data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/28G06F16/2455
Inventor 洪灿榕
Owner 南威北方科技集团有限责任公司