Method and device for aggregate query in distributed databases

A database and distributed technology, applied in the field of database query, can solve problems such as no longer supporting aggregation query, and achieve the effect of reducing development time

Active Publication Date: 2014-10-29
BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD +1
View PDF4 Cites 50 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, after the data is distributed to multi-machine databases by sub-database and sub-table, one or more columns of data originally stored in one data table will be transferred to multiple data tables, or even multiple databases, for hashing distribution, aggregation queries are no longer supported

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for aggregate query in distributed databases
  • Method and device for aggregate query in distributed databases
  • Method and device for aggregate query in distributed databases

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] figure 1 It is a flow chart of a method for aggregate query in a distributed database provided by Embodiment 1 of the present invention, and this embodiment is applicable to realizing aggregate query in a distributed database. The distributed database includes a SQL (Structured Query Language, structured query language) node and at least one data node, wherein the SQL node is used to receive query commands and perform calculations according to the query commands, and the data node is used to store data. The method can be executed by the SQL node, and specifically includes the following steps:

[0026] Step 110, receiving the original SQL query statement sent by the client.

[0027] Wherein, the original SQL statement is the original query command manually input on the client. The client sends the query command to the SQL node, and the SQL node receives the original SQL statement sent by the client. The original SQL statement includes the SELECT keyword, query column,...

Embodiment 2

[0064] Embodiment 2 of the present invention provides a method for aggregation query in a distributed database, and specifically applies the method for aggregation query in a distributed database provided in Embodiment 1. The specific example is: Suppose there is a data table acid, which contains columns: id, name, sex, age, city, mobile, where id is the primary key.

[0065] The original query is: SELECT city, avg(age) FROM acid GROUP BY city HAVING(count(id)>100000).

[0066] The meaning of the original SQL query statement: from the acid data table, from the city that satisfies the condition of (count(id)>100000), obtain the data of city and the data of avg(age).

[0067] The above original SQL query statement includes two aggregate functions, avg(age), which is the average age, and count(id), which is to calculate the number of ids.

[0068] After the SQL node receives the original SQL query statement, it parses the original SQL query statement to generate a syntax tree (s...

Embodiment 3

[0075] image 3 It is a schematic diagram of a device for aggregate query in a distributed database provided in Embodiment 3 of the present invention. The device for aggregate query in a distributed database provided in this embodiment is used to implement the method for aggregate query in a distributed database provided in Embodiment 1 . Such as image 3 As shown, the apparatus for aggregation query in a distributed database provided by Embodiment 3 of the present invention includes: a receiving module 310 , an acquiring module 320 , a transforming module 330 , an updating module 340 , a distribution receiving module 350 and a computing module 360 ​​.

[0076] Wherein, the receiving module 310 is used to receive the original SQL query statement sent by the client;

[0077] The obtaining module 320 is used to obtain the query column of the original SQL query statement and the aggregation function in the conditional sub-query;

[0078] The transformation module 330 is used t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and device for aggregate query in distributed databases. The method comprises the steps of receiving an original SQL query statement sent by a client side; obtaining a query column of the original SQL query statement and an aggregate function in condition subquery; in the aggregate function, transforming the complicated aggregate function for multilist computation into a simple aggregate function; according to the simple aggregate function, updating the original SQL query statement into a new SQL query statement; sending the new SQL query statement to two or more databases for querying and receiving the returned query result of the query column; according to the query result and the new SQL query statement, calculating and obtaining the query result of the original SQL query statement. According to the method and device, aggregate query in the databases of multiple computers is achieved, it is not needed to write aggregate processing logic in an application program, and therefore the development time of the application program is shortened.

Description

technical field [0001] Embodiments of the present invention relate to database query technology, and in particular to a method and device for aggregation query in a distributed database. Background technique [0002] Databases generally store data through data tables, but the amount of data is not necessarily controllable. With the development of time and business, there will be more and more data tables in the database, and the amount of data in the data tables will also increase. . Correspondingly, the overhead of data operations such as addition, deletion, modification, and query on the data table will also increase. In addition, the hardware resources (CPU, disk, memory, IO, etc.) of a server are limited. In the end, the data volume and data processing capacity that the database can carry will encounter the bottleneck of hardware resources. Therefore, it is necessary to divide databases and tables. That is, the data originally stored in one host database is stored in b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/244G06F16/27
Inventor 唐超马丽伟秦波王锋赵晓平
Owner BEIJING JINGDONG SHANGKE INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products