Unlock instant, AI-driven research and patent intelligence for your innovation.

A big data query optimization method based on presto and elasticsearch

A technology of query optimization and optimization method, which is applied in the fields of electronic digital data processing, digital data information retrieval, special data processing applications, etc. The effect of reducing reads and fast queries

Active Publication Date: 2022-06-10
LINEWELL SOFTWARE
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, elasticsearch lacks traditional SQL syntax support, making it difficult for developers to use, and the data migration and docking work of systems based on relational databases are not easy to carry out
Presto can provide basic SQL syntax support for Elasticsearch, but its memory-based query mechanism also needs to pre-read almost all target data into the cluster memory, and the data needed by end users is only dozens or even Several, during the process consume a lot of server resources and time for reading and filtering redundant data, resulting in a significant drop in query efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A big data query optimization method based on presto and elasticsearch
  • A big data query optimization method based on presto and elasticsearch

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The invention discloses a big data query optimization method based on Presto and Elasticsearch, such as figure 1 As shown, the required systems include Presto cluster and Elasticsearch cluster. Users can submit SQL query requests to the Presto cluster through command line tools, jdbc clients or graphical interface development tools, and the Presto cluster receives and parses the SQL request, and then send a query request to the Elasticsearch cluster and read the data.

[0024] The big data query optimization method based on Presto and Elasticsearch of the present invention saves the queried data in the index of the Elasticsearch cluster, such as figure 2 As shown, when a user submits a SQL query request to the Presto cluster, the following steps are performed:

[0025] Step 1. The Presto cluster receives and parses the SQL query request and generates a corresponding abstract syntax tree, and generates a query execution plan tree according to the syntax tree. The exec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a large data query optimization method based on Presto and Elasticsearch, which saves all queried data in the Elasticsearch cluster, then receives and parses SQL requests through the Presto cluster to generate corresponding abstract syntax trees and execution plans, and converts query conditions Convert it into an Elasticsearch query request and send it to Elasticsearch for pre-query. The present invention utilizes the advantages of Presto SQL parsing and Elasticsearch fast query to pre-filter part of the data as much as possible before retrieving data, greatly reducing the reading of redundant data, so as to improve the performance of conditional query.

Description

technical field [0001] The invention relates to a fast query method for big data, and particularly provides a query optimization method for big data based on Presto and Elasticsearch. Background technique [0002] Elasticsearch is a full-text search engine built on Apache Lucene TM A distributed search engine based on real-time analysis, it uses Lucene as the core to implement all indexing and search functions, so that the content of each document can be indexed, searched, sorted, and filtered. However, elasticsearch lacks traditional SQL syntax support, which is difficult for developers to use, and data migration and docking of systems based on relational databases are not easy. Presto can provide basic SQL syntax support for Elasticsearch, but its memory-based query mechanism also needs to pre-read almost the full amount of target data into the cluster memory, and the data required by the end user is only dozens of data from the massive data. In the process, a lot of ser...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2453
Inventor 洪灿榕
Owner LINEWELL SOFTWARE