Supercharge Your Innovation With Domain-Expert AI Agents!

OLAP query engine dynamic cost evaluation method and device

A query engine and evaluation device technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of great impact on query performance, low accuracy, and inability to dynamically learn query records, and achieve evaluation accuracy. High, high-accuracy effects

Pending Publication Date: 2022-04-29
CHINA UNITECHS
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] 1. Low accuracy: information such as table data volume and field cardinality has a great impact on query performance. Currently, the rule-based cost evaluation engine cannot use this information to perform optimal cost evaluation for SQL.
[0004] 2. No dynamic improvement ability: no self-learning ability, unable to dynamically learn query records, improve and improve the accuracy of cost evaluation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • OLAP query engine dynamic cost evaluation method and device
  • OLAP query engine dynamic cost evaluation method and device
  • OLAP query engine dynamic cost evaluation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0096] Step 1: Collect the data volume and field cardinality of the two tables t_sys_member and t_member_order_detail_d in the data warehouse, and obtain 2.05 million pieces of data in t_sys_member with 30 fields, and 16.72 million pieces of data in t_member_order_detail_d with 46 fields.

[0097] Step 2: Perform feature extraction on the above two tables, perform discrete processing on the execution time, and label it to obtain the table t_sys_member with a query time of 1.2s and label it with label 1, and obtain the table t_member_order_detail_d with a query time of 2.3s and label it with label 2.

[0098] Step 3: For the table information of the table t_sys_member, the filter field member_id, the analysis field sex, the time partition stastis_date, the table data volume of 2.05 million and the field cardinality of 30, generate respective feature codes; for the table information of the table t_member_order_detail_d, the filter field member_id, Analyze the field city, time par...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an OLAP query engine dynamic cost evaluation method and device, and the method comprises the steps: collecting the data size and field cardinal number of a data table in a data warehouse; meanwhile, historical query sql and corresponding query time are recorded; performing discrete processing on the execution time of the sql to generate a label; meanwhile, table information, a filtering field, an analysis field, a time partition, a table data size and an analysis field cardinal number of the data table are used for generating respective feature codes; performing classification algorithm training on an existing query sample according to the extracted features of the data table by using an existing machine learning model; and converting the execution plan of the OLAP query engine into a feature vector, and inputting the feature vector into the machine learning model for cardinality evaluation, thereby obtaining the estimated query time of the data table. According to the method and the device, the OLAP query engine can quickly and accurately perform cost evaluation on the input query sql to generate the optimal execution plan.

Description

technical field [0001] The invention relates to the field of OLAP query engines, in particular to an OLAP query engine dynamic cost evaluation method and device. Background technique [0002] Currently, the mainstream open source OLAP (On-Line Analytical Processing) query engines on the market, including Hive, Spark SQL, Presto, Kylin, Impala, Druid, Clickhouse, Greplum, etc., use a rule-based approach to cost assessment, such as figure 1 As shown, the following problems currently exist: [0003] 1. Low accuracy: information such as table data volume and field cardinality has a great impact on query performance. Currently, the rule-based cost evaluation engine cannot use this information to perform optimal cost evaluation for SQL. [0004] 2. No dynamic improvement ability: no self-learning ability, unable to dynamically learn query records, improve and improve the accuracy of cost evaluation. Contents of the invention [0005] In order to solve the above-mentioned prob...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/2453G06F16/28G06N20/00
CPCG06F16/24542G06F16/285G06N20/00
Inventor 毛春阳闫一帅
Owner CHINA UNITECHS
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More