Unlock instant, AI-driven research and patent intelligence for your innovation.

Feature determination method and device for predicting occupied amount of query resources

A technology for determining methods and occupancy, which is applied in the field of data processing, can solve problems that affect the execution of query tasks, few input feature values, and inaccurate memory footprint, and achieve the effects of optimizing resource allocation, improving utilization, and reasonable execution

Active Publication Date: 2020-03-06
BEIJING GRIDSUM TECH CO LTD
View PDF14 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] For an Impala query, a supervised learning algorithm is used to predict the memory space used by the query. When using a supervised learning algorithm for memory space prediction, the feature used is the total number of files to be processed, but only the files to be processed are The total number is used as the feature input of the supervised learning algorithm. Due to the small number of input feature values, the predicted memory footprint will be inaccurate, which will affect the execution of subsequent query tasks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature determination method and device for predicting occupied amount of query resources
  • Feature determination method and device for predicting occupied amount of query resources
  • Feature determination method and device for predicting occupied amount of query resources

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0054] An embodiment of the present invention provides a feature determination method for predicting query resource usage, where the query resource usage may be a memory footprint.

[0055] refer to figure 1 , which can include:

[0056] S11. Obtain the data to be queried;

[0057] Wherein, the data to be queried is data input when performing a query operation;

[0058] Specifically, the data to be queried is the data that the u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a feature determination method and device for predicting occupied amount of query resources and the method comprises the steps: obtaining to-be-queried data, determining queryplan data corresponding to the to-be-queried data, generating a tree structure corresponding to the query plan data, and determining a feature dimension value corresponding to a preset feature dimension based on the tree structure; wherein the preset feature dimensions are used for determining resources occupied by query operation, and the number of the preset feature dimensions is multiple. A plurality of feature dimension values can be determined, and compared with the condition that only one feature dimension value exists, the plurality of feature dimension values is input into the supervised learning algorithm, so that the query resource obtained by prediction is more accurate.

Description

technical field [0001] The present invention relates to the field of data processing, and more specifically, relates to a feature determination method and device for predicting query resource occupancy. Background technique [0002] Impala is a new type of query system that provides structured query language SQL semantics and can query petabytes of PB-level big data stored in Hadoop's distributed file system HDFS (Hadoop Distributed File System) and HBase. [0003] For an Impala query, a supervised learning algorithm is used to predict the memory space used by the query. When using a supervised learning algorithm for memory space prediction, the feature used is the total number of files to be processed, but only the files to be processed are The total number is used as the feature input of the supervised learning algorithm. Due to the small number of input feature values, the predicted memory footprint will be inaccurate, which will affect the execution of subsequent query t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2453G06F9/50G06F11/34
CPCG06F9/5016G06F11/3442G06F2201/80
Inventor 张双燕
Owner BEIJING GRIDSUM TECH CO LTD