Collection Method of Training Samples for Database Load Response Time Prediction Model

A technology for training samples and predicting models, applied in special data processing applications, electrical digital data processing, instruments, etc., can solve the problems of high model establishment cost, long model training time, and failure to consider the interaction between loads, reducing The effect of the number of samples

Inactive Publication Date: 2017-10-27
TAIYUAN UNIV OF TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] However, the sampling method corresponding to the above statistical model does not consider the interaction between loads, and only obtains samples through specific sampling or random sampling of the entire sample space
As the amount of database data increases, the load running time increases. If the training samples are not selected, the model training time will become longer, and the cost of model establishment will become very high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Collection Method of Training Samples for Database Load Response Time Prediction Model
  • Collection Method of Training Samples for Database Load Response Time Prediction Model
  • Collection Method of Training Samples for Database Load Response Time Prediction Model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Example: Let five load types be given as q 1 ,q 2 ,q 3 ,q 4 ,q 5 ; The MPL level is 4, which means that the number of workloads that can run in the database at the same time is 4, and the current sample is s 0 (q 1 ,q 2 ,q 3 ,q 4 ). where q 1 ,q 2 ,q 3 ,q 4 ,q 5 Each of the five query templates C q1 、C q2 、C q3 、C q4 、C q5 Generated, the database system is IBM DB2, and the version number is 9.5.

[0031] 1. Obtain the response data of each load when it runs alone; the response data includes response time, CPU time, logical reading number, BAL value T q ;

[0032] Run load q alone 1 ,q 2 ,q 3 ,q 4 ,q 5And get the respective response time, CPU time, number of logical reads, BAL values ​​for individual runs. The data is obtained through the DB2 snapshot monitoring command: "db2 get snapshot for dynamic sql ondatabase".

[0033] 2. Obtain the response data when the load runs in pairs; set q 1 ,q 2 ,q 3 ,q 4 ,q 5 Perform permutations and combi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The collection method of database load response time prediction model training samples belongs to the sample collection method based on clustering, which includes (1) obtaining the response data of each load of the database when it is running alone; (2) obtaining the response of the database load when it is running in pairs data; (3) calculate the average page reading time change; (4) cluster the full sample space according to the average page reading time change; (5) fill the sample selection table; (6) generate training samples. The invention can reduce the sampling number of the statistical model, maintain the model precision and reduce the cost of model establishment.

Description

technical field [0001] The invention belongs to a sample collection method based on clustering, and is a training collection method applied to a database load response time prediction model. Background technique [0002] In the current parallel database system, predicting the load response time is very important, which can help the database administrator adjust the database parameters and arrange the parallel load reasonably. [0003] However, due to the complex interaction mechanism between parallel database loads, the establishment process of traditional analytical models is complicated and the prediction effect is not good. Therefore, the existing literature is mainly to establish a statistical model to predict the response time of the load. That is, the establishment of statistical models is completed in three steps: sample collection, model training (regression), and model evaluation. The literature in this area mainly includes [1] Duggan J, Cetintemel U, Papaemmanoui...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 牛保宁张锦文
Owner TAIYUAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products