Database configuration tuning method, device, equipment and readable storage medium

By receiving client requests, collecting configuration samples, and analyzing them using tuning models, the problems of low efficiency and high cost in NoSQL database tuning were solved, achieving efficient and adaptive database performance optimization.

CN116150128BActive Publication Date: 2026-06-12SUN YAT SEN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SUN YAT SEN UNIV
Filing Date
2023-03-17
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing NoSQL database performance tuning methods require significant manpower and are difficult to achieve optimal performance. Machine learning methods have high training costs and cannot quickly adapt to changing workloads and hardware environments, resulting in high tuning time costs.

Method used

This paper provides a database configuration tuning method. By receiving tuning requests from clients, it collects a configuration sample set and analyzes it using a preset tuning model. The method judges and adjusts the configuration results until the client's requirements are met. The tuning model is then optimized using a covariate offset correction model and a reinforcement learning algorithm to adapt to different environments.

Benefits of technology

It improves tuning efficiency, reduces time costs, can be adaptively applied to NoSQL databases, recommends optimal configuration results, and is suitable for various database configuration tuning tasks.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116150128B_ABST
    Figure CN116150128B_ABST
Patent Text Reader

Abstract

The application provides a database configuration optimization method and device, equipment and a readable storage medium. When a database needs to be optimized, the application can receive and collect a target configuration sample set corresponding to a target database according to an optimization request of a target client; and input the target configuration sample set into an optimization model for analysis to obtain a corresponding configuration result; whether the configuration result meets the optimization requirement can also be judged; if not, the target configuration sample set is continuously input into the optimization model for analysis until the configuration result meets the optimization requirement of the target client. If yes, the configuration result can be fed back to the target client. The application has high optimization efficiency and effectively reduces the optimization time cost, and can be applied to the optimization of NoSQL databases in a self-adaptive manner, can effectively recommend the optimal configuration result of the database that needs to be optimized, and the method provided in the application embodiment has strong applicability and can be adapted to different database configuration optimization work.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of database performance configuration technology, and in particular to a database configuration tuning method, apparatus, device and readable storage medium. Background Technology

[0002] In today's big data era, massive amounts of data are generated every second. Network service providers must possess reliable database infrastructure to meet users' ever-increasing demands for massive amounts of data. To help users store and manage such vast amounts of data, most cloud service providers offer NoSQL database services. Typically, each NoSQL database has a large number of performance-related configuration parameters derived from the database and operating system kernel. However, current NoSQL databases only offer a default configuration suitable for general use cases; therefore, users need to adjust configuration parameters to improve the performance of the NoSQL database.

[0003] In practical applications, users typically perform optimization through experience-based expertise or by utilizing machine learning methods. However, manual optimization requires significant human resources and struggles to achieve optimal database performance. Existing machine learning optimization methods are costly to train and cannot quickly adapt to changing workloads and hardware environments, resulting in high time costs for optimization. Summary of the Invention

[0004] This application aims to at least solve one of the aforementioned technical defects. In view of this, this application provides a database configuration tuning method, apparatus, device, and readable storage medium to solve the technical defect in the prior art that makes it difficult to efficiently tune the performance of a database.

[0005] A database configuration tuning method includes:

[0006] Receive a tuning request from the target client, wherein the tuning request from the target client includes tuning requirements and the target database that needs to be tuned;

[0007] Based on the tuning request from the target client, collect the target configuration sample set corresponding to the target database;

[0008] The target configuration sample set is input into a preset target tuning model for analysis to obtain the configuration results corresponding to the target database.

[0009] Determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client;

[0010] If the configuration result corresponding to the target database does not meet the tuning requirements of the target client, then return to the operation of inputting the target configuration sample set into the preset target tuning model for analysis, until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client;

[0011] If the configuration result corresponding to the target database meets the tuning requirements of the target client, then the configuration result is fed back to the target client.

[0012] Preferably, before inputting the target configuration sample set into a preset target tuning model for analysis, the method further includes:

[0013] The target configuration sample set is matched with a preset tuning model to obtain a target tuning model that matches the target database.

[0014] Preferably, the step of matching the target configuration sample set with a preset tuning model to obtain a target tuning model that matches the target database includes:

[0015] The target configuration sample set is input into the preset tuning model to extract the configuration environment features in the target configuration sample set;

[0016] The configuration environment features of the target configuration sample set and the data in the preset experience replay pool are input into the preset covariate shift correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model. The preset covariate shift correction model is trained by using the configuration environment features of the training configuration sample set and the data in the preset experience replay pool as training samples and the deviation information between the current environment corresponding to the training configuration sample set and the configuration environment corresponding to the preset tuning model as sample labels.

[0017] Based on the deviation information, the preset tuning model is adjusted using the covariate offset correction model to obtain a target tuning model that matches the target database.

[0018] Preferably, the training process of the target optimization model includes:

[0019] Collect target training samples and store them in a pre-defined experience replay pool;

[0020] The input parameter dimensions of the target tuning model are determined, and the network parameters of the target tuning model are set. The network parameters of the target tuning model include the agent of the target tuning model, the tuning environment of the target tuning model, the state of the target tuning model, the action of the target tuning model, the incentive parameters of the target tuning model, and the tuning strategy of the target tuning model.

[0021] The target training samples are input into the target optimization model according to the input parameter dimensions of the target optimization model, and the model is repeatedly trained to obtain the target optimization model of the database corresponding to the target training samples.

[0022] A database configuration tuning device, comprising:

[0023] The receiving unit is used to receive the tuning request from the target client, wherein the tuning request from the target client includes tuning requirements and the target database that needs to be tuned;

[0024] The collection unit is used to collect the target configuration sample set corresponding to the target database based on the tuning request of the target client.

[0025] The analysis unit is used to input the target configuration sample set into a preset target tuning model for analysis, and obtain the configuration results corresponding to the target database.

[0026] The judgment unit is used to determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client; if the configuration result corresponding to the target database does not meet the tuning requirements of the target client, then return to the operation of inputting the target configuration sample set into the preset target tuning model for analysis, until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client;

[0027] The feedback unit is used to feed back the configuration result to the target client when the execution result of the judgment unit determines that the configuration result corresponding to the target database meets the tuning requirements of the target client.

[0028] Preferably, the device further includes:

[0029] The matching unit is used to match the target configuration sample set with a preset tuning model to obtain a target tuning model that matches the target database.

[0030] Preferably, the matching unit includes:

[0031] The first matching subunit is used to input the target configuration sample set into the preset tuning model to extract the configuration environment features in the target configuration sample set;

[0032] The second matching subunit is used to input the configuration environment features of the target configuration sample set and the data in the preset experience replay pool into a preset covariate shift correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model. The preset covariate shift correction model is trained by using the configuration environment features of the training configuration sample set and the data in the preset experience replay pool as training samples and the deviation information between the current environment corresponding to the training configuration sample set and the configuration environment corresponding to the preset tuning model as sample labels.

[0033] The third matching subunit is used to adjust the preset tuning model based on the deviation information using the covariate offset correction model, so as to obtain a target tuning model that matches the target database.

[0034] Preferably, the training process of the target optimization model includes:

[0035] Collect target training samples and store them in a pre-defined experience replay pool;

[0036] The input parameter dimensions of the target tuning model are determined, and the network parameters of the target tuning model are set. The network parameters of the target tuning model include the agent of the target tuning model, the tuning environment of the target tuning model, the state of the target tuning model, the action of the target tuning model, the incentive parameters of the target tuning model, and the tuning strategy of the target tuning model.

[0037] The target training samples are input into the target optimization model according to the input parameter dimensions of the target optimization model, and the model is repeatedly trained to obtain the target optimization model of the database corresponding to the target training samples.

[0038] A database configuration tuning device includes: one or more processors, and memory;

[0039] The memory stores computer-readable instructions, which, when executed by the one or more processors, implement the steps of any of the database configuration tuning methods described above.

[0040] A readable storage medium storing computer-readable instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of any of the database configuration tuning methods described above.

[0041] As can be seen from the technical solutions described above, when it is necessary to optimize the performance of a database, the method provided in this application embodiment can receive an optimization request from a target client, so as to optimize the performance of the database requested by the target client based on the optimization request. The optimization request from the target client includes optimization requirements and the target database to be optimized. Furthermore, based on the optimization request from the target client, a target configuration sample set corresponding to the target database can be collected. Collecting the target configuration sample set corresponding to the target database can help improve the accuracy of the configuration results corresponding to the target database. After collecting the target configuration sample set, it can be input into a preset target tuning model for analysis, thereby obtaining the configuration result corresponding to the target database. The configuration result obtained through the analysis of the target tuning model may not necessarily meet the tuning requirements of the target client. Therefore, after obtaining the configuration result, it can be further determined whether the configuration result corresponding to the target database meets the tuning requirements of the target client. If the configuration result corresponding to the target database does not meet the tuning requirements of the target client, the operation of inputting the target configuration sample set into the preset target tuning model for analysis can be returned until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client. If the configuration result corresponding to the target database meets the tuning requirements of the target client, the configuration result can be fed back to the target client.

[0042] Therefore, when it is necessary to optimize the configuration of a database, the optimization method provided in this application embodiment is highly efficient and effectively reduces the optimization time cost. It can also be adaptively applied to the optimization of NoSQL databases and can effectively recommend the optimal configuration result for the database that needs to be optimized. The method provided in this application embodiment has strong applicability and can be adapted to different database configuration optimization tasks. Attached Figure Description

[0043] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0044] Figure 1 A flowchart illustrating a database configuration tuning method provided in this application embodiment;

[0045] Figure 2A schematic diagram of the framework of an optimization model provided in an embodiment of this application;

[0046] Figure 3 A schematic diagram illustrating the comparison of the best relative throughput and latency of TD3, SAC, and DDPG with and without fused causal reasoning in a Redis database, provided for embodiments of this application;

[0047] Figure 4 A schematic diagram illustrating the comparison of the best relative throughput and latency of Meta TD3 and TD3 in a MongoDB database, provided for embodiments of this application;

[0048] Figure 5 This application provides a schematic diagram illustrating the relationship between stimulus parameters and Q-values ​​after Redis offline training.

[0049] Figure 6 This is a schematic diagram illustrating the structure of a database configuration optimization device as exemplified in an embodiment of this application;

[0050] Figure 7 This is a hardware structure block diagram of a database configuration tuning device disclosed in an embodiment of this application. Detailed Implementation

[0051] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0052] While using expertise to fine-tune database performance is an effective method, it requires significant human resources and is unlikely to achieve optimal database performance.

[0053] To address this problem, machine learning methods were invented. Machine learning methods can be broadly categorized into two main types: Bayesian optimization-based methods and reinforcement learning-based methods.

[0054] Among them, the Bayesian optimization-based method treats the relationship between configuration and performance as a black box, and can quickly find the optimal configuration by using a heuristic search strategy in the absence of prior knowledge.

[0055] Reinforcement learning-based methods recommend optimal configurations by training a policy model. These methods typically use deep neural networks as the policy model. Neural networks can accept high-dimensional configuration parameters as input, making them well-suited for high-dimensional configuration spaces.

[0056] However, most machine learning methods focus only on the configuration of specific NoSQL databases, ignoring the impact of the operating system on database performance, which reduces the potential for performance improvements. Furthermore, NoSQL databases have a large number of configuration parameters, making it difficult for neural networks to converge and resulting in high training costs for existing machine learning methods.

[0057] Existing tuning methods, such as Hunter, QTune, and OtterTune, do not consider the impact of operating system configuration parameters on database performance.

[0058] However, operating system factors such as the CPU scheduler have a significant impact on database performance. Furthermore, NoSQL databases have a large number of configuration parameters. As the number of configuration parameters increases, the training cost of the tuning model also increases. Therefore, how to enable the tuning model to converge quickly and reduce the training cost is a key issue.

[0059] Furthermore, existing tuning methods cannot quickly adapt to constantly changing workloads and hardware environments, resulting in high tuning time costs. If the operating environments for online tuning and offline training differ, fine-tuning the model may be necessary to adapt to the unfamiliar environment. Fine-tuning the model requires significant time, and this high time cost is unacceptable for online tuning of NoSQL databases. Therefore, it is necessary to consider how to improve the model's generalization ability and reduce tuning time.

[0060] Given that most current database configuration tuning solutions are difficult to adapt to complex and ever-changing business needs, the applicant has researched a database configuration tuning solution. When it is necessary to tune the database configuration, the tuning method provided in this application is highly efficient and effectively reduces tuning time costs. It can also be adaptively applied to the tuning of NoSQL databases and can effectively recommend the optimal configuration result for the database that needs tuning. The method provided in this application has strong applicability and can be adapted to different database configuration tuning tasks.

[0061] The methods provided in this application can be used in a variety of general-purpose or special-purpose computing device environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor devices, distributed computing environments including any of the above devices, etc.

[0062] This application provides a database configuration optimization method, which can be applied to various database management systems, as well as various computer terminals or smart terminals. The executing entity can be the processor or server of the computer terminal or smart terminal.

[0063] The following is combined with Figure 1This document describes the process of the database configuration tuning method provided in the embodiments of this application, such as... Figure 1 As shown, the process may include the following steps:

[0064] Step S101: Receive the tuning request from the target client. The tuning request from the target client includes tuning requirements and the target database that needs to be tuned.

[0065] Specifically, in practical applications, when it is necessary to optimize the configuration or performance of a database, the optimization requirements can be determined first.

[0066] Therefore, when it is necessary to optimize the database requested by the target client, the method provided in this application embodiment can receive the optimization request from the target client. The optimization request from the target client includes optimization requirements and the target database that needs to be optimized, so that the target database can be optimized according to the optimization request from the target client.

[0067] Step S102: Based on the tuning request from the target client, collect the target configuration sample set corresponding to the target database.

[0068] Specifically, in practical applications, in order to provide the optimization effect of the database, it is necessary to analyze some configuration samples of the database to determine the most suitable configuration scheme for the database that needs to be optimized.

[0069] Therefore, in order to quickly complete the optimization of the target database, a target configuration sample set corresponding to the target database can be collected based on the optimization request of the target client.

[0070] The target configuration sample set may include at least one configuration sample from the target database. Each configuration sample in the target configuration sample set may be a configuration scheme set according to the tuning request of the target client.

[0071] Step S103: Input the target configuration sample set into the preset target tuning model for analysis to obtain the configuration results corresponding to the target database.

[0072] Specifically, as described above, the method provided in this application embodiment can collect the target configuration sample set corresponding to the target database based on the tuning request of the target client.

[0073] The target configuration sample set includes at least one configuration sample created based on the tuning request of the target client.

[0074] To obtain the configuration result most suitable for the target database, the target configuration sample set can be input into a preset target tuning model for analysis to obtain the configuration result corresponding to the target database.

[0075] Step S104: Determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client.

[0076] Specifically, in practical applications, the configuration results of the target database obtained by the target tuning model may not meet the tuning requirements of the target client. If the configuration results corresponding to the target database do not meet the tuning requirements of the target client, it means that the currently obtained configuration results do not meet the requirements, and the target configuration sample needs to be re-analyzed in order to obtain configuration results that can meet the tuning requirements of the target client.

[0077] Therefore, after determining that the configuration result corresponding to the target database does not meet the tuning requirements of the target client, the operation of inputting the target configuration sample set into the preset target tuning model for analysis can be returned until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client.

[0078] If the configuration result corresponding to the target database meets the tuning requirements of the target client, it means that the current output configuration result of the target tuning model can be fed back to the target client, and then step S105 can be executed.

[0079] Step S105: Feedback the configuration result to the target client.

[0080] Specifically, as described above, after obtaining the configuration result of the target database, the method provided in this application embodiment can further determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client.

[0081] If the configuration result corresponding to the target database meets the tuning requirements of the target client, it means that the current output configuration result of the target tuning model can be fed back to the target client. In this case, the configuration result can be directly fed back to the target client so that the target client can complete the configuration of the target database and achieve better performance of the target database.

[0082] As can be seen from the above-described technical solutions, when it is necessary to adjust the performance of the database, the method provided in this application embodiment has high tuning efficiency and effectively reduces the tuning time cost. It can also be adaptively applied to the tuning of NoSQL databases and can effectively recommend the optimal configuration result of the database that needs to be tuned. The method provided in this application embodiment has strong applicability and can be adapted to different database configuration tuning work, which can enable the database to achieve better performance.

[0083] In practical applications, in order to obtain a database tuning model that is more suitable for the target client's tuning request, the method provided in this application embodiment can further fine-tune the preset tuning model before inputting the target configuration sample set into the preset target tuning model for analysis. The target configuration sample set is then matched with the preset tuning model to obtain a database tuning model that is more suitable for the target client's tuning request. This process is described below and may include the following steps:

[0084] Step S201: Input the target configuration sample set into the preset tuning model to extract the configuration environment features in the target configuration sample set.

[0085] Specifically, in practical applications, the operating system environment and the current configuration environment may both affect the tuning effect of the tuning model.

[0086] Therefore, in order to obtain a better optimization model for the database that is more suitable for the target client request, before inputting the target configuration sample set into the preset target optimization model for analysis, the method provided in this application embodiment can extract the configuration environment features in the target configuration sample set from the preset optimization model, so that the optimization model can be fine-tuned according to the configuration environment features in the target configuration sample set.

[0087] Step S202: Input the configuration environment features of the target configuration sample set and the data in the preset experience replay pool into the preset covariate offset correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model.

[0088] Specifically, as described above, before inputting the target configuration sample set into the preset target tuning model for analysis, the method provided in this application embodiment can input the target configuration sample set into the preset tuning model to extract the configuration environment features in the target configuration sample set.

[0089] There are differences between the configuration environment features in the target configuration sample set and the current configuration environment and the training configuration environment corresponding to the preset tuning model, which may affect the tuning analysis effect of the tuning model. Therefore, after extracting the configuration environment features in the target configuration sample set, the configuration environment features in the target configuration sample set and the data in the preset experience replay pool can be input into the preset covariate offset correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model.

[0090] The preset covariate displacement correction model can be trained using the configuration environment features in the training configuration sample set and the data in the preset experience replay pool as training samples, and the deviation information between the current environment corresponding to the training configuration sample set and the configuration environment corresponding to the preset tuning model as sample labels.

[0091] Step S203: Based on the deviation information, adjust the preset tuning model using the covariate offset correction model to obtain a target tuning model that matches the target database.

[0092] Specifically, as described above, the method provided in this application embodiment can utilize the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model. Determining the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model can help to correct the tuning model.

[0093] Therefore, after determining the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model, the preset tuning model can be adjusted using the covariate offset correction model based on the deviation information to obtain a target tuning model that matches the target database.

[0094] As can be seen from the above-described technical solutions, when it is necessary to adjust the performance of the database, the method provided in this application embodiment can determine the environmental characteristics of the target configuration sample set, and after determining the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model, adjust the preset tuning model according to the deviation information using the covariate offset correction model to obtain a target tuning model that matches the target database, so that the target tuning model performs better.

[0095] As described above, the method provided in this application embodiment can analyze the target sample set using the target tuning model to obtain the configuration results of the database requested for tuning by the target client. The training process of the target tuning model will be described next, which may include the following steps:

[0096] Step S301: Collect target training samples and store them in a preset experience replay pool.

[0097] Specifically, in practical applications, in order to improve the training efficiency of the tuning model and the utilization rate of the collected data samples, the method provided in this application embodiment can collect target training samples and store them in a preset experience replay pool before training the tuning model.

[0098] in,

[0099] The number of samples in the preset experience replay pool is no less than 200.

[0100] Storing the target training samples in the preset experience replay pool can facilitate the reuse of the target training samples to train and optimize the model.

[0101] Step S302: Determine the input parameter dimensions of the target optimization model and set the network parameters of the target optimization model.

[0102] Specifically, in general, the parameter space of a database is very large.

[0103] For example, NoSQL has a very large parameter space.

[0104] To improve the optimization rate of the tuning model, the dimensionality of the input parameters of the tuning model can be reduced.

[0105] Therefore, after collecting the target training samples, the input parameter dimensions of the target tuning model can be further determined, and the network parameters of the target tuning model can be set.

[0106] The network parameters of the target tuning model may include the agent of the target tuning model, the tuning environment of the target tuning model, the state of the target tuning model, the action of the target tuning model, the incentive parameters of the target tuning model, and the tuning strategy of the target tuning model.

[0107] Determining the input parameter dimensions of the target tuning model and setting the network parameters of the target tuning model can help improve the training cost and tuning efficiency of the target tuning model.

[0108] Step S303: Input the target training samples into the target optimization model according to the input parameter dimensions of the target optimization model and train repeatedly to obtain the target optimization model of the database corresponding to the target training samples.

[0109] Specifically, as described above, the method provided in this application embodiment can determine the input parameter dimension of the target tuning model and set the network parameters of the target tuning model.

[0110] After setting the relevant input parameter dimensions and network parameters of the target optimization model, the target training samples can be further input into the target optimization model according to the input parameter dimensions of the target optimization model for repeated training to obtain the target optimization model of the database corresponding to the target training samples.

[0111] This leads to a target optimization model that yields better configuration results.

[0112] As can be seen from the technical solutions described above, the method provided in this application can collect target training samples and store them in a preset experience replay pool. After setting the relevant input parameter dimensions and network parameters of the target optimization model, the target training samples can be further input into the target optimization model according to the input parameter dimensions of the target optimization model for repeated training to obtain the configuration result of the database corresponding to the target training samples. This results in a target optimization model that achieves better configuration results. It can effectively improve the utilization rate of training samples and enhance the prediction effect and prediction speed of the optimization model.

[0113] Next, combine Figures 2-5 This application describes the implementation process of the database configuration tuning method provided in its embodiments:

[0114] Figure 2 An example of a framework diagram for optimizing a model is shown. For example... Figure 2 As shown, the implementation process of the method provided in this embodiment can include two parts: offline model training and online optimization. The implementation process of these two parts will be described in detail below.

[0115] 1. Offline training process of the model:

[0116] In the offline model training process, a certain number of high-quality configuration samples are first collected using a Bayesian optimization algorithm. After obtaining a sufficient number of high-quality configuration samples, a random forest algorithm is used to filter the high-quality configuration samples to obtain important configuration parameters, thereby reducing the dimensionality of the configuration parameters. The high-quality configuration samples are then stored in a priority experience replay pool. Finally, a reinforcement learning model based on meta-learning and causal inference is used to train and optimize the reinforcement learning model.

[0117] The detailed steps of offline model training may include the following:

[0118] (1) Collect high-quality samples

[0119] In practical applications, Bayesian optimization (BO) algorithms are mainly used to solve computationally expensive black-box optimization problems.

[0120] BO is a method that uses Bayes' theorem to guide the search to find the minimum or maximum value of an objective function.

[0121] Specifically, the main approach is to utilize previously observed prior knowledge to perform the next optimization during each iteration.

[0122] BO's two key components are the proxy model and the collection function. BO is well-suited for solving NoSQL parameter optimization problems where the objective function is unknown and computationally complex.

[0123] In practical applications, the cold start problem is particularly prominent when tuning reinforcement learning models. High-quality configuration samples in the early stages of training can solve the startup problem encountered in reinforcement learning and effectively accelerate the training speed of the model.

[0124] Therefore, before training and optimizing the model, BO can be used to generate high-quality configuration samples, thereby warming up the reinforcement learning model.

[0125] Considering that traditional boolean modeling (BO) cannot accept high-dimensional input spaces, the HeSBO tool can be used to adjust the input dimension of the model during training and tuning. The HeSBO tool is a variant of stochastic linear projection.

[0126] Assuming the original D-dimensional search space is H and the target dimension is d, the HeSBO tool can define a d-dimensional search space L = [-1, 1].

[0127] According to previous research, the best results are achieved when d is 10%-20% of D.

[0128] After defining the d-dimensional search space, a random projection matrix can be generated as follows:

[0129] M∈R D×d

[0130] In matrix M, each row contains a non-zero element in one column, set to ±1. The column indices and signs of the values ​​in matrix M are independently and randomly sampled uniformly. Essentially, M provides a one-to-many mapping. Each primitive parameter in H is controlled by a composite parameter in L, and each composite parameter can control multiple primitive parameters.

[0131] After transforming the input dimensions of the optimized model, it can be processed using a standard Bayesian optimization algorithm.

[0132] To obtain a better tuning model, a Gaussian process can be chosen as a surrogate model for BO in practical applications.

[0133] In practical applications, EI acquisition functions are generally used to recommend the next better configuration.

[0134] Database systems can typically return the corresponding throughput and 95% latency.

[0135] Let: K = {k1, k2, ..., k n} represents the recommended configuration for each iteration;

[0136] M = {M1, M2, ..., M} n} represents the feedback obtained from each deployment configuration;

[0137] Since the Bayesian optimization algorithm can only optimize one objective, the following formulas (1) and (2) can be used to optimize M. t The two indicators are combined into the excitation parameter R of the optimization model. t :

[0138] Among them, M t This represents the feedback obtained from deploying the configuration in the t-th iteration, where the feedback may include throughput and 95% latency;

[0139] R t This represents the excitation parameters in the t-th iteration.

[0140]

[0141]

[0142] Here, E(x) is an amplification function, which aims to expand the range of the excitation parameters, thereby making it easier for the tuned model to obtain evaluation information for the sample;

[0143] T cur and T def These represent the throughput obtained with the current configuration and the default configuration, respectively.

[0144] L cur and L def These represent 95% of the latency obtained with the current configuration and the default configuration, respectively.

[0145] β∈[0,1] is the weight ratio of throughput and latency. The larger β is, the more the reinforcement learning algorithm will favor throughput; otherwise, it will favor 95% latency. The size of β can be set by the user.

[0146] S t ={K t ,R t Let} be the sample added to the observation set for the tth time.

[0147] In a Gaussian process, each point in the continuous input space is associated with a normally distributed random variable.

[0148] The distribution of a Gaussian process is the joint distribution of an infinite number of random variables. For this reason, a Gaussian process is the distribution of a function over a continuous domain.

[0149] When the observation set is from S t To S t+1 At that time, the Gaussian process model can be easily updated.

[0150] The EI acquisition function is then used to recommend the next optimal configuration, and this process is repeated until the maximum number of iterations is reached or performance does not improve for an extended period. With the help of BO, high-quality training samples can be obtained quickly, which can help to rapidly optimize and tune the model.

[0151] (2) Reduce parameter dimensionality

[0152] Generally, NoSQL databases have very large parameter spaces. To reduce the input dimensionality of the model, a random forest algorithm can be used to compute the feature importance in the parameter space, based on the high-quality samples generated by HeSBO.

[0153] Random forest is a supervised machine learning method that can be used to handle classification and regression problems. A random forest is a classifier that contains multiple decision trees, and the class it outputs can be determined by the mode of the classes of each tree.

[0154] During model training, classification and regression trees (CARTs) can be selected to construct decision trees. The random forest algorithm consists of 500 CARTs.

[0155] First, input G features, where each feature is a subset of the configuration and the label is its corresponding performance.

[0156] Secondly, a training dataset can be obtained by randomly sampling n times with replacement from the 200 training samples obtained from BO. After obtaining the training dataset, the unsampled samples can be used as the test dataset to evaluate its error.

[0157] For each node, G features are randomly selected, and the decision for each node in the decision tree can be determined based on these features.

[0158] Based on these G features, CART constructs a tree structure from the root node using Gini impurity, and the optimal features (i.e., configuration parameters) can split from the node to generate new leaf nodes. Each CART grows fully without being pruned.

[0159] Based on the obtained CARTs, the average reduction in Gini impurity of 500 CARTs can be used to determine the importance of each configuration parameter.

[0160] Finally, the input configuration parameters can be filtered based on these values ​​to obtain a parameter importance ranking list. After obtaining the parameter importance ranking list, the top P configuration parameters can be selected as the input space for tuning the reinforcement learning model. The size of P can be set by the user.

[0161] (3) Store configuration sample set

[0162] After obtaining the training sample set, the method provided in this application embodiment designs a multi-task priority experience replay pool for experience replay.

[0163] Can be of the form {E,P T ,S T ,S T+1 ,a T ,r T ,S N:T ,a N:T ,r N:T The samples are stored in the experience replay pool.

[0164] Where E is the name of the tuning environment;

[0165] S T This is the current state;

[0166] a T The action performed by the Actor based on the current state and characteristics of the current environment;

[0167] S T+1 r T To perform action a T The state and excitation parameters of the post-environment;

[0168] S N:T a N:T r N:T The states, actions, and excitation parameters of the environment E collected from time N to time T.

[0169] P T The priority of this sample is calculated using the following formula:

[0170] δ T =|Q1-Q2|

[0171] P T =δ T +|r T |

[0172] Where Q1 is the current Q function and Q2 is the target Q function.

[0173] δ T This refers to the temporal difference error. A large temporal difference error indicates that the current Q-function is still far from the target Q-function, and more of these samples should be used for training. Therefore, the temporal difference error can be used to measure the value of a sample.

[0174] In actual training, in addition to focusing on samples with large temporal difference errors, samples with large and small activation parameters also need to be considered. These values ​​are sparse, but they have a significant impact on the target Q-value and can provide more policy update information.

[0175] Therefore, during training, both temporal difference error and activation parameters can be considered simultaneously, allowing samples with large temporal difference errors and large or small activation parameters to be used multiple times. This enables the model to fully learn the knowledge from these high-value samples.

[0176] (4) Optimize the reinforcement learning model

[0177] Most existing reinforcement learning-based database tuning methods employ the DDPG algorithm. DDPG is able to learn more effectively in a continuous action space.

[0178] DDPG combines DQN and Actor-Critic, and is a model-free, heterogeneous reinforcement learning method.

[0179] DDPG's Critic module is prone to overestimating the Q-value, which can be fatal in practice.

[0180] Therefore, the TD3 algorithm can be used in actual training.

[0181] Compared to DDPG, TD3 has three main advantages.

[0182] (1) TD3 can use two Critics to estimate different Q values. By selecting the smallest Q value as the target Q value, the possibility of overestimating the Q value can be reduced.

[0183] (2) TD3 can slow down the update speed of the Actor, making the training of the Actor more stable.

[0184] (3) TD3 adds random noise to the action output by the Actor target network to increase the stability of the algorithm.

[0185] Meta-reinforcement learning methods can adapt to new tasks by aggregating historical experience into the latent representation upon which the policy is based. The method provided in this application can solve the tuning problem using offline meta-reinforcement learning.

[0186] By fusing meta-learning and TD3, neural networks can be used to transform features from different workloads or hardware environments into context, which can then be used as input to the Actor.

[0187] The six elements involved in the optimization model provided in this application embodiment may include the following:

[0188] (a) Agent: The Agent is a TD3 model consisting of an Actor and a Critic. The Actor's input is the current state and features of the current environment, and its output is an action. The Critic's input is the current state, the current action, and features of the current tuning environment, and its output is the Q-value. The Actor updates based on feedback from the Critic, and the Critic updates based on the current stimulus parameters. The Agent's goal is to adjust its actions according to the tuning environment to maximize the stimulus parameters, thereby meeting the database performance requirements.

[0189] (b) Tuning Environment: This can include NoSQL environments and operating system environments. Meta-training can be performed offline using several NoSQL tuning environments. Different workloads, hardware, and NoSQL databases constitute different environments. In the implementation, the environment can be decoupled from the tuning algorithm.

[0190] (c) State: State is the current observation of the tuning environment by the tuning system. State can be divided into database state and operating system state, encompassing categories such as CPU, memory, network, hard disk, and file system status. The method for obtaining state differs for each database. For example, MongoDB uses the "serverStatus" and "mongostat" commands to obtain relevant variables, while Cassandra uses the "nodetool" command. The operating system uses the "vmstat" command built into Linux to obtain its state.

[0191] (d) Action: Actions are the configuration of a NoSQL database and can include important configuration parameters filtered by the random forest.

[0192] (e) Reward Parameters: After the action is converted into a configuration and deployed to the actual environment, the current database throughput and 95% latency are obtained through stress testing. Then, the current database throughput and 95% latency are substituted into the above formulas (1) and (2) to obtain the reward parameters of the tuning model, which describe the difference between the current performance and the default performance.

[0193] (f) Policy tuning: The neural network needs to learn the optimal policy function, which generates actions based on environmental features and states. Its goal is to maximize the activation parameters.

[0194] By combining meta-learning and reinforcement learning, this application creates TD3-Context, TD3-Actor, and TD3-Critic. TD3-Context uses a two-layer recurrent neural network GRU.

[0195] Among them, LSTM has more parameters and higher training accuracy.

[0196] Compared to LSTM, GRU has a simpler structure, fewer parameters, and is easier to converge. TD3-Actor and TD3-Critic can be implemented using simple neural networks.

[0197] Assume the distribution of the optimization environment is p(E):

[0198] Each task is a Markov Decision Process (MDP), consisting of a set of states, actions, transition functions, and activation functions.

[0199] The activation function is pre-defined, while the transfer function is learned through training.

[0200] In actual training, a set of different tuning environments p(E) can be given. The meta-training process can first transform the historical context c of a certain environment into the features of the environment, so that the policy can quickly adapt to the unfamiliar environment.

[0201] It can make Let E be the context of the environment at time t.

[0202] in, It is the experience of the environment E collected from time N to time T.

[0203] Will Inputting into TD3-Context allows you to output the characteristics of that environment.

[0204] Secondly, TD3-Actor can learn and optimize policies through policy functions.

[0205] TD3-Critic uses the Q function to evaluate the quality of the policy function.

[0206] To accelerate model convergence, this application may use causal inference as a strategy for leveraging reinforcement learning models. The main purpose of causal inference is to infer causal relationships between variables from observed data.

[0207] Correlation-based methods cannot uncover the essential relationship between parameters and performance, while causal reasoning can solve this problem.

[0208] Among them, causal reasoning provides two functions:

[0209] 1) Intervention: Assessing changes in the effects of a given intervention;

[0210] 2) Explain why a certain configuration can lead to high performance.

[0211] Causal reasoning comprises two phases: the information phase and the query phase. The information phase uncovers the causal structure. Typical algorithms based on conditional independence constraints include PC and FCI. The PC algorithm operates under the condition of no confounding factors, while FCI can produce near-correct results even in the presence of confounding factors.

[0212] During offline training, it may be impossible to observe all state factors in the actual system being tuned. Furthermore, the more accurate the causal structure, the more accurate the subsequent intervention recommendations will be. Therefore, during actual training, the FCI algorithm can be executed based on samples from the experience replay pool for causal discovery.

[0213] Next, a directed acyclic mixture graph (DAG) can be plotted based on the discovered causal relationships. During the query phase, specific parameters can be intervened based on the defined causal structure to fully utilize high-quality samples, thereby accelerating model convergence. In the actual training process, configurations can be generated based on the DAG.

[0214] The model provided in this application embodiment can use causal effects to determine the next NoSQL configuration, that is, to select the parameter that has the greatest impact on the target performance for modification, thereby realizing intervention in the tuning system.

[0215] Assuming that all confounding factors can be identified through observational data, the model provided in this application embodiment can use backdoor adjustments to calculate the causal effect between configuration and target performance. The parameter with the largest causal effect is selected for adjustment, while other parameters use their values ​​that provide the best performance under the current environment.

[0216] Finally, the model provided in this application embodiment can embed causal reasoning into the Meta-Reinforcement Learning method (MetaTD3).

[0217] The action selection strategy is briefly described below: A small sample size leads to low accuracy in causal discovery; therefore, causal inference is only enabled when the sample size in the experience replay pool exceeds 200. Once enabled, causal inference is used 30% of the time in action selection. The probability of using causal inference decreases as the number of model training iterations increases.

[0218] 2. Online optimization

[0219] In online tuning, after receiving a tuning request from the client, the server can collect configuration samples from the client's database. These collected client database configuration samples, along with the configuration samples from offline training, are then input into the tuning reinforcement learning model. After fine-tuning the model, a tuning model adapted to the client's database is obtained. After the model outputs a configuration, it can be filtered according to rules. If the configuration does not meet the tuning requirements, the model continues to output new configurations until the optimal configuration is obtained. Finally, the configuration is output to the client, thus completing the database tuning process.

[0220] The online optimization process can include the following steps:

[0221] (1) Collect and configure sample sets

[0222] During the online tuning phase, this application can use the trained model for tuning. In practical applications, the tuning environment and the training environment are not exactly the same, and may even differ significantly. Therefore, it is necessary to quickly adapt to the new environment. Before adapting to the new environment, it may be necessary to collect a configuration sample set. This application can use Bayesian optimization to quickly collect a high-quality configuration sample set, and the specific process is similar to the sample collection process in the offline training of the model described above.

[0223] (2) Model Adaptation

[0224] After collecting the configuration sample set, this application can input the collected configuration samples into TD3-Context to extract environmental features. A covariate offset correction model is then used, taking these features and data from the experience replay pool as input, to output the deviation information between the current environment and the training environment. The covariate offset correction model is implemented using a logistic regression algorithm. This model is then used to adjust and optimize the reinforcement learning model, resulting in a model adapted to the current environment.

[0225] (3) Rule Filtering

[0226] For user-defined rules, this application can integrate the user-defined rules to adjust the recommended configuration range, and modify the numerical range of the recommended parameters to meet the user's custom rule requirements.

[0227] (4) Configuration Recommendation

[0228] In reinforcement learning, the Critic evaluates the actions generated by the Actor. To save deployment time and achieve rapid tuning, this application uses the Critic to filter suboptimal configurations during online tuning, thereby enabling the tuning reinforcement learning model to recommend the optimal configuration more quickly.

[0229] Next, we will use an example to illustrate the optimization effect of the model provided in the embodiments of this application:

[0230] The evaluation was conducted on a three-node cluster on a local area network. Each node has 16GB of memory and 8 cores, and all nodes are running Ubuntu 18.04. One node is running a different CPU. Tuning experiments were performed on three NoSQL databases: MongoDB, Redis, and Cassandra.

[0231] Table 1. Number of NoSQL Configuration Parameters and Status Indicators

[0232]

[0233]

[0234] To mitigate the impact of performance variability caused by workload, network, and other hardware, all experiments can be repeated three times to ensure the accuracy of the conclusions.

[0235] In the experiment, the standard workloads of the YCSB load testing tool can be used for offline training and online tuning, with workloads a (50% read, 50% update), b (95% read, 5% update), c (100% read), e (95% read, 5% insert), f (95% scan and 5% insert), and f (50% read, 50% insert).

[0236] To increase the richness of the workload, a custom workload g (40% read, 30% update, 30% insert) can be defined.

[0237] In the experiments, the weighting β for throughput and latency can be set to 0.5. To further explain how this application achieves optimal performance for NoSQL, ablation experiments are conducted below on the reinforcement learning module, meta-learning module, and critic identification suboptimal configuration module. Currently, most RL-based database tuning methods use DDPG.

[0238] To demonstrate the correctness of the methods provided in this application, experiments were conducted on TD3, SAC, and DDPG with and without fused causal inference. Each method was trained at the same time cost, and these offline-trained models were then fine-tuned online.

[0239] in, Figure 3 This example illustrates the comparison of the best relative throughput and latency of TD3, SAC, and DDPG with and without fused causal reasoning in a Redis database. Figure 4 This example illustrates a comparison of the best relative throughput and latency of Meta TD3 and TD3 in a MongoDB database. Figure 5 This example illustrates the relationship between stimulus parameters and Q-values ​​after Redis offline training.

[0240] from Figure 3 It can be seen that the method incorporating causal reasoning significantly improves tuning performance compared to methods without it. This demonstrates that the causal reasoning method of this invention can quickly find causal relationships between configuration parameters, thereby fully utilizing high-quality tuning configuration samples and improving training efficiency. Furthermore, for reinforcement learning modules, DDPG typically overestimates the Q-value of the state significantly in the early stages of training, while TD3 and SAC use two critics to reduce the problem of excessively high Q-values. This shows that this invention can effectively improve the tuning efficiency of the model.

[0241] In the experiments, this application tested the meta-learning module in MongoDB and compared the methods of Meta TD3 and TD3, both of which incorporate causal inference. Workloads a through e were used as the offline training environment in the experiments. TD3 and Meta_TD3 were tuned on workload f, while TD3_hardware and Meta_TD3_hardware were tuned on workload f and running on different CPU nodes.

[0242] from Figure 4 It can be seen that both methods perform poorly in initial tuning when encountering unfamiliar environments. Meta_TD3, however, can initially adapt the model. After adaptation, Meta_TD3 exhibits better performance than TD3. This demonstrates that the present invention has better generalization ability and stronger applicability.

[0243] In the experiment, Redis was trained offline using Critic and then optimized online. Figure 5It can be seen that the trends of Q-value and excitation parameters are almost identical. This indicates that the method provided in this application embodiment can effectively identify suboptimal configurations during the online tuning phase.

[0244] The database configuration optimization apparatus provided in the embodiments of this application is described below. The database configuration optimization apparatus described below can be referred to in correspondence with the database configuration optimization method described above.

[0245] See Figure 6 , Figure 6 This is a schematic diagram of a database configuration optimization device disclosed in an embodiment of this application.

[0246] like Figure 6 As shown, the database configuration tuning device may include:

[0247] The receiving unit 101 is used to receive the tuning request from the target client, wherein the tuning request from the target client includes tuning requirements and the target database that needs to be tuned.

[0248] The collection unit 102 is used to collect the target configuration sample set corresponding to the target database based on the tuning request of the target client;

[0249] Analysis unit 103 is used to input the target configuration sample set into a preset target tuning model for analysis, and obtain the configuration results corresponding to the target database;

[0250] The judgment unit 104 is used to determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client; if the configuration result corresponding to the target database does not meet the tuning requirements of the target client, then return to the operation of inputting the target configuration sample set into the preset target tuning model for analysis, until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client.

[0251] The feedback unit 105 is used to feed back the configuration result to the target client when the execution result of the judgment unit determines that the configuration result corresponding to the target database meets the tuning requirements of the target client.

[0252] As can be seen from the technical solutions described above, when it is necessary to optimize the performance of a database, the apparatus provided in this application embodiment can receive an optimization request from a target client, so as to optimize the performance of the database requested by the target client based on the optimization request from the target client. The optimization request from the target client includes optimization requirements and the target database to be optimized. Furthermore, based on the optimization request from the target client, a target configuration sample set corresponding to the target database can be collected. Collecting the target configuration sample set corresponding to the target database can help improve the accuracy of the configuration results corresponding to the target database. After collecting the target configuration sample set, it can be input into a preset target tuning model for analysis, thereby obtaining the configuration result corresponding to the target database. The configuration result obtained through the analysis of the target tuning model may not necessarily meet the tuning requirements of the target client. Therefore, after obtaining the configuration result, it can be further determined whether the configuration result corresponding to the target database meets the tuning requirements of the target client. If the configuration result corresponding to the target database does not meet the tuning requirements of the target client, the operation of inputting the target configuration sample set into the preset target tuning model for analysis can be returned until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client. If the configuration result corresponding to the target database meets the tuning requirements of the target client, the configuration result can be fed back to the target client.

[0253] Therefore, when it is necessary to optimize the configuration of a database, the optimization device provided in this application embodiment has high optimization efficiency and effectively reduces optimization time cost. It can also be adaptively applied to the optimization of NoSQL databases and can effectively recommend the optimal configuration result of the database to be optimized. The device provided in this application embodiment has strong applicability and can be adapted to different database configuration optimization work.

[0254] Further optionally, the device may also include:

[0255] The matching unit is used to match the target configuration sample set with a preset tuning model to obtain a target tuning model that matches the target database.

[0256] Further optionally, the matching unit may include:

[0257] The first matching subunit is used to input the target configuration sample set into the preset tuning model to extract the configuration environment features in the target configuration sample set;

[0258] The second matching subunit is used to input the configuration environment features of the target configuration sample set and the data in the preset experience replay pool into a preset covariate shift correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model. The preset covariate shift correction model is trained by using the configuration environment features of the training configuration sample set and the data in the preset experience replay pool as training samples and the deviation information between the current environment corresponding to the training configuration sample set and the configuration environment corresponding to the preset tuning model as sample labels.

[0259] The third matching subunit is used to adjust the preset tuning model based on the deviation information using the covariate offset correction model, so as to obtain a target tuning model that matches the target database.

[0260] Further optionally, the training process of the target optimization model may include:

[0261] Collect target training samples and store them in a pre-defined experience replay pool;

[0262] The input parameter dimensions of the target tuning model are determined, and the network parameters of the target tuning model are set. The network parameters of the target tuning model include the agent of the target tuning model, the tuning environment of the target tuning model, the state of the target tuning model, the action of the target tuning model, the incentive parameters of the target tuning model, and the tuning strategy of the target tuning model.

[0263] The target training samples are input into the target optimization model according to the input parameter dimensions of the target optimization model, and the model is repeatedly trained to obtain the target optimization model of the database corresponding to the target training samples.

[0264] The specific processing flow of each unit included in the aforementioned database configuration tuning device can be found in the previous section on database configuration tuning methods, and will not be repeated here.

[0265] The database configuration tuning device provided in this application embodiment can be applied to database configuration tuning devices, such as terminals: mobile phones, computers, etc. Optionally, Figure 7 The hardware structure diagram of the database configuration and tuning equipment is shown. (Refer to...) Figure 7 The hardware structure of the database configuration tuning device may include: at least one processor 1, at least one communication interface 2, at least one memory 3, and at least one communication bus 4.

[0266] In this embodiment, the number of processor 1, communication interface 2, memory 3, and communication bus 4 is at least one, and processor 1, communication interface 2, and memory 3 communicate with each other through communication bus 4.

[0267] Processor 1 may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of this application.

[0268] Memory 3 may include high-speed RAM, and may also include non-volatile memory, such as at least one disk storage device;

[0269] The memory stores a program, which the processor can call. The program is used to implement the various processing flows in the aforementioned terminal database configuration optimization scheme.

[0270] This application embodiment also provides a readable storage medium that can store a program suitable for processor execution, the program being used to: implement the various processing flows of the aforementioned terminal in the database configuration optimization scheme.

[0271] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0272] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.

[0273] The above description of the disclosed embodiments enables those skilled in the art to make or use this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Various embodiments can be combined with each other. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A database configuration tuning method, characterized in that, include: Receive a tuning request from the target client, wherein the tuning request from the target client includes tuning requirements and the target database that needs to be tuned; Based on the tuning request from the target client, collect the target configuration sample set corresponding to the target database; The target configuration sample set is input into a preset target tuning model for analysis to obtain the configuration results corresponding to the target database. Determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client; If the configuration result corresponding to the target database does not meet the tuning requirements of the target client, then return to the operation of inputting the target configuration sample set into the preset target tuning model for analysis, until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client; If the configuration result corresponding to the target database meets the tuning requirements of the target client, then the configuration result is fed back to the target client; Before inputting the target configuration sample set into a preset target tuning model for analysis, the method further includes: The target configuration sample set is matched with a preset tuning model to obtain a target tuning model that matches the target database. The step of matching the target configuration sample set with a preset tuning model to obtain a target tuning model that matches the target database includes: The target configuration sample set is input into the preset tuning model to extract the configuration environment features in the target configuration sample set; The configuration environment features of the target configuration sample set and the data in the preset experience replay pool are input into a preset covariate offset correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model. The preset covariate offset correction model is trained by using the configuration environment features of the training configuration sample set and the data in the preset experience replay pool as training samples and the deviation information between the current environment corresponding to the training configuration sample set and the configuration environment corresponding to the preset tuning model as sample labels. Based on the deviation information, the preset tuning model is adjusted using the covariate offset correction model to obtain a target tuning model that matches the target database.

2. The method according to claim 1, characterized in that, The training process of the target optimization model includes: Collect target training samples and store them in a pre-defined experience replay pool; The input parameter dimensions of the target tuning model are determined, and the network parameters of the target tuning model are set. The network parameters of the target tuning model include the agent of the target tuning model, the tuning environment of the target tuning model, the state of the target tuning model, the action of the target tuning model, the incentive parameters of the target tuning model, and the tuning strategy of the target tuning model. The target training samples are input into the target optimization model according to the input parameter dimensions of the target optimization model, and the model is repeatedly trained to obtain the target optimization model of the database corresponding to the target training samples.

3. A database configuration optimization device, characterized in that, include: The receiving unit is used to receive the tuning request from the target client, wherein the tuning request from the target client includes tuning requirements and the target database that needs to be tuned; The collection unit is used to collect the target configuration sample set corresponding to the target database based on the tuning request of the target client. The analysis unit is used to input the target configuration sample set into a preset target tuning model for analysis, and obtain the configuration results corresponding to the target database. The judgment unit is used to determine whether the configuration result corresponding to the target database meets the tuning requirements of the target client; If it is determined that the configuration result corresponding to the target database does not meet the tuning requirements of the target client, then return to the operation of inputting the target configuration sample set into the preset target tuning model for analysis, until the configuration result corresponding to the target database output by the target tuning model meets the tuning requirements of the target client; A feedback unit is used to feed back the configuration result to the target client when the execution result of the judgment unit determines that the configuration result corresponding to the target database meets the tuning requirements of the target client. A matching unit is used to match the target configuration sample set with a preset tuning model to obtain a target tuning model that matches the target database. The matching unit includes: The first matching subunit is used to input the target configuration sample set into the preset tuning model to extract the configuration environment features in the target configuration sample set; The second matching subunit is used to input the configuration environment features of the target configuration sample set and the data in the preset experience replay pool into a preset covariate offset correction model to obtain the deviation information between the current configuration environment corresponding to the target configuration sample set and the training configuration environment corresponding to the preset tuning model. The preset covariate offset correction model is trained by using the configuration environment features of the training configuration sample set and the data in the preset experience replay pool as training samples and the deviation information between the current environment corresponding to the training configuration sample set and the configuration environment corresponding to the preset tuning model as sample labels. The third matching subunit is used to adjust the preset tuning model based on the deviation information using the covariate offset correction model, so as to obtain a target tuning model that matches the target database.

4. The apparatus according to claim 3, characterized in that, The training process of the target optimization model includes: Collect target training samples and store them in a pre-defined experience replay pool; The input parameter dimensions of the target tuning model are determined, and the network parameters of the target tuning model are set. The network parameters of the target tuning model include the agent of the target tuning model, the tuning environment of the target tuning model, the state of the target tuning model, the action of the target tuning model, the incentive parameters of the target tuning model, and the tuning strategy of the target tuning model. The target training samples are input into the target optimization model according to the input parameter dimensions of the target optimization model, and the model is repeatedly trained to obtain the target optimization model of the database corresponding to the target training samples.

5. A database configuration tuning device, characterized in that, include: One or more processors, and memory; The memory stores computer-readable instructions, which, when executed by the one or more processors, implement the steps of the database configuration tuning method as described in any one of claims 1 to 2.

6. A readable storage medium, characterized in that: The readable storage medium stores computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the database configuration tuning method as described in any one of claims 1 to 2.