Data migration method for heterogeneous databases based on reinforcement learning
By introducing forward-looking factors and multi-level feedback mechanisms into heterogeneous databases, combined with cross-domain knowledge transfer, and optimizing data migration strategies, the problems of insufficient robustness and adaptability in existing technologies are solved, and more efficient data migration results are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- LOGISTICAL ENGINEERING UNIVERSITY OF PLA
- Filing Date
- 2025-03-04
- Publication Date
- 2026-06-19
AI Technical Summary
Existing reinforcement learning-based data transfer methods have poor robustness and adaptability in heterogeneous databases, are prone to getting trapped in local optima, and have high computational resource requirements.
A forward-looking transfer strategy optimization mechanism based on deep reinforcement learning is adopted, combined with cross-domain knowledge transfer. By predicting future needs and environmental changes, the data transfer process is optimized. Forward-looking factors, multi-level feedback mechanisms, task complexity factors, and task priority factors are introduced to optimize the transfer strategy.
It improves the robustness and adaptability of data migration, solves the problem of local optima, optimizes the accuracy and efficiency of migration strategies, reduces manual configuration and intervention, and improves work efficiency.
Smart Images

Figure CN120179629B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of data migration technology, specifically relating to a data migration method, system, device, and computer-readable storage medium for heterogeneous databases based on reinforcement learning. Background Technology
[0002] In the information age, data migration is a common task, especially when migrating data between heterogeneous databases. Traditional data migration methods often rely on manual configuration and fixed rules, which are not only time-consuming and labor-intensive, but also prone to errors and difficult to adapt to constantly changing environments and needs, resulting in poor migration outcomes.
[0003] With the development of big data and artificial intelligence technologies, reinforcement learning (RL), as an effective automatic optimization method, is increasingly being applied to data transfer tasks. Reinforcement learning methods automatically optimize transfer strategies through interaction with the environment, improving transfer performance and demonstrating high flexibility in adapting to different environments and tasks. Compared to traditional rule-based methods, reinforcement learning can continuously adjust and optimize transfer strategies through trial and error, thus exhibiting stronger adaptability in complex and ever-changing environments.
[0004] However, existing reinforcement learning-based data migration methods still have some shortcomings. First, these methods primarily focus on optimizing the current task, lacking the ability to predict future needs and environmental changes. For example, during data migration, future changes in data volume, data structure, and business requirements can significantly impact the migration strategy, and existing methods often cannot anticipate these changes, resulting in insufficient robustness and adaptability in the migration process. Second, the training process of reinforcement learning models typically requires substantial data and computational resources, which poses a challenge for resource-constrained environments. Furthermore, reinforcement learning methods are prone to getting trapped in local optima when handling large-scale, high-dimensional data migration tasks, affecting the overall migration performance.
[0005] Based on this, the present invention proposes a data migration method for heterogeneous databases based on reinforcement learning. It uses a forward-looking migration strategy optimization mechanism based on deep reinforcement learning and combines cross-domain knowledge transfer to optimize the data migration process by predicting future needs and environmental changes. Summary of the Invention
[0006] To address the aforementioned problems in existing technologies, namely, the poor robustness and adaptability of existing reinforcement learning-based data migration methods and their tendency to get trapped in local optima, this invention, in its first aspect, proposes a data migration method for heterogeneous databases based on reinforcement learning. This method includes:
[0007] S10: Extract the data to be migrated from each heterogeneous database as source data, and take the migration task corresponding to each source data as the source task.
[0008] S20, determine the migration task corresponding to the target database as the target task; calculate the migration path similarity between the target task and each source task and sort them in descending order, and take the source tasks corresponding to the top k migration path similarities as mini-batch tasks.
[0009] S30, calculate the prospective factor and task complexity factor corresponding to each mini-batch task;
[0010] S40, Calculate the policy gradient by combining the prospective factor and the task complexity factor; Update the policy parameters based on the policy gradient, and then update the policy network;
[0011] S50, calculate the meta-gradient based on the updated policy network, and correct the meta-gradient by combining it with the task relevance matrix; update the global parameters based on the corrected meta-gradient.
[0012] S60, initialize the policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data.
[0013] In some preferred embodiments, the similarity of the migration paths between the target task and each source task is calculated using the following method:
[0014] The feature vectors of the target task and the source task are extracted respectively, and used as the first feature vector and the second feature vector;
[0015] The first feature vector and the second feature vector are standardized respectively.
[0016] The similarity between the standardized first feature vector and the standardized second feature vector is calculated using a similarity calculation method, and this similarity is used as the migration path similarity.
[0017] In some preferred embodiments, the prospective factor is calculated as follows:
[0018]
[0019] Where, p i Let t0 represent the forward-looking factor corresponding to the i-th mini-batch task, where the forward-looking factor represents the influence of the current task on future tasks, and t0 represent the task start time. f Demand(t) represents the last point in time within a future period, and Demand(t) represents the time migration demand data at the current time t, which is predicted by a deep learning model based on historical data. λ represents the decay rate.
[0020] In some preferred embodiments, the task complexity factor is calculated as follows:
[0021] Extract the sub-feature vectors corresponding to the i-th mini-batch task;
[0022] Calculate the weights corresponding to each sub-feature vector, and weight the corresponding sub-feature vectors using the weights; sum the weighted sub-feature vectors to obtain the weighted feature sum of the i-th mini-batch task, which is used as the first summation result;
[0023] The ratio of the first summation result to the second summation result is calculated and used as the task complexity factor; the second summation result is the maximum value of the weighted feature summation in all mini-batch tasks.
[0024] In some preferred embodiments, the policy parameters are updated based on the policy gradient, and the method is as follows:
[0025]
[0026] Where θ′ represents the updated policy parameters, θ represents the unupdated policy parameters, and g i S represents the policy gradient. i,target Gamma represents the similarity of migration paths. t =e -λt Indicates the time decay factor. This represents the task complexity factor. Let J(π) represent the performance gradient of the i-th mini-batch task, and J(π) represent the expected reward. θ This indicates the policy corresponding to the policy parameter. Representing strategy π θ Trajectory generated through interaction with the environment.
[0027] In some preferred embodiments, the meta-gradient is corrected by incorporating the task relevance matrix, and the method is as follows:
[0028]
[0029] Among them, G′ i G represents the corrected meta-gradient. i C represents the uncorrected meta-gradient. i This represents the consistency index value, which is the ratio of the number of data entries that match the source data after migration to the total number of data entries. i V represents the integrity metric value after migration, which is the ratio of the number of data records after migration to the number of data records in the source. i U represents the migration speed metric, which is the ratio of the amount of data migrated to the time taken for migration. iR represents the data utilization efficiency metric after migration, specifically the ratio of the average response time in the target database to the baseline response time. ij T represents a small batch of tasks i With small batch tasks T j The correlation between them, Q i T represents a small batch of tasks i Priority factor, Priority(T) i ) represents the priority of the i-th mini-batch task, π θ ′ indicates a policy updated based on the updated policy parameters. Representing strategy π θ The trajectory generated by interaction with the environment.
[0030] In some preferred embodiments, the global parameters are updated based on the corrected meta-gradient, as follows:
[0031] Calculate the arithmetic mean of the corrected meta-gradients for k mini-batch tasks;
[0032] The unupdated strategy parameters are summed with the arithmetic mean and used as global parameters.
[0033] In a second aspect, the present invention proposes a data migration system for heterogeneous databases based on reinforcement learning, the system comprising:
[0034] The data extraction module is configured to extract the data to be migrated from various heterogeneous databases as source data, and to use the migration tasks corresponding to each source data as source tasks.
[0035] The similarity calculation module is configured to determine the migration task corresponding to the target database as the target task; calculate the migration path similarity between the target task and each source task and sort them in descending order; and take the source tasks corresponding to the top k migration path similarities as mini-batch tasks.
[0036] The factor calculation module is configured to calculate the forward-looking factor and task complexity factor corresponding to each mini-batch task.
[0037] The policy update module calculates the policy gradient by combining the forward-looking factor and the task complexity factor; updates the policy parameters based on the policy gradient, and then updates the policy network.
[0038] The gradient correction module is configured to calculate the meta-gradient based on the updated policy network, and correct the meta-gradient in conjunction with the task relevance matrix; and update the global parameters based on the corrected meta-gradient.
[0039] The data migration module is configured to initialize a policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data.
[0040] In a third aspect, the present invention proposes a data migration device for heterogeneous databases based on reinforcement learning, the device comprising:
[0041] At least one processor, and a memory communicatively connected to at least one of the processors;
[0042] The memory stores instructions that can be executed by the processor to implement the aforementioned data migration method for heterogeneous databases based on reinforcement learning.
[0043] In a fourth aspect, the present invention provides a computer-readable storage medium storing computer instructions for execution by a computer to implement the above-described data migration method for heterogeneous databases based on reinforcement learning.
[0044] The beneficial effects of this invention are:
[0045] This invention improves the robustness and adaptability of data migration and solves the problem of local optima.
[0046] 1) This invention introduces a forward-looking factor, using time series analysis and deep learning models to predict future needs and environmental changes, optimizing migration strategies and improving migration quality and efficiency; it utilizes cross-domain knowledge transfer technology to apply knowledge and experience from other domains to the current task, improving the generalization ability and adaptability of the migration strategy; combined with a multi-level feedback mechanism, it comprehensively considers task similarity, time decay, forward-looking factor, task complexity factor, and task priority factor to optimize the strategy gradient, improving the accuracy and effectiveness of the migration strategy; through task complexity factor and task priority factor, it more accurately assesses the difficulty and importance of tasks, optimizes task selection and scheduling, improves migration efficiency, and thus solves the problem that traditional data migration methods are difficult to adapt to constantly changing environments and needs, resulting in poor migration effects and easy getting trapped in local optima.
[0047] 2) This invention automates and optimizes migration strategies, reducing manual configuration and intervention, and improving work efficiency. Attached Figure Description
[0048] Other features, objects, and advantages of this application will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings.
[0049] Figure 1 This is a flowchart illustrating a data migration method for heterogeneous databases based on reinforcement learning, according to one embodiment of the present invention. Detailed Implementation
[0050] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions in the embodiments of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this invention, not all embodiments. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this invention.
[0051] The present application will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the invention. Furthermore, it should be noted that, for ease of description, only the parts relevant to the invention are shown in the accompanying drawings.
[0052] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other.
[0053] A data migration method for heterogeneous databases based on reinforcement learning according to the first embodiment of the present invention, such as... Figure 1 As shown, it includes the following steps:
[0054] S10: Extract the data to be migrated from each heterogeneous database as source data, and take the migration task corresponding to each source data as the source task.
[0055] S20, determine the migration task corresponding to the target database as the target task; calculate the migration path similarity between the target task and each source task and sort them in descending order, and take the source tasks corresponding to the top k migration path similarities as mini-batch tasks.
[0056] S30, calculate the prospective factor and task complexity factor corresponding to each mini-batch task;
[0057] S40, Calculate the policy gradient by combining the prospective factor and the task complexity factor; Update the policy parameters based on the policy gradient, and then update the policy network;
[0058] S50, calculate the meta-gradient based on the updated policy network, and correct the meta-gradient by combining it with the task relevance matrix; update the global parameters based on the corrected meta-gradient.
[0059] S60, initialize the policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data.
[0060] To more clearly illustrate the data migration method for heterogeneous databases based on reinforcement learning according to the present invention, the steps of one embodiment of the method are described in detail below with reference to the accompanying drawings.
[0061] In this invention, when faced with multiple similar but not identical transfer tasks, cross-domain knowledge transfer technology is introduced, which allows learning general knowledge and strategies from completed transfer tasks and applying them to new transfer tasks, thereby quickly adapting to new tasks, accelerating the learning speed of new tasks, and improving the success rate of transfer.
[0062] Cross-domain knowledge transfer can be applied in reinforcement learning in a variety of ways, including transfer learning and meta-learning (also known as "learning how to learn," which refers to the ability of a model to quickly adapt to new tasks by learning from multiple tasks).
[0063] The first embodiment of this invention provides a data migration method for heterogeneous databases based on reinforcement learning. It employs a meta-learning method (specifically, each data migration task is treated as a separate task, and the goal of cross-domain knowledge transfer is to learn from previous tasks to complete new tasks faster and better). Furthermore, it optimizes the migration strategy and improves migration quality and efficiency by introducing a prospective factor, a multi-level feedback mechanism, a task complexity factor, and a task priority factor. Details are as follows:
[0064] S10: Extract the data to be migrated from each heterogeneous database as source data, and take the migration task corresponding to each source data as the source task.
[0065] In this embodiment, the data to be migrated is first extracted from various existing heterogeneous databases. Taking the materials and equipment management system as an example, the source data includes personnel data, financial data, and materials data. Information such as material name, quantity, storage location, and supplier is extracted from the materials management system; information such as personnel name, position, affiliated unit, and contact information is extracted from the personnel management system; and information such as budget, expenditure, income, and audit records is extracted from the financial management system.
[0066] Define source tasks based on the type and characteristics of the source data, and clarify the scope of the source tasks to facilitate subsequent task selection and strategy optimization. For example, T1: personnel data migration, T2: financial data migration;
[0067] S20, determine the migration task corresponding to the target database as the target task; calculate the migration path similarity between the target task and each source task and sort them in descending order, and take the source tasks corresponding to the top k migration path similarities as mini-batch tasks.
[0068] The goal of data migration is to integrate data from multiple heterogeneous databases into a unified target database to improve the efficiency of data management and use. To ensure successful data migration, the target tasks need to be clearly defined based on the requirements of the target database.
[0069] In this embodiment, we understand the structure, functions, and business requirements of the target database. Structural requirements: Understanding the table structure, field definitions, primary keys, and foreign key relationships of the target database; Functional requirements: Understanding the functions the target database needs to support, such as data querying, report generation, and data analysis; Business requirements: Understanding the business processes the target database needs to support, such as materials management, personnel management, and financial management.
[0070] The requirements of the target database are broken down into specific migration tasks, i.e., target tasks. Each task corresponds to a specific data migration operation, and the mapping relationship between the source data and the target data is determined to ensure the consistency and integrity of the data during the migration process. For example, when migrating data from the personnel management system to the personnel management table in the target database, fields such as "personnel name," "position," "affiliated unit," and "contact information" in the personnel management system are mapped to the corresponding fields in the target database; similarly, when migrating data from the financial management system to the financial management table in the target database, fields such as "budget," "expenditure," "income," and "audit records" in the financial management system are mapped to the corresponding fields in the target database.
[0071] Then, initialize the parameters θ of the policy network and the parameters of the value network; select the top k source tasks with the highest similarity to the migration path of the target task from the source task set as mini-batch tasks.
[0072] The method for calculating migration path similarity is as follows:
[0073] The feature vectors of the target task and the source task are extracted respectively, and used as the first feature vector and the second feature vector; the first feature vector and the second feature vector are then standardized.
[0074] The similarity between the standardized first feature vector and the standardized second feature vector is calculated using a similarity calculation method, which serves as the migration path similarity. In this invention, the cosine similarity formula is preferred, but a neural network model, such as an LSTM model, can also be used. In this invention, the LSTM model is constructed based on a sequentially connected input layer, convolutional layer (for feature extraction), Transformer encoding layer, gated recurrent unit layer, fully connected layer, and output layer. The output layer of the Transformer encoding layer is connected to the output residual of the gated recurrent unit layer, serving as the input to the fully connected layer. The Transformer encoding layer dynamically encodes the output feature vector to obtain an encoded vector. A global attention mechanism (i.e., global vector Attention = X + α2.Attention(Q,K,V), α2 = α) is then applied. 1+Linear(C), α1 = LayerNorm(ReLU(Linear(X))) + X, C = Condition(X), X represents the input feature vector, Linear represents linear transformation, LayerNorm represents layer normalization, ReLU represents the activation function, Attention(Q,K,V) is the standard self-attention mechanism, and Condition represents the conditional variable. The encoded vector is weighted by a local attention mechanism (the local attention mechanism used in this invention is an existing technology, which will not be described in detail here) to obtain a global vector and a local vector. The global vector and the local vector are then fused as the output.
[0075] S30, calculate the prospective factor and task complexity factor corresponding to each mini-batch task;
[0076] In heterogeneous database data migration, future needs and environments may change. Therefore, migration strategies need to be forward-looking, capable of predicting and adapting to future changes.
[0077] In this embodiment, a forward-looking transfer strategy optimization mechanism based on deep reinforcement learning is introduced. By using time series analysis and deep learning models to predict future needs, the adaptability and robustness of the transfer strategy are improved.
[0078] The forward-looking factors (i.e., the factors that influence the current task on future tasks) are as follows:
[0079]
[0080] Where, p i This represents the forward-looking factor corresponding to the i-th mini-batch task. The forward-looking factor indicates the influence of the current task on future tasks. t0 represents the task start time, and t-t0 represents the time difference, used to capture changes in task requirements over a future period. f t Demand(t) represents the last point in time within a future period. It is the migration requirement data at the current time t, which is predicted by a deep learning model (existing technology, which will not be elaborated here) based on historical data (i.e., migration task records in the past period, including migration time, source data, target data, migration strategy, migration results, etc.). λ represents the decay rate.
[0081] By leveraging time differences, forward-looking factors can capture changes in task requirements over a future period, thus better reflecting the impact of current tasks on future tasks. This approach is particularly useful in data migration because it helps optimize migration strategies and ensures the migration process adapts to future business needs. In data management, this method of calculating forward-looking factors can significantly improve the effectiveness and efficiency of data migration.
[0082] The task complexity factor needs to accurately assess the complexity of a task, taking into account factors such as data size, data type, and migration path. This invention uses a multi-factor comprehensive evaluation model, combining deep learning and expert knowledge, to calculate the task complexity factor, as follows:
[0083] Each small batch of tasks T i Each task has multiple features f (i.e., sub-feature vectors), which can include data size, data type diversity, migration path length, data dependency, migration frequency, etc. In this invention, the sub-feature vectors corresponding to the i-th mini-batch task are first extracted; the weights corresponding to each sub-feature vector are calculated, and the corresponding sub-feature vectors are weighted using these weights; the weighted sub-feature vectors are summed to obtain the weighted feature sum of the i-th mini-batch task, which is used as the first summation result; the ratio of the first summation result to the second summation result is calculated as the task complexity factor; the second summation result is the maximum value of the weighted feature sums across all mini-batch tasks. As shown in the following formula:
[0084]
[0085] in, Represents the task complexity factor, ∑ f∈F ω f ·Feature f (T i ) represents the weighted sum of features for mini-batch tasks, F represents the feature set, and ω f The weights of the sub-feature vector f are represented by the feature vector f. f (T i ) represents a small batch of tasks T i The corresponding sub-feature vectors, that is, the values of the sub-feature vectors in the mini-batch task, T j This represents the set of all mini-batch tasks, and max represents the maximum value of the weighted sum of features of all tasks.
[0086] S40, Calculate the policy gradient by combining the prospective factor and the task complexity factor; Update the policy parameters based on the policy gradient, and then update the policy network;
[0087] In this embodiment, for each mini-batch task, a strategy is used to interact with the environment and collect a certain number of trajectories; then, a prospective factor and a task complexity factor are calculated.
[0088] Combining the prospective factor and the task complexity factor, the policy gradient is calculated, and the policy parameters are updated based on the policy gradient. The specific process is as follows:
[0089]
[0090] Where θ′ represents the updated policy parameters, θ represents the unupdated policy parameters, and g i S represents the policy gradient. i,target Gamma represents the similarity of migration paths. t =e -λt This refers to the time decay factor (in the data migration process, early migration paths and strategies may be less effective than later paths and strategies. Therefore, a time decay factor can be introduced to reduce the impact of early tasks). This represents the task complexity factor. Let J(π) represent the performance gradient of the i-th mini-batch task, and J(π) represent the expected reward. θ This indicates the policy corresponding to the policy parameter. Representing strategy π θ Trajectory generated through interaction with the environment.
[0091] In reinforcement learning, the performance gradient is used to guide the update of policy parameters to maximize the expected reward. The performance gradient reflects the sensitivity of the policy parameters to the expected reward. In this invention, the calculation process of the performance gradient is as follows:
[0092] Use strategy π θ When interacting with the environment, one or more trajectories are generated (the trajectory includes state s, action a, and reward r). For each time step in each trajectory, action a is calculated. t In state s t The logarithmic probability logπ θ (a t |s t ).
[0093] For each time step t in each trajectory, calculate the cumulative reward from the start of t to the end of the trajectory: γ is the discount factor, set according to the actual situation; for each time step in each trajectory, the gradient term is calculated. The performance gradient of a trajectory is obtained by summing the gradient terms of all time steps. If there are multiple trajectories, the gradients of all trajectories are summed to obtain the final performance gradient.
[0094] S50, calculate the meta-gradient based on the updated policy network, and correct the meta-gradient by combining it with the task relevance matrix; update the global parameters based on the corrected meta-gradient.
[0095] In this embodiment, the updated strategy is used to interact with the environment again on a small batch of tasks to collect new trajectories. Then, a multi-level feedback mechanism is introduced (a single migration effect feedback may not be sufficient to fully evaluate the quality of the migration). This includes the consistency index value of the migrated data (i.e., the ratio of the number of data entries that match the source data to the total number of data entries, i.e., comparing the migrated data with the source data one by one to obtain the number of matching data entries, and using the ratio of the number of matching data entries to the total number of data entries (i.e., the amount of data after migration) as the consistency index value), the integrity index value (i.e., the ratio of the number of data entries after migration to the number of data entries in the source), the migration speed index value (i.e., the ratio of the amount of data migrated to the time taken for migration), and the data utilization efficiency index value after migration (i.e., the ratio of the average response time in the target database to the baseline response time (set according to specific circumstances)).
[0096] Attention is used to calculate the task relevance matrix, which is then used to calculate the meta-gradient. The meta-gradient is then corrected by combining multi-level feedback and the task relevance matrix. The global parameters are updated based on the corrected meta-gradient. The specific process is as follows:
[0097]
[0098] Among them, G′ i G represents the corrected meta-gradient. i C represents the uncorrected meta-gradient. i This represents the consistency index value, which is the ratio of the number of data entries that match the source data after migration to the total number of data entries. i V represents the integrity metric value after migration, which is the ratio of the number of data records after migration to the number of data records in the source. i U represents the migration speed metric, which is the ratio of the amount of data migrated to the time taken for migration. i R represents the data utilization efficiency metric after migration, specifically the ratio of the average response time in the target database to the baseline response time. ij T represents a small batch of tasks i With small batch tasks T j The correlation between tasks (the correlation between different tasks can further affect the calculation of the meta-gradient. By introducing a task correlation matrix, the meta-gradient can be adjusted more precisely), Q i T represents a small batch of tasks i Priority factor, Priority(T) i ) represents the priority of the i-th mini-batch task, π θ ′ indicates a policy updated based on the updated policy parameters. Representing strategy π θ The trajectory generated by interaction with the environment.
[0099] The method for updating global parameters based on the corrected meta-gradient is as follows:
[0100] Calculate the arithmetic mean of the corrected meta-gradients corresponding to k mini-batch tasks; sum the unupdated policy parameters with the arithmetic mean and use it as the global parameter.
[0101] S60, initialize the policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data.
[0102] In this embodiment, the policy network is initialized using the meta-updated global parameters, and a migration policy (including data extraction, transformation (i.e., performing necessary transformations on the extracted data according to the generated migration policy to ensure the data format meets the requirements of the target database) and loading) is generated. Data migration is then performed in the actual environment. Alternatively, after initializing the policy network, a portion of data is extracted from the target database as a fine-tuning dataset. A migration policy is generated for data migration, and the policy network is fine-tuned based on the migration results and feedback information (such as consistency and integrity). After fine-tuning, data migration is performed in the actual environment.
[0103] In summary, this invention improves performance in heterogeneous database data migration by introducing a forward-looking factor, a multi-level feedback mechanism, a task complexity factor, and a task priority factor. This enables the model to adapt more flexibly to changes in tasks and exhibit higher migration quality and efficiency on new tasks.
[0104] A data migration system for heterogeneous databases based on reinforcement learning according to a second embodiment of the present invention, the system comprising:
[0105] The data extraction module is configured to extract the data to be migrated from various heterogeneous databases as source data, and to use the migration tasks corresponding to each source data as source tasks.
[0106] The similarity calculation module is configured to determine the migration task corresponding to the target database as the target task; calculate the migration path similarity between the target task and each source task and sort them in descending order; and take the source tasks corresponding to the top k migration path similarities as mini-batch tasks.
[0107] The factor calculation module is configured to calculate the forward-looking factor and task complexity factor corresponding to each mini-batch task.
[0108] The policy update module calculates the policy gradient by combining the forward-looking factor and the task complexity factor; updates the policy parameters based on the policy gradient, and then updates the policy network.
[0109] The gradient correction module is configured to calculate the meta-gradient based on the updated policy network, and correct the meta-gradient in conjunction with the task relevance matrix; and update the global parameters based on the corrected meta-gradient.
[0110] The data migration module is configured to initialize a policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data.
[0111] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process and related descriptions of the system described above can be found in the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0112] It should be noted that the data migration system for heterogeneous databases based on reinforcement learning provided in the above embodiments is only an example of the division of the above functional modules. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the modules or steps in the embodiments of the present invention can be further decomposed or combined. For example, the modules in the above embodiments can be merged into one module, or further divided into multiple sub-modules to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the various modules or steps and are not considered as an improper limitation of the present invention.
[0113] A data migration device for heterogeneous databases based on reinforcement learning according to a third embodiment of the present invention includes at least one processor and a memory communicatively connected to at least one of the processors; wherein the memory stores instructions executable by the processor, the instructions being executed by the processor to implement the above-described data migration method for heterogeneous databases based on reinforcement learning.
[0114] A fourth embodiment of the present invention provides a computer-readable storage medium storing computer instructions, which are executed by the computer to implement the above-described data migration method for heterogeneous databases based on reinforcement learning.
[0115] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process and related descriptions of the data migration device and computer-readable storage medium for heterogeneous databases based on reinforcement learning described above can be found in the corresponding process in the aforementioned method examples, and will not be repeated here.
[0116] Those skilled in the art will recognize that the modules and method steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. The programs corresponding to the software modules and method steps can be placed in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or any other form of storage medium known in the art. To clearly illustrate the interchangeability of electronic hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in electronic hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the invention.
[0117] The terms “first,” “second,” “third,” etc., are used to distinguish similar objects, not to describe or indicate a specific order or sequence.
[0118] The technical solution of the present invention has been described above with reference to the preferred embodiments shown in the accompanying drawings. However, it will be readily understood by those skilled in the art that the scope of protection of the present invention is obviously not limited to these specific embodiments. Without departing from the principles of the present invention, those skilled in the art can make equivalent changes or substitutions to the relevant technical features, and the technical solutions after these changes or substitutions will all fall within the scope of protection of the present invention.
Claims
1. A data migration method for heterogeneous databases based on reinforcement learning, characterized in that, The method includes: S10: Extract the data to be migrated from each heterogeneous database as source data, and take the migration task corresponding to each source data as the source task. S20, determine the migration task corresponding to the target database, and designate it as the target task; calculate the similarity of the migration paths between the target task and each source task, and sort them in descending order, then select the top-ranked tasks... k The source tasks corresponding to the similarity of each migration path are used as mini-batch tasks; S30, calculate the prospective factor and task complexity factor corresponding to each mini-batch task; S40, Calculate the policy gradient by combining the prospective factor and the task complexity factor; Update the policy parameters based on the policy gradient, and then update the policy network; S50, calculate the meta-gradient based on the updated policy network, and correct the meta-gradient by combining it with the task relevance matrix; update the global parameters based on the corrected meta-gradient. S60, initialize the policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data; The forward-looking factor is calculated as follows: ; in, Indicates the first Forward-looking factors correspond to a small batch of tasks, where the forward-looking factor represents the impact of the current task on future tasks. Indicates the start time of the task. Indicates the last point in time within a future period. Indicates the current time point The time migration requirement data is predicted based on historical data using a deep learning model. Indicates the attenuation rate; The method for updating policy parameters based on the policy gradient is as follows: ; ; in, This indicates the updated policy parameters. This indicates policy parameters that have not been updated. Represents the policy gradient. Indicates the similarity of migration paths. Indicates the time decay factor. This represents the task complexity factor. Indicates the first The performance gradient of mini-batch tasks Indicates expected return. This indicates the policy corresponding to the policy parameter. Representation Strategy Trajectory generated through interaction with the environment.
2. The data migration method for heterogeneous databases based on reinforcement learning according to claim 1, characterized in that, The method for calculating the similarity of the migration paths between the target task and each source task is as follows: The feature vectors of the target task and the source task are extracted respectively, and used as the first feature vector and the second feature vector; The first feature vector and the second feature vector are standardized respectively. The similarity between the standardized first feature vector and the standardized second feature vector is calculated using a similarity calculation method, and this similarity is used as the migration path similarity.
3. The data migration method for heterogeneous databases based on reinforcement learning according to claim 1, characterized in that, The task complexity factor is calculated as follows: Extract the first Sub-feature vectors corresponding to each mini-batch task; Calculate the weights corresponding to each sub-feature vector, and weight the corresponding sub-feature vectors using the weights; sum the weighted sub-feature vectors to obtain the first sub-feature vector. The weighted sum of features from each mini-batch task is used as the first summation result; The ratio of the first summation result to the second summation result is calculated and used as the task complexity factor; the second summation result is the maximum value of the weighted feature summation in all mini-batch tasks.
4. The data migration method for heterogeneous databases based on reinforcement learning according to claim 1, characterized in that, The meta-gradient is corrected by incorporating the task relevance matrix, and the method is as follows: ; ; ; in, This represents the corrected meta-gradient. This represents the uncorrected meta-gradient. This represents the consistency metric value, which is the ratio of the number of data entries that match the source data after migration to the total number of data entries. This represents the integrity metric value after migration, which is the ratio of the number of data records after migration to the number of data records in the source database. This represents the migration speed metric, which is the ratio of the amount of data migrated to the time taken for migration. This represents the data utilization efficiency metric after migration, specifically the ratio of the average response time in the target database to the baseline response time. Indicates small batch tasks With small batch tasks The correlation between them Indicates small batch tasks Priority factor, Indicates the first Prioritization of small batch tasks This indicates a policy updated based on the updated policy parameters. Representation Strategy Trajectory generated through interaction with the environment.
5. The data migration method for heterogeneous databases based on reinforcement learning according to claim 4, characterized in that, The method for updating global parameters based on the corrected meta-gradient is as follows: calculate k The arithmetic mean of the corrected meta-gradients for mini-batch tasks; The unupdated strategy parameters are summed with the arithmetic mean and used as global parameters.
6. A data migration system for heterogeneous databases based on reinforcement learning, characterized in that, The system includes: The data extraction module is configured to extract the data to be migrated from various heterogeneous databases as source data, and to use the migration tasks corresponding to each source data as source tasks. The similarity calculation module is configured to determine the migration task corresponding to the target database as the target task; calculate the similarity of the migration paths between the target task and each source task and sort them in descending order, then select the top-ranked tasks... k The source tasks corresponding to the similarity of each migration path are used as mini-batch tasks; The factor calculation module is configured to calculate the forward-looking factor and task complexity factor corresponding to each mini-batch task. The policy update module calculates the policy gradient by combining the forward-looking factor and the task complexity factor; updates the policy parameters based on the policy gradient, and then updates the policy network. The gradient correction module is configured to calculate the meta-gradient based on the updated policy network, and correct the meta-gradient in conjunction with the task relevance matrix; and update the global parameters based on the corrected meta-gradient. The data migration module is configured to initialize a policy network based on the global parameters, generate a migration policy through the initialized policy network, and then migrate the source data. The forward-looking factor is calculated as follows: ; in, Indicates the first Forward-looking factors correspond to a small batch of tasks, where the forward-looking factor represents the impact of the current task on future tasks. Indicates the start time of the task. Indicates the last point in time within a future period. Indicates the current time point The time migration requirement data is predicted based on historical data using a deep learning model. Indicates the attenuation rate; The method for updating policy parameters based on the policy gradient is as follows: ; ; in, This indicates the updated policy parameters. This indicates policy parameters that have not been updated. Represents the policy gradient. Indicates the similarity of migration paths. Indicates the time decay factor. This represents the task complexity factor. Indicates the first The performance gradient of mini-batch tasks Indicates expected return. This indicates the policy corresponding to the policy parameter. Representation Strategy Trajectory generated through interaction with the environment.
7. An electronic device for data migration of heterogeneous databases based on reinforcement learning, characterized in that, The electronic device includes: At least one processor, and a memory communicatively connected to at least one of the processors; The memory stores instructions that can be executed by the processor to implement the data migration method for heterogeneous databases based on reinforcement learning as described in any one of claims 1-5.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions that are executed by a computer to implement the data migration method for heterogeneous databases based on reinforcement learning as described in any one of claims 1-5.