A method and apparatus for DDL synchronization in group execution
By categorizing transactions into ordinary groups and DDL groups during database synchronization, creating dependent database objects first, and then executing transactions in the DDL group, the problem of unmet dependencies during group execution is resolved, improving synchronization success rate and consistency, and reducing error risk.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WUHAN DAMENG DATABASE
- Filing Date
- 2023-07-14
- Publication Date
- 2026-06-30
Smart Images

Figure CN116991939B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of database synchronization technology, and in particular to a method and apparatus for DDL synchronization performed in groups. Background Technology
[0002] Currently, in real-time database data synchronization based on log parsing, a grouped synchronization approach is often used to ensure database synchronization performance and improve parallelism. One method for implementing target-side data synchronization is to use a group execution strategy, splitting and classifying transactions by database object. Under this grouping approach, transactions between different database objects are no longer related, thus allowing for parallel execution using multithreading to improve synchronization efficiency. However, this group execution method cannot guarantee the order of parallel execution. If dependencies exist between different database objects, it may cause synchronization DDL (Data Definition Language) operations to fail because some DDL operations may not have yet created their dependent database objects when they are executed. These database objects include tables, indexes, views, charts, default values, rules, triggers, users, functions, etc. In this case, the DDL operation cannot be synchronized, the database reports a corresponding error, and subsequent database operations associated with this DDL operation also cannot be synchronized, resulting in the loss of synchronization objects.
[0003] Therefore, overcoming the shortcomings of the existing technology is an urgent problem to be solved in this technical field. Summary of the Invention
[0004] To address the aforementioned deficiencies or improvement needs of existing technologies, this invention provides a group-based DDL synchronization method and apparatus. Its purpose is to, when performing real-time database synchronization using a log-parsing-based group synchronization method, execute transactions in different groups in parallel on the target end based on the transaction's commit LSN at the source end. This ensures that DDL operations that do not find their dependent database objects are successfully synchronized, thus resolving the problem that unsuccessful synchronization of database object DDL operations may cause errors in the synchronization service.
[0005] The present invention adopts the following technical solution:
[0006] In a first aspect, the present invention provides a method for DDL synchronization executed in groups, the method comprising:
[0007] Receive log information, classify the log information into multiple transactions, and classify the transactions into at least one ordinary group or DDL group; wherein, the transaction classified into the ordinary group is the first transaction, and the transaction classified into the DDL group is the second transaction;
[0008] Retrieve the first committed but not yet entered into the database from each ordinary group, and enter the corresponding first transaction into the database;
[0009] Based on the first transaction data entry of each ordinary group, set the latest LSN for each ordinary group;
[0010] The second transaction that has been committed but not yet entered into the database is entered into the database based on the latest LSN of each ordinary group.
[0011] Further, receiving log information, classifying the log information into multiple transactions, and classifying the transactions into at least one ordinary group or DDL group includes:
[0012] The log information is parsed to obtain at least one database operation;
[0013] Determine whether each database operation includes table information;
[0014] If the table information is included, the database operation is categorized into the corresponding ordinary group according to the database object on which the database operation depends, wherein one ordinary group corresponds to one database object;
[0015] If the table information is not included, the database operation is categorized into the DDL group.
[0016] Furthermore, determining whether each database operation contains table information includes:
[0017] If, after top-level parsing, the database object on which the database operation depends is still not found, then the database operation does not include the table information.
[0018] If the database object that the database operation depends on has been found after top-level parsing, then the database operation includes the table information.
[0019] Furthermore, setting the latest LSN for each ordinary group based on the first transaction entry status of each ordinary group includes:
[0020] For a normal group, the first transaction corresponding to each group is entered into the database in the order of the LSNs committed.
[0021] Obtain the commit LSN of the first transaction that has been entered into the database, and use the commit LSN of the first transaction that has been entered into the database as the latest LSN;
[0022] When all the first transactions already stored in the ordinary group are completed and entered into the database, the latest LSN of the ordinary group is set to the first preset value.
[0023] Furthermore, for a regular group, the step of storing the corresponding first transaction into the database according to the order of LSN submission includes:
[0024] Multiple ordinary worker threads are created, each of which corresponds to an ordinary group, and the multiple ordinary worker threads are executed in parallel.
[0025] Each of the aforementioned ordinary worker threads sequentially inserts the first transaction in the ordinary group it is responsible for into the database.
[0026] Further, the step of adding the second transaction that has been committed but not yet entered into the database in the DDL group according to the latest LSN of each ordinary group includes:
[0027] Obtain the latest LSN for each of the aforementioned ordinary packets;
[0028] Sort the latest LSNs of each of the ordinary groups to obtain the minimum LSN;
[0029] Based on the minimum LSN, decide whether to import the second transaction in the DDL group into the database.
[0030] Further, the step of selecting whether to import the second transaction in the DDL group into the database based on the minimum LSN includes:
[0031] Based on the minimum LSN, iterate through all the commit LSNs corresponding to the second transactions in the DDL group, and determine whether there are any second transactions whose corresponding commit LSN is less than the minimum LSN.
[0032] If it exists, then the corresponding second transaction will be entered into the database;
[0033] If it does not exist, the second transaction in the DDL group will not be entered into the database.
[0034] Furthermore, the DDL synchronization method for group execution also includes:
[0035] Create a timed scheduling thread, determine a first preset time, and execute the timed scheduling thread at intervals of the first preset time.
[0036] The timed scheduling thread will add the second transaction that has been committed but not yet entered into the database in the DDL group according to the latest LSN of each ordinary group.
[0037] Furthermore, the DDL synchronization method for group execution also includes:
[0038] Record the first commit LSN and first commit number corresponding to the first transaction that has been entered into the database;
[0039] Record the second commit LSN and second commit number corresponding to the second transaction that has been entered into the database;
[0040] During fault recovery, the first transaction that has been completed and inserted into the database is filtered based on the first commit LSN and the first commit number; the second transaction that has been completed and inserted into the database is filtered based on the second commit LSN and the second commit number.
[0041] In a second aspect, the present invention also provides a DDL synchronization device for group execution, used to implement the DDL synchronization method for group execution described in the first aspect, wherein the DDL synchronization device for group execution includes:
[0042] At least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the DDL synchronization method of group execution described in the first aspect.
[0043] Thirdly, the present invention also provides a non-volatile computer storage medium storing computer-executable instructions that are executed by one or more processors to perform the DDL synchronization method of group execution described in the first aspect.
[0044] Unlike existing technologies, the present invention has at least the following beneficial effects:
[0045] This invention sets up DDL groups and at least one ordinary group. Transactions without found dependent database objects are categorized into DDL groups, while transactions with found dependent database objects are categorized into the corresponding ordinary groups. Transactions in ordinary groups are first entered into the database, while transactions in DDL groups are not entered until the database objects dependent on by the transactions in the DDL groups are determined based on the latest LSN of each ordinary group and are created. This avoids synchronization failures for such transactions and allows subsequent database synchronization to continue based on this synchronization. This solves the problem that unsuccessful synchronization of database object DDL operations may cause errors in the synchronization service, greatly reducing the error risk that may be caused by the group execution method for database synchronization and broadening the application scenarios of the group execution method. Attached Figure Description
[0046] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments of the present invention will be briefly described below. Obviously, the drawings described below are merely some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without any creative effort.
[0047] Figure 1 This is a flowchart illustrating a DDL synchronization method for group execution provided in an embodiment of the present invention;
[0048] Figure 2 This is a schematic diagram of the specific process of step 10 in an embodiment of the present invention;
[0049] Figure 3 This is a schematic diagram of the specific process of step 30 in an embodiment of the present invention;
[0050] Figure 4 This is a schematic diagram of the specific process of step 301 in an embodiment of the present invention;
[0051] Figure 5 This is a schematic diagram of the specific process of step 40 in an embodiment of the present invention;
[0052] Figure 6 This is a schematic diagram of the specific process of step 403 in an embodiment of the present invention;
[0053] Figure 7 This is a schematic diagram of the architecture of the DDL synchronization device for group execution provided in an embodiment of the present invention. Detailed Implementation
[0054] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.
[0055] In the description of this invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are only for the convenience of describing this invention and do not require that this invention must be constructed and operated in a specific orientation. Therefore, they should not be construed as limiting this invention.
[0056] In this invention, the terms "first," "second," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, a feature defined with "first," "second," etc., may explicitly or implicitly include one or more of that feature. In the description of this application, unless otherwise stated, "a plurality of" means two or more.
[0057] Furthermore, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.
[0058] Example 1:
[0059] In the log parsing-based group synchronization method, the target database receives log information containing multiple transactions. Each transaction contains at least one database operation. Transactions are split and classified by database object and grouped into corresponding groups according to their dependent database objects. To improve synchronization efficiency, multi-threaded parallel execution of transactions in each group is used.
[0060] When the SQL (Structured Query Language) statements used in database operations do not contain information about the database objects they depend on, or when the syntax is too complex to easily resolve dependencies, the target end still cannot find the dependent database objects after top-level parsing. Therefore, it is impossible to split and categorize these operations by database object. Under current grouping strategies, these database operations will be grouped into different groups and executed in parallel but out of order, without guaranteeing the execution order. These types of database operations are all DDL operations, such as those involving views, functions, and triggers.
[0061] Top-level parsing refers to parsing the DDL-type SQL statements captured in the logs and extracting all second-level database object information involved in the parsed syntax tree. However, it does not perform secondary parsing on the definitions of these second-level database objects, thus obtaining the second-level database objects that the current database operation depends on. Since the definitions of database objects below the first level are all in the source database, if the target database synchronization service needs to perform parsing below the first level during log parsing, such as parsing the third-level database objects that the second-level database objects depend on, it needs to use the object name of the corresponding second-level database object to query the system table in the source database to obtain the SQL statement that creates the second-level database object. This SQL statement is then parsed, and all third-level database object information involved in the parsed syntax tree is extracted to obtain the definition of the second-level database object. In practical use cases, the number of database objects contained in complex SQL statements is unpredictable. Parsing each second-level database object incurs a high cost in interaction with the source database, and the cost of parsing all second-level database objects is uncontrollable. Therefore, usually only top-level parsing is performed on database operations to ensure the performance of the synchronization service.
[0062] For database operations that fail to find their dependent database objects, it's possible that the dependent database objects haven't been created when the operation is executed, leading to a series of database errors. This invention primarily uses the common scenario of tables as database objects for illustration. For cases where database objects are indexes, views, triggers, etc., those skilled in the art can refer to this invention for implementation.
[0063] To address the aforementioned problems, Embodiment 1 of the present invention provides a DDL synchronization method for group execution, such as... Figure 1 As shown, the DDL synchronization method for group execution includes:
[0064] Step 10: Receive log information, classify the log information into multiple transactions, and classify the transactions into at least one ordinary group or DDL group; wherein, the transaction classified into the ordinary group is the first transaction, and the transaction classified into the DDL group is the second transaction.
[0065] A transaction is a collection of database operations. Log information contains multiple database operations. The DDL synchronization method of this embodiment, after receiving log information at the target end, categorizes multiple log messages into multiple transactions based on the included database operations. If the database operation contained in the log information can be parsed by the target end to determine its dependent database object, it is classified into at least one corresponding first transaction based on the dependent database object, and then the corresponding first transaction is categorized into at least one ordinary group. If the database operation contained in the log information cannot be parsed by the target end to determine its dependent database object, it is uniformly classified into a second transaction, and then the second transaction is uniformly categorized into a DDL group. To maintain database consistency, the DDL synchronization method of this embodiment distinguishes the database operations to be synchronized based on their dependent database objects before synchronization. This ensures that during synchronization, the database operations that create the corresponding database object are executed first, followed by the database operations that depend on that database object. This avoids database errors that might occur when multiple database operations are executed in parallel because the dependent database objects have not yet been created.
[0066] Step 20: Retrieve the first committed transaction that has not yet been entered into the database from each ordinary group, and enter the corresponding first transaction into the database.
[0067] In this invention, "database insertion" involves executing a database operation on the target database to achieve database synchronization. Since the database operations in the DDL group, even after top-level parsing, still haven't had their dependent database objects parsed by the target, the DDL synchronization method in this embodiment first executes the corresponding first transaction in the ordinary group on the target, i.e., creating the corresponding database object on the target. Because the log information received by the target includes all log information generated by the source when executing database operations, it contains log information of database operations that are currently being executed but not yet committed. To maintain database consistency, synchronization is based on commits; the target only synchronizes database operations that have been committed on the source but not yet synchronized on the target.
[0068] Step 30: Based on the first transaction entry status of each ordinary group, set the latest LSN for each ordinary group.
[0069] In a normal group corresponding to a certain database object, after the first transaction is completed and inserted into the database, the commit LSN of the first transaction is used as the latest LSN of the normal group, so that the timing of the second transaction insertion can be determined based on the latest LSN.
[0070] Step 40: Based on the latest LSN of each ordinary group, add the second transaction that has been committed but not yet entered into the database in the DDL group.
[0071] In this embodiment of the invention, the DDL synchronization method executes the corresponding first transaction in a normal group on the target end, obtains the corresponding database object, and then executes the second transaction in the DDL group that depends on the database object. For example, the target end executes a first transaction in a normal group, creating a database object, specifically a table named 'test', on the target end. Only then can the second transaction in the DDL group, which depends on the table 'test', be executed.
[0072] To better illustrate the training method of the present invention, step 10 of the DDL synchronization method for group execution in the embodiment of the present invention will be further refined below. Specifically, as follows: Figure 2 As shown, step 10 includes:
[0073] Step 101: Parse the log information to obtain at least one database operation.
[0074] The log information originates from the source end. A database log parsing service is deployed at the source end. The source end generates log information based on the database operations executed at the source end. By parsing the SQL statements of the database operations, the source end obtains the database objects involved in the database operations and then uses the names of these database objects as table information to populate the corresponding database operation's log information. If the SQL statement of the database operation does not contain a database object, or if the SQL statement's syntax is complex and the database object it depends on is still not found after top-level parsing, then the table information is not populated. The log information is then sent to the target end.
[0075] Deploy a database data synchronization service on the target end and parse at least one SQL statement for database operation from the log information generated on the source end.
[0076] Step 102: Determine whether each database operation contains table information.
[0077] At the target end, step 102 includes:
[0078] If, after top-level parsing, the database object on which the database operation depends is still not found, then the database operation does not include the table information; if, after top-level parsing, the database object on which the database operation depends is found, then the database operation includes the table information.
[0079] Step 103: If the table information is included, the database operation is classified into the corresponding ordinary group according to the database object on which the database operation depends, wherein one ordinary group corresponds to one database object.
[0080] Because the source end divides database operations into multiple transactions based on data usage requirements, with each transaction involving multiple database objects, this division is maintained and sent to the target end. On the target end, using a single thread to execute database operations in the order they were executed on the source end avoids database errors, but is extremely inefficient and unsuitable for database synchronization in actual production. Using multi-threaded parallel execution, however, makes it impossible to determine the execution order of the different database objects each operation depends on, which can easily lead to database errors.
[0081] The DDL synchronization method using group execution in this embodiment of the invention requires that the database objects dependent on transactions within the same ordinary group be kept separate. Therefore, the received database operations need to be divided into multiple transactions on the target end based on the dependent database objects, so that they can be categorized into different groups for parallel execution and achieve database synchronization. Specifically, since the source end divides database operations into multiple source-end transactions based on data usage requirements, to preserve these usage requirements, based on the source-end transaction division, when a source-end transaction involves multiple database objects, the target end splits that source-end transaction into multiple source-end sub-transactions, using the dependent database objects as units. Each source-end sub-transaction is treated as a single target-end transaction and is not merged with other transactions that depend on the same database objects.
[0082] The target end distinguishes each transaction based on the database objects that the database operations depend on. If the database operations contained in the log information can be parsed by the target end to determine their dependent database objects, then the transaction is classified into at least one corresponding first transaction based on the dependent database objects. Within each first transaction, the database operations depend on the same database objects. First transactions with the same dependencies are then grouped into the same ordinary group.
[0083] Step 104: If the table information is not included, then classify the database operation into the DDL group.
[0084] If, after top-level analysis, the database object upon which the database operation depends is still not found, it is categorized into the second transaction. The second transactions are then grouped into DDL groups. Database operations that do not contain table information are generally DDL operations that do not involve multiple database objects and do not involve transaction splitting.
[0085] During database synchronization, DDL operations related to tables, indexes, and partitioned tables can be performed by following the regular groupings of the relevant database objects, i.e., executing them along with the relevant tables. This embodiment of the invention addresses DDL operations that do not contain table information by adding a separate DDL group on the target side. This group stores and executes such DDL operations uniformly, solving the problem in existing technologies where, when these DDL operations are categorized into different regular groups and executed in parallel without order, the uncertain execution order can lead to database errors, potentially resulting in dependent database objects not being created.
[0086] Since database synchronization generally involves DDL and DML (Data Manipulation Language) operations, and the database operations contained in the log information also include DML operations (insert, update, and delete data operations), all of which are database operations that can resolve dependent database objects, meaning the corresponding log information contains table information, in an optional embodiment, during database synchronization, the DML operations in the log information on the target end are categorized as the first transaction corresponding to the corresponding database object. This first transaction is then categorized into the ordinary group corresponding to the corresponding database object, and subsequent DDL operations in the ordinary group are executed together with the first transaction in the commit LSN order.
[0087] After the target end categorizes the database operations in the log information based on the dependent database objects, since the database objects dependent on the first transaction in the ordinary group are known, the data can be inserted into the database on the target end according to the execution order of the source end. After the first transaction begins insertion, in order to insert the corresponding second transaction in the DDL group into the database according to the insertion status of each ordinary group, the latest LSN of each ordinary group is set. Specifically, as follows: Figure 3 As shown, step 30 includes:
[0088] Step 301: For a normal group, the first transaction corresponding to it is entered into the database in the order of LSN submission.
[0089] Step 302: Obtain the commit LSN of the first transaction that has been entered into the database, and use the commit LSN of the first transaction that has been entered into the database as the latest LSN.
[0090] Step 303: When all the first transactions already stored in the ordinary group have been completed and entered into the database, set the latest LSN of the ordinary group to the first preset value.
[0091] The first preset value is a value that is unlikely to be involved in the database submission LSN, and is set by those skilled in the art based on the specific database. In an optional embodiment, the first preset value is set to infinity (MAX) so that all subsequent submitted LSN values are less than the first preset value.
[0092] In a normal group corresponding to a certain database object, after the first transaction is completed and inserted into the database, the commit LSN of the first transaction is used as the latest LSN of the normal group, so that the timing of the second transaction insertion can be determined based on the latest LSN.
[0093] According to the DDL synchronization method for group execution according to an embodiment of the present invention, the latest LSN of each ordinary group will be used to determine whether the corresponding database object has been created. Then, the second transaction that depends on the database object will be executed. When there are no committed but not yet entered transactions in an ordinary group, the latest LSN of the ordinary group will be set to infinity, indicating that all database objects in the ordinary group have been created. All database operations in the DDL group can be executed. During execution, there will be no dependence on the database objects in the ordinary group, but there may still be some database objects that have not been created or have changed.
[0094] like Figure 4 As shown, step 301 includes:
[0095] Step 3011: Create multiple ordinary worker threads, each of which corresponds to an ordinary group, and the multiple ordinary worker threads execute in parallel.
[0096] Step 3012: Each of the ordinary worker threads enters the first transaction in the ordinary group it is responsible for into the database in sequence.
[0097] On the target side, a multi-threaded, parallel execution strategy is adopted among ordinary groups. Only within the same ordinary group, different first transactions are executed in the order they were committed on the source side. Those skilled in the art can set up multiple data synchronization execution threads on the target side according to the usage requirements. These execution threads are responsible for polling all ordinary groups and retrieving the first transactions that have been committed on the source side but not yet entered into the database on the target side.
[0098] Multiple first transactions within different ordinary groups are executed in parallel. The first transaction within each ordinary group is executed on the target side according to its commit LSN order at the source. First transactions between different ordinary groups are executed out of order. Since step 10 categorizes the resolvable database operations contained in the log information into different ordinary groups based on their dependent database objects after receiving the log information, the first transactions within the same ordinary group depend on the same database objects. Therefore, synchronization on the target side follows the execution order at the source, preventing the execution of database operations before the dependent database objects have been created. Furthermore, the different first transactions within different ordinary groups depend on completely different database objects and are not related. When multiple ordinary threads are used to concurrently insert the first transactions from different groups into the database, the situation of executing database operations before the dependent database objects have been created will also not occur.
[0099] After the target end executes part of the first transaction in the normal group, it obtains the corresponding database object. At this time, executing the second transaction, which was committed later than a certain first transaction on the source end, can avoid database errors, regardless of whether the second transaction depends on the database object obtained by the first transaction. To better illustrate the training method of the present invention, step 40 of the DDL synchronization method for group execution in this embodiment of the present invention will be further refined below. Specifically, as follows... Figure 5 As shown, step 40 includes:
[0100] Step 401: Obtain the latest LSN for each of the aforementioned ordinary packets. The first preset time can be set by those skilled in the art according to the specific application scenario requirements, and is not specifically limited here.
[0101] Step 402: Sort the latest LSNs of each of the ordinary groups to obtain the minimum LSN.
[0102] Step 403: Based on the minimum LSN, select whether to import the second transaction in the DDL group into the database.
[0103] A timed scheduling thread is created, a first preset time is determined, and the timed scheduling thread is executed at intervals of the first preset time. The timed scheduling thread enters the second transaction that has been committed but not yet entered into the database in the DDL group according to the latest LSN of each ordinary group.
[0104] Under the grouped execution strategy on the target end, the execution order of the first and second transactions of different database objects is uncertain, depending on polling and execution status. However, some second transactions may depend on the completion of other first transactions. These dependencies may be very complex or involve nested calls, making it inconvenient to directly resolve dependencies. For example, second transaction 4 depends on first transactions 1 and 2. Only after these two first transactions are completed can second transaction 4 execute successfully. There may be cases where some first transactions depend on database objects created by second transactions. Since the DDL synchronization method of this embodiment executes sequentially according to the commit LSN at the source end, all ordinary groups of the first transaction depend on the database objects created by the second transaction. The commit LSNs of these two transactions generally differ significantly. Before the second transaction is entered into the database, these ordinary groups will not execute, avoiding database errors caused by missing dependencies. When executing in parallel with multiple threads, the first preset time of the timed scheduling thread needs to be set with reference to this influencing factor to improve the efficiency of database synchronization while ensuring successful execution.
[0105] This invention establishes a correlation between the execution status of each first transaction in a regular group and the execution status of the corresponding second transaction in a DDL group by setting a latest LSN for each regular group. Subsequently, by comparing the latest LSN and the commit LSN of the second transaction at the source, the execution order at the source can be determined. In the DDL synchronization method of this invention, under multi-threaded parallel execution, execution in this order will not cause synchronization errors. The specific steps for determining whether to insert the second transaction into the database based on the latest LSN are described below.
[0106] To better illustrate the training method of the present invention, step 403 of the DDL synchronization method for group execution in this embodiment will be further refined below, such as... Figure 6 As shown, step 403 includes:
[0107] Step 4031: Based on the minimum LSN, traverse all commit LSNs corresponding to the second transactions in the DDL group and determine whether there are any second transactions with a corresponding commit LSN less than the minimum LSN.
[0108] Step 4032: If it exists, then enter the corresponding second transaction into the database.
[0109] Step 4033: If it does not exist, then the second transaction in the DDL group will not be entered into the database.
[0110] Each time the scheduled thread executes, it obtains the current minimum LSN. Based on this minimum LSN, it iterates through all second transactions in the DDL group. If the commit LSN of a second transaction is less than the minimum LSN, it means that the execution order on the source side is: first, the database operations in this second transaction are executed, and then the database operations corresponding to the minimum LSN are executed. Therefore, the database objects that the database operations in this second transaction depend on have already been created or are in the same changed state as when the database operations corresponding to the minimum LSN are executed. Executing this second transaction at this time avoids database errors or even serious database operation errors when the database operations are synchronized on the target side because the dependent database objects have not yet been created. Step 403 is executed repeatedly until all second transactions in the DDL group have been completed and entered into the database, thus realizing the DDL synchronization method of group execution in this embodiment of the invention.
[0111] It is worth noting that when the first and second transactions are written to the database, the first commit LSN and first commit number corresponding to the first transaction that has been written to the database are recorded. The second commit LSN and second commit number corresponding to the second transaction that has been written to the database are also recorded.
[0112] During fault recovery, the first transaction that has been completed and inserted into the database is filtered based on the first commit LSN and the first commit number; the second transaction that has been completed and inserted into the database is filtered based on the second commit LSN and the second commit number.
[0113] The commit number represents the order in which the first or second transaction is written to the database on the target end. Each regular group and DDL group, when executing a transaction, also needs to record the commit LSNs of the first and second transactions already written to the database locally on the target end. In the event of a failure on the target end, the first and second transactions corresponding to the commit LSNs are determined based on the locally recorded commit LSNs. Following the order of the commit numbers, the first and second transactions already written to the database after the target end failure are filtered out, and recovery is performed on the correct first and second transactions to ensure data consistency between the source and target databases.
[0114] Example 2:
[0115] Based on Embodiment 1 above, this embodiment of the invention provides a specific example of a DDL synchronization method for group execution, in order to better understand the entire synchronization process. This embodiment of the invention will use the log information sequence shown in the table below as an example for illustration:
[0116]
[0117]
[0118] After the source end generates the above transactions, it sends the log information sequence containing the above transactions to the target end. At the same time, the target end receives the above log information sequence, classifies the database operations contained in the log information sequence into 5 first transactions and 2 second transactions, and the process of classifying the first transactions and second transactions is as follows:
[0119] S1: For the first transaction with transaction ID 1, classify it into the normal group T1 of table T1, and add it to the normal worker thread of table T1 for execution. The initial commit LSN of normal group T1 is 0.
[0120] S2: For the first transaction with transaction ID 2, classify it into the normal group T2 of table T2, and add it to the normal worker thread of table T2 for execution. The initial commit LSN of normal group T2 is 0.
[0121] S3: For the first transaction with transaction ID 3, classify it into the normal group T3 of table T3, and add it to the normal worker thread of table T3 for execution. The initial commit LSN of normal group T3 is 0.
[0122] S4: For the second transaction with transaction ID 4, classify the second transaction into the DDL group and add the second transaction to the DDL group to be executed by the scheduled thread.
[0123] S5: For the first transaction with transaction ID 5, classify it into the normal group T4 of table T4, and add it to the normal worker thread of table T4 for execution. The initial commit LSN of normal group T4 is 0.
[0124] S6: For the first transaction with transaction ID 6, classify the first transaction into the normal group T1 of table T1, and add the first transaction to the normal worker thread of table T1 for execution.
[0125] S7: For the second transaction with transaction ID 7, classify the second transaction into the DDL group and add the second transaction to the DDL group to be executed by the scheduled thread.
[0126] The execution order and log information reception status on the target side are shown in the table below:
[0127]
[0128] In this context, TRX 1 refers to the transaction with transaction ID 1, which belongs to the normal group T1; TRX 2 refers to the transaction with transaction ID 2, which belongs to the normal group T2; TRX 3 refers to the transaction with transaction ID 3, which belongs to the normal group T3; TRX 4 refers to the transaction with transaction ID 4, which belongs to the DDL group; TRX 5 refers to the transaction with transaction ID 5, which belongs to the normal group T4; and TRX 6 refers to the transaction with transaction ID 6, which belongs to the normal group T1.
[0129] Different ordinary groups are executed in parallel, and there is no execution order. The aforementioned time represents the order in which different ordinary groups are entered into the database in the specific instance provided in Embodiment 2 of this invention. For example, depending on the specific parallel execution order of the target end and the log information reception, ordinary group T2 is executed later than ordinary group T3.
[0130] The scheduled thread executes intermittently. When the scheduled thread executes, ordinary groups and DDL groups are executed in parallel. DDL groups begin to be inserted into the database after the conditions are met. For example, when executing TRX 4, the DDL group needs to check the latest LSN of the ordinary group containing TRX 3. Only after the conditions are met can the corresponding SQL statement be executed.
[0131] When the scheduled thread executes at regular intervals, the specific execution order of the above execution sequence is as follows:
[0132] S1: When the commit number 1 ends, all ordinary group T1 is completely stored in the database (at this time, TRX 6(32) has not yet been placed in ordinary group T1), and log information with commit LSN 24 is received. The commit number corresponding to commit LSN 22 is recorded locally as 1.
[0133] S2: When commit number 2 ends, all normal packets T3 are successfully stored in the database, and log information for commit LSN 28 is received. Locally, commit number 2 is recorded as the corresponding commit LSN 26.
[0134] At the end of time 2, the timed scheduling thread is executed. Ordinary group T2 has a first transaction waiting to be entered into the database, while ordinary groups T1 / T3 / T4 do not have a first transaction waiting to be entered into the database. Therefore, the latest LSNs of the ordinary groups are T1: MAX, T2: 0, T3: MAX, T4: MAX, and the minimum LSN is 0.
[0135] Multiple ordinary worker threads execute in parallel. In the example above, the first transaction in ordinary packet T3 with committed LSN 26 is received first, followed by the first transaction in ordinary packet T2 with committed LSN 24. The first transaction of the first received and completed packet is executed first. Since ordinary packets are divided according to the database objects they depend on (in this case, database tables), the target end executes the corresponding transactions in the order they are received, and database errors will not occur due to the database objects on which the executed database operations depend not being created.
[0136] Since the target continuously receives log information from the source for synchronization during the execution of the normal worker thread and the timed scheduling thread, after all transactions in the normal group are completed and the latest LSN of the normal group is set to MAX, if the first transaction exists in the normal group again, the latest LSN of the normal group will be updated after the first transaction is executed.
[0137] Traverse the DDL group and find that there is no second transaction in the DDL group whose committed LSN is less than the minimum LSN.
[0138] S3: When commit number 3 ends, all normal packets T2 are successfully stored in the database, and log information for commit LSN 30 is received. Locally, commit number 3 is recorded as the corresponding commit LSN 24.
[0139] When scheduling ends at time 3, ordinary group T4 has a first transaction waiting to be entered into the database, while ordinary groups T1 / T2 / T3 do not have a first transaction waiting to be entered into the database. Therefore, the latest LSNs of the ordinary groups are T1: MAX, T2: MAX, T3: MAX, T4: 0, and the minimum LSN is 0.
[0140] Traverse the DDL groups; there is no second transaction with a smaller LSN.
[0141] S4: When commit number 4 ends, receive the log information for commit LSN 32. Record locally that commit number 4 corresponds to commit LSN 30.
[0142] When scheduling ends at time 4, there is a first transaction to be entered into the database in ordinary group T1 (TRX 6(32) is put into ordinary group T1), and there is no first transaction to be entered into the database in ordinary groups T2 / T3 / T4. Therefore, the latest LSNs of ordinary groups are T1: 22, T2: MAX, T3: MAX, T4: MAX, and the minimum LSN is 22.
[0143] Traverse the DDL groups; there is no second transaction with a smaller LSN.
[0144] S5: When commit number 5 ends, receive log information for commit LSN 34. Record locally that commit number 5 corresponds to commit LSN 32.
[0145] When scheduling ends at time 5, there is no first transaction waiting to be entered into the database for ordinary groups T1 / T2 / T3 / T4. Therefore, the latest LSNs of ordinary groups are T1:MAX, T2:MAX, T3:MAX, T4:MAX, and the minimum LSN is MAX.
[0146] Traverse the DDL groups; the second transaction with a LSN less than the minimum is the second transaction (transaction 4) and the second transaction (transaction 7), and execute the write operation. On the local record, commit the commit number corresponding to LSN 28 as 6. On the local record, commit the commit number corresponding to LSN 34 as 7.
[0147] At this point, the first and second transactions in the four ordinary groups and the DDL group have been entered into the database.
[0148] After transaction 7 completes its data insertion, the target database experiences a failure. Following the failure, the target database only contains the result of the first transaction (ID 2). Based on the commit LSNs (22, 24, 26, 28, 30, 32, and 34) in the local records, the IDs of the first transactions corresponding to these commit LSNs are determined to be 1, 2, 3, 5, and 6, and the IDs of the second transactions are 4 and 7. Since the target database currently contains the result of the first transaction (ID 2), this transaction is filtered out. Transactions with IDs 1, 3, 5, and 6, and transactions with IDs 4 and 7 are then recovered and re-inserted into the database.
[0149] Example 3:
[0150] like Figure 7 The diagram shown is an architectural schematic of a DDL synchronization device for group execution according to an embodiment of the present invention. The DDL synchronization device for group execution in this embodiment includes one or more processors 31 and a memory 32. Figure 7 Take a processor 31 as an example.
[0151] Processor 31 and memory 32 can be connected via a bus or other means. Figure 7 Taking the example of a connection between China and Israel via a bus.
[0152] The memory 32, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs and non-volatile computer-executable programs, such as the grouped execution DDL synchronization method in Embodiment 1. The processor 31 executes the grouped execution DDL synchronization method by running the non-volatile software program and instructions stored in the memory 32.
[0153] Memory 32 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 32 may optionally include memory remotely located relative to processor 31, which can be connected to processor 31 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
[0154] The program instructions / modules are stored in the memory 32. When executed by one or more processors 31, they perform the DDL synchronization method of group execution described in Embodiment 1 above, for example, the method described above. Figures 1-6 The steps shown.
[0155] It is worth noting that the information interaction and execution process between the modules and units in the above-mentioned device and system are based on the same concept as the processing method embodiment of the present invention. For details, please refer to the description in the method embodiment of the present invention, and will not be repeated here.
[0156] Those skilled in the art will understand that all or part of the steps in the various methods of the embodiments can be implemented by a program instructing related hardware. The program can be stored in a computer-readable storage medium, which may include: read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk, etc.
[0157] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for DDL synchronization in group execution, characterized in that, The DDL synchronization method for group execution includes: Receive log information, classify the log information into multiple transactions, and classify the transactions into at least one ordinary group or DDL group; wherein, the transaction classified into the ordinary group is the first transaction, and the transaction classified into the DDL group is the second transaction; Receiving log information and classifying it into multiple transactions, and classifying the transactions into at least one ordinary group or DDL group, includes: parsing the log information to obtain at least one database operation; determining whether each database operation contains table information; if it contains table information, classifying the database operation into the corresponding ordinary group according to the database object it depends on, wherein one ordinary group corresponds to one database object; if it does not contain table information, classifying the database operation into the DDL group. Retrieve the first committed but not yet entered into the database from each ordinary group, and enter the corresponding first transaction into the database; Based on the first transaction data entry of each ordinary group, set the latest LSN for each ordinary group; Obtain the latest LSN of each of the ordinary groups; sort the latest LSNs of each of the ordinary groups to obtain the minimum LSN; based on the minimum LSN, select whether to insert the second transaction in the DDL group into the database; The step of selecting whether to include the second transaction in the DDL group in the database based on the minimum LSN includes: according to the minimum LSN, traversing all the commit LSNs corresponding to the second transactions in the DDL group, and determining whether there is a second transaction whose corresponding commit LSN is less than the minimum LSN; if there is, then the corresponding second transaction is included in the database; if there is no, then the second transaction in the DDL group is not included in the database.
2. The DDL synchronization method for group execution according to claim 1, characterized in that, The determination of whether each database operation contains table information includes: If, after top-level parsing, the database object on which the database operation depends is still not found, then the database operation does not include the table information. If the database object that the database operation depends on has been found after top-level parsing, then the database operation includes the table information.
3. The DDL synchronization method for group execution according to claim 1, characterized in that, The step of setting the latest LSN for each ordinary group based on the first transaction entry status of each ordinary group includes: For a normal group, the first transaction corresponding to each group is entered into the database in the order of the LSNs committed. Obtain the commit LSN of the first transaction that has been entered into the database, and use the commit LSN of the first transaction that has been entered into the database as the latest LSN; When all the first transactions already stored in the ordinary group are completed and entered into the database, the latest LSN of the ordinary group is set to the first preset value.
4. The DDL synchronization method for group execution according to claim 3, characterized in that, For a regular group, the process of loading the corresponding first transaction into the database according to the order of LSN submission includes: Multiple ordinary worker threads are created, each of which corresponds to an ordinary group, and the multiple ordinary worker threads are executed in parallel. Each of the aforementioned ordinary worker threads sequentially inserts the first transaction in the ordinary group it is responsible for into the database.
5. The DDL synchronization method for group execution according to claim 1, characterized in that, The DDL synchronization method for group execution also includes: Create a timed scheduling thread, determine a first preset time, and execute the timed scheduling thread at intervals of the first preset time. The timed scheduling thread will add the second transaction that has been committed but not yet entered into the database in the DDL group according to the latest LSN of each ordinary group.
6. The DDL synchronization method for group execution according to any one of claims 1-5, characterized in that, The DDL synchronization method for group execution also includes: Record the first commit LSN and first commit number corresponding to the first transaction that has been entered into the database; Record the second commit LSN and second commit number corresponding to the second transaction that has been entered into the database; During fault recovery, the first transaction that has been completed and inserted into the database is filtered based on the first commit LSN and the first commit number; the second transaction that has been completed and inserted into the database is filtered based on the second commit LSN and the second commit number.
7. A DDL synchronization device for group execution, characterized in that, It includes at least one processor and a memory, which are connected via a data bus. The memory stores instructions that can be executed by the at least one processor. After being executed by the processor, the instructions are used to complete the DDL synchronization method of group execution as described in any one of claims 1-5.