A unified data change capture method and apparatus based on a database mesh layer

By adopting a unified data change capture method in the database mesh layer, the inconsistency problem of data change capture in heterogeneous database environments is solved, realizing transparent access to heterogeneous databases and accurate data change capture, thus ensuring data consistency and integrity.

CN122309600APending Publication Date: 2026-06-30SUZHOU BONA XUNDONG SOFTWARE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SUZHOU BONA XUNDONG SOFTWARE CO LTD
Filing Date
2026-06-04
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies struggle to achieve unified access and data change capture across databases in heterogeneous database environments. Furthermore, existing solutions cannot accurately obtain the set of primary keys affected by changes, resulting in incomplete or inaccurate CDC records that fail to meet data consistency requirements.

Method used

The database mesh layer provides a MySQL protocol access interface, uses adapters to connect to various heterogeneous databases, constructs an abstract syntax tree to extract the primary key set, and obtains data snapshots before and after the change operation to perform difference calculations, generating CDC records. The business data change operation and the CDC record are executed in the same database transaction.

Benefits of technology

It enables transparent access to heterogeneous databases and unified data change capture, accurately obtains data change content and field-level change details, ensures strong consistency between business data and change event records, and avoids data inconsistency issues.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309600A_ABST
    Figure CN122309600A_ABST
Patent Text Reader

Abstract

This application discloses a unified data change capture method and apparatus based on a database mesh layer, relating to the field of database technology. It includes: defining an SQL interceptor interface containing pre-execution, post-execution, and exception interception methods, and making it inherit from a sorting interface with execution order; creating a context object with extended information fields for data transfer between interceptors; responding to SQL requests by calling the pre-execution interception method, parsing the SQL type and performing preprocessing, taking over transaction control and setting a transaction flag in the extended information field; issuing SQL execution and obtaining the result; upon success, calling the post-execution interception method, performing post-processing according to the SQL type, and committing the transaction according to the transaction flag; and in case of an exception, calling the exception interception method to roll back the transaction according to the transaction flag. This invention achieves unified and precise control over the entire SQL execution process, solving the problems of poor scalability and low data reliability in existing technologies.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of database technology, and more specifically to a unified data change capture method and apparatus based on a database mesh layer. Background Technology

[0002] As enterprise IT infrastructure development deepens, business systems often need to access multiple types of databases simultaneously, such as MySQL, PostgreSQL, Oracle, OceanBase, and TiDB. This environment is known as a heterogeneous database environment. To achieve unified access and management across databases, the industry has proposed various database proxy middleware solutions, such as MySQLProxy, Database Mesh, and Apache ShardingSphere. MySQL Proxy offers query monitoring and load balancing capabilities, but it only supports MySQL databases and cannot solve the access problem in heterogeneous database environments. Apache ShardingSphere, as a distributed database middleware, exposes MySQL or PostgreSQL protocols and supports database sharding and read / write separation. However, its core design is still primarily geared towards homogeneous database clusters, lacking a unified adaptation layer for differences in SQL dialects, data type mappings, and primary key generation mechanisms between different database types.

[0003] Regarding Data Change Capture (CDC) technology, existing solutions primarily rely on database logs such as MySQL's binlog and Oracle's redo log, or are based on triggers. While log-based approaches are less intrusive to business logic, they require the separate development and maintenance of log parsing modules for each database, and the significant differences in log formats and parsing methods across different databases make it difficult to establish a unified technical solution. Trigger-based approaches increase database maintenance costs and runtime overhead. Furthermore, existing SQL interception-based CDC solutions often fail to accurately extract or deduce the set of primary keys affected by changes, especially when the business SQL does not explicitly include primary key conditions. This makes it difficult to pinpoint the specific changed data rows, resulting in incomplete or inaccurate CDC records. In addition, existing solutions generally do not manage business data change operations and the generation and storage of CDC records within the same transaction boundary. This can lead to data inconsistencies in abnormal situations, such as business data being changed but CDC records being lost, or CDC records being generated but business data changes failing. These solutions cannot meet the high consistency requirements of scenarios such as data auditing and data synchronization. Summary of the Invention

[0004] The purpose of this invention is to provide a unified data change capture method and apparatus based on a database mesh layer. By providing a unified MySQL protocol access interface externally through the database mesh layer and connecting to various heterogeneous databases internally via adapters, it enables transparent access of business systems to backend heterogeneous databases. Developers do not need to learn the SQL dialects of multiple databases or modify business code to obtain unified data change capture capabilities. During SQL execution, by constructing an abstract syntax tree and extracting or deriving the set of primary keys affected by changes based on metadata information, even when the business SQL does not explicitly carry primary key conditions, the automatically generated pre-primary key query statement can accurately locate the data rows to be changed, solving the problem that traditional solutions cannot accurately obtain the set of affected primary keys. By taking data snapshots before and after the change operation and performing field-by-field difference calculations, the generated CDC records can accurately reflect the complete content and field-level change details of each data change. More importantly, by placing the business data change operation and the CDC record write operation in the same database transaction, strong consistency between business data and change event records is guaranteed, fundamentally avoiding various abnormal situations of data inconsistency.

[0005] To achieve the above objectives, the present invention provides the following technical solution: In a first aspect, this invention provides a unified data change capture method based on a database mesh layer. The database mesh layer provides a unified MySQL protocol access interface externally and connects to various heterogeneous databases internally via adapters. The method includes: Receive SQL requests sent by clients via the MySQL protocol, and perform syntax parsing on the SQL requests to generate an abstract syntax tree; The operation type of the SQL request is identified based on the abstract syntax tree. When the operation type is a data modification statement, the primary key information of the target table involved in the data modification statement is obtained through the abstract syntax tree and the pre-acquired metadata information. The primary key set of the data rows affected by the data modification statement is extracted or deduced based on the conditional expressions in the abstract syntax tree. Before performing business data change operations, query the target database based on the primary key set and obtain a snapshot of the data before the change; The adapter converts the data change statements into dialect SQL that matches the target database, and then sends the converted dialect SQL to the target database for execution to complete the business data change. After the business data change is successfully executed, the target database is queried again based on the primary key set to obtain a snapshot of the changed data; Perform difference calculations on the data snapshots before and after the change, and generate data change capture records based on the difference calculation results; The business data change operation and the write operation of the data change capture record are executed in the same database transaction. The database transaction is committed only if both the business data change operation and the write operation of the data change capture record are successful; if either operation fails, the database transaction is rolled back.

[0006] In some embodiments, the primary key set of data rows affected by data modification statements is extracted or derived based on conditional expressions in the abstract syntax tree, specifically including: When the data change statement is an insert statement, if the insert statement contains a primary key field, the primary key value is directly extracted from the abstract syntax tree as the primary key set; if the primary key is automatically generated by the target database, the generated primary key value is obtained through the adapter as the primary key set after the business data change is executed. When the data modification statement is an update statement or a delete statement, determine whether the conditional expression in the abstract syntax tree contains a condition for the primary key; If it is included, the primary key value is directly extracted by traversing the condition nodes related to the primary key in the abstract syntax tree to form a primary key set; If not included, a primary key query statement is automatically generated and executed based on the target table and conditional expression in the abstract syntax tree to retrieve the set of primary keys of the data rows affected by the data change statement from the target database.

[0007] In some embodiments, the primary key value is extracted directly by traversing the condition nodes related to the primary key in the abstract syntax tree, including: When the conditional expression contains a primary key equality condition, the literal value on the right side of the equals sign is extracted as the primary key value. When the conditional expression contains a primary key IN list condition, extract all values ​​from the IN list as the primary key value; When the conditional expression contains a primary key range condition, the set of primary key values ​​that satisfy the range condition is obtained by executing the preceding primary key query statement.

[0008] In some embodiments, a difference calculation is performed on the data snapshot before the change and the data snapshot after the change, and a data change capture record is generated based on the difference calculation result, specifically including: When the data change statement is a delete statement, the data snapshot after the change is empty. The data change capture record contains the operation type of delete, the complete data row content before the change, and the null value identifier after the change. When the data change statement is an update statement, the data snapshot before the change and the data snapshot after the change are compared field by field to identify the fields that have changed. The data change capture record contains the old and new values ​​of the fields that have changed and whose operation type is update.

[0009] In some embodiments, business data change operations and write operations of data change capture records are executed within the same database transaction, specifically including: Create or join a transaction context upon receiving a data change statement; Within the transaction context, execute the following steps in sequence: query the data snapshot before the change, execute the data change statement after dialect conversion, query the data snapshot after the change, generate a data change capture record, and write the data change capture record to the transaction log table or the outbox table. Commit or rollback operations simultaneously affect the business data change results in the target database and the data change capture records in the transaction log table or outbox table.

[0010] In some embodiments, the database mesh layer internally connects to multiple heterogeneous databases via adapters, including at least two of MySQL, PostgreSQL, Oracle, OceanBase, and TiDB; The adapter is used to handle SQL dialect conversion, data type mapping, pagination syntax differences, primary key return method differences, transaction control differences, and error code conversion between different databases.

[0011] In some embodiments, the method further includes: The generated data change capture records are saved to the CDC record table in the database, and the status field of each record is set to "pending push". Records in the CDC record table are asynchronously pushed to the message queue using an independent thread pool; If the push is successful, the status field of the corresponding record will be updated to "success"; if the push fails, the status field will remain "pending push".

[0012] In some embodiments, the method further includes: Periodically query the CDC record table for records whose status field is "Pending Push" and have reached the preset retry time; Re-execute the asynchronous push operation on the retrieved records; If the push is successful, the status field is updated to success; if the push fails, the number of retries is incremented and it is determined whether the number of retries exceeds a preset threshold. If the threshold is not exceeded, the next retry time is calculated and updated based on the number of retries, and the system waits for the next retry; if the threshold is exceeded, the status field is updated to final failure and retries are stopped.

[0013] Secondly, the present invention also provides a unified data change capture device based on a database mesh layer. The database mesh layer provides a unified MySQL protocol access interface externally and connects to various heterogeneous databases internally via adapters. The device includes: The syntax parsing module is used to receive SQL requests sent by the client through the MySQL protocol, and to parse the SQL requests to generate an abstract syntax tree; The collection acquisition module is used to identify the operation type of the SQL request based on the abstract syntax tree. When the operation type is a data modification statement, it obtains the primary key information of the target table involved in the data modification statement through the abstract syntax tree and the pre-acquired metadata information, and extracts or derives the primary key set of the data rows affected by the data modification statement based on the conditional expressions in the abstract syntax tree. The snapshot acquisition module is used to query and obtain a snapshot of the data before the change from the target database based on the primary key set before performing business data change operations; The data change module is used to convert data change statements into dialect SQL that matches the target database through an adapter, and then send the converted dialect SQL to the target database for execution to complete the business data change. The data query module is used to query the target database again based on the primary key set after the business data change is successfully executed, and obtain a snapshot of the changed data. The difference calculation module is used to perform difference calculations on the data snapshot before and after the change, and generate a data change capture record based on the difference calculation results. The transaction commit module is used to execute business data change operations and data change capture record write operations in the same database transaction. The database transaction is committed only if both the business data change operation and the data change capture record write operation are successful; if either operation fails, the database transaction is rolled back.

[0014] Thirdly, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the unified data change capture method based on the database Mesh layer provided in the first aspect.

[0015] Fourthly, the present invention also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the unified data change capture method based on a database mesh layer provided in the first aspect.

[0016] Fifthly, the present invention also provides a computer program product, including a computer program that, when executed by a processor, implements the unified data change capture method based on a database mesh layer provided in the first aspect.

[0017] The beneficial effects of this invention are as follows: The unified data change capture method based on the database mesh layer provides a unified MySQL protocol access interface externally through the database mesh layer and connects to various heterogeneous databases internally through adapters. This enables transparent access of business systems to backend heterogeneous databases. Developers do not need to learn the SQL dialects of various databases or modify business code to obtain unified data change capture capabilities. During SQL execution, by constructing an abstract syntax tree and extracting or deriving the set of primary keys affected by changes based on metadata information, even when the business SQL does not explicitly carry primary key conditions, the automatically generated pre-primary key query statement can accurately locate the data rows to be changed, solving the problem that traditional solutions cannot accurately obtain the set of affected primary keys. By taking data snapshots before and after executing the change operation and performing field-by-field difference calculations, the generated CDC records can accurately reflect the complete content and field-level change details of each data change. More importantly, by placing the business data change operation and the CDC record write operation in the same database transaction, strong consistency between business data and change event records is guaranteed, fundamentally avoiding various abnormal situations of data inconsistency.

[0018] The above description is merely an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention and to implement it in accordance with the contents of the specification, the preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings. Attached Figure Description

[0019] Figure 1 This is a flowchart illustrating a unified data change capture method based on a database mesh layer according to an embodiment of the present invention; Figure 2 This is a schematic diagram of a unified data change capture device based on a database mesh layer according to an embodiment of the present invention; Figure 3 This is a schematic diagram of an electronic device structure provided in an embodiment of this application. Detailed Implementation

[0020] The technical solution of the present invention will now be clearly and completely described with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0021] It should be noted that references to "an embodiment," "embodiment," "example embodiment," etc., in this specification refer to the described embodiment including specific features, structures, or characteristics; however, not every embodiment must include these specific features, structures, or characteristics. Furthermore, such expressions do not refer to the same embodiment. Moreover, when describing specific features, structures, or characteristics in conjunction with embodiments, whether or not explicitly described, it is indicated that incorporating such features, structures, or characteristics into other embodiments is within the knowledge of those skilled in the art.

[0022] Furthermore, the technical features involved in the different embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

[0023] In some embodiments, such as Figure 1 The diagram illustrates a unified data change capture method based on a database mesh layer. The database mesh layer provides a unified MySQL protocol access interface externally and connects to various heterogeneous databases internally via adapters. Specific methods include: S101 receives SQL requests sent by the client via the MySQL protocol, and performs syntax parsing on the SQL requests to generate an abstract syntax tree.

[0024] This embodiment provides a unified data change capture method based on a database mesh layer. This method uses a database mesh layer as an intermediary bridge between the application system and various underlying databases, enabling unified capture and processing of data changes in a heterogeneous database environment. The database mesh layer provides a unified MySQL protocol access interface, allowing various applications, business intelligence analysis tools, or operation and maintenance clients using standard MySQL drivers to connect to the mesh layer in a way that accesses a standard MySQL database, without requiring any code modification or driver replacement. The MySQL protocol is a binary protocol specification defined by MySQL for communication between clients and database servers. The client driver encapsulates the SQL statement to be executed in a specific type of network data packet according to this protocol format and sends it to a designated port on the server. Upon startup, the mesh layer listens on the same port as the MySQL service and simulates the behavior of the MySQL server, correctly parsing MySQL protocol messages to extract the original SQL statement text, connection parameters, character sets, and other context information. Simultaneously, internally, the mesh layer connects to at least two of the various heterogeneous databases, such as MySQL, PostgreSQL, Oracle, OceanBase, and TiDB, through different database adapters. Database adapters are core functional components in the Mesh layer used to shield the differences between heterogeneous databases. Each adapter is specifically implemented for a particular type of database, encapsulating all the special logic required to interact with that database. Specifically, it is responsible for tasks such as SQL dialect conversion, data type mapping, pagination syntax adaptation, unified encapsulation of primary key return methods, coordination of transaction control mechanisms, standardized conversion of error codes, and unified format encapsulation of execution result sets. Through this unified external and adaptable internal architecture, business system developers can completely ignore the brand, version, and SQL dialect differences of the backend databases, using a unified MySQL syntax for development. Simultaneously, the system automatically gains unified access to heterogeneous databases and consistent data change capture capabilities.

[0025] During operation, the Mesh layer first receives SQL requests sent by the client via the MySQL protocol and performs syntax parsing on these requests to generate an Abstract Syntax Tree (AST). An AST is an intermediate representation of the source code's syntax structure using a tree-like structure. Each node corresponds to a syntactic unit in the SQL statement, such as a table name, field name, conditional expression, function call, subquery, or join. In this embodiment, the Mesh layer integrates the DruidSQL Parser as its syntax parsing engine. This parser can parse SQL text in MySQL syntax into a complete AST. For example, for the SQL statement `DELETE FROM orders WHERE status = CANCELLED`, the parser generates an AST with a root node representing a delete statement. This AST includes a table name node (`orders`), a conditional expression node (a binary expression node), a left child node representing the field `status`, an equals sign operator, and a right child node representing the string literal `CANCELLED`. By constructing an AST, the Mesh layer can accurately understand the structure and intent of SQL statements at the semantic level, avoiding the misjudgment or parsing errors that may occur when traditional string matching methods handle nested subqueries, UNION queries, and complex logical expressions, thus providing a reliable structured foundation for subsequent primary key extraction and data snapshot acquisition.

[0026] S102, based on the abstract syntax tree, identify the operation type of the SQL request. When the operation type is a data modification statement, obtain the primary key information of the target table involved in the data modification statement through the abstract syntax tree and the pre-acquired metadata information, and extract or deduce the primary key set of the data rows affected by the data modification statement based on the conditional expression in the abstract syntax tree.

[0027] After obtaining the Abstract Syntax Tree (AST), the Mesh layer identifies the operation type of the SQL request based on the AST. The operation type refers to the specific action the SQL statement performs on the data, mainly including four categories: query, insert, update, and delete. The identification process is completed by checking the specific type of the AST root node. For example, if the root node is a delete statement type, it corresponds to a delete operation; if it's an update statement type, it corresponds to an update operation; and if it's an insert statement type, it corresponds to an insert operation. When the operation type of the SQL request is identified as a data modification statement—that is, any one of an insert, update, or delete statement—the Mesh layer initiates the data change capture process.

[0028] First, the Mesh layer uses the Abstract Syntax Tree (AST) and pre-acquired metadata to obtain the primary key information of the target table involved in the data change statement. Based on the conditional expressions in the AST, it extracts or derives the set of primary keys for the data rows affected by the data change statement. Metadata information refers to descriptive data about the database table structure, including table name, field name, field type, field length, nullability, primary key definition, foreign key constraints, and index information. This information is collected in real-time from various backend databases and maintained uniformly by the metadata management module within the Mesh layer. The metadata management module periodically synchronizes the latest table structure definitions with each database to ensure the accuracy and timeliness of metadata such as primary key information. A primary key is a combination of one or more fields in a relational database table used to uniquely identify a row of data. In a CDC record, the primary key is the key identifier of the data state before and after the change. The primary key set is a list of primary key values ​​corresponding to all data rows that will be affected by a data change operation.

[0029] The method for obtaining the primary key set varies depending on the type of data modification statement and the specific details of the conditional expression. For insert statements, if the SQL statement itself explicitly includes the assignment of the primary key field, such as in the statement `INSERT INTO orders(order_id, user_id, amount) VALUES(12345, 1001, 99.00)` where `order_id` is the primary key field, the Mesh layer directly extracts the value 12345 corresponding to this primary key field from the AST as an element of the primary key set. If the primary key is automatically generated by the target database, such as an auto-incrementing primary key in MySQL or a sequence-generated primary key in Oracle, the Mesh layer cannot obtain the primary key value before the statement execution because the primary key value does not exist in the SQL text. In this case, the Mesh layer will retrieve the generated primary key value by calling the target database-specific primary key return interface through the database adapter after the business data modification is completed. For example, for MySQL databases, the adapter obtains the auto-incrementing primary key value through the JDBC `getGeneratedKeys` method; for Oracle databases, the adapter obtains the newly generated primary key value through the `RETURNING INTO` clause or a sequence query. The obtained primary key values ​​will be added to the primary key set.

[0030] For update and delete statements, the Mesh layer further determines whether the conditional expression in the AST contains a condition for the primary key. If the conditional expression contains a condition for the primary key, the Mesh layer directly extracts the primary key value by traversing the conditional nodes related to the primary key in the AST to form the primary key set. The specific extraction logic includes: when the conditional expression contains a primary key equality condition, such as WHERE order_id = 12345, the Mesh layer extracts the literal value on the right side of the equals sign as the primary key value; when the conditional expression contains a primary key IN list condition, such as WHERE order_id IN (12345, 12346,12347), the Mesh layer extracts all values ​​in the IN list as the primary key value set; when the conditional expression contains a primary key range condition, such as WHERE order_id BETWEEN 10000 AND 20000 or WHERE order_id greater than 10000, the Mesh layer obtains all primary key values ​​that satisfy the range condition by executing a pre-query statement. This pre-query statement is a SELECT primary key field FROM target table WHERE range condition statement dynamically generated based on the target table name and range condition in the AST. If the WHERE clause of an update or delete statement does not contain a primary key condition, such as `DELETE FROM orders WHERE status = CANCELLED`, the Mesh layer cannot directly extract the primary key value from the AST. In this case, the Mesh layer automatically generates a pre-query statement based on the target table name and the complete WHERE condition expression in the AST, such as `SELECT order_id FROM orders WHERE status = CANCELLED`, and sends this query to the target database for execution. This retrieves the set of primary keys of the data rows actually affected by the data change statement. This query operation occurs before the execution of the business data change statement, thus accurately capturing the identifiers of the data rows that are about to be modified or deleted. This primary key extraction and derivation mechanism based on AST parsing allows the Mesh layer to accurately locate each affected row of data even if the business SQL itself does not explicitly carry a primary key condition. This solves the technical problem of not being able to accurately locate changed data and generate data change capture records when the business SQL does not carry a primary key, while avoiding the performance overhead of a full table scan.

[0031] S103, Before performing the business data change operation, query the target database based on the primary key set and obtain a snapshot of the data before the change.

[0032] After obtaining the accurate primary key set, the Mesh layer queries the target database to retrieve a snapshot of the data before the change, based on this primary key set, before executing the business data change operation. A data snapshot refers to the complete field content and state of a data row at a specific point in time, typically including the names and corresponding values ​​of all fields in that row. The specific operation of querying the snapshot before the change is as follows: for each primary key value in the primary key set, construct a query statement: `SELECT all fields FROM target table WHERE primary key field equal to primary key value`, and send these queries to the target database in batches or row by row for execution, thereby obtaining the complete field content of each row of data before the change. For update statements, the snapshot before the change records all field values ​​of the data in that row before the update operation is executed; for delete statements, the snapshot before the change records the last state of the data in that row before it is permanently deleted; for insert statements, since the data did not exist before the insertion, the snapshot before the change is empty. The purpose of obtaining a snapshot before the change is to provide a benchmark for subsequent difference calculations. Especially in deletion operations, once data is deleted, if a snapshot is not saved in advance, the specific content of the deleted data will never be known. This will result in incomplete information in the data change capture record, which will not meet the needs of data auditing, data backtracking, and cross-system data synchronization.

[0033] S104 uses an adapter to convert data change statements into dialect SQL that matches the target database, and then sends the converted dialect SQL to the target database for execution to complete the business data change.

[0034] Subsequently, the Mesh layer uses an adapter to convert data modification statements into SQL in a dialect that matches the target database, and then sends the converted dialect SQL to the target database for execution to complete the business data changes. Since the Mesh layer uniformly receives SQL statements in MySQL syntax, while the target database may be a system with different SQL dialects such as PostgreSQL, Oracle, OceanBase, and TiDB, syntax conversion is necessary to ensure correct execution of the statements in the target database. The SQL dialect conversion module in the adapter is responsible for this task, and internally maintains a mapping rule between MySQL syntax and the syntax of each target database. For example, the LIMIT offset pagination syntax in MySQL needs to be converted to the ROWNUM pagination nested query syntax in Oracle or the OFFSET offset FETCH NEXT return row number ONLY syntax in SQL Server; the backticked identifier quoting method in MySQL needs to be converted to the double quote quoting method in PostgreSQL or the unquoted method in Oracle; the AUTO_INCREMENT auto-increment field attribute in MySQL needs to be mapped to a combination of SEQUENCE sequence and BEFORE INSERT trigger in Oracle; the REPLACEINTO statement in MySQL needs to be converted to the INSERT ON CONFLICT DO UPDATE statement in PostgreSQL. The converted SQL statement retains the business semantics of the original SQL, but the syntax format is fully compatible with the target database, ensuring error-free execution. After the conversion, the Mesh layer sends the dialect SQL statement to the target database for execution through standard database connection interfaces such as JDBC. The target database processes the statement according to the normal process, completing the actual insertion, update, or deletion operations on the business data.

[0035] S105: After the business data change is successfully executed, query the target database again based on the primary key set to obtain a snapshot of the changed data.

[0036] After the business data change is successfully executed, the Mesh layer queries the target database again using the same set of primary keys to obtain a snapshot of the changed data. Since primary key values ​​typically remain unchanged during update operations, the same set of primary keys can be used for queries. For delete statements, because the data has been physically removed, the changed query will return an empty result set, meaning the post-change snapshot is empty. For update statements, the post-change snapshot will reflect the modified new field values, where some field values ​​may differ from the pre-change values. For insert statements, since the pre-insert snapshot is empty, the post-change snapshot contains the complete content of the newly inserted data row, including the values ​​of all fields.

[0037] S106, perform difference calculation on the data snapshot before and after the change, and generate a data change capture record based on the difference calculation result.

[0038] Next, the Mesh layer performs difference calculations on the data snapshots before and after the change, and generates a Data Change Capture Record (CDC record) based on the results. A CDC record is a standardized data structure used to describe a data change event and can be used for downstream scenarios such as data synchronization, audit analysis, cache invalidation, and event-driven processing. The method of difference calculation varies depending on the operation type. When the data change statement is a delete statement, the post-change data snapshot is empty. Therefore, the operation type field in the generated CDC record is marked as delete, the pre-change field is filled with the complete data row content before deletion, the post-change field is a null value, and the difference field indicates that the data has been deleted. For example, a CDC record for deleting an order might contain the following information: database name is db_one, table name is orders, primary key field name is order_id with a value of 12345, operation type is DELETE, data before the change includes order_id as 12345, user_id as 1001, status as CANCELLED, and amount as 99.00, data after the change is null, the difference field includes deleted as true, transaction number as tx_001, and event time as April 22, 2026, 10:00:00. When the data change statement is an update statement, the Mesh layer compares the values ​​of each field in the snapshot before and after the change, identifying which fields have changed. For fields that have changed, the CDC record simultaneously records both the old value before the change and the new value after the change. For example, when the status of an order is updated from CREATED to PAID, the generated CDC record will have the operation type field marked as UPDATE, the primary key as order_id equal to 12345, the status field in the data before the change as CREATED, and the status field in the data after the change as PAID. The difference field will record the old value as CREATED and the new value as PAID for the status field. For INSERT statements, the generated CDC record will have the operation type field marked as INSERT, the data before the change as empty, and the data after the change as the content of the newly inserted complete data row. Through this field-by-field difference calculation method, the generated CDC record can accurately reflect the complete content and scope of each data change, providing fine-grained change information to downstream systems.

[0039] S107: The business data change operation and the write operation of the data change capture record are executed in the same database transaction. The database transaction is committed only if both the business data change operation and the write operation of the data change capture record are successful; if either operation fails, the database transaction is rolled back.

[0040] After generating the data change capture record (CDC), the Mesh layer executes the business data change operation and the write operation of the CDC within the same database transaction. A transaction is a logical unit in the execution process of a database management system, possessing four key properties: atomicity, consistency, isolation, and durability. Upon recognizing a data change statement, the Mesh layer creates or joins a transaction context. Within this transaction context, the Mesh layer performs the following operations sequentially: First, it queries the data snapshot before the change; this query is completed within the transaction, providing a view of the data consistency at the start of the current transaction. Next, it executes the dialect-translated business data change statement, performing the actual insert, update, or delete operations on the target table. Then, after the business data change is successful, it queries the data snapshot after the change; the query result reflects the latest state after the execution of the business data change statement. Afterward, it calculates the differences between the before and after snapshots, generating CDC records. Finally, it writes the generated CDC records to the transaction log table or the outbox table. The transaction log table or outbox table is a dedicated table created or utilized by the Mesh layer in the target database to reliably store CDC records. Its table structure includes at least all fields of the CDC record, as well as a status control field for recording the push status. All the above operations are performed within the protection scope of the same database transaction. The Mesh layer only commits the database transaction when both the business data change operation and the CDC record write operation are successfully completed, thus simultaneously persisting the business data changes and CDC records to the target database. If any stage fails—for example, if the business SQL execution fails, the snapshot query fails, the CDC record generation fails, or the write to the transaction log table fails—the Mesh layer will roll back the entire database transaction. The rollback operation restores the business data to its state before the transaction began, and no incomplete CDC records remain in the transaction log table. This mechanism, which manages business operations and CDC records within the same transaction boundary, fundamentally avoids data inconsistencies such as business data being changed but CDC records being lost, CDC records being generated but business data changes failing, and the inability to recover deleted data after a deletion operation. It ensures strict strong consistency between business data and change event records.

[0041] Furthermore, to improve the reliability of data change capture records (CDCs) when distributing them to downstream systems and to enhance overall system throughput, the Mesh layer implements an asynchronous push and retry guarantee mechanism for the generated CDC records. After successfully writing a CDC record to the transaction log table or outbox table, the Mesh layer sets a status field for each CDC record and marks it as pending push with its initial value. It also sets the retry count field to zero and calculates and sets a next retry time field based on the current system time, for example, the current time plus five seconds. Subsequently, the Mesh layer starts a thread pool independent of the main business thread. This thread pool is specifically responsible for batch reading records with a pending push status from the CDC record table, serializing these records into message bodies, and asynchronously sending them to an external message queue system through the message queue client interface. If the message queue returns a successful confirmation, the Mesh layer performs a database update operation, changing the status field of the CDC record from pending push to successful. If the transmission fails due to a momentary network failure, temporary unavailability of the message queue service, or other reasons, the Mesh layer only records an error log and does not make any changes to the status field of the CDC record in the database, keeping it in the pending push state.

[0042] In addition, a background cleanup and retry task is deployed within the Mesh layer on a scheduled basis. This task is triggered at fixed time intervals, such as every five minutes. Within each execution cycle, the task queries the CDC record table, filtering out all records that simultaneously meet the following two conditions: the current value of the status field is equal to the pending push, and the value of the next retry time field is less than or equal to the current system time. For each pending retry record returned by the query, the task re-executes the asynchronous push operation. If the push is successful, the status field of the record is updated to success. If the push fails again, the retry count field of the record is incremented by one. The task then determines whether the incremented retry count exceeds the system's preset maximum retry threshold, which can be configured according to business tolerance, for example, set to ten times. If the retry count has not exceeded the threshold, the value of the next retry time field is calculated and updated using an exponential backoff strategy based on the current retry count. For example, the calculation method is the current time plus two raised to the power of the current retry count, multiplied by the initial waiting time of five seconds. By gradually extending the retry interval, this strategy effectively avoids continuous impact on system resources when the target service is unavailable for an extended period. If the number of retries exceeds the preset maximum threshold, the task updates the status field of this record to "final failure" and terminates any subsequent automatic retry behavior for that record. Simultaneously, an alarm mechanism is triggered so that operations personnel can intervene for manual verification and compensation. Through this multi-layered protection system—first local persistence, then asynchronous push, and finally supplemented by scheduled retries—the Mesh layer improves the overall reliability of CDC data to nearly 100%. Meanwhile, the asynchronous push mechanism does not affect the execution efficiency of the main business process at all, ensuring low latency and high throughput performance of the business system in high-concurrency scenarios.

[0043] Based on the same inventive concept, this application also provides a database mesh layer-based unified data change capture device for implementing the aforementioned database mesh layer-based unified data change capture method. The solution provided by this device is similar to the implementation described in the above method. Therefore, the specific limitations of one or more database mesh layer-based unified data change capture device embodiments provided below can be found in the above-described limitations of the database mesh layer-based unified data change capture method, and will not be repeated here.

[0044] In one embodiment, such as Figure 2 As shown, a unified data change capture device based on a database mesh layer is provided. The database mesh layer provides a unified MySQL protocol access interface to the outside world, and internally connects to various heterogeneous databases through adapters. The device includes: The syntax parsing module 30 is used to receive SQL requests sent by the client through the MySQL protocol, and to perform syntax parsing on the SQL requests to generate an abstract syntax tree; The set acquisition module 31 is used to identify the operation type of the SQL request based on the abstract syntax tree. When the operation type is a data change statement, it obtains the primary key information of the target table involved in the data change statement through the abstract syntax tree and the pre-acquired metadata information, and extracts or derives the primary key set of the data rows affected by the data change statement based on the conditional expression in the abstract syntax tree. The snapshot acquisition module 32 is used to query and obtain a snapshot of the data before the change from the target database based on the primary key set before performing the business data change operation; The data change module 33 is used to convert data change statements into dialect SQL that matches the target database through an adapter, and then send the converted dialect SQL to the target database for execution to complete the business data change. The data query module 34 is used to query the target database again based on the primary key set after the business data change is successfully executed, and obtain the data snapshot after the change; The difference calculation module 35 is used to perform difference calculation on the data snapshot before the change and the data snapshot after the change, and generate a data change capture record based on the difference calculation result. The transaction commit module 36 is used to execute business data change operations and data change capture record write operations in the same database transaction. The database transaction is committed only if both the business data change operation and the data change capture record write operation are successful; if either operation fails, the database transaction is rolled back.

[0045] This application also provides an electronic device, in some embodiments, referring to... Figure 3 As shown, the electronic device 700 includes an input unit 710, a memory 720, a processor 730, and an output unit 740. The memory 720 stores program instructions that can be executed on the processor 730. The processor 730 can execute the unified data change capture method and / or technical solution based on the database mesh layer in the foregoing embodiments by calling the program instructions. The electronic device 700 can be a mobile terminal device such as a mobile phone or a computer.

[0046] Furthermore, embodiments of this application also provide a computer-readable storage medium for storing a computer program that executes a unified data change capture method based on a database mesh layer. For example, computer program instructions, when executed by a computer, can invoke or provide the methods and / or technical solutions according to this application through the operation of the computer. The program instructions that invoke the methods of this application may be stored in a fixed or removable storage medium, and / or transmitted via data streams in broadcast or other signal carrying media, and / or stored in a storage medium that operates according to the program instructions.

[0047] Obviously, those skilled in the art should understand that the modules or steps of this application described above can be implemented using general-purpose computing devices. They can be centralized on a single computing device or distributed across a network of multiple computing devices. Optionally, they can be implemented using computer-executable program code, thereby storing them in a storage device for execution by a computing device, or fabricating them separately as individual integrated circuit modules, or fabricating multiple modules or steps as a single integrated circuit module. Thus, this application is not limited to any particular combination of hardware and software.

[0048] The technical features of the above embodiments can be arbitrarily integrated. For the sake of brevity, not all possible integrations of the technical features in the above embodiments are described. However, as long as the integration of these technical features does not contradict each other, they should be considered to be within the scope of this specification.

[0049] The above embodiments merely illustrate several implementation methods of the present invention, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these all fall within the protection scope of the present invention. Therefore, the protection scope of this invention patent should be determined by the appended claims.

Claims

1. A unified data change capture method based on database Mesh layer, characterized in that, The database mesh layer provides a unified MySQL protocol access interface to the outside world, and internally connects to various heterogeneous databases through adapters. The method includes: Receive SQL requests sent by clients via the MySQL protocol, and perform syntax parsing on the SQL requests to generate an abstract syntax tree; The operation type of the SQL request is identified based on the abstract syntax tree. When the operation type is a data modification statement, the primary key information of the target table involved in the data modification statement is obtained through the abstract syntax tree and the pre-acquired metadata information. The primary key set of the data rows affected by the data modification statement is extracted or derived based on the conditional expression in the abstract syntax tree. Before performing business data change operations, query the target database based on the primary key set and obtain a snapshot of the data before the change; The data change statement is converted into dialect SQL that matches the target database through an adapter, and the converted dialect SQL is sent to the target database for execution to complete the business data change; After the business data change is successfully executed, the target database is queried again based on the primary key set to obtain a snapshot of the changed data; Perform a difference calculation on the data snapshot before the change and the data snapshot after the change, and generate a data change capture record based on the difference calculation result; The business data change operation and the write operation of the data change capture record are executed in the same database transaction. The database transaction is committed only if both the business data change operation and the write operation of the data change capture record are successful; if either operation fails, the database transaction is rolled back.

2. The method of claim 1, wherein, The extraction or derivation of the primary key set of data rows affected by the data modification statement based on the conditional expression in the abstract syntax tree specifically includes: When the data change statement is an insert statement, if the insert statement contains a primary key field, the primary key value is directly extracted from the abstract syntax tree as the primary key set; if the primary key is automatically generated by the target database, the generated primary key value is obtained through the adapter as the primary key set after the business data change is executed. When the data change statement is an update statement or a delete statement, determine whether the conditional expression in the abstract syntax tree contains a condition for the primary key; If it is included, the primary key value is directly extracted by traversing the condition nodes related to the primary key in the abstract syntax tree to form a primary key set; If not included, a primary key query statement is automatically generated and executed based on the target table in the abstract syntax tree and the conditional expression to retrieve the set of primary keys of the data rows affected by the data change statement from the target database.

3. The method of claim 2, wherein, The step of directly extracting the primary key value by traversing the condition nodes related to the primary key in the abstract syntax tree includes: When the conditional expression contains a primary key equality condition, the literal value on the right side of the equals sign is extracted as the primary key value; When the conditional expression contains a primary key IN list condition, all values ​​in the IN list are extracted as the primary key values. When the conditional expression contains a primary key range condition, the set of primary key values ​​that satisfy the range condition is obtained by executing the preceding primary key query statement.

4. The method according to claim 1, characterized in that, The step of performing difference calculations on the data snapshot before the change and the data snapshot after the change, and generating a data change capture record based on the difference calculation results, specifically includes: When the data change statement is a delete statement, the data snapshot after the change is empty, and the data change capture record contains the operation type of delete, the complete data row content before the change, and the null value identifier after the change; When the data change statement is an update statement, the data snapshot before the change and the data snapshot after the change are compared field by field to identify the fields that have changed. The data change capture record contains the old and new values ​​of the fields that have changed and whose operation type is update.

5. The method according to claim 1, characterized in that, The step of placing the business data change operation and the write operation of the data change capture record within the same database transaction specifically includes: Upon receiving the data change statement, create or join a transaction context; Within the transaction context, the following steps are executed sequentially: query the data snapshot before the change, execute the data change statement after dialect conversion, query the data snapshot after the change, generate a data change capture record, and write the data change capture record to the transaction log table or the outbox table. The commit or rollback operation simultaneously affects the business data change results in the target database and the data change capture records in the transaction log table or outbox table.

6. The method according to claim 1, characterized in that, The database Mesh layer internally connects to various heterogeneous databases via adapters, including at least two of MySQL, PostgreSQL, Oracle, OceanBase, and TiDB. The adapter is used to handle SQL dialect conversion, data type mapping, pagination syntax differences, primary key return method differences, transaction control differences, and error code conversion between different databases.

7. The method according to any one of claims 1 to 6, characterized in that, The method further includes: The generated data change capture records are saved to the CDC record table in the database, and a status field is set to "pending push" for each record; Records in the CDC record table are asynchronously pushed to the message queue using an independent thread pool; If the push is successful, the status field of the corresponding record will be updated to "success"; if the push fails, the status field will remain "pending push".

8. The method according to claim 7, characterized in that, The method further includes: Periodically query the records in the CDC record table whose status field is "Pending Push" and have reached the preset retry time; Re-execute the asynchronous push operation on the retrieved records; If the push is successful, the status field is updated to success; if the push fails, the number of retries is incremented and it is determined whether the number of retries exceeds a preset threshold. If the threshold is not exceeded, the next retry time is calculated and updated based on the number of retries, and the system waits for the next retry; if the threshold is exceeded, the status field is updated to final failure and retries are stopped.

9. A unified data change capture device based on a database mesh layer, characterized in that, The database mesh layer provides a unified MySQL protocol access interface to the outside world, and internally connects to various heterogeneous databases through adapters. The device includes: The syntax parsing module is used to receive SQL requests sent by the client through the MySQL protocol, and to parse the SQL requests to generate an abstract syntax tree; The set acquisition module is used to identify the operation type of the SQL request based on the abstract syntax tree. When the operation type is a data modification statement, it obtains the primary key information of the target table involved in the data modification statement through the abstract syntax tree and the pre-acquired metadata information, and extracts or derives the primary key set of the data rows affected by the data modification statement based on the conditional expression in the abstract syntax tree. The snapshot acquisition module is used to query and obtain a snapshot of the data before the change from the target database based on the primary key set before performing the business data change operation; The data change module is used to convert the data change statement into dialect SQL that matches the target database through an adapter, and then send the converted dialect SQL to the target database for execution to complete the business data change. The data query module is used to query the target database again based on the primary key set after the business data change is successfully executed, and obtain the data snapshot after the change; The difference calculation module is used to perform difference calculation on the data snapshot before the change and the data snapshot after the change, and generate a data change capture record based on the difference calculation result. The transaction commit module is used to execute the business data change operation and the write operation of the data change capture record in the same database transaction. The database transaction is committed only when both the business data change operation and the write operation of the data change capture record are successful; if either operation fails, the database transaction is rolled back.

10. An electronic device comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, When the processor executes the computer program, it implements the unified data change capture method based on the database Mesh layer as described in any one of claims 1 to 8.