A multi-active data consistency method and system based on unique primary key

By using a globally unique primary key in a unified manner across multiple active-active data centers, the problems of data inconsistency and split-brain in the active-active mode are solved, and complete consistency guarantee between data centers is achieved, which is suitable for data consistency processing in distributed databases.

CN114647654BActive Publication Date: 2026-06-23CREDIT CENT OF THE PEOPLES BANK OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CREDIT CENT OF THE PEOPLES BANK OF CHINA
Filing Date
2022-03-21
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing technologies cannot effectively solve data inconsistencies and split-brain phenomena between different data centers in a multi-active mode, especially when data synchronization is delayed, and there is a lack of effective distributed transaction processing solutions.

Method used

A multi-active data consistency method based on unique primary keys is adopted. By using a globally unique primary key as a constraint in different data centers, the process is transformed into transaction consistency processing between data centers. This includes steps such as allocating concurrent query requests, checking the identifier code, converting to unique primary key constraint transactions, and synchronously updating the identifier code, to ensure data consistency.

Benefits of technology

It achieves complete data consistency guarantee across data centers in a multi-active mode, avoiding inconsistencies and split-brain phenomena between data centers, and has good universal applicability, without relying on specific hardware or software.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114647654B_ABST
    Figure CN114647654B_ABST
Patent Text Reader

Abstract

The application relates to a multi-live data consistency method and system based on a unique primary key, which comprises the following steps: allocating a plurality of concurrent query requests for the same subject; checking whether an identification code corresponding to the subject exists; converting the query request into a unique primary key constraint transaction for the subject without the identification code; synchronizing the unique primary key constraint transaction, updating the identification code of the subject according to the unique primary key constraint transaction; synchronizing the updated identification code, and feeding back the query request. The global unique primary key uniformly used between different data centers is used as a constraint, the data consistency problem is converted into transaction consistency between the data centers for processing, complete data consistency guarantee is provided for the multi-live mode of the same business operation of the data centers, and the inconsistent and brain split phenomenon between the data centers possibly caused in special business conditions can be effectively avoided.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data security technology, and in particular to a method and system for multi-active data consistency based on a unique primary key. Background Technology

[0002] Active-active (or multi-active) technology is an effective disaster recovery solution for computer systems. It provides a high level of protection for business continuity by flexibly distributing computing, storage, and network resources among more than one data center.

[0003] There are currently three deployment models for multi-active data centers: First, different data centers operate different services; second, based on the scenario where business data is an independent collection in different data centers, different data centers provide the same service simultaneously, but the generated data does not need to be merged and processed; third, different data centers operate the same service, but due to issues such as multiple data centers simultaneously submitting new requests for the same data subject and data synchronization delays between data centers, the generated data may exhibit inconsistencies and split-brain phenomena in a distributed database.

[0004] Distributed transactions are a crucial factor affecting the consistency of data in distributed databases. Currently, there are four solutions in a single data center model: two-phase commit, compensating transactions, message tables, and MQ transactional messages. However, for multi-active models, existing technologies cannot support distributed transactions in different data centers, and there are no relevant solutions. Therefore, they cannot effectively support the simultaneous generation of the same new data body in different data centers or the handling of query requests when data synchronization is incomplete. Summary of the Invention

[0005] To address the shortcomings of existing technologies, this invention proposes a multi-active data consistency method and system based on a unique primary key. By using a globally unique primary key that is uniformly used across different data centers as a constraint, the data consistency problem is transformed into transaction consistency between data centers for processing. This provides complete data consistency assurance for multi-active modes in which each data center operates the same business, effectively addressing and avoiding inconsistencies and split-brain phenomena that may occur between data centers under special business conditions.

[0006] To achieve the above objectives, the technical solution adopted by the present invention includes:

[0007] A multi-active data consistency method based on a unique primary key, characterized by comprising:

[0008] Distribute multiple concurrent query requests targeting the same entity;

[0009] Check if there is an identification code for the corresponding entity;

[0010] For entities that do not have an identifier, convert the query request into a unique primary key constraint transaction;

[0011] Synchronize transactions with unique primary key constraints and update the identifier of the principal based on the unique primary key constraints.

[0012] The updated identifier code is synchronized, and the query request is responded to.

[0013] Furthermore, the allocation of multiple concurrent query requests targeting the same subject includes:

[0014] Multiple concurrent query requests for the same subject can be allocated based on IP address, region, ratio, load, or randomly.

[0015] Furthermore, the step of converting the query request into a unique primary key constraint transaction includes:

[0016] Set a unique primary key for the subject;

[0017] The query request is converted into an order number, which corresponds to a unique primary key;

[0018] Combine the globally unique primary key and the order number to form a unique primary key constraint transaction.

[0019] Furthermore, the unique primary key constraint transaction also includes an identifier bit, which includes a default value for indicating synchronization operations.

[0020] Furthermore, the identifier of the transaction update subject constrained by the unique primary key includes:

[0021] Generate a new identifier code;

[0022] Update the identification code using the order number.

[0023] Furthermore, for entities with identification codes, the query request is fed back based on the existing identification codes.

[0024] The present invention also relates to a multi-active data consistency system based on a unique primary key, characterized in that it includes a first data center that is interconnected with each other and runs the same application and several second data centers;

[0025] The first data center includes a first load balancing module, several first transaction conversion modules, a first primary key module, a first management service module, and a first distributed database;

[0026] The second data center includes a second load balancing module, several second transaction conversion modules, a second primary key module, a second management service module, and a second distributed database;

[0027] The first load balancing module and the second load balancing module are connected to form a load balancing cluster. The load balancing cluster will distribute multiple concurrent query requests for the same subject to the first transaction conversion module or the second transaction conversion module.

[0028] The first transaction conversion module and the second transaction conversion module check whether there is an identifier code for the corresponding subject according to the allocated query request and convert the query request corresponding to the subject that does not have an identifier code into a unique primary key constraint transaction.

[0029] The first primary key module and the second primary key module manage the unique primary key of the corresponding entity;

[0030] The first management service module and the second management service module are connected and send unique primary key constraint transactions converted by the first transaction conversion module or the second transaction conversion module to each other;

[0031] The first distributed database and the second distributed database are connected and synchronized through bidirectional data replication.

[0032] Furthermore, any second data center can switch to become a new first data center according to the operation command, while the original first data center can automatically switch to become a second data center.

[0033] Furthermore, the second data center checks whether there is an identifier code for the corresponding subject based on the allocated query request, and converts the query request corresponding to the subject for which there is no identifier code into a unique primary key constraint transaction and synchronizes it to the first data center;

[0034] The first data center updates the identifier of the transaction subject based on the unique primary key constraint synchronized by the second data center, and synchronizes the updated identifier to the second data center.

[0035] Furthermore, the identifier of the transaction update subject synchronized by the first data center based on the unique primary key from the second data center includes:

[0036] If the first data center does not have a corresponding identifier for the subject, the first data center will generate a new identifier using the same unique primary key and update the distributed database data.

[0037] When the first data center has an identifier for the corresponding entity, the first data center will extract the data corresponding to the unique primary key in the distributed database to synchronize the identifier of the second data center.

[0038] The beneficial effects of this invention are as follows:

[0039] The multi-active data consistency method and system based on unique primary keys described in this invention uses a globally unique primary key used across different data centers as a constraint to transform the data consistency problem into transaction consistency between data centers. It does not rely on any specific hardware structure or additional software functions. It can provide complete data consistency guarantee for multi-active modes in which each data center operates the same business, and can effectively deal with and avoid inconsistencies and split-brain phenomena that may occur between data centers under special business conditions. It has good universal applicability. Attached Figure Description

[0040] Figure 1 This is a schematic diagram of the multi-active data consistency method based on a unique primary key according to the present invention.

[0041] Figure 2 This is a schematic diagram of the multi-active data consistency system architecture based on a unique primary key according to the present invention. Detailed Implementation

[0042] To better understand the content of this invention, a detailed description will be provided in conjunction with the accompanying drawings and embodiments.

[0043] like Figure 1 The diagram shows a flowchart of the multi-active data consistency method based on a unique primary key according to the present invention, which includes the following steps:

[0044] Multiple concurrent query requests targeting the same entity can be allocated based on IP address, region, ratio, load, or randomly.

[0045] Check if there is an identification code for the corresponding entity;

[0046] For entities with existing identification codes, the query request is returned based on the existing identification codes;

[0047] For entities that do not have an identifier, a corresponding unique primary key is set for the entity, and the query request is converted into an order number. The order number corresponds to the unique primary key. The globally unique primary key, the order number, and the identifier used to indicate the synchronization operation are combined to form a unique primary key constrained transaction.

[0048] Synchronize unique primary key constraint transactions, update the identifier of the subject based on the unique primary key constraint transaction, especially generate a new identifier and update the identifier data through the order number. For example, when the default value represented by the identifier bit indicates that a synchronization operation is required, a new identifier is generated through the identifier generation service, and the relevant data in the background is updated through the order number, and the background data is updated and merged through the order number.

[0049] The updated identifier code is synchronized, and the query request is responded to.

[0050] During the synchronization and updating of the identifier codes, a common list of order numbers is recorded for multiple data centers, and the information (identifier code data) corresponding to the order numbers is processed in a multi-threaded manner. First, the information corresponding to the order numbers is recorded in the database of the main data center (the logically set main center), and then the information corresponding to the order numbers is transmitted to the management and merging service application provided by the management service module for other data centers to synchronize and update.

[0051] Specifically, for personal information, the management and merging service application checks if the information corresponding to the order number exists in the database. If it does, it retrieves the information from the database and further determines if an identification code exists. If the identification code does not exist, it creates a file, retrieves the corresponding service catalog, product catalog, etc., generates an identification code, queries corresponding ratings and other derivative products, and can also generate corresponding product reports as needed. For enterprise information, the management and merging service application similarly checks if the information corresponding to the order number exists in the database, but for existing information, it calls the query interface to further determine whether the enterprise information involves charges, whether billing functions need to be invoked, and finds key personnel and other derivative products, generating a product report.

[0052] The present invention also relates to a structure as follows Figure 2 The illustrated embodiment of a multi-active data consistency system based on a unique primary key can be used to execute the methods described above. The system consists of a first data center running the same application and several second data centers, as shown below. Figure 2 The diagram shown is a schematic representation of an embodiment containing one second data center. For technical solutions employing a larger number of second data centers, the structure is similar. Figure 2 In the illustrated embodiment, each second data center forms a data connection with the first data center, meaning that the first data center can connect to multiple different second data centers simultaneously.

[0053] On the other hand, the first and second data centers in the system are only distinguished by logical execution. Based on actual needs, any second data center can switch to become the first data center at any time, while the original first data center automatically switches to become a second data center, ensuring that the system always maintains a structure with only one first data center. Therefore, the system shown in the embodiment always maintains a radial connection relationship with one first data center as the main center and the other second data centers as secondary centers. However, it also has connectivity capabilities between different second data centers to support any second data center switching to become the first data center while maintaining the overall radial connection relationship of the system.

[0054] The first data center includes a first load balancing module, a first transaction conversion module, a first primary key module, a first management service module, and a first distributed database; the second data center includes a second load balancing module, a second transaction conversion module, a second primary key module, a second management service module, and a second distributed database. Preferably, the first and second load balancing modules are connected to form a load balancing cluster, which distributes multiple concurrent query requests for the same entity to either the first or second transaction conversion module. The first and second transaction conversion modules check for the existence of an identifier for the corresponding entity based on the allocated query requests and convert query requests for entities without identifiers into unique primary key constraint transactions. The first and second primary key modules manage the unique primary keys for the corresponding entities. The first and second management service modules are connected and mutually send unique primary key constraint transactions converted by either the first or second transaction conversion module. The first and second distributed databases are connected and synchronize data through bidirectional data replication.

[0055] The internal structures of the first and second data centers are consistent. The functional differences between components are only distinguished by the logical execution differences between the two data centers: The second data center checks if a corresponding entity's identifier exists for each assigned query request and converts query requests for entities without identifiers into unique primary key constraint transactions, which are then synchronized to the first data center. The first data center updates the entity's identifier based on the unique primary key constraint transactions synchronized in the second data center and synchronizes the updated identifier to the second data center. If the first data center does not have a corresponding entity's identifier, it generates a new identifier using the same unique primary key and updates the distributed database data. If the first data center has a corresponding entity's identifier, it extracts the data corresponding to the unique primary key from the distributed database for synchronizing the identifier with the second data center.

[0056] This shows that, Figure 2 In the illustrated embodiment, the first and second data centers operate as follows: first, both data centers run the same version of the application; second, database data is synchronized using data replication technology; third, globally unique primary keys maintain uniqueness in both data centers; and fourth, business requests are dynamically allocated through global load balancing. Therefore, the first and second data centers can guarantee complete data consistency in a multi-active mode during operation, and also support mutual location exchange between them.

[0057] In actual operation, when an access request occurs, it is allocated to a specific data center for response according to different rules. These rules include, but are not limited to, IP allocation, geographic allocation, proportional allocation, and random allocation. The dual-active data center mode means that the same type of service runs simultaneously in two data centers, responding to access requests and providing services to the outside world at the same time. A globally unique primary key database means that data generated by service requests in the dual-active data centers has a unique global primary key (PK) constraint in all databases of both the first and second data centers. The distributed database means that data is sharded and partitioned in each data center based on the global primary key. Each data center maintains a one-to-many mapping between global primary keys (PK) and transactions. This mapping abstractly records the data flow of data center services. Global data consistency in the dual-center database means that the corresponding global primary keys for the same user in both data centers cannot conflict. Transactions in the dual-center database mean that service requests from the same user in both data centers represent different processing flows, and the processes within the same data center have atomicity. The relationship between the global primary key (PK) and transactions is pushed to the management service nodes of each data center. The primary key-transaction relationship between the service nodes in the two data centers is synchronized with each other, mainly performing two functions: first, scheduling, which is responsible for generating merge tasks; second, calling the primary key-transaction merge module through an interface to perform transaction merging based on the primary key (MergeTransaction By Primary Key) and writing the result to the database. Data from the first data center is synchronized to the second data center through underlying replication technology. This operation is not dependent on any specific product or technology and has very good broad applicability.

[0058] The following section further illustrates the method and system of this invention through the management of personal / corporate credit information data.

[0059] In the scenario where a new internal code ID (identifier code) appears in a dual-center database for a single interface query business for an individual / enterprise, how to merge the new identifier code and the archive data entry for the same credit subject?

[0060] Scenarios where newly added internal codes cause data inconsistencies and split-brain scenarios include: first, concurrent queries to a new credit entity in two data centers; second, a new credit entity being loaded into one data center, but before the data is synchronized to the other data center, a query for that same credit entity occurs in the other data center. The specific steps for establishing a dual-active data center using the method and system described in this invention to address these situations include:

[0061] Step 1: Different institutions simultaneously send queries to the same credit subject. Some institutions' query requests are assigned to the first data center, while others' query requests are assigned to the second data center.

[0062] Step 2: Query the credit subject internal code information in the file subject internal code database in the second data center.

[0063] Step 3: Determine if the personal two-label internal code information exists; if it exists, return the two-label internal code information; if it does not exist, proceed to Step 4.

[0064] Step 4: Treat the credit subject query request as a transaction and maintain a PK-transaction relationship consisting of the credit subject's two identifiers + order number + identifier, where the order number plays the role of the transaction.

[0065] Step 5: The credit principal PK-transaction relationship in the second data center is synchronized to the data merging node in the first data center via ORACLE ADG.

[0066] Step Six: The data merging node in the first data center merges and adds credit subject data according to the rules. The rules include: 1) If the two data centers have the same two identifiers and the identifier bit is -1 (the default value indicates synchronization), a new internal code is generated through the internal code generation service, and the relevant data in the backend database is updated using the order number; 2) If the backend database of the first data center and the merging node have the same two identifiers and the identifier bit is -1, the data in the merging node is updated using the internal code of the two identifiers in the backend database via the order number.

[0067] Step 7: Synchronize the two internal codes from the first data center to the database in the second data center using database replication technology.

[0068] Step 8: Process the relevant data with the business records of other credit entities using the two codes, such as billing records, query records, and order records, using the same solution.

[0069] The above description is merely a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims

1. A multi-active data consistency system based on a unique primary key, characterized in that, This includes a first data center that is interconnected with each other and runs the same applications, and several second data centers that are isomorphic to the first data center; The first data center includes a first load balancing module, several first transaction conversion modules, a first primary key module, a first management service module, and a first distributed database; The second data center includes a second load balancing module, several second transaction conversion modules, a second primary key module, a second management service module, and a second distributed database; The first load balancing module and the second load balancing module are connected to form a load balancing cluster. The load balancing cluster will distribute multiple concurrent query requests for the same subject to the first transaction conversion module or the second transaction conversion module. The first transaction conversion module and the second transaction conversion module check whether there is an identifier code for the corresponding subject according to the allocated query request and convert the query request corresponding to the subject that does not have an identifier code into a unique primary key constraint transaction. The first primary key module and the second primary key module manage the unique primary key of the corresponding entity; The first management service module and the second management service module are connected and send unique primary key constraint transactions converted by the first transaction conversion module or the second transaction conversion module to each other; The first distributed database and the second distributed database are connected and synchronized through bidirectional data replication; The allocation of multiple concurrent query requests for the same subject includes: Multiple concurrent query requests targeting the same entity can be allocated based on IP address, region, ratio, load, or randomly. Converting a query request into a unique primary key constraint transaction includes: Set a unique primary key for the subject; The query request is converted into an order number, which corresponds to a unique primary key; Combine the globally unique primary key and the order number to form a transaction with a unique primary key constraint. The unique primary key constraint transaction also includes an identifier bit, which includes a default value for indicating synchronization operations; Synchronize unique primary key constraint transactions and update the principal's identifier based on the unique primary key constraint transactions.

2. The system as described in claim 1, characterized in that, Any of the second data centers can switch to become the new first data center according to the operation instructions, while the original first data center can automatically switch to the second data center.

3. The system as described in claim 2, characterized in that, The second data center checks whether there is an identifier code for the corresponding subject based on the allocated query request, and converts the query request corresponding to the subject for which there is no identifier code into a unique primary key constraint transaction and synchronizes it to the first data center; The first data center updates the identifier of the transaction subject based on the unique primary key constraint synchronized by the second data center, and synchronizes the updated identifier to the second data center.

4. The system as described in claim 3, characterized in that, The identifier of the transaction update subject synchronized by the first data center based on the unique primary key from the second data center includes: If the first data center does not have a corresponding identifier for the subject, the first data center will generate a new identifier using the same unique primary key and update the distributed database data. When the first data center has an identifier for the corresponding entity, the first data center will extract the data corresponding to the unique primary key in the distributed database to synchronize the identifier of the second data center.