Incremental synchronization method and device based on materialized view log and computer device
By specifying attribute columns and metadata in the materialized view log to determine the incremental operation fields, incremental data can be extracted directly from the materialized view log, solving the problem of low efficiency in incremental data synchronization in Oracle databases and achieving more efficient data synchronization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA TELECOM CLOUD TECH CO LTD
- Filing Date
- 2022-07-29
- Publication Date
- 2026-06-16
AI Technical Summary
In existing technologies, incremental data synchronization in Oracle databases based on materialized view log technology requires table lookups, resulting in low efficiency in incremental data extraction and impacting synchronization efficiency.
By specifying multiple attribute columns in the materialized view log and combining them with the metadata of the data source table to determine the fields corresponding to incremental operations, incremental data can be extracted from the materialized view log, reducing the number of table lookups and improving synchronization efficiency.
No need for table lookups, reducing the synchronization time for incremental data and improving the performance of incremental database synchronization.
Smart Images

Figure CN115470290B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data synchronization technology, and specifically to an incremental synchronization method, apparatus, and computer device based on materialized view logs. Background Technology
[0002] Incremental synchronization of a database requires extracting incremental data from the source database, analyzing and transforming it, and then writing a database script file and operation language suitable for the target database to store the incremental data in the target database.
[0003] Oracle database is a commonly used database type and a database type that database synchronization components must support. Currently, for incremental data synchronization in Oracle databases, when using materialized view log technology, it is necessary to read the materialized view log and then perform a table lookup in the data source table based on the primary key in the materialized view log to retrieve the complete incremental data. However, each log entry in the materialized view log requires a table lookup. Due to network transmission and database performance limitations, the efficiency of incremental data extraction is low, thus affecting the synchronization efficiency of incremental data. Summary of the Invention
[0004] In view of this, embodiments of the present invention provide an incremental synchronization method, apparatus, and computer device based on materialized view logs to solve the problem of low incremental data extraction efficiency, which leads to low incremental data synchronization efficiency.
[0005] According to a first aspect, embodiments of the present invention provide an incremental synchronization method based on a materialized view log, comprising: obtaining a data source table in a source database; creating a materialized view log corresponding to the data source table, the materialized view log including multiple attribute columns, the attribute columns being used to determine the fields and update values involved in incremental operations of the data source table; when an incremental operation is generated for the data source table, determining the fields corresponding to the incremental operation based on the metadata of the data source table; when the field corresponding to the incremental operation is in the attribute columns, extracting the incremental data corresponding to the incremental operation from the materialized view log; and synchronizing the incremental data to a target database.
[0006] The incremental synchronization method based on materialized view logs provided in this invention specifies multiple attribute columns in the materialized view log corresponding to the data source table. When an incremental operation occurs, the field corresponding to the incremental operation can be determined by combining the multiple attribute columns and the metadata of the data source table. If the field is in the attribute column, the incremental data is extracted from the materialized view log. This eliminates the need for table lookups, reduces the number of table lookups, improves the synchronization efficiency of incremental data, and enhances the performance of database incremental synchronization.
[0007] In conjunction with the first aspect, in the first embodiment of the first aspect, the plurality of attribute columns include a field change vector column, an incremental operation type column, and a data column. The step of determining the field corresponding to the incremental operation when an incremental operation is generated for the data source table includes: updating the incremental operation in the incremental operation type column when an incremental operation is generated for the data source table; and determining the field corresponding to the incremental operation based on the incremental operation type column, the field change vector column, and the data column.
[0008] The incremental synchronization method based on materialized view logs provided in this invention updates the incremental operation in the incremental operation type column. This allows the field generating the data update to be determined based on the incremental operation type, the field change vector column, and the data column—that is, the field generating the incremental operation is identified from the materialized view log. This enables the confirmation of updated data based on the incremental operation, eliminating the need to query the data source table, reducing the time spent querying the incremental data, and improving data synchronization efficiency.
[0009] In conjunction with the first embodiment of the first aspect, in the second embodiment of the first aspect, determining the field corresponding to the incremental operation based on the incremental operation type column, the field change vector column, and the data column includes: determining incremental operations corresponding to the same primary key information based on the primary key information of the materialized view; when there are multiple incremental operations, merging the multiple incremental operations based on preset rules to obtain a target incremental operation; and determining the field of the target incremental operation in the materialized view log based on the field change vector column, the target incremental operation, and the data column.
[0010] The incremental synchronization method based on materialized view logs provided in this invention merges multiple incremental operations, making it easier to confirm the final state of incremental data, further reducing the number of table lookups and the size of incremental data, and facilitating subsequent operations.
[0011] In conjunction with the second embodiment of the first aspect, in the third embodiment of the first aspect, the types of incremental operations include insertion operations, update operations, and deletion operations. The step of merging multiple incremental operations based on preset rules to obtain a target incremental operation includes: when multiple incremental operations are sequentially insertion and update operations, merging multiple incremental operations based on the preset rules to determine the target incremental operation as an insertion operation; when multiple incremental operations are sequentially insertion and deletion operations, merging the incremental operations based on the preset rules to determine the target incremental operation as a deletion operation; when multiple incremental operations are all update operations, merging the incremental operations based on the preset rules to determine the target incremental operation as an update operation; when multiple incremental operations are sequentially update and deletion operations, merging the incremental operations based on the preset rules to determine the target incremental operation as a deletion operation; and when multiple incremental operations are sequentially deletion and insertion operations, merging the incremental operations based on the preset rules to determine the target incremental operation as an insertion operation.
[0012] The incremental synchronization method based on materialized view logs provided in this invention merges multiple incremental operations according to the order in which they are generated by preset rules, ensuring that multiple incremental operations for the same primary key can be merged, thereby minimizing the number of table lookups.
[0013] In conjunction with the first aspect, in the fourth embodiment of the first aspect, after determining the field corresponding to the incremental operation based on the metadata of the data source table, the method further includes: determining whether the field corresponding to the incremental operation is a big data field; when the field corresponding to the incremental operation is not the big data field, determining whether the field corresponding to the incremental operation is in the attribute column.
[0014] The incremental synchronization method based on materialized view logs provided in this invention reduces the number of table lookups for non-big data fields by determining the corresponding fields for incremental operations. This improves the performance of incremental data extraction for data source tables that do not contain big data fields, or for scenarios where big data fields are read more and written less.
[0015] In conjunction with the first aspect, in the fifth embodiment of the first aspect, the step of obtaining the data source table in the source database and creating the materialized view log corresponding to the data source table includes: obtaining the metadata of the data source table, wherein the metadata includes a data primary key, a data name, and a data type; and generating a materialized view log corresponding to the metadata based on the data primary key, data name, and data type of the metadata.
[0016] The incremental synchronization method based on materialized view logs provided in this invention constructs corresponding materialized view logs through the data primary key, data name, and data type corresponding to the metadata, so as to accurately locate the data that generated the incremental operation.
[0017] In conjunction with the first aspect, in the sixth embodiment of the first aspect, after synchronizing the incremental data to the target database, the method further includes: cleaning the materialized view log.
[0018] The incremental synchronization method based on materialized view logs provided in this embodiment of the invention cleans up the log data that has been synchronized from the materialized view log after the incremental data synchronization is completed, thereby releasing data space and saving data space.
[0019] According to a second aspect, embodiments of the present invention provide an incremental synchronization device based on materialized view logs, comprising: an acquisition module, configured to acquire a data source table in a source database and create a materialized view log corresponding to the data source table, the materialized view log including attribute columns for updating the metadata of the data source table; a first determination module, configured to determine the field corresponding to the incremental operation when an incremental operation is performed on the data source table; an extraction module, configured to extract incremental data corresponding to the incremental operation from the materialized view log when the field corresponding to the incremental operation is in the attribute columns; and a synchronization module, configured to synchronize the incremental data to a target database.
[0020] According to a third aspect, embodiments of the present invention provide a computer device, including: a memory and a processor, wherein the memory and the processor are communicatively connected to each other, the memory stores computer instructions, and the processor executes the computer instructions to perform the incremental synchronization method based on materialized view logs as described in the first aspect or any embodiment of the first aspect.
[0021] According to a fourth aspect, embodiments of the present invention provide a computer-readable storage medium storing computer instructions for causing a computer to perform the incremental synchronization method based on materialized view logs as described in the first aspect or any embodiment of the first aspect.
[0022] It should be noted that the beneficial effects of the incremental synchronization device, computer equipment, and computer-readable storage medium based on materialized view logs provided in the embodiments of the present invention can be found in the description of the corresponding content in the incremental synchronization method based on materialized view logs, and will not be repeated here. Attached Figure Description
[0023] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0024] Figure 1 An architecture diagram of the data synchronization system in an embodiment of the present invention is shown;
[0025] Figure 2 This is a flowchart of an incremental synchronization method based on materialized view logs according to an embodiment of the present invention;
[0026] Figure 3 This is another flowchart of the incremental synchronization method based on materialized view logs according to an embodiment of the present invention;
[0027] Figure 4 This is another flowchart of the incremental synchronization method based on materialized view logs according to an embodiment of the present invention;
[0028] Figure 5 This is a structural block diagram of the incremental synchronization device based on materialized view logs according to an embodiment of the present invention;
[0029] Figure 6 This is a schematic diagram of the hardware structure of a computer device provided in an embodiment of the present invention. Detailed Implementation
[0030] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0031] Currently, synchronizing incremental data in a database requires extracting the incremental data from the source database, analyzing and transforming it, and then writing a database script file and operation language suitable for the target database to store the incremental data in the target database. However, Oracle database is one of the commonly used database types, and Oracle database synchronization relies on materialized view logs. It performs a table lookup in the data source table based on the primary key in the materialized view log to retrieve the complete incremental data. However, each log entry in the materialized view log requires a table lookup. Due to network transmission and database performance limitations, the efficiency of incremental data extraction is low, thus affecting the synchronization efficiency of incremental data.
[0032] Based on this, this technical solution extracts incremental data from the materialized view log according to the operational characteristics between the materialized view log and the data source table. This eliminates the need for table lookups, reduces the number of table lookups, improves the synchronization efficiency of incremental data, and enhances the performance of incremental database synchronization.
[0033] This invention provides a data synchronization system, such as... Figure 1 As shown, the data synchronization system includes a source database, a target database, a management node cluster, a synchronization node cluster, and a coordinator cluster. During data synchronization, incremental data is extracted from the source database, analyzed and transformed by the management node cluster, synchronization node cluster, and coordinator cluster, and written into an operation language suitable for the target database. The incremental data is then stored in the target database.
[0034] The source database can be a distributed relational database, a distributed file system, or an unstructured database; no specific restrictions are placed on the type of database here.
[0035] The target database can be a distributed relational database, a distributed file system, or an unstructured database; no specific restrictions are placed on the type of database here.
[0036] The management node cluster is used for data verification and configuration, and pushes the data verification and configuration information to the synchronization node cluster. It also receives data synchronization status and progress information from the synchronization node cluster.
[0037] The synchronization node cluster is used to execute the specific data synchronization process.
[0038] The coordinator cluster is used to coordinate data synchronization from the source data to the target database.
[0039] According to an embodiment of the present invention, an embodiment of incremental synchronization based on materialized view logs is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.
[0040] This embodiment provides an incremental synchronization method based on materialized view logs, which can be used for data synchronization between the source database and the target database in the above data synchronization system. This embodiment is for the source database to be a relational database Oracle, and the target database to be any relational database, non-relational database, or message queue, etc. Figure 2 This is a flowchart of an incremental synchronization method based on materialized view logs according to an embodiment of the present invention, such as... Figure 2 As shown, the process includes the following steps:
[0041] S11, retrieve the data source table from the source database and create a materialized view log corresponding to the data source table. The materialized view log includes multiple attribute columns, which are used to determine the fields and update values involved in the incremental operations of the data source table.
[0042] The data source table is a data table in the source database used to record source data, while the materialized view log is a log table used to record update operations (i.e., additions, deletions, and modifications) on the data source table. The source database typically includes multiple data source tables, each corresponding to a different materialized view log; that is, each data source table has its own materialized view log, which occupies a certain amount of storage space.
[0043] Specifically, after obtaining the data source table, a corresponding materialized view log is created. When creating the materialized view log table corresponding to the data source table, a rowid is automatically generated for each log record. Simultaneously, during the creation of the materialized view log, attribute columns can be specified. These attribute columns characterize the updates to the source data in the data source table. These attribute columns can include filter columns, sequence, including new values, dmltype, oldnew, changevector, etc.
[0044] It should be noted that `filter columns` is used to record column values in the data source table, and can be any combination of fields in the data source table that are not of big data (LOB) type; `dmltype` indicates the incremental operation type, which can be INSERT (I), UPDATE (U), or DELETE (D); `including new values` indicates that the updated data is included; `oldnew` indicates whether the log row has a new or old value; `changevector` indicates the columns that have been updated, stored in `raw(255)`, with the updated columns marked as 1 in the corresponding binary bits; `sequence` indicates the order of incremental operations. Of course, other attributes or characteristics can be specified according to actual needs, which are not specifically limited here.
[0045] S12, when an incremental operation is performed on the data source table, determine the field corresponding to the incremental operation.
[0046] The data synchronization system can monitor in real time whether incremental operations have occurred on the data source table. When an incremental operation occurs on the data source table, the materialized view log will be refreshed in tandem with the incremental operation. By querying the metadata of the data source table, the field information of the data source table (including name, type, order in the data source table, etc.) can be obtained. The changevector column in the materialized view log can mark the fields that have been changed in the data source table. Here, by comparing the field information in the metadata with the changevector column, the field corresponding to the incremental operation can be determined. That is, by examining the changevector column in the materialized view log table, the field corresponding to the incremental operation in the materialized view log can be inferred.
[0047] S13, when the field corresponding to the incremental operation is in the attribute column, extract the incremental data corresponding to the incremental operation from the materialized view log.
[0048] The filter columns in the materialized view log record column values from the data source table. The field that generated the incremental operation is compared with the fields contained in the filter columns of the attribute column to determine if the field corresponding to the incremental operation is in any of the filter columns. If the field corresponding to the incremental operation is in any of the filter columns, it means that the incremental data corresponding to the incremental operation can be directly assembled and obtained from the materialized view log. In this case, there is no need to return to the data source table to query the incremental data; the incremental data corresponding to the incremental operation can be directly extracted from the materialized view log.
[0049] S14, synchronize incremental data to the target database.
[0050] After the database synchronization system extracts the incremental data corresponding to the incremental operation from the materialized view log, it writes the incremental data into an operation language suitable for the target database, so as to synchronize the incremental data to the target database. This allows for incremental updates to the data in the target database without having to perform another table lookup.
[0051] The incremental synchronization method based on materialized view logs provided in this embodiment specifies multiple attribute columns in the materialized view log corresponding to the data source table. When an incremental operation occurs, the field corresponding to the incremental operation can be determined by combining the multiple attribute columns and the metadata of the data source table. If the field is in the attribute column, the incremental data is extracted from the materialized view log. This eliminates the need for table lookups, reduces the number of table lookups, improves the synchronization efficiency of incremental data, and enhances the performance of database incremental synchronization.
[0052] This embodiment provides an incremental synchronization method based on materialized view logs, which can be used for data synchronization between a source database and a target database in the aforementioned data synchronization system. This implementation is applicable when the source database is a relational database (Oracle) and the target database is any relational database, non-relational database, or message queue, etc. Figure 3 This is a flowchart of an incremental synchronization method based on materialized view logs according to an embodiment of the present invention, such as... Figure 3 As shown, the process includes the following steps:
[0053] S21, retrieve the data source table from the source database and create a materialized view log corresponding to the data source table. The materialized view log includes multiple attribute columns, which are used to determine the fields and update values involved in the incremental operations of the data source table.
[0054] For detailed explanations, please refer to the relevant descriptions corresponding to the above embodiments, which will not be repeated here.
[0055] S22, When an incremental operation is performed on a data source table, determine the field corresponding to the incremental operation.
[0056] Specifically, the attribute columns include a field change vector column, an incremental operation type column, and a data column. The field change vector column represents the fields involved in the current incremental operation in the data source table. The incremental operation type column records the type of the current incremental operation, including insert, update, and delete operations. The data column records the names and latest values of non-large data fields in the data source table.
[0057] Accordingly, step S22 above may include:
[0058] S221, When an incremental operation is performed on a data source table, the incremental operation is updated in the incremental operation type column.
[0059] When an incremental operation is detected in the data source table, the materialized view log updates along with the source data update in the data source table, determines the type of incremental operation corresponding to the current source data update, such as inserting new data or updating the value of a row of data, and records the incremental operation in the incremental operation type column of the materialized view log.
[0060] S222, based on the incremental operation type column, field change vector column, and data column, determine the field corresponding to the incremental operation.
[0061] The field change vector column in the materialized view log indicates the location of data updates in the data source table. The incremental operation type column indicates the data update method, such as inserting new data, deleting data, or updating data values. By querying the incremental operation type column, field change vector column, and non-large data field columns in the materialized view log, the field where the incremental operation occurred can be determined.
[0062] This section uses a specific example, such as the data source table shown in Table 1, to create a materialized view log corresponding to Table 1. Then, the following incremental operations are performed on Table 1:
[0063] 1) Insert data with id=1;
[0064] 2) Modify the price field of id=2.
[0065] The materialized view logs after the above incremental operations are shown in Table 2. Here, dmltype indicates the type of incremental operation, i.e., the incremental operation type column; oldnew indicates whether the log row is a new value (N) or an old value (O); changvector indicates the data column where the field has changed, i.e., the field change vector column.
[0066] Table 1. Example of data source representation
[0067] id(number) name(varchar) price(number) des(clob) 1 book 10.0 text 2 pen 9 text
[0068] Table 2 Example of Materialized View Log
[0069] rowid id name price dmltype oldnew changevector row1 1 book 10.0 I N FEFF row2 2 pen 8 U O 0800 row2 2 pen 9 U N 0800
[0070] It should be noted that the field change vector column (changevector) uses binary. Each bit (0 or 1) indicates whether the corresponding column has been changed. For example, if the first column has been changed, it would be 00000010; if the second column has been changed, it would be 00000100; and if the third column has been changed, it would be 00001000. Furthermore, it uses little-endian mode and is converted to hexadecimal, resulting in 0200, 0400, and 0800 respectively. The changevector corresponding to an insert operation is FEFF. The changevector corresponding to a delete operation is 0000.
[0071] As an optional implementation, step S222 above may include:
[0072] (1) Based on the primary key information of the materialized view log, determine the incremental operation corresponding to the same primary key information.
[0073] In the relationship between the data source table and the materialized view log, the primary key information is used to represent the data record in the data source table corresponding to the materialized view log. That is, the primary key information is used to uniquely identify each data record.
[0074] The primary key information in the materialized view log can uniquely correspond to a data record in the data source table. Different primary key information corresponds to different data records, and the incremental operation generated for the data corresponding to each primary key information can be uniquely determined through the primary key information in the materialized view log.
[0075] (2) When there are multiple incremental operations, the multiple incremental operations are merged based on preset rules to obtain the target incremental operation.
[0076] The preset rules are pre-defined merging rules based on the order in which multiple incremental operations are generated. When multiple incremental operations are generated for data with the same primary key information, in order to facilitate the determination of the final incremental operation, the incremental operations can be merged according to the preset rules to obtain the target incremental operation after merging multiple incremental operations, and the target incremental operation is updated to the incremental operation type column dmltype.
[0077] Specifically, the types of incremental operations include insertion, update, and deletion operations. Taking two incremental operations as an example, step (2) above may include:
[0078] (21) When multiple incremental operations are successively insertion and update operations, multiple incremental operations are merged based on preset rules, and the target incremental operation is determined to be an insertion operation.
[0079] (22) When multiple incremental operations are insertion and deletion operations in sequence, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be a deletion operation.
[0080] (23) When multiple incremental operations are all update operations, merge the incremental operations based on preset rules and determine the target incremental operation as an update operation.
[0081] (24) When multiple incremental operations are update operations and deletion operations in sequence, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be a deletion operation.
[0082] (25) When multiple incremental operations are deletion and insertion operations in sequence, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be an insertion operation.
[0083] The merging of the above incremental operations can be expressed as:
[0084] 1) INSERT + UPDATE = INSERT;
[0085] 2) INSERT + DELETE = DELETE;
[0086] 3) UPDATE + UPDATE = UPDATE;
[0087] 4) UPDATE + DELETE = DELETE;
[0088] 5) DELETE+INSERT = INSERT.
[0089] Specifically, when creating a materialized view log, the `sequence` attribute is specified to record the order of incremental operations. For merging two or more incremental operations, they can be merged pairwise sequentially to obtain the target incremental operation. Let's take three operations as an example. When the three incremental operations are, in order, a delete operation, an insert operation, and an update operation, the delete operation is first merged with the insert operation to obtain the insert operation; then, the insert operation is merged with the update operation to obtain the insert operation, meaning the target incremental operation is the insert operation.
[0090] (3) Based on the field change vector column, the target incremental operation and the data column, determine the field of the target incremental operation in the materialized view log.
[0091] The field change vector column indicates the location of the source data in the data source table where the data update occurred. Then, the target incremental operation reveals the final updated value of the source data. The target incremental operation is recorded in the incremental operation type column. Therefore, by querying the incremental operation type column, field change vector column, and non-large data field data columns in the materialized view log, the field where the incremental operation occurred can be determined.
[0092] Whether a table lookup is needed depends on the merged results of the incremental operations. Specifically, there are three types of operations after the incremental operations are merged:
[0093] 1) After merging, it is an insert operation. If the filter columns of the materialized view log contain fields from all data source tables, then there is no need to look up the table; otherwise, a table lookup is required.
[0094] 2) After merging, there is an update operation. Based on each update operation, the fields updated multiple times are merged in sequence. If the updated field is contained in the attribute column filter columns, there is no need to look up the table; otherwise, a table lookup is required.
[0095] 3) If the merged operation is followed by a deletion, the deletion can be performed based on the primary key information without needing to look up the table.
[0096] Here is a specific example to illustrate this. As shown in Table 1, after the materialized view log corresponding to Table 1 is created, multiple incremental operations are performed on the metadata in Table 1 as shown in Table 3.
[0097] Table 3 Examples of Multiple Incremental Operations
[0098]
[0099]
[0100] In a batch of materialized view log data, incremental operations are performed to merge log data with the same primary key information, resulting in the following: Figure 4 The materialized view log shown.
[0101] Table 4. Example of materialized view logs after incremental operations are merged.
[0102] rowid id name price dmltype oldnew changevector row2 2 pen 9 U O 0C row2 2 pencil 10 U N 0C row1 1 book 10.0 D O 00
[0103] S23, when the field corresponding to the incremental operation is in the attribute column, the incremental data corresponding to the incremental operation is extracted from the materialized view log.
[0104] For detailed explanations, please refer to the relevant descriptions corresponding to the above embodiments, which will not be repeated here.
[0105] S24, synchronize incremental data to the target database.
[0106] For detailed explanations, please refer to the relevant descriptions corresponding to the above embodiments, which will not be repeated here.
[0107] The incremental synchronization method based on materialized view logs provided in this embodiment updates the incremental operation in the incremental operation type column. This allows the field that caused the data update to be determined based on the incremental operation type, the field change vector column, and the data column—that is, the field that caused the incremental operation is identified from the materialized view log. This enables confirmation of updated data based on incremental operations without needing to query the data source table, reducing the time spent on table lookups for incremental data and improving data synchronization efficiency. By merging multiple incremental operations, it is easier to confirm the final state of the incremental data, minimizing the number of table lookups, reducing the size of the incremental data, and facilitating subsequent operations.
[0108] This embodiment provides an incremental synchronization method based on materialized view logs, which can be used for data synchronization between a source database and a target database in the aforementioned data synchronization system. This implementation is applicable when the source database is a relational database (Oracle) and the target database is any relational database, non-relational database, or message queue, etc. Figure 4 This is a flowchart of an incremental synchronization method based on materialized view logs according to an embodiment of the present invention, such as... Figure 4 As shown, the process includes the following steps:
[0109] S31, retrieve the data source table from the source database and create a materialized view log corresponding to the data source table. The materialized view log includes multiple attribute columns, which are used to determine the fields and update values involved in the incremental operations of the data source table.
[0110] As an optional implementation, the above steps may include:
[0111] S311, retrieve the metadata of the data source table, which includes the primary key, data name, and data type.
[0112] The data synchronization system reads the data source table to be synchronized from the source database and reads the metadata in the data source table, including the data primary key, data name and data type of the data source table, and may also include data location, and of course other information, which is not specifically limited here.
[0113] Among them, the primary key is used to identify the uniqueness of the data, such as the data identifier ID in Table 1 above; the data name represents the name of each column of data, such as name in Table 1 above; the data type represents the type of data, such as lob type, text type, etc., such as des(clob) in Table 1 above; and the data location represents the index position of the data table in the source database.
[0114] S312 generates a materialized view log corresponding to the metadata based on the data primary key, data name, and data type of the metadata.
[0115] Based on the primary key, data name, and data type of the metadata, the data and field information of the data source table described by the metadata are determined. Then, based on the data of the data source table corresponding to the metadata, the appropriate syntax is used to create a materialized view log, specifying multiple attribute columns for the materialized view log, including features such as dmltype, oldnew, changevector, sequence, and filter columns.
[0116] It should be noted that filter columns can contain multiple fields, such as data identifier (id), data name (name), and other data information.
[0117] S32 determines the field corresponding to the incremental operation when an incremental operation is performed on the data source table.
[0118] For detailed explanations, please refer to the relevant descriptions corresponding to the above embodiments, which will not be repeated here.
[0119] S33, determine whether the field corresponding to the incremental operation is a big data field.
[0120] The "Big Data" field indicates that the data corresponding to this incremental operation is of type LOB. Materialized view logs update data that is not LOB; however, when updating data in LOB fields, a table lookup is still required.
[0121] When an incremental operation is detected in the data source table, it is determined whether the field in which the incremental operation occurred is a big data field. If the field corresponding to the incremental operation is not a big data field, step S34 is executed; otherwise, a table lookup is performed.
[0122] S34, determine whether the field corresponding to the incremental operation is in the attribute column.
[0123] When the field corresponding to the incremental operation is not a large data field, the field corresponding to the incremental operation is compared with each field contained in the attribute column to determine whether the field corresponding to the incremental operation is in the filter columns of the attribute column. If the field corresponding to the incremental operation is in the filter columns of the attribute column, then proceed to the next step S35; otherwise, perform a table lookup.
[0124] Specifically, the data synchronization system can sequentially analyze each row of data in the materialized view log, determine whether the data that generated the incremental operation needs to be retrieved from the table, and perform statistics on the incremental data corresponding to the incremental operation. After the data aggregation is completed, the data that needs to be retrieved from the table is retrieved, while the data that does not need to be retrieved from the table is directly extracted from the materialized view log.
[0125] S35, when the field corresponding to the incremental operation is in the attribute column, the incremental data corresponding to the incremental operation is extracted from the materialized view log.
[0126] For detailed explanations, please refer to the relevant descriptions corresponding to the above embodiments, which will not be repeated here.
[0127] S36 synchronizes incremental data to the target database.
[0128] For detailed explanations, please refer to the relevant descriptions corresponding to the above embodiments, which will not be repeated here.
[0129] S37, perform data cleanup on the materialized view log.
[0130] After the incremental data is successfully imported into the target database, the materialized view logs that have already been synchronized are cleaned up, and the system continues to wait for log data generated by the next incremental operation. When new log data is found in the materialized view logs, the system returns to step S32 to execute the incremental data synchronization process.
[0131] The incremental synchronization method based on materialized view logs provided in this embodiment constructs corresponding materialized view logs using the primary key, data name, and data type corresponding to the metadata, in order to accurately locate the data that generated the incremental operation. For incremental operations on non-big data fields, the number of table lookups for non-big data fields is reduced by identifying the corresponding field of the incremental operation. This improves the performance of incremental data extraction for data source tables that do not contain big data fields, or for scenarios where big data fields are read more and written less. After the incremental data synchronization is completed, the log data that has been synchronized is cleaned up from the materialized view logs, releasing data space and saving data usage space.
[0132] This embodiment also provides an incremental synchronization device based on materialized view logs, which is used to implement the above embodiments and preferred embodiments; details already described will not be repeated. As used below, the term "module" can be a combination of software and / or hardware that implements a predetermined function. Although the device described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.
[0133] This embodiment provides an incremental synchronization device based on materialized view logs, such as... Figure 4 As shown, it includes:
[0134] The acquisition module 41 is used to acquire the data source table in the source database and create a materialized view log corresponding to the data source table. The materialized view log includes multiple attribute columns, which are used to determine the fields and update values involved in the incremental operations of the data source table.
[0135] The first determining module 42 is used to determine the field corresponding to the incremental operation when an incremental operation is generated for the data source table.
[0136] Extraction module 43 is used to extract incremental data corresponding to incremental operations from the materialized view log when the field corresponding to the incremental operation is in the attribute column.
[0137] Synchronization module 44 is used to synchronize incremental data to the target database.
[0138] As an optional implementation, the multiple attribute columns include a field change vector column, an incremental operation type column, and a data column. The first determining module 42 may include:
[0139] The update submodule is used to update the incremental operation type column when an incremental operation is generated for a data source table.
[0140] The first determination submodule is used to determine the field corresponding to the incremental operation based on the incremental operation type column, the field change vector column, and the data column.
[0141] As an optional implementation, the second determining submodule described above may include:
[0142] Retrieve sub-units to determine multiple incremental operations corresponding to the same primary key information based on the primary key information in the materialized view log.
[0143] The merging subunit is used to merge multiple incremental operations based on preset rules to obtain the target incremental operation when there are multiple incremental operations.
[0144] The sub-unit is used to determine the field in the materialized view log for the target incremental operation based on the field change vector column, the target incremental operation, and the data column.
[0145] Specifically, the types of incremental operations include insertion, update, and deletion operations, and the above-mentioned merging subunit is used for:
[0146] When multiple incremental operations are sequentially insertion and update operations, multiple incremental operations are merged based on preset rules, and the target incremental operation is determined to be an insertion operation.
[0147] When multiple incremental operations are successively insertion and deletion operations, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be a deletion operation.
[0148] When multiple incremental operations are all update operations, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be an update operation.
[0149] When multiple incremental operations are successively update and delete operations, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be a delete operation.
[0150] When multiple incremental operations are successively deletion and insertion operations, the incremental operations are merged based on preset rules, and the target incremental operation is determined to be an insertion operation.
[0151] As an optional implementation, the acquisition module 41 described above may include:
[0152] The first submodule is used to retrieve the metadata of the data source table, which includes the primary key, data name, and data type.
[0153] The generation submodule is used to generate materialized view logs corresponding to the metadata, based on the data primary key, data name, and data type of the metadata.
[0154] As an optional implementation, the above-described incremental synchronization device based on materialized view logs may further include:
[0155] The first judgment module is used to determine whether the field corresponding to the incremental operation is a big data field.
[0156] The second judgment module is used to determine whether the field corresponding to the incremental operation is in the attribute column when the field corresponding to the incremental operation is not a big data field.
[0157] As an optional implementation, the above-described incremental synchronization device based on materialized view logs may further include:
[0158] The cleanup module is used to clean up data in the materialized view logs.
[0159] In this embodiment, the incremental synchronization device based on materialized view logs is presented in the form of functional units. Here, a unit refers to an ASIC circuit, a processor and memory that execute one or more software or fixed programs, and / or other devices that can provide the above functions.
[0160] Further functional descriptions of the above modules are the same as those in the corresponding embodiments described above, and will not be repeated here.
[0161] The incremental synchronization device based on materialized view logs provided in this invention specifies an attribute column in the materialized view log corresponding to the data source table. When an incremental operation occurs, the field corresponding to the incremental operation is determined. If the field is in the attribute column, the incremental data is extracted from the materialized view log. This eliminates the need for table lookups, reduces the number of table lookups, improves the synchronization efficiency of incremental data, and enhances the performance of database incremental synchronization.
[0162] This invention also provides a computer device, which includes a source database or a target database and has the above-described features. Figure 5 The incremental synchronization device based on materialized view logs is shown.
[0163] Please see Figure 6 , Figure 6 This is a schematic diagram of the structure of a computer device provided in an optional embodiment of the present invention, such as... Figure 6As shown, the computer device may include: at least one processor 501, such as a central processing unit (CPU), at least one communication interface 503, memory 504, and at least one communication bus 502. The communication bus 502 is used to enable communication between these components. The communication interface 503 may include a display screen and a keyboard; optionally, the communication interface 503 may also include a standard wired interface or a wireless interface. The memory 504 may be high-speed volatile random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory 504 may also be at least one storage device located remotely from the aforementioned processor 501. The processor 501 may be combined with... Figure 5 The described apparatus has an application program stored in memory 504, and a processor 501 calls the program code stored in memory 504 to perform any of the above method steps.
[0164] The communication bus 502 can be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The communication bus 502 can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, Figure 6 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.
[0165] The memory 504 may include volatile memory, such as random-access memory (RAM); the memory may also include non-volatile memory, such as flash memory, hard disk drive (HDD) or solid-state drive (SSD); the memory 504 may also include a combination of the above types of memory.
[0166] The processor 501 can be a central processing unit (CPU), a network processor (NP), or a combination of a CPU and an NP.
[0167] The processor 501 may further include a hardware chip. This hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof. The PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
[0168] Optionally, memory 504 is also used to store program instructions. Processor 501 can call the program instructions to implement the functions described in this application. Figures 2 to 4 The incremental synchronization method based on materialized view logs is shown in the embodiment.
[0169] This invention also provides a non-transitory computer storage medium storing computer-executable instructions that can execute the incremental synchronization method based on materialized view logs in any of the above method embodiments. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), random access memory (RAM), flash memory, hard disk drive (HDD), or solid-state drive (SSD), etc.; the storage medium may also include combinations of the above types of memory.
[0170] Although embodiments of the invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations all fall within the scope defined by the appended claims.
Claims
1. An incremental synchronization method based on materialized view logs, characterized in that, include: Obtain the data source table from the source database, and create a materialized view log corresponding to the data source table. The materialized view log includes multiple attribute columns, which are used to determine the fields and update values involved in the incremental operation of the data source table. When an incremental operation is performed on the data source table, the field corresponding to the incremental operation is determined. If the field corresponding to the incremental operation is in the attribute column, then the incremental data corresponding to the incremental operation is extracted from the materialized view log. Synchronize the incremental data to the target database; The step of obtaining the data source table in the source database and creating the materialized view log corresponding to the data source table includes: obtaining the metadata of the data source table, wherein the metadata includes the data primary key, data name and data type; and generating the materialized view log corresponding to the metadata based on the data primary key, data name and data type of the metadata. The multiple attribute columns include a field change vector column, an incremental operation type column, and a data column. When an incremental operation is generated for the data source table, determining the field corresponding to the incremental operation includes: updating the incremental operation in the incremental operation type column when an incremental operation is generated for the data source table; and determining the field corresponding to the incremental operation based on the incremental operation type column, the field change vector column, and the data column.
2. The method according to claim 1, characterized in that, The step of determining the field corresponding to the incremental operation based on the incremental operation type column, the field change vector column, and the data column includes: Based on the primary key information of the materialized view log, determine the incremental operations corresponding to the same primary key information; When there are multiple incremental operations, the multiple incremental operations are merged based on preset rules to obtain the target incremental operation; Based on the field change vector column, the target incremental operation, and the data column, the field of the target incremental operation in the materialized view log is determined.
3. The method according to claim 2, characterized in that, The incremental operations include insertion, update, and deletion operations. Merging multiple incremental operations based on preset rules to obtain the target incremental operation includes: When multiple incremental operations are sequentially insertion and update operations, the multiple incremental operations are merged based on the preset rules, and the target incremental operation is determined to be an insertion operation. When multiple incremental operations are sequentially insertion and deletion operations, the incremental operations are merged based on the preset rules to determine the target incremental operation as a deletion operation; When multiple incremental operations are all update operations, the incremental operations are merged based on the preset rules to determine the target incremental operation as an update operation; When multiple incremental operations are sequentially update and delete operations, the incremental operations are merged based on the preset rules to determine the target incremental operation as a delete operation; When multiple incremental operations are successively deletion and insertion operations, the incremental operations are merged based on the preset rules, and the target incremental operation is determined to be an insertion operation.
4. The method according to claim 1, characterized in that, After determining the field corresponding to the incremental operation based on the metadata of the data source table, the method further includes: Determine whether the field corresponding to the incremental operation is a big data field; When the field corresponding to the incremental operation is not the big data field, determine whether the field corresponding to the incremental operation is in the attribute column.
5. The method according to claim 1, characterized in that, After synchronizing the incremental data to the target database, the process also includes: Perform data cleanup on the materialized view log.
6. An incremental synchronization device based on materialized view logs, characterized in that, include: The acquisition module is used to acquire the data source table in the source database and create a materialized view log corresponding to the data source table. The materialized view log includes multiple attribute columns, which are used to determine the fields and update values involved in the incremental operation of the data source table. The first determining module is used to determine the field corresponding to the incremental operation when an incremental operation is generated for the data source table; The extraction module is used to extract incremental data corresponding to the incremental operation from the materialized view log when the field corresponding to the incremental operation is in the attribute column. The synchronization module is used to synchronize the incremental data to the target database; The acquisition module includes: a first acquisition submodule, used to acquire the metadata of the data source table, the metadata including the data primary key, data name and data type; and a generation submodule, used to generate a materialized view log corresponding to the metadata based on the data primary key, data name and data type of the metadata. The plurality of attribute columns include a field change vector column, an incremental operation type column, and a data column. The first determining module includes: an update submodule, used to update the incremental operation in the incremental operation type column when an incremental operation is generated for the data source table; and a first determining submodule, used to determine the field corresponding to the incremental operation based on the incremental operation type column, the field change vector column, and the data column.
7. A computer device, characterized in that, include: A memory and a processor are communicatively connected, the memory stores computer instructions, and the processor executes the computer instructions to perform the incremental synchronization method based on materialized view logs as described in any one of claims 1-5.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions for causing the computer to execute the incremental synchronization method based on materialized view logs as described in any one of claims 1-5.