Extension method and device for distributed database

An expansion method and expansion device technology, applied in the field of distributed database expansion methods and devices, can solve the problems of increasing compatibility, difficult maintenance and adjustment, and suspension, so as to increase the data migration process, reduce the data migration process, Accelerate the effect of data migration

Active Publication Date: 2018-04-03
CHINA MOBILE GRP GUANGDONG CO LTD +1
13 Cites 8 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0009] The method of suspending business acceptance has the following problems: 1) The expansion may only be part of the table and some database hosts, but it is necessary to suspend all businesses involving table operations, and it is likely to suspend all businesses; 2) The time of business interruption It is related to the amount of data that needs to be migrated. When the number of records in the table is large, the pause time is longer
[0010] Using the method of modifying the distribution strateg...
View more

Method used

[0119] Batch data update or delete operations have processing conflicts with data migration operations. To solve operational conflicts and ensure data consistency, the backgrou...
View more

Abstract

The embodiment of the invention discloses an extension method and device for a distributed database. The method comprises the steps that when it is determined that a data host has access to a data access service, a new data distribution strategy is set for all data hosts according to an extension target; after it is determined that data corresponding to the new data distribution strategy starts migration, the state of the new distribution strategy is set to be in migration; and after migration of the data corresponding to the new data distribution strategy is completed in a table routing mode,the new data distribution strategy is used to replace an old data distribution strategy.

Application Domain

Database distribution/replicationSpecial data processing applications

Technology Topic

Extension methodData mining +3

Image

  • Extension method and device for distributed database
  • Extension method and device for distributed database
  • Extension method and device for distributed database

Examples

  • Experimental program(1)

Example Embodiment

[0078] In order to understand the characteristics and technical content of the embodiments of the present invention in more detail, the implementation of the embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The attached drawings are for reference and explanation purposes only, and are not used to limit the embodiments of the present invention.
[0079] figure 1 It is a flowchart of a distributed database expansion method according to an embodiment of the present invention, such as figure 1 As shown, the method for expanding a distributed database in an embodiment of the present invention includes the following processing steps:
[0080] Step 101: When it is determined that there are data hosts in the data access service, a new data distribution strategy is set for all data hosts according to the expansion target.
[0081] In the embodiment of the present invention, when a new data host is connected, the current data needs to be redistributed. Specifically, a new data distribution strategy is set for all data hosts according to the expansion target, and the current data is adjusted according to the new data distribution strategy. Perform the migration.
[0082] Step 102: After determining that the data corresponding to the new data distribution strategy starts to migrate, set the status of the new distribution strategy to being migrated.
[0083] Step 103: After the data corresponding to the new data distribution strategy is migrated in a sub-table routing manner, the old data distribution strategy is replaced with the new data distribution strategy.
[0084] Wherein, the migration of the data corresponding to the new data distribution strategy in a sub-table routing manner includes:
[0085] Generating a new table name corresponding to the new data distribution strategy according to the new data distribution strategy and the database table naming specification;
[0086] Create a table according to the new table name in the corresponding database of the data host involved;
[0087] In a multi-process and multi-threaded manner, the data to be migrated is stored in the created table.
[0088] The storing the data to be migrated into the created table includes:
[0089] Start a cursor for each original table, and read the records corresponding to the cursor in turn;
[0090] Delete the records on the original table according to the primary key of the original table and the read record value, and insert the read record value into the corresponding new data according to the distribution routing rule corresponding to the new data distribution strategy The created table.
[0091] After the foregoing steps 101 to 103, the technical solution of the embodiment of the present invention further includes:
[0092] When inserting the read record value into the corresponding new table fails, the node where the original table is located and the node where the newly created table is located both perform transaction rollback, and roll back to the state before the data to be migrated is stored ; When the read record value is successfully inserted into the corresponding new table, the node where the original table is located and the node where the newly created table is located sequentially commit transactions.
[0093] When a single record update or delete request is received during the migration process, when it is determined that there is a data conflict between the single record and the migration operation, the updated data is migrated to the new table or the original table is directly deleted before the data migration If after data migration, use the data to directly update the new table or delete the data in the new table.
[0094] When a single record insertion request is received during the migration process, the insertion is performed according to the table name of the new table and the node where it is located, the transaction is committed when the insertion is successful, and the failure reason is returned when the insertion fails.
[0095] When a batch record query request is received during the migration process, the query is performed on the original table according to the original data distribution strategy, the query is performed on the new table according to the new data distribution strategy, and the query results are merged and returned in a unified manner. When a batch data update or delete request is received during the migration process, the data migration process will be suspended. After the data transactions currently being migrated are submitted, the data in the new table and the original table are respectively processed according to the original data distribution strategy and the new data distribution strategy. Update or delete. When the update or delete fails, the transaction rollback is executed. When the update or delete succeeds, the transactions of each node are submitted in turn, and the data migration is continued.
[0096] The following uses specific examples to further clarify the essence of the technical solutions of the embodiments of the present invention.
[0097] The embodiment of the present invention mainly uses a background independent program to gradually migrate data. During data migration, the DDS data access layer performs table routing at the same time according to two different data distribution strategies, the old and the new. When the back-end independent program completes the data relocation, DDS switches to a new data distribution strategy to perform sub-table routing.
[0098] figure 2 This is the overall implementation sequence diagram of the distributed database expansion method of the embodiment of the present invention, such as figure 2 As shown, in the embodiment of the present invention, in the data access service, the new extended data host is accessed first, and the data distribution strategy is set according to the extended target. After the data distribution strategy is set, the data access service sets the new strategy to not start data migration. After the background data migration program is started, the data access service will be notified, the new distribution strategy status will be changed to the migration, and the data migration will be performed at the same time. For the SQL processing request submitted by the application, the data access service judges, and if the distribution strategy status is in migration, the related request will be specially processed. When the independent program is completed, the migration result is confirmed. If the conditions are met, the data access service is notified and the distribution strategy status is changed to normal. This expansion is over.
[0099] image 3 It is a flowchart of initialization processing when a distributed database is expanded according to an embodiment of the present invention, such as image 3 As shown, in the initialization process, first check whether the related tables that need to be redistributed are still in data migration. This is achieved by checking the status identification method. A flag bit is set for each distribution table in the DDS. The flag bit is set when the data migration is started, and the flag bit is cleared after the migration is completed to prevent a distribution strategy from being adjusted. A new distribution strategy adjustment was initiated.
[0100] After the background independent data migration program is started, the new table name corresponding to the new strategy is generated according to the distribution strategy and the database table naming convention, and the table is created according to the new table name on each database node involved. If all the nodes involved create a table successfully, the initialization is successful, and the subsequent processing is continued; otherwise, the initialization fails, and the DDS is notified, and the DDS clears the flag bit, and at the same time restores the old distribution strategy, and continues to perform table routing according to the old distribution strategy .
[0101] After the initialization is successful, you can start multi-process and multi-threaded data migration. Figure 4 Is a data migration flowchart of an embodiment of the present invention, such as Figure 4 As shown, the overall processing flow of the data migration background program is as follows:
[0102] In the process of data migration, the cursor is first started on each original table, and the records of the cursor are read in turn. Each time a record is read, the record on the old table is deleted according to the primary key of the old table and the value of the read record, and the record is judged according to the new distributed routing rule and inserted into the corresponding new table.
[0103] If the insertion of the new table fails, both the node where the old table is located and the node where the new table is located do transaction rollback processing. If the new table is inserted successfully, the node where the old table is located and the node where the new table is located perform transaction commit in turn. If according to the new distributed routing rules, the old and new tables are on the same node, only one transaction can be submitted.
[0104] After the insertion fails, the transaction is rolled back, and the related records remain on the old table. After the data migration is completed, the abnormal data needs to be handled by the maintenance staff according to the abnormal data processing process.
[0105] When all records of the old table are processed, the entire migration process ends.
[0106] In order to ensure that the business is not interrupted, during the entire migration process, the operation of the related table will not be suspended, and the DDS will not interrupt the processing request of the related table. The following operations on related tables are divided into five situations: single record query (query by primary key), single record insert, single record update or delete, batch result set query, and batch data update or delete.
[0107] Figure 5 It is a flow chart of a single record query processing flow in an embodiment of the present invention, such as Figure 5 As shown, during the migration process, if the DDS data service layer encounters a SQL request for a single record query, it will be processed according to the following processing flow:
[0108] In the case of single record query, the DDS data service layer first searches the node according to the new table name according to the query primary key condition and the new distribution routing strategy. If the search is successful, the related record has been migrated, and the query result is directly returned. . If the search fails, the related records may not have been migrated. You need to follow the old distributed routing strategy to search the corresponding node by the old table name. If the search is successful, this will return the query result, otherwise an empty result set will be returned.
[0109] Image 6 It is a flow chart of a single record update or delete processing flow in the embodiment of the present invention, such as Image 6 As shown, for a single record update or deletion, the DDS service layer processes it according to the following processing flow:
[0110] There is a data conflict between the update or deletion of a single record and the background data migration operation. If the operation is before the data relocation, the updated data needs to be relocated to the new table. If the operation is after the data migration, the new table needs to be updated directly. If the update and relocation occur at the same time, the final data consistency needs to be ensured.
[0111] When performing an update or delete operation, if it occurs before the migration of the record, the update result of the old table is the same as that after the migration of the new table. If it happens after the record is migrated, the record of the new table is directly modified, and data consistency is guaranteed. If a conflict occurs between migration and operation, data consistency can also be guaranteed regardless of whether it is updated or migrated first.
[0112] Figure 7 It is a flow chart of a single record insertion processing flow in an embodiment of the present invention, such as Figure 7 As shown, for the scenario where a single record is inserted, the DDS data access service layer processes it as follows:
[0113] For the scenario where a single record is inserted, the DDS data access service layer processes it in accordance with the new routing strategy, directly inserting it according to the new table name and the node where it is located, if the insertion is successful, submit it, and if the insertion fails, it will directly return the failure reason. Insert a single record, insert a new table directly according to the new routing distribution strategy, and there is no problem of data operation conflict with data background migration.
[0114] Figure 8 It is a flowchart of the batch record query processing flow of the embodiment of the present invention, such as Figure 8 As shown, in the case of batch query, the DDS data service layer queries the old table according to the old distribution strategy, and queries the new table according to the new distribution strategy, and then merges the query results to return to the application in a unified manner. When the data migration is completed, the query will be performed in accordance with the new distribution strategy.
[0115] Picture 9 This is a flowchart of the batch record update or deletion processing flow of the embodiment of the present invention. In the case of batch data update or deletion, the related processing flow is as follows:
[0116] When starting to update or delete data in batches, the DDS data access service layer first informs the background data migration program and requests the suspension of data migration processing. After the background data migration program has submitted all the data transactions currently being migrated, it will feed back a message that the data migration has been suspended. After receiving the feedback, the data access service layer DDS will start the data update or delete operation.
[0117] When starting the operation, operate the new table and the old table at the same time according to the new and old distribution strategies. When all operations return to normal, the transactions of each data node are submitted in turn. If there is a node, whether the return operation of the new table or the old table fails, the transaction of all data nodes is rolled back and the application operation fails.
[0118] When the data operations of the new and old tables are completed and the transaction has been submitted or rolled back, the DDS data access service layer notifies the background data migration program to continue the previously suspended data migration processing.
[0119] There is a processing conflict between batch data update or deletion operations and data migration operations. In order to resolve operational conflicts and ensure data consistency, the background data migration is suspended, and the background data migration is continued after the batch operation is completed. Data consistency performance is guaranteed.
[0120] The method and device for extending the distributed database in the embodiments of the present invention do not need to stop related services during data migration, and can perform online migration of services. Even for tables with huge data volumes, the distribution strategy can be readjusted; embodiments of the present invention New distribution strategies can be set freely, new hosts are added, and the load can be effectively shared after the data migration is completed. The embodiment of the present invention can adjust the distribution strategy multiple times, and continuously optimize according to the actual situation. Different data tables can be relocated separately. The data relocation process can flexibly adjust the operating resources. When the business is busy, the data relocation process can be reduced, and more computing resources can be spared to support business processing. When the business volume decreases, it can be increased Data relocation process, speed up the completion of data migration.
[0121] Picture 10 It is a schematic diagram of the composition structure of a distributed database expansion device according to an embodiment of the present invention, such as Picture 10 As shown, the apparatus for expanding a distributed database in an embodiment of the present invention includes a first determining unit 1001, a first setting unit 1002, a second determining unit 1003, a second setting unit 1004, a migration unit 1005, and a replacement unit 1006, in which:
[0122] The first determining unit 1001 is configured to determine whether there is a data host access in the data access service, and sometimes trigger the first setting unit 1002;
[0123] The first setting unit 1002 is used to set a new data distribution strategy for all data hosts according to the expansion target;
[0124] The second determining unit 1003 is configured to determine whether the data corresponding to the new data distribution strategy starts migration, and trigger the second setting unit 1004 after the migration is started;
[0125] The second setting unit 1004 is configured to set the state of the new distribution strategy to be in migration;
[0126] The migration unit 1005 is configured to complete the migration of the data corresponding to the new data distribution strategy in a sub-table routing manner;
[0127] The replacement unit 1006 replaces the old data distribution strategy with the new data distribution strategy after the migration unit completes the migration.
[0128] In the embodiment of the present invention, the migration unit 1005 is also used for:
[0129] Generating a new table name corresponding to the new data distribution strategy according to the new data distribution strategy and the database table naming specification;
[0130] Create a table according to the new table name in the corresponding database of the data host involved;
[0131] In a multi-process and multi-threaded manner, the data to be migrated is stored in the created table.
[0132] In the embodiment of the present invention, the migration unit 1005 is also used for:
[0133] Start a cursor for each original table, and read the records corresponding to the cursor in turn;
[0134] Delete the records on the original table according to the primary key of the original table and the read record value, and insert the read record value into the corresponding new data according to the distribution routing rule corresponding to the new data distribution strategy The created table.
[0135] in Picture 10 On the basis of the expansion device for the distributed database shown, the expansion device for the distributed database in the embodiment of the present invention further includes:
[0136] The migration processing unit (not shown in the figure) is used to insert the read record value into the corresponding new table when it fails, the node where the original table is located and the node where the newly created table is located both perform transaction return Roll back to the state before the data to be migrated is stored; when the read record value is successfully inserted into the corresponding new table, the node where the original table is located and the node where the newly created table is located perform transactions in turn submit.
[0137] in Picture 10 On the basis of the expansion device of the distributed database shown, the expansion device of the distributed database of the embodiment of the present invention further includes: a first receiving unit (not shown in the figure) and a first query unit (not shown in the figure) ,among them:
[0138] The first receiving unit is configured to receive a single record query request during the migration process;
[0139] The first query unit is used to search on the node according to the table name of the new table according to the conditions of querying the primary key and the new data distribution strategy. If the search is successful, the query result is returned. When the search fails, according to the original data distribution strategy, The corresponding node is searched according to the table name of the original table, and the search result is returned when the search is successful, and an empty result set is returned when the search fails.
[0140] in Picture 10 On the basis of the expansion device of the distributed database shown, the expansion device of the distributed database of the embodiment of the present invention further includes: a second receiving unit (not shown in the figure) and a third determining unit (not shown in the figure) And the first update unit (not shown in the figure), where:
[0141] The second receiving unit is used to receive a single record update or delete request during the migration process;
[0142] The third determining unit is configured to determine whether there is a data conflict between the single record and the migration operation, and trigger the update unit when there is a conflict;
[0143] The first update unit is used to migrate the updated data to the new table or directly delete the data in the original table before the data migration. After the data migration, use the data to directly update the new table or delete the data in the new table .
[0144] in Picture 10 On the basis of the expansion device of the distributed database shown, the expansion device of the distributed database of the embodiment of the present invention further includes: a third receiving unit (not shown in the figure) and an insertion unit (not shown in the figure), wherein :
[0145] The third receiving unit is configured to receive a single record insertion request during the migration process;
[0146] Insertion unit is used to insert according to the table name and node of the new table. When the insertion is successful, the transaction is committed, and when the insertion fails, the failure reason is returned.
[0147] in Picture 10 On the basis of the expansion device of the distributed database shown, the expansion device of the distributed database of the embodiment of the present invention further includes: a fourth receiving unit (not shown in the figure) and a second query unit (not shown in the figure) ,among them:
[0148] The fourth receiving unit is used to receive query requests for batch records during the migration process;
[0149] The second query unit is configured to perform queries on the original table according to the original data distribution strategy, and perform queries on the new table according to the new data distribution strategy, and then the query results are combined and returned in a unified manner.
[0150] in Picture 10 On the basis of the expansion device of the distributed database shown, the expansion device of the distributed database of the embodiment of the present invention further includes: a fifth receiving unit (not shown in the figure) and a second updating unit (not shown in the figure) ,among them:
[0151] The fifth receiving unit is configured to receive data batch update or deletion requests during the migration process;
[0152] The second update unit is used to suspend data migration processing. After the data transactions currently being migrated are submitted, the data in the new table and the original table are updated or deleted according to the original data distribution strategy and the new data distribution strategy. Or when the deletion fails, the transaction rollback is executed, and when the update or deletion succeeds, the transactions of each node are sequentially submitted, and the migration unit is triggered to continue the data migration.
[0153] Those skilled in the art should understand that Picture 10 The implementation function of each unit in the expansion device of the distributed database shown can be understood with reference to the relevant description of the respective embodiments and application examples of the aforementioned expansion method of the distributed database. The above-mentioned processing units can be constituted by a microprocessor, a large programmable array FPGA or corresponding chips.
[0154] The technical solutions described in the embodiments of the present invention can be combined arbitrarily without conflict.
[0155] In the several embodiments provided by the present invention, it should be understood that the disclosed method and smart device can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms of.
[0156] The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units; Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
[0157] In addition, the functional units in the embodiments of the present invention can be all integrated into a second processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit; The above-mentioned integrated unit can be realized in the form of hardware or in the form of hardware plus software functional unit.
[0158] The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. It should be covered within the protection scope of the present invention.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products