System and method for guaranteeing distributed data processing consistency

A distributed data and consistent technology, applied in the direction of digital transmission system, transmission system, data exchange network, etc., can solve problems such as inconsistent expected results, abnormal data processing, etc., and achieve the effect of reducing delay and ensuring reliability

Active Publication Date: 2014-03-19
SHANGHAI STOCK EXCHANGE
5 Cites 43 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the consistency problem existing in the distributed data processing in the existing transaction system and the technical problem that when the log mechanism is used, the abnormal data processing leads to th...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention relates to the distributed data processing technology field and particularly relates to a system and a method for guaranteeing distributed data processing consistency. The system comprises a computer transaction system platform. The computer transaction system platform comprises multiple transaction platforms for transaction business processing, a host in a transaction host is taken as a main node, other nodes are taken as standby transaction machines, the main node is for processing tasks of all orders and further for log record, maintenance and triggering basic data synchronization of secondary nodes, the secondary nodes do not carry out real-time order processing but realizes consistency with the main node through synchronization maintenance of basic data memory. Compared with the prior art, the system is advantaged in that a log structure with a message execution state is employed; a problem of time delay conflict of real-time reading and updating of the basic data of the transaction system can be solved; complexity of replay of the transaction system is reduced; consistency during data processing between platforms and switching between main and standby machines, system performance and reliability are improved.

Application Domain

Technology Topic

Image

  • System and method for guaranteeing distributed data processing consistency
  • System and method for guaranteeing distributed data processing consistency
  • System and method for guaranteeing distributed data processing consistency

Examples

  • Experimental program(3)

Example Embodiment

[0056] Example 1
[0057] Such as figure 1 As shown, figure 1 It is the basic data maintenance framework diagram in the distributed trading system of the present invention. The computer distributed trading system platform consists of several trading platforms responsible for transaction processing. The trading host will use one host as the master node, and other nodes will become transactions. Standby machine, the master node is responsible for all order processing tasks, and is responsible for logging and maintenance, triggering the basic data synchronization of the slave node. The slave node does not perform real-time order processing, but can maintain the basic data memory consistent with the master node through synchronization. There is a synchronization router inside. The synchronization router obtains order information and basic data update messages from order generation software, front-end data management software or other platforms. The synchronization router is connected to several application processes and shared message queues, and several application processes are connected to basic data memory. It is connected to the basic data management module with the order book memory and shared message queue, and the basic data management module is connected to the basic data memory; the basic data memory supports one write and multiple reads, and supports the cancellation instructions of write operations. The basic data management module is processing the basic data log The execution status information and index of the log record are added to support dynamic positioning. When the data processing request arrives, the original request message is recorded in the log record, and the execution status information and execution result in the log record are updated after the data processing is completed.

Example Embodiment

[0058] Example 2
[0059] The basic data memory organization structure diagram is as figure 2 As shown, each record in the figure consists of multiple versions. In addition to the KEY (password) value used for indexing in the record header structure, a TAG (tag) field is also added to record the current version number, where TAG ( The tag) field is of INT64 type, and the reading and writing of the TAG (tag) field are all atomic operations, and no conflict occurs. The basic data memory supports one write and multiple reads, and supports the cancel instruction of the write operation. While the HFM (Basic Data Management Module) updates the basic data record, the read operation of the application is not affected.

Example Embodiment

[0060] Example 3
[0061] A method to ensure the consistency of distributed data processing. The request messages involved in the data processing process mainly include order requests and basic data update requests. The order message comes from the order generation software, and the basic data update message comes from the foreground data management software or other platforms. The specific method is as follows:
[0062] a. After the request message arrives at the host, it will be sent to the shared message queue by the synchronization router according to the message category, and handed over to the corresponding process for processing;
[0063] b. In the order processing process, the application process first connects to the basic data memory, and then verifies the order based on the information recorded therein. After the verification is successful, it writes to the order book memory, and updates the status information in the application log file record. The application process only writes the log once in the process of processing the order;
[0064] c. The data update request message is sent to the shared message queue via the synchronization router, which triggers the basic data management module. The basic data management module first writes the original request to the basic data log, and after completing the update of the basic data memory, it updates the basic data log again The status message in is the result of successful execution and landing processing. The basic data management module writes the log twice in the process of processing the message, respectively, when it is triggered and after the processing is completed, the improved basic data log structure and the mechanism to ensure that the log is updated twice The reliability of basic data changes;
[0065] d. For the situation that the basic data management module and the application process in step b and step c simultaneously read and write the basic data memory, the basic data memory adopts a multi-version organization form to support the simultaneous read and write of the basic data management module and multiple application processes, Effectively reduce the time delay of data processing;
[0066] The basic data memory read method is as follows:
[0067] (1). The application program reads the basic data memory method:
[0068] a. After traversing through the KEY (password) value to the memory record to be read, first read the recorded TAG (tag) value;
[0069] b. According to the KEY (password) value and TAG (tag) value, locate the current latest version record 1.1 of record 1, and apply the process of the return value of the content in record 1.1;
[0070] (2). Basic data management module update basic data memory method:
[0071] a. After traversing through the KEY (password) value to the memory record to be read, first read the recorded TAG (tag) value;
[0072] b. According to the obtained TAG (tag) value, determine the current latest version record 1.1 of record 1, and update the record to be modified to the next version record 1.2 according to the KEY (password) value and TAG (tag) value. , Since step a occurs before step c, the memory record read by the synchronous application process is still record 1.1;
[0073] c. The TAG (tag) value of the update record is 1.2, and the data update operation is completed;
[0074] After step c, the memory read by the application process is record 1.2;
[0075] (3). The basic data management module cancels the previous instruction method:
[0076] a. After traversing through the KEY (password) value to the memory record to be read, first read the recorded TAG value;
[0077] b. According to the obtained TAG (tag) value, determine the current latest version record 1.2 of record 1, and update the TAG (tag) value to the previous version number 1.1 of the current version. At this time, since step a occurs before step b, Therefore, the memory record read by the synchronous application process is still record 1.2;
[0078] After step b, the memory read by the application process is record 1.1.
[0079] When the basic data management module processes the basic data change request, it needs to update the log records twice, namely:
[0080] a. When the basic data management module receives the basic data change request, it first writes the original request record into the ORIGINAL_LOG (original log) area, adds a new record in the RESULT_LOG (result log), and sets the status information to "Processing Medium", the first landing log is completed;
[0081] b. When the basic data change request is processed, the basic data management module finds the record in the RESULT_LOG (result log) according to the TASKID (task verification) of the original request, sets the execution status to "processing successful", and returns it to the foreground The result message is packed into the result area, and the second log record is completed.
[0082] Improved reliability of transaction logic: After the HFM (Basic Data Management Module) update is completed, the basic data memory record read by the application is the latest version, and there is no possibility of reading dirty data. If the HFM (Basic Data Management Module) fails to update the basic data memory record during execution, the value in the TAG (tag) field is still the value of the previous version of the record, and no additional data rollback mechanism is required.
[0083] Compared with the ordinary memory structure, the multi-version design of the basic data memory in the present invention abandons the lock mechanism, so that the read and write operations are performed at the same time, which improves the efficiency of the system and reduces the system delay caused by the reference data change to At the same time, the reliability of memory read and write is increased, and the function of canceling the basic data change is expanded.
[0084] Such as image 3 As shown, the basic data log structure is divided into two parts: ORIGINAL_LOG (original log) and RESULT_LOG (result log). ORIGINAL_LOG (original log) records the original request, and RESULT_LOG (result log) records the status of the original request through system processing and As a result, ORIGINAL_LOG (original log) and RESULT_LOG (result log) are used as the input and output of the system respectively, and the contents can be used as the data basis for the system to replay.
[0085] When the basic data management module processes the basic data change request, it needs to update the log records twice, namely:
[0086] a. When the basic data management module receives the basic data change request, it first writes the original request record into the ORIGINAL_LOG (original log) area, adds a new record in the RESULT_LOG (result log), and sets the status information to "Processing Medium", the first landing log is completed;
[0087] b. When the basic data change request is processed, the basic data management module finds the record in the RESULT_LOG (result log) according to the TASKID (task verification) of the original request, sets the execution status to "processing successful", and returns it to the foreground The result message is packed into the result area, and the second log record is completed.
[0088] Compared with the ordinary sequential addition type log structure, the improved log structure in the present invention increases the status of message execution and the result of message processing. The execution information records the result of the HFM (Basic Data Management Module) executing the request command. When the system is replaying, only the original request whose status is successful in the log record will be reprocessed by the system. This design ensures the consistency of the data before and after the system replay. ; The message processing result records the execution result of the messages between the front and back ends in the background, which provides a guarantee for the system's troubleshooting and the consistency of the front and back messages.
[0089] The request processed by HFM (Basic Data Management Module) includes basic data changes and basic data query messages. The request originates from the shared message queue with the synchronization route. HFM (Basic Data Management Module) maintains a local message queue for storing HFM (Basic Data Management Module) messages sorted from the shared message queue.
[0090] HFM (Basic Data Management Module) is relatively simple when processing basic data query requests and does not involve log operations. Such as Figure 4 As shown, HFM (Basic Data Management Module) reads the original request from the shared message queue and puts it into the local TASK (task) message queue. HFM (Basic Data Management Module) reads the local queue sequentially, and performs message type judgment and validity verification. HFM (Basic Data Management Module) formats the response message after querying the basic data memory to obtain the result, and sends the response message To the shared message queue, the result will be sent to the foreground software by the synchronous routing, and finally the HFM (Basic Data Management Module) will set the TASK (task) message in the local queue to be processed according to TASKID (task verification), that is, delete it from the local queue , So far, HFM (Basic Data Management Module) has completed the processing of basic data query requests.
[0091] HFM (Basic Data Management Module) uses the basic data log mechanism when processing basic data change requests. Correspondingly, HFM (Basic Data Management Module) will process the TASK of basic data modification requests in two stages, such as Figure 5 As shown, in the first stage, HFM (Basic Data Management Module) obtains the basic data update request message from the shared message queue, records the original request to the ORIGINAL_LOG (original log) and updates the status of the RESULT_LOG (result log) record as "executing ", then HFM (Basic Data Management Module) generates a local message queue TASK (task) message, and sends the message to the local message queue, waiting to be processed; the second stage, HFM (Basic Data Management Module) from the local message queue Read the TASK (task) message, determine whether it is the basic data update message type, and then read the original message record in the ORIGINAL_LOG (original log) from the original request address of the TASK (task) message record. Before completing the memory update, The validity of the original request is verified. After the memory update is completed, the TASK (task) message status in the local message queue is set to be executed successfully. Finally, the HFM (Basic Data Management Module) generates a processing result message and performs format conversion, and sends it to the shared message Queue and update to the results in RESULT_LOG (result log).
[0092] The replay of basic data changes is divided into HFM (basic data management module) process replay and host replay, which are used to restore the state of HFM (basic data management module) process FAILOVER and host FAILOVER respectively:
[0093] 1) If FAILOVER (failover) occurs in the HFM (Basic Data Management Module) process, but the architecture does not have CRASH (crash), only the HFM (Basic Data Management Module) process needs to be restarted:
[0094] The flow chart of the process repeats as Image 6 As shown, the HFM (Basic Data Management Module) needs to perform the following processing during the replay: The HFM (Basic Data Management Module) process will traverse all the records in the log to reconstruct the latest data that occurred when CRASH (dead) or shut down Status, through the execution status field in the previously landed RESULT_LOG (result log), you can determine which TASK (task) records are still in the "processing" state. For the records in the "processing" state, resubmit the TASK message to HFM (Basic Data Management Module) The local message queue requires HFM (Basic Data Management Module) to re-execute this instruction so that the host can correctly process and return the foreground response message. Finally, the host sets the status field in the RESULT_LOG (result log) to " Successful execution", the standby machine can perform the update operation synchronously after seeing the request;
[0095] For the restart of the HFM (Basic Data Management Module) process, it is not necessary to replay all the log records, but only to check the completeness of the RESULT_LOG (result log), and submit uncompleted TASK requests, reducing the HFM (Basic Data Management Module) The time-consuming process of restarting also increases the reliability. The standby machine will not be affected during the process of FAILOVER (failover) and restart of the HFM (Basic Data Management Module) process;
[0096] 2) The HFM (Basic Data Management Module) takes over when the host FAILOVER (failover) occurs. If a FAILOVER (failover) occurs on the host where the HFM (Basic Data Management Module) is located, switch between the main and standby machines:
[0097] The basic data log library is shared between the host and the standby machine. When a record in the basic data log physical file is updated to "execute successfully", it will trigger the standby machine's HFM (Basic Data Management Module) to synchronously update the content in the basic data memory. Therefore, when the main and standby machines are switched, remove the TASK (task) that the host has completed the memory write operation but has not completed the update log status, and the basic data memory in the standby machine has remained consistent with the host. Reference Figure 5 , When switching occurs, the original main HFM (Basic Data Management Module) process may be in the following states:
[0098] a) In the 1st and 2nd steps of the first stage, the host CRASH (crash), and the request has not been written into the log physical file. In this case, the standby machine can be processed directly after taking over, because all the things submitted before are all After the execution is completed, the lost new request can be resubmitted through the front-end software retransmission;
[0099] b) During the 3 steps of the first stage, the host CRASH (crash), the new request has been written to the log file, but has not been processed in the second stage, in this case, after the standby machine takes over, the standby machine needs to be checked The status in the RESULT_LOG (result log) is the record in process, and the unfinished TASK (task) is resubmitted to the local queue for execution;
[0100] c) During the process of the second stage 1-5 steps of the basic data change data request, the host has a CRASH (crash). In this case, the new request has been written to the log, and the business logic of the second stage has been processed ( The memory data has also been updated), but the status information of the last TASK (task) has not been updated to the disk. After the standby machine takes over, it will resubmit the uncompleted TASK (task) in the ORIGINAL_LOG (original log) to the local message queue, and re-execute the business process, because the original host HFM (basic data management module) process has not completed the second The second log update (the TASK (task) in the RESULT_LOG (result log) execution status information is set to "execute successfully"), so the standby machine has not started to perform this basic data update operation, the standby machine updates this TASK ( Task) can still be considered the first update;
[0101] d) CRASH (crash) occurs after the completion of the 6 steps of the second stage. In this case, the standby machine can directly process the subsequent new orders and perform the business logic of the host.
[0102] Before the main and standby machines are switched, the standby machine application process does not process order messages in real time. When the main and standby machines are switched, the standby machine performs reconfiguration according to the records in the application log. The order record in the application log already contains the status and verification information. During the replay of the standby machine, only the processed orders are loaded, and the order is not checked, which avoids the impact of the basic data change on the replay. After the main and standby machines are switched, the standby machine becomes the master and takes over all TASK (tasks) of the previous master.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Classification and recommendation of technical efficacy words

  • Guaranteed reliability
  • Reduce latency

Method and device for transporting message

ActiveCN106130882ASolve the congestion problemGuaranteed reliabilityData switching networksClient-sideTelemetry
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Control system of pure water cooling device of high voltage direct current (HVDC) transmission converter valve

ActiveCN101634866AGuaranteed reliabilityNarrow down the scope of the accidentTemperature control without auxillary powerExchange of informationControl system
Owner:GUANGZHOU GOALAND ENERGY CONSERVATION TECH

Method for reliably generating video abstraction in complex scene

InactiveCN103413330AShort time spentGuaranteed reliabilityImage analysisMulti target trackingVideo sequence
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Task following between multiple operating systems

ActiveUS20090320048A1Reduce latencyHinders possibilityMultiprogramming arrangementsTransmissionPick operating systemOperational system
Owner:ARM LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products