System and method for persisting, double-writing and pushing data in a securities core trading system
By introducing a multi-engine collaborative approach into the securities trading system, the problems of data loss, lack of filtering support, and inflexible expansion were solved, achieving efficient persistence and push of massive amounts of data, and meeting the high availability and flexibility requirements of the securities trading system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUOTAI JUNAN SECURITIES CO LTD
- Filing Date
- 2022-12-28
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies in securities trading systems suffer from problems such as data loss and difficulty in recovery, lack of support for data filtering and flexible scaling, low data processing throughput, and inflexible expansion methods, which affect the flexibility and efficiency of data processing.
A system was designed that includes a file storage engine, a data parsing engine, a data routing engine, a data persistence engine, a data dual-write engine, and a data push engine. Through various routing strategies, a distributed database, and middleware queues, it can achieve the diversion, filtering, parallel import, and push of massive amounts of data, and support multi-node expansion and business collaboration.
It achieves high availability, low latency, ultra-fast massive in-memory data persistence and data push, supports multiple extended processing methods, overcomes the shortcomings of traditional methods, and ensures data integrity and processing efficiency.
Smart Images

Figure CN115982277B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer application technology, and in particular to the field of business message data collaboration technology for a new generation of distributed core trading systems in securities. Specifically, it refers to a system and method for achieving ultra-fast, massive memory data persistence, dual data writing, and data push in a core trading system for securities. Background Technology
[0002] Today, more and more enterprises, especially securities trading companies, face various data integration and system consolidation challenges. Traditional technical methods typically suffer from several drawbacks, including data loss and difficulty in recovery, lack of support for data filtering and elastic scaling, low data throughput, and inflexible scaling methods. The vulnerability to data loss and difficulty in recovery leads to incomplete business data, impacting business progress, and the inability to quickly recover from data errors affects data accuracy. The lack of support for elastic scaling of application instances limits data processing flexibility, causing excessive local performance pressure or preventing dynamic adjustments to the production deployment architecture. Low data throughput affects data timeliness. Inflexible scaling methods reduce data processing efficiency.
[0003] Therefore, there is an urgent need for a technical solution that can effectively address the problem of massive data collaboration in the next-generation low-latency core trading platform for securities. Summary of the Invention
[0004] The purpose of this invention is to overcome the shortcomings of the prior art and provide a system and method for achieving ultra-fast massive memory data persistence, dual data writing, and data push in a core securities trading system, which supports breakpoint recovery, data routing, high flexibility, and good scalability.
[0005] To achieve the above objectives, the present invention provides a system and method for realizing ultra-fast, massive-scale in-memory data persistence, dual data writing, and data push in a core securities trading system, as follows:
[0006] This system, designed for achieving ultra-fast, massive-scale in-memory data persistence, dual data writing, and data push within a core securities trading system, is characterized by the following features:
[0007] The file storage engine is used to obtain large amounts of raw memory data from the securities trading message bus, divide it into multiple data blocks according to nodes and topics, and store them in corresponding data files and index files.
[0008] The data parsing engine, connected to the file storage engine, is used to load massive amounts of data from local files and process and parse the data through decoding methods.
[0009] The data routing engine, connected to the data parsing engine, is used to process data through various routing strategies and to perform message data splitting and filtering.
[0010] The data persistence engine, connected to the data routing engine, is used to import massive amounts of data into a distributed database in batches according to multiple shards;
[0011] A dual-write data engine, connected to the aforementioned data routing engine, is used to quickly write massive amounts of data to external business systems for collaborative business processing; and
[0012] The data push engine, connected to the data routing engine, is used to access multiple middleware queues and push massive amounts of data to external systems for data collection and processing.
[0013] Furthermore, the system is equipped with multiple nodes, and the file storage engine, data parsing engine, data routing engine, data persistence engine, data dual-write engine, and data push engine are all set in the corresponding nodes.
[0014] Preferably, the data routing engine specifically performs the following processing:
[0015] Massive amounts of data are distributed and processed according to multiple routing strategies. Routing and filtering strategies are configured and processed to achieve multi-node, multi-dimensional processing and support the horizontal scaling of distributed nodes.
[0016] Furthermore, the data routing engine will determine the filtering range of each instance node based on the filtering configuration. When data arrives at each instance node, if it falls within the filtering range of the current instance, the data will be discarded by that instance. Otherwise, it will match the data routing mode according to various routing configuration rules. It also supports specifying the routing mode for global messages and the default routing mode, allowing each instance to process the data according to the specified mode.
[0017] Preferably, the data persistence engine specifically performs the following processing:
[0018] Massive amounts of data are processed in batches according to multiple shards, and imported into distributed databases in parallel according to various ingestion modes. It supports the use of distributed databases such as Oracle, MySQL, GoldenDB, and OceanBase, and implements a unified scheduling and management mechanism through the aforementioned distributed databases, as well as supporting rapid scaling of business operations.
[0019] Preferably, the file storage engine performs the following processing:
[0020] A large amount of raw memory data is obtained from the securities trading message bus, divided into multiple data blocks according to nodes and topics, and stored locally in corresponding data files and index files. The offset of the data file is quickly located through the index file, supporting fast data reading and data recovery.
[0021] Preferably, the index file specifically includes a global index file and a partition index file, wherein,
[0022] The global index file is used to maintain global information including service event number, heartbeat timestamp, number of message topic partitions, number of topic messages, and topic partition information, and is also used to quickly obtain the overall message data of each topic partition;
[0023] The partition index file is used to maintain the message sequence number, message data offset, and message data length of messages under each topic partition, and the message data storage status under a certain topic partition can be quickly obtained through the partition index file.
[0024] Preferably, the data parsing engine performs the following processing:
[0025] Based on time windows and stream processing mode, massive amounts of data are loaded from the local file, and data parsing is supported through hard-coded parsing mode and message configuration mode, loading the data into the memory area in batches.
[0026] Preferably, the hard-coded parsing mode is specifically as follows: the data type and parsing order of each data segment in the message are defined by code, and the offset is changed sequentially according to the length of the data segment to finally complete the parsing of the entire message data;
[0027] The message configuration mode is as follows: the message version and the data types and parsing order of each data segment in the message are defined by the configuration template, and the offset is changed sequentially according to the length of the data segment to finally complete the parsing of the entire message data.
[0028] Preferably, the data persistence engine includes concurrent and sequential data persistence modes. After the in-memory data is parsed, the appropriate data persistence mode can be selected to import the data into the distributed database.
[0029] The concurrent data storage mode is specifically as follows: the data is divided into multiple batches, and multiple threads are constructed to store multiple batches of data in parallel.
[0030] The sequential data insertion mode is as follows: by hashing a specified field of the batch data, a mapping relationship between the hash value and the batch data is established, and the batch data is mapped to multiple thread tasks and inserted into the database in parallel by the insertion queue.
[0031] Preferably, the system uses various methods of listening to and subscribing to securities trading message bus events to enable specific message data transmitted upstream to enter the corresponding queue and trigger the consumption of message data. The specific message data includes order data, account data, and business data.
[0032] Preferably, each node in the system is configured as a cluster in a primary / backup manner, the cluster supports elastic scaling of nodes, and each node reports heartbeat information at a fixed frequency.
[0033] This method utilizes the aforementioned system to implement ultra-fast, massive-scale in-memory data persistence, dual data writing, and data push in a core securities trading system. The method includes the following steps:
[0034] (1) After the upstream transmits massive amounts of message data to the designated queue through the securities trading message bus, the monitoring mechanism responds.
[0035] (2) Obtain a large amount of raw memory data from the securities trading message bus through the file storage engine, divide it into multiple data blocks according to nodes and topics, and store it in the corresponding data files and index files;
[0036] (3) The data parsing engine loads massive amounts of data from local files and loads them into the memory area through corresponding decoding processing;
[0037] (4) The data routing engine processes the data according to the corresponding routing strategy and filters and distributes the parsed memory data to different clusters and instance nodes;
[0038] (5) The data is split into multiple streams. The data persistence engine imports the data into the distributed database in parallel according to the concurrent or serial mode, or the data dual-write engine quickly writes the data to the external business system for business collaboration, or the data push engine pushes the massive amount of data to the external system for data collection and processing by accessing multiple middleware queues.
[0039] The system and method described in this invention, designed for achieving ultra-fast, massive-scale in-memory data persistence, dual-write, and data push in a core securities trading system, can perform distributed storage of large amounts of in-memory data while ensuring no data loss. It also supports multiple extended processing methods, meeting the needs of distributed database persistence, dual-write, and data push in production environments, and offering high throughput and low latency under high availability. Furthermore, by supporting distributed data file storage, flexible configuration of parsing methods and data routing filtering modes, and support for multiple data entry modes and extended processing methods, it overcomes the shortcomings of traditional methods, such as easy data loss and difficulty in recovery, lack of support for data filtering and routing, low data processing throughput, and inflexible extended processing methods. Attached Figure Description
[0040] Figure 1 This is a schematic diagram of the overall framework of the system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system, as per the present invention.
[0041] Figure 2 This is a timing diagram illustrating the method for ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system, as described in this invention.
[0042] Figure 3 This is a schematic diagram of the partition index file structure in the file storage engine of the present invention.
[0043] Figure 4 This is a schematic diagram of the global index file structure in the file storage engine of the present invention. Detailed Implementation
[0044] To more clearly describe the technical content of the present invention, the following description is provided in conjunction with specific embodiments.
[0045] Before describing the embodiments of the present invention in detail, it should be noted that, in the following, the terms “comprising,” “including,” or any other variations are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed or inherent to such process, method, article, or apparatus.
[0046] Please see Figure 1 As shown, this system, designed for achieving ultra-fast, massive-scale in-memory data persistence, dual data writing, and data push within a core securities trading system, includes:
[0047] The file storage engine is used to obtain large amounts of raw memory data from the securities trading message bus, divide it into multiple data blocks according to nodes and topics, and store them in corresponding data files and index files.
[0048] The data parsing engine, connected to the file storage engine, is used to load massive amounts of data from local files and process and parse the data through decoding methods.
[0049] The data routing engine, connected to the data parsing engine, is used to process data through various routing strategies and to perform message data splitting and filtering.
[0050] The data persistence engine, connected to the data routing engine, is used to import massive amounts of data into a distributed database in batches according to multiple shards;
[0051] A dual-write data engine, connected to the aforementioned data routing engine, is used to quickly write massive amounts of data to external business systems for collaborative business processing; and
[0052] The data push engine, connected to the data routing engine, is used to access multiple middleware queues and push massive amounts of data to external systems for data collection and processing.
[0053] Furthermore, the system is equipped with multiple nodes, and the file storage engine, data parsing engine, data routing engine, data persistence engine, data dual-write engine, and data push engine are all set in the corresponding nodes.
[0054] In a preferred embodiment of the present invention, the data routing engine specifically performs the following processing:
[0055] Massive amounts of data are distributed and processed according to multiple routing strategies. Routing and filtering strategies are configured and processed to achieve multi-node, multi-dimensional processing and support the horizontal scaling of distributed nodes.
[0056] Furthermore, the data routing engine will determine the filtering range of each instance node based on the filtering configuration. When data arrives at each instance node, if it falls within the filtering range of the current instance, the data will be discarded by that instance. Otherwise, it will match the data routing mode according to various routing configuration rules. It also supports specifying the routing mode for global messages and the default routing mode, allowing each instance to process the data according to the specified mode.
[0057] In a preferred embodiment of the present invention, the data persistence engine specifically performs the following processing:
[0058] Massive amounts of data are processed in batches according to multiple shards, and imported into distributed databases in parallel according to various ingestion modes. It supports the use of distributed databases such as Oracle, MySQL, GoldenDB, and OceanBase, and implements a unified scheduling and management mechanism through the aforementioned distributed databases, as well as supporting rapid scaling of business operations.
[0059] In a preferred embodiment of the present invention, the file storage engine specifically performs the following processing:
[0060] A large amount of raw memory data is obtained from the securities trading message bus, divided into multiple data blocks according to nodes and topics, and stored locally in corresponding data files and index files. The offset of the data file is quickly located through the index file, supporting fast data reading and data recovery.
[0061] In a preferred embodiment of the present invention, the index file specifically includes a global index file and a partition index file, wherein...
[0062] The global index file is used to maintain global information including service event number, heartbeat timestamp, number of message topic partitions, number of topic messages, and topic partition information, and is also used to quickly obtain the overall message data of each topic partition;
[0063] The partition index file is used to maintain the message sequence number, message data offset, and message data length of messages under each topic partition, and the message data storage status under a certain topic partition can be quickly obtained through the partition index file.
[0064] In a preferred embodiment of the present invention, the data parsing engine specifically performs the following processing:
[0065] Based on time windows and stream processing mode, massive amounts of data are loaded from the local file, and data parsing is supported through hard-coded parsing mode and message configuration mode, loading the data into the memory area in batches.
[0066] As a preferred embodiment of the present invention, the hard-coded parsing mode specifically involves: defining the data type and parsing order of each data segment in the message through code, and changing the offset sequentially according to the length of the data segment to finally complete the parsing of the entire message data;
[0067] The message configuration mode is as follows: the message version and the data types and parsing order of each data segment in the message are defined by the configuration template, and the offset is changed sequentially according to the length of the data segment to finally complete the parsing of the entire message data.
[0068] In a preferred embodiment of the present invention, the data persistence engine includes a concurrent data persistence mode and a sequential data persistence mode. After the in-memory data is parsed, the appropriate data persistence mode can be selected to import the data into the distributed database.
[0069] The concurrent data storage mode is specifically as follows: the data is divided into multiple batches, and multiple threads are constructed to store multiple batches of data in parallel.
[0070] The sequential data insertion mode is as follows: by hashing a specified field of the batch data, a mapping relationship between the hash value and the batch data is established, and the batch data is mapped to multiple thread tasks and inserted into the database in parallel by the insertion queue.
[0071] In a preferred embodiment of the present invention, the system enables specific message data transmitted upstream to enter the corresponding queue and triggers the consumption of message data by means of various monitoring and subscription methods for securities trading message bus events. The specific message data includes order data, account data and business data.
[0072] In a preferred embodiment of the present invention, each node in the system is configured as a cluster in a primary / backup manner. The cluster supports elastic scaling of nodes, and each node reports heartbeat information at a fixed frequency.
[0073] This method utilizes the aforementioned system to implement ultra-fast, massive-scale in-memory data persistence, dual data writing, and data push in a core securities trading system. The method includes the following steps:
[0074] (1) After the upstream transmits massive amounts of message data to the designated queue through the securities trading message bus, the monitoring mechanism responds.
[0075] (2) Obtain a large amount of raw memory data from the securities trading message bus through the file storage engine, divide it into multiple data blocks according to nodes and topics, and store it in the corresponding data files and index files;
[0076] (3) The data parsing engine loads massive amounts of data from local files and loads them into the memory area through corresponding decoding processing;
[0077] (4) The data routing engine processes the data according to the corresponding routing strategy and filters and distributes the parsed memory data to different clusters and instance nodes;
[0078] (5) The data is split into multiple streams. The data persistence engine imports the data into the distributed database in parallel according to the concurrent or serial mode, or the data dual-write engine quickly writes the data to the external business system for business collaboration, or the data push engine pushes the massive amount of data to the external system for data collection and processing by accessing multiple middleware queues.
[0079] In a preferred embodiment of the present invention, the file storage engine in the method includes: obtaining a large amount of raw memory data from the securities trading message bus, dividing it into multiple data blocks according to nodes and topics, storing it locally in corresponding data files and index files, quickly locating the data file offset through the index file, and supporting fast data reading and data recovery.
[0080] As a preferred embodiment of the present invention, the data parsing engine includes: loading massive amounts of data from local files based on time windows and stream processing modes, and processing the parsed data through various decoding methods such as configuration file parsing and hard-coded parsing, and loading the data into the memory area in batches.
[0081] In a preferred embodiment of the present invention, the data routing engine includes: processing massive data streams according to multiple routing strategies, wherein the routing strategies and filtering strategies are configurable, enabling multi-node, multi-dimensional processing, and supporting the horizontal expansion of distributed nodes.
[0082] In a preferred embodiment of the present invention, the data persistence engine includes: processing massive amounts of data in batches according to multiple shards, importing them in parallel into a distributed database according to multiple ingestion modes, and supporting multiple mainstream databases such as Oracle and MySQL, as well as domestic distributed databases such as GoldenDB and OceanBase. Through a unified scheduling and management mechanism for the distributed database, the complexity of database sharding operations can be reduced while ensuring transparency to upper-layer applications. Simultaneously, data redundancy is reduced, resource fragmentation is avoided, rapid scaling of business operations is supported, resource control and isolation requirements in different scenarios are met, and stronger multi-active disaster recovery capabilities are achieved, enabling high availability at low cost.
[0083] As a preferred embodiment of the present invention, the data dual-write engine includes: rapidly dual-writing massive amounts of data to an external business system to achieve business collaboration with the external business system.
[0084] As a preferred embodiment of the present invention, the data push engine includes: pushing massive amounts of data to external systems for data collection and processing by accessing multiple middleware queues such as Kafka.
[0085] In a preferred embodiment of the present invention, by means of various monitoring and subscription methods for securities trading message bus events, specific message data transmitted upstream enters the corresponding queue and triggers the consumption action of message data.
[0086] In a preferred embodiment of the present invention, the event listener uses the implemented event processing interface as the access point and distinguishes different event types transmitted from upstream through the event enumeration class, and triggers the corresponding response in the event handler.
[0087] In a preferred embodiment of the present invention, the step of importing memory data into local files is instantaneous, rather than triggering file storage only after the data volume reaches a set threshold. This reduces the probability of memory data loss and ensures the integrity of the original data. Nodes are configured as clusters in a primary-backup manner. Different clusters support connection to different message partitions. Each cluster includes multiple nodes and supports horizontal scaling, thereby ensuring high availability of the overall service.
[0088] In a preferred embodiment of the present invention, nodes report heartbeat information at a fixed frequency to ensure the maintenance and updating of the state of each node in the distributed system.
[0089] In a preferred embodiment of the present invention, the heartbeat logs of each node are collected and analyzed by the monitoring component at a certain frequency, and the node status is updated on the monitoring page at regular intervals.
[0090] In a preferred embodiment of the present invention, memory data is stored in local files in the order of the securities trading message bus, data content is stored in data files, and corresponding index files are updated synchronously. The index files include global index files and partition index files. Through the fast retrieval of the index files, the data files can be quickly located.
[0091] As a preferred embodiment of the present invention, the global index file maintains global information such as service event number, heartbeat timestamp, number of message topic partitions, number of topic messages, and topic partition information. The overall message data of each topic partition can be quickly obtained through the global index file. When the data is updated, the global index file will update each information field synchronously.
[0092] In a preferred embodiment of the present invention, the partition index file maintains the message sequence number, message data offset, message data length, etc. of messages under each topic partition. The message data storage status under a certain topic partition can be quickly obtained through the partition index file. When the data is updated, the partition index file will update each status field synchronously.
[0093] In a preferred embodiment of the present invention, the file data is loaded into binary data by the file reading thread and stored in the memory-mapped block, and then retrieved and parsed by the parsing thread.
[0094] In a preferred embodiment of the present invention, a mapping data block is generated according to a specified size during local file initialization. The data block size can be configured, and the mapping file position index value and offset pointer are initialized.
[0095] In a preferred embodiment of the present invention, when reading a local file, it is determined whether the current offset position and the data segment length exceed the storage range of the current mapped file block. If the range is exceeded, the mapped file block is dynamically expanded, supporting multiple expansion strategies.
[0096] In a preferred embodiment of the present invention, when writing a local file, the remaining space of the current mapped data block is checked. If the length of the data to be written exceeds the remaining space, the mapped file block is expanded and then the data is loaded.
[0097] In a preferred embodiment of the present invention, the data read from the file is parsed according to the message topic and message number, and the data information is read from the local data file according to the corresponding parsing method and loaded into memory.
[0098] In a preferred embodiment of the present invention, the parsing processor implements an abstract parsing interface and matches different message data under different message topics by concatenating the message topic and message number as key values.
[0099] In a preferred embodiment of the present invention, the message version is obtained and determined from the message data header during data parsing, and different parsing strategies are supported for different message versions.
[0100] In a preferred embodiment of the present invention, the message data consists of two parts: a message header and a message body. The message header contains common message attributes, and the message body contains message data fields.
[0101] In a preferred embodiment of the present invention, the message header is defined according to a fixed format, including common message attributes such as message version number, message options, message encoding, and message body length attribute, as well as the data type of each attribute.
[0102] In a preferred embodiment of the present invention, the message body is defined according to the actual data composition of different messages, the data fields are arranged in the order of composition, and the corresponding data space is allocated according to the data segment type.
[0103] As a preferred embodiment of the present invention, the hard-coded parsing mode defines the data type and parsing order of each data segment in the message through code, and changes the offset sequentially according to the length of the data segment, and finally completes the parsing of the entire message data.
[0104] In a preferred embodiment of the present invention, the message configuration mode defines the message version and the data type and parsing order of each data segment in the message through a configuration template, and changes the offset sequentially according to the length of the data segment to finally complete the parsing of the entire message data.
[0105] In a preferred embodiment of the present invention, the data routing engine determines the filtering range of each instance node according to the filtering configuration. When data arrives at each instance node, if it falls within the filtering range of the current instance, the data will be discarded by that instance. Otherwise, the data routing mode will be matched according to various routing configuration rules, supporting global message specified routing mode and default routing mode, etc., and each instance will process the data according to the specified mode.
[0106] In a preferred embodiment of the present invention, the data routing engine loads the cluster instance list and full routing configuration information in the initialization step, determines the default instance nodes and routing policies for each node according to the configuration, and distinguishes the detailed routing relationships of different policies according to the configured routing type and routing key value.
[0107] As a preferred embodiment of the present invention, the data routing engine supports specifying the default instance type through configuration. By querying the instance deployment list, the cluster to which the instance belongs and the instance type can be obtained. If the current instance type value matches the default instance type, then the instance is the default instance.
[0108] In a preferred embodiment of the present invention, the data routing engine defines the routing policy logic through the routing interface. The routing policy supports multiple methods such as global message specified routing, ordinary message routing, default routing, specified business department routing, and modulo routing, and different routing policy priorities can be customized.
[0109] In a preferred embodiment of the present invention, the global message routing policy determines whether the current message data matches based on the loaded configuration information. The matching key is a concatenation string composed of a global message routing type enumeration value and a message number. If the current instance's filter list contains this message, the message is discarded; otherwise, if the current instance's routing list contains this message, it is processed by the current instance according to this policy.
[0110] In a preferred embodiment of the present invention, the ordinary message routing policy determines whether the current message data matches based on the loaded configuration information. The matching key is a concatenation string composed of an ordinary message routing type enumeration value and a message number. If the current instance's filter list contains this message, the message is discarded; otherwise, if the current instance's routing list contains this message, it is processed by the current instance according to this policy.
[0111] In a preferred embodiment of the present invention, the default routing policy determines whether the current message data matches based on the loaded configuration information. If the message routing field of the current message is empty, the decision to process the message is made based on whether the current instance is the default instance. If the current instance is the default instance, the message is processed by that instance; otherwise, it is discarded.
[0112] In a preferred embodiment of the present invention, the designated branch routing policy determines whether the current message data matches based on the loaded configuration information. The matching key is a concatenation string consisting of the designated branch routing type enumeration value and the message routing field (branch number). If the current instance's filter list contains this message, the message is discarded; otherwise, if the current instance's routing list contains this message, it is processed by the current instance according to this policy.
[0113] In a preferred embodiment of the present invention, the branch office modulo routing strategy determines whether the current message data matches based on the loaded configuration information. If the hash value field configured for this instance is valid, and the message routing field (branch number) does not match the hash value after modulo, then the message is discarded; otherwise, it is processed by the current instance.
[0114] In a preferred embodiment of the present invention, after message data is parsed and routed, it is stored in a generic class structure as a unit. This generic structure serves as a specific memory area, and the message data space is maintained by an object array, the size of which is defined according to a specified configuration value. When data is imported, it is placed at the corresponding index position in the object data according to the sequence number of the message data.
[0115] In a preferred embodiment of the present invention, a separate thread retrieves message data from a specific memory area and submits it to the database entry thread pool to execute the database entry task. The data retrieval action supports configuration of batch quantity and timeout period.
[0116] In a preferred embodiment of the present invention, message data entry supports filtering by specified message type, such as ErrorMessage and IgnoreMessage.
[0117] In a preferred embodiment of the present invention, the message persistence engine segments batch messages into the database according to message numbers and establishes a mapping table between message numbers and corresponding message data.
[0118] In a preferred embodiment of the present invention, after traversal, the batch entry messages are stored in two mapping tables, a concurrent entry table and a sequential entry table, according to the entry mode.
[0119] As a preferred embodiment of the present invention, after memory data parsing and routing, multiple modes of data entry are supported. The concurrent data entry method can import data into the distributed database in a high-concurrency parallel manner according to the sharding and batching method, while the sequential data entry method can use various modulo methods such as hash modulo to map and allocate data to different data queues for import into the distributed database.
[0120] In a preferred embodiment of the present invention, after the data is processed by the parsing engine, it is divided into unordered data and ordered data according to the data identifier, and placed into unordered processing queues and ordered processing queues respectively. The unordered processing queue adopts a concurrent data storage mode, which divides the data into multiple batches and constructs multiple threads to store multiple batches of data in parallel. The ordered processing queue adopts a sequential data storage mode, which establishes a mapping relationship between the hash value and the batch data by hashing a specified field of the batch data, mapping the batch data to multiple thread tasks, and delivering them to the storage queue for parallel storage. Each ordered storage queue uniquely corresponds to one task thread, ensuring the storage order of each batch of data in each ordered queue. The validity and atomicity of batch data storage are controlled by a transaction mechanism, and the storage status of multiple batches of data is recorded in real time by a breakpoint mechanism. The unordered queue and the ordered queue each maintain independent breakpoint pointers. After each batch of data is stored, the breakpoint pointer will move forward by the corresponding offset. The global offset pointer will take the minimum of the breakpoint position of the ordered queue and the breakpoint position of the unordered queue, thereby ensuring that no data is lost when the breakpoint is restored.
[0121] In a preferred embodiment of the present invention, the concurrent data storage method prioritizes storing messages that are hard-coded and parsed, followed by storing messages that are configured and parsed.
[0122] In a preferred embodiment of the present invention, the message data is entered into the database based on the Boolean value of the partition entry flag. If the Boolean value is true, the message data is grouped by the partition field (business department number) and cut into multiple batches of data according to the batch entry quantity, and submitted for multi-task processing. If the Boolean value is false, the message data is directly cut into multiple batches of data according to the batch entry quantity and submitted for multi-task processing.
[0123] In a preferred embodiment of the present invention, the message data is configured to be partitioned for storage based on the Boolean value of the partition storage flag. If the Boolean value is true, the message data is grouped by the partition field (business department number) and cut into multiple batches of data according to the batch storage quantity, and submitted for multi-task processing. If the Boolean value is false, the message data is directly cut into multiple batches of data according to the batch storage quantity and submitted for multi-task processing.
[0124] In a preferred embodiment of the present invention, the execution result of the data entry task is blocked. If the task fails, the error information is added to the error log table, and subsequent steps continue.
[0125] In a preferred embodiment of the present invention, the data push engine loads the configuration table to obtain message configuration information such as message number, message definition, queue topic, and message type, and pushes it to the middleware queue in combination with the parsed message data.
[0126] The present invention discloses a method for ultra-fast, massive-scale in-memory data persistence, dual-write, and data push in a new generation of distributed core trading systems for securities, comprising the following steps:
[0127] (1) After the upstream transmits massive message data to the designated queue through the securities trading message bus, the listening mechanism responds and obtains a large amount of raw memory data from the securities trading message bus through the file storage engine. The data is divided into multiple data blocks according to nodes and topics and stored in the corresponding data files and index files.
[0128] (2) The data parsing engine loads massive amounts of data from local files and loads them into the memory area through various decoding methods;
[0129] (3) The data routing engine processes the parsed memory data according to various routing strategies and filters and distributes it to different clusters and instance nodes;
[0130] (4) The offloaded data can be extended to access multiple processing engines. The data persistence engine can import the data into the distributed database in parallel according to the concurrent or serial mode. The data dual-write engine can quickly write the data to the external business system for business collaboration. The data push engine can push massive amounts of data to the external system for data collection and processing by accessing multiple middleware queues.
[0131] In a specific embodiment of the present invention:
[0132] Suppose the new generation distributed core trading system of Upstream Securities pushes business message 700001 to partition A and partition B of the topic queue MsgCreditTrd. The data fields defined in this message include: business serial number seq_id, credit type credit_kind, currency currency, fund account fund_id, and order quantity order_qty.
[0133] Each node in the cluster monitors the message push event, triggers the response of the file storage engine, retrieves the original data from the securities trading message bus, and stores it as two data file blocks and an index file block by topic partition: MsgCreditTrd_A and MsgCreditTrd_B.
[0134] The data parsing engine loads the message data from the local file. Suppose the configuration file decoding method is used, then the message structure is configured in the configuration file as follows.
[0135] <Message name="PktCreditTrd" pktno="700001">
[0136] <Field name="seq_id" primitiveType="SeqID_def" description="business serial number" / >
[0137] <Field name="credit_kind" primitiveType="CreditKind_def" description="credit type" / >
[0138] <Field name="currency" primitiveType="Currenc y _def" description="currency" / >
[0139] <Field name="fund_id" primitiveType="FundID_def" description="fund account" / >
[0140] <Field name="order_qty" primitiveType="Qt y _d e f" d escr iption="order quantity" / >
[0141]
[0142] Among them<name>The tag description message name is PktCreditTrd. <pktno>The tag description message number is 700001. <field>Tags describe the names and types of data fields within the message; different types correspond to different data lengths. Using this configuration structure, the data parsing engine parses each field in the message body according to its order and length, loading them into memory.
[0143] Assuming message 700001 is configured to use the normal message routing strategy, the data routing engine will use the concatenation strategy enumeration value and message number NORMAL_700001 as the key to search the filter list and route list of the normal message routing strategy. If a match is found, the message will be processed; otherwise, it will be discarded.
[0144] Assuming message 700001 is configured to use sequential inbound mode, the data persistence engine hashes the specified field fund_id of the batch message data to establish a mapping relationship between the hash value and the batch data. Different batches of data are mapped to different thread task queues and are processed by each inbound queue.
[0145] Any process or method description in the flowchart or otherwise herein can be understood as representing a module, segment, or portion of code comprising one or more executable instructions for implementing a particular logical function or process, and the scope of the preferred embodiments of the invention includes additional implementations in which functions may be performed not in the order shown or discussed, including substantially simultaneously or in reverse order depending on the functions involved, as will be understood by those skilled in the art to which embodiments of the invention pertain.
[0146] It should be understood that various parts of the present invention can be implemented using hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented using software or firmware stored in memory and executed by a suitable instruction execution device.
[0147] Those skilled in the art will understand that all or part of the steps of the methods in the above embodiments can be implemented by a program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
[0148] The storage media mentioned above can be read-only memory, disk, or optical disk, etc.
[0149] In the description of this specification, references to terms such as "an embodiment," "some embodiments," "example," "specific example," or "embodiment," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0150] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
[0151] The system and method described in this invention, designed for achieving ultra-fast, massive-scale in-memory data persistence, dual-write, and data push in a core securities trading system, can perform distributed storage of large amounts of in-memory data while ensuring no data loss. It also supports multiple extended processing methods, meeting the needs of distributed database persistence, dual-write, and data push in production environments, and offering high throughput and low latency under high availability. Furthermore, by supporting distributed data file storage, flexible configuration of parsing methods and data routing filtering modes, and support for multiple data entry modes and extended processing methods, it overcomes the shortcomings of traditional methods, such as easy data loss and difficulty in recovery, lack of support for data filtering and routing, low data processing throughput, and inflexible extended processing methods.
[0152] In this specification, the invention has been described with reference to specific embodiments thereof. However, it will be apparent that various modifications and variations can be made without departing from the spirit and scope of the invention. Therefore, the specification and drawings should be considered illustrative rather than restrictive.< / field> < / pktno> < / name>
Claims
1. A system for achieving ultra-fast, massive-scale in-memory data persistence, dual data writing, and data push in a core securities trading system, characterized in that: The system includes: The file storage engine is used to obtain large amounts of raw memory data from the securities trading message bus, divide it into multiple data blocks according to nodes and topics, and store them in corresponding data files and index files. The data parsing engine, connected to the file storage engine, is used to load massive amounts of data from local files and process and parse the data through a decoding method that combines hard coding and message configuration. The data routing engine, connected to the data parsing engine, is used to process data through various routing strategies and to perform message data splitting and filtering. The data persistence engine, connected to the data routing engine, is used to import massive amounts of data into a distributed database in batches according to multiple shards; A dual-write data engine, connected to the aforementioned data routing engine, is used to quickly write massive amounts of data to external business systems for collaborative business processing; and The data push engine, connected to the data routing engine, is used to access multiple middleware queues and push massive amounts of data to external systems for data collection and processing. Furthermore, the system is equipped with multiple nodes, and the file storage engine, data parsing engine, data routing engine, data persistence engine, data dual-write engine, and data push engine are all set in the corresponding nodes; The data routing engine specifically performs the following processing: Massive amounts of data are distributed and processed according to multiple routing strategies. Routing and filtering strategies are configured and processed to achieve multi-node, multi-dimensional processing and support the horizontal scaling of distributed nodes. Furthermore, the data routing engine will determine the filtering range of each instance node based on the filtering configuration. When data arrives at each instance node, if it falls within the filtering range of the current instance node, the data will be discarded by that instance node. Otherwise, it will match the data routing mode according to various routing configuration rules. It also supports specifying the routing mode for global messages and the default routing mode, and each instance node will process the data according to the specified mode.
2. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to claim 1, characterized in that: The data persistence engine performs the following processing: Massive amounts of data are processed in batches according to multiple shards, and imported into distributed databases in parallel according to various ingestion modes. It supports the use of distributed databases such as Oracle, MySQL, GoldenDB, and OceanBase, and implements a unified scheduling and management mechanism through the aforementioned distributed databases, as well as supporting rapid scaling of business operations.
3. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to claim 2, characterized in that, The file storage engine specifically performs the following processing: A large amount of raw memory data is obtained from the securities trading message bus, divided into multiple data blocks according to nodes and topics, and stored locally in corresponding data files and index files. The offset of the data file is quickly located through the index file, supporting fast data reading and data recovery.
4. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to claim 3, characterized in that, The index files specifically include a global index file and partition index files, wherein, The global index file is used to maintain global information including service event number, heartbeat timestamp, number of message topic partitions, number of topic messages, and topic partition information, and is also used to quickly obtain the overall message data of each topic partition; The partition index file is used to maintain the message sequence number, message data offset, and message data length of messages under each topic partition, and the message data storage status under a certain topic partition can be quickly obtained through the partition index file.
5. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to claim 4, characterized in that, The data parsing engine specifically performs the following processing: Based on time windows and stream processing mode, massive amounts of data are loaded from the local file, and data parsing is supported through hard-coded parsing mode and message configuration mode, loading the data into the memory area in batches.
6. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to claim 5, characterized in that, The hard-coded parsing mode is specifically defined as follows: the data type and parsing order of each data segment in the message are defined by code, and the offset is changed sequentially according to the length of the data segment to finally complete the parsing of the entire message data; The message configuration mode is as follows: the message version and the data types and parsing order of each data segment in the message are defined by the configuration template, and the offset is changed sequentially according to the length of the data segment to finally complete the parsing of the entire message data.
7. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to claim 6, characterized in that, The data persistence engine includes concurrent and sequential data persistence modes. After the in-memory data is parsed, the appropriate data insertion mode can be selected to import the data into the distributed database. The concurrent data storage mode is specifically as follows: the data is divided into multiple batches, and multiple threads are constructed to store multiple batches of data in parallel. The sequential data insertion mode is as follows: by hashing a specified field of the batch data, a mapping relationship between the hash value and the batch data is established, and the batch data is mapped to multiple thread tasks and inserted into the database in parallel by the insertion queue.
8. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a securities core trading system according to any one of claims 1 to 7, characterized in that, The system uses various methods of listening to and subscribing to securities trading message bus events to enable specific message data transmitted upstream to enter the corresponding queue and trigger the consumption of message data. The specific message data includes order data, account data, and business data.
9. The system for achieving ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system according to any one of claims 1 to 7, characterized in that, Each node in the system is configured as a cluster in a primary / backup manner. The cluster supports elastic scaling of nodes, and each node reports heartbeat information at a fixed frequency.
10. A method for implementing ultra-fast, massive-scale memory data persistence, dual data writing, and data push in a core securities trading system using the system described in any one of claims 1 to 7, characterized in that, The method includes the following steps: (1) After the upstream transmits massive amounts of message data to the designated queue through the securities trading message bus, the monitoring mechanism responds; (2) Obtain a large amount of raw memory data from the securities trading message bus through the file storage engine, divide it into multiple data blocks according to nodes and topics, and store it in the corresponding data files and index files; (3) The data parsing engine loads massive amounts of data from local files and loads them into the memory area through corresponding decoding processing; (4) The data routing engine processes the data according to the corresponding routing strategy and filters and distributes the parsed memory data to different clusters and instance nodes; (5) The data is imported into the distributed database in parallel according to the concurrent or serial mode by the data persistence engine, or the data is quickly written to the external business system by the data dual-write engine for business collaboration, or the data push engine pushes the massive amount of data to the external system for data collection and processing by accessing a variety of middleware queues.