Data transmission method, apparatus, device, and storage medium
By acquiring database table metadata for traffic analysis, monitoring message queue links, and automatically creating idle queue links, the transmission bottleneck in large data volume transmission was resolved, thereby improving the efficiency and flexibility of data transmission.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2023-02-07
- Publication Date
- 2026-06-19
AI Technical Summary
In the process of transmitting large amounts of data, existing technologies have transmission bottlenecks and cannot achieve horizontal expansion of data, resulting in reduced data transmission efficiency and flexibility.
By responding to data transmission commands, the system obtains metadata from the database tables, performs traffic analysis, monitors the operation data of the message queue links, and automatically creates idle message queue links when conditions exceed preset limits. It then uses a hash algorithm to allocate target queue links for data transmission.
It improves the efficiency and flexibility of data transmission, solves the transmission bottleneck problem, and enables horizontal expansion of data transmission.
Smart Images

Figure CN116048846B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of data processing technology or financial technology, and more specifically to a data processing method, apparatus, device, storage medium, and program product. Background Technology
[0002] When migrating a DB2 (Relational Database Management System) from a host system to a platform system, the process typically involves first retrieving the DB2 data and then sending it to the host system's message queue. A data channel is then established between the host's message queue and the platform's message queue to complete the data transmission. In implementing the concept of this invention, the inventors discovered the following problems in related technologies: when the amount of data being transmitted is large, transmission bottlenecks occur, preventing horizontal scaling of data transmission and thus reducing the efficiency and flexibility of data transmission. Summary of the Invention
[0003] In view of the above problems, this disclosure provides data transmission methods, apparatus, devices, storage media, and program products that improve the efficiency and intelligence of data transmission.
[0004] One aspect of this disclosure provides a data transmission method, comprising: responding to a data transmission instruction, obtaining a database table carried in the data transmission instruction through a preset program; extracting metadata from a data warehouse based on the database table; performing traffic analysis on the metadata to obtain the traffic analysis result; monitoring the operation data of a message queue link, and automatically creating at least one idle message queue link based on the traffic analysis result when the operation data exceeds a preset condition, wherein the message queue link is used for data transmission; and executing the data transmission instruction based on the at least one idle message queue link.
[0005] According to embodiments of this disclosure, the metadata includes an identifier of the database table; the method further includes: registering the at least one idle message queue link to a message queue routing table; and returning at least one idle message queue link from the message queue routing table using the identifier of the database table.
[0006] According to an embodiment of this disclosure, the method further includes: if m free message queue links are returned, using a hash algorithm to determine n target message queue links from the m free message queue links, where m > 1 and 1 ≤ n ≤ m.
[0007] According to an embodiment of this disclosure, determining n target message queue links from the m idle message queue links using a hash algorithm includes: extracting the primary key field from the database table; and performing a hash operation on the field value of the primary key field and the value of m to obtain the hash value as n.
[0008] According to embodiments of this disclosure, the method further includes: monitoring the operating parameters generated during the execution of the data transmission instruction and obtaining monitoring results; generating alarm information and performing a retry operation when the monitoring results indicate an execution abnormality; stopping execution and notifying relevant maintenance personnel when the number of retry operations exceeds a preset threshold.
[0009] According to embodiments of this disclosure, the metadata includes the byte length occupied by the database table; the step of performing traffic analysis on the metadata to obtain the traffic analysis result includes: recording the change in the byte length in response to operations on the database table; obtaining the flow rate of the database table by analyzing the change in the byte length over a period of time; and determining the traffic analysis result based on the flow rate of the database table.
[0010] According to embodiments of this disclosure, the runtime data includes the runtime memory of the message queue link, the transmission latency of the message queue link, and the runtime status parameters of the message queue link.
[0011] Another aspect of this disclosure provides a data transmission apparatus, comprising: an acquisition module, configured to acquire a database table carried in a data transmission instruction via a preset program in response to a data transmission instruction; an extraction module, configured to extract metadata from a data warehouse based on the database table; an analysis module, configured to perform traffic analysis on the metadata to obtain the traffic analysis result; a first monitoring module, configured to monitor the operation data of a message queue link, and automatically create at least one idle message queue link based on the traffic analysis result when the operation data exceeds a preset condition, wherein the message queue link is used for data transmission; and an execution module, configured to execute the data transmission instruction based on the at least one idle message queue link.
[0012] Another aspect of this disclosure provides an electronic device, including: one or more processors; and a storage device for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors perform the data transmission method described above.
[0013] Another aspect of this disclosure provides a computer-readable storage medium having executable instructions stored thereon, which, when executed by a processor, cause the processor to perform the data transfer method described above.
[0014] Another aspect of this disclosure provides a computer program product, including a computer program that, when executed by a processor, implements the data transmission method described above.
[0015] According to the data transmission method, apparatus, device, storage medium, and program product provided in this disclosure, in response to a data transmission command, a database table is obtained through a preset program; metadata is extracted from a data warehouse based on the database table; traffic analysis is performed on the metadata to obtain traffic analysis results; the operating data of message queue links is monitored, and if the operating data exceeds preset conditions, at least one idle message queue link is automatically created based on the traffic analysis results; and the data transmission command is executed using the idle message queue link. Because the data transmission process incorporates the traffic analysis results obtained from the traffic analysis of the database table, and automatically creates multiple message queue links based on the traffic analysis results, it at least partially overcomes the transmission bottlenecks and the inability to achieve horizontal data expansion in related technologies, thereby achieving the technical effect of improving the efficiency and flexibility of data transmission. Attached Figure Description
[0016] The foregoing contents, as well as other objects, features, and advantages of this disclosure, will become clearer from the following description of embodiments with reference to the accompanying drawings, in which:
[0017] Figure 1 The illustrations depict application scenarios of data transmission methods, apparatus, devices, storage media, and program products according to embodiments of the present disclosure.
[0018] Figure 2 A flowchart illustrating a data transmission method according to an embodiment of the present disclosure is shown schematically.
[0019] Figure 3 This diagram illustrates the architecture of a data transmission system based on related technologies.
[0020] Figure 4 An architectural diagram of a data transmission system according to an embodiment of the present disclosure is illustrated schematically;
[0021] Figure 5 The application is illustrated schematically. Figure 4 A flowchart of the data transmission method executed by the data transmission system;
[0022] Figure 6 A schematic block diagram of a data transmission apparatus according to an embodiment of the present disclosure is shown; and
[0023] Figure 7 A block diagram schematically illustrates an electronic device suitable for implementing a data transmission method according to an embodiment of the present disclosure. Detailed Implementation
[0024] The embodiments of the present disclosure will now be described with reference to the accompanying drawings. However, it should be understood that these descriptions are exemplary only and are not intended to limit the scope of the disclosure. In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the embodiments of the present disclosure for ease of explanation. However, it will be apparent that one or more embodiments may be practiced without these specific details. Furthermore, descriptions of well-known structures and techniques are omitted in the following description to avoid unnecessarily obscuring the concepts of the present disclosure.
[0025] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit this disclosure. The terms “comprising,” “including,” etc., as used herein indicate the presence of the stated features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.
[0026] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.
[0027] When using expressions such as "at least one of A, B, and C", they should generally be interpreted in accordance with the meaning that is commonly understood by a person skilled in the art (e.g., "a system having at least one of A, B, and C" should include, but is not limited to, a system having A alone, a system having B alone, a system having C alone, a system having A and B, a system having A and C, a system having B and C, and / or a system having A, B, and C, etc.).
[0028] This disclosure provides a data transmission method, apparatus, device, storage medium, and program product to improve the efficiency and flexibility of data transmission and enhance user experience. Specifically, the method includes: responding to a data transmission command by obtaining a database table carried in the data transmission command through a preset program; extracting metadata from a data warehouse based on the database table; performing traffic analysis on the metadata to obtain traffic analysis results; monitoring the operational data of message queue links, and automatically creating at least one idle message queue link based on the traffic analysis results when the operational data exceeds preset conditions, wherein the message queue link is used for data transmission; and executing the data transmission command based on the at least one idle message queue link.
[0029] It should be noted that the data transmission method and apparatus determined in the embodiments of this disclosure can be used in the field of data processing technology or the field of financial technology, or in any field other than the field of data processing technology or the field of financial technology. The embodiments of this disclosure do not limit the application field of the determined data transmission method and apparatus.
[0030] In the technical solutions disclosed herein, the collection, storage, use, processing, transmission, provision, disclosure, and application of data (including but not limited to user personal information) comply with the provisions of relevant laws and regulations, necessary confidentiality measures have been taken, and they do not violate public order and good morals.
[0031] Figure 1 The illustrations depict application scenarios of data transmission methods, apparatus, devices, storage media, and program products according to embodiments of the present disclosure.
[0032] like Figure 1 As shown, application scenario 100 according to this embodiment may include terminal devices 101, 102, and 103, network 104, and server 105. Network 104 is used as a medium to provide a communication link between terminal devices 101, 102, and 103 and server 105. Network 104 may include various connection types, such as wired or wireless communication links or fiber optic cables, etc.
[0033] Users can use terminal devices 101, 102, and 103 to interact with server 105 via network 104 to receive or send data transmission commands, etc. Various communication client applications can be installed on terminal devices 101, 102, and 103, such as financial applications, shopping applications, web browser applications, search applications, instant messaging tools, email clients, social media platform software, etc. (for example only).
[0034] Terminal devices 101, 102, and 103 can be various electronic devices that support web browsing, including but not limited to smartphones, tablets, laptops, and desktop computers.
[0035] Server 105 can be a server providing various services, such as a background management server that processes and executes data transmission commands sent by users using terminal devices 101, 102, and 103 (this is just an example). The background management server can analyze and process the received data transmission commands and other data, and feed back the processing results (such as transmission links, web pages, information, or data obtained or generated based on the data transmission commands) to the terminal devices. For example, in response to a data transmission command, server 105 can obtain the database table carried in the data transmission command through a preset program; extract metadata from the data warehouse based on the database table; perform traffic analysis on the metadata to obtain traffic analysis results; monitor the operation data of the message queue links, and automatically create at least one idle message queue link based on the traffic analysis results when the operation data exceeds preset conditions, wherein the message queue link is used for data transmission; and execute the data transmission command based on at least one idle message queue link.
[0036] It should be noted that the data transmission method provided in this embodiment can generally be executed by server 105. Correspondingly, the data transmission device provided in this embodiment can generally be located in server 105. The data transmission method provided in this embodiment can also be executed by a server or server cluster that is different from server 105 and capable of communicating with terminal devices 101, 102, 103 and / or server 105. Correspondingly, the data transmission device provided in this embodiment can also be located in a server or server cluster that is different from server 105 and capable of communicating with terminal devices 101, 102, 103 and / or server 105.
[0037] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.
[0038] The following will be based on Figure 1 The described scene, through Figures 2-5 The data transmission method of the present disclosure embodiments will be described in detail.
[0039] Figure 2 A flowchart illustrating a data transmission method according to an embodiment of the present disclosure is shown schematically.
[0040] like Figure 2 As shown, the data transmission method of this embodiment includes operations S201 to S205.
[0041] In operation S201, in response to the data transmission command, the database table carried in the data transmission command is obtained through a preset program.
[0042] In operation S202, metadata is extracted from the data warehouse based on database tables.
[0043] In operation S203, traffic analysis is performed on the metadata to obtain the traffic analysis results.
[0044] In operation S204, the running data of the message queue link is monitored, and if the running data exceeds the preset conditions, at least one idle message queue link is automatically created based on the traffic analysis results. The message queue link is used for data transmission.
[0045] In operation S205, a data transmission instruction is executed based on at least one idle message queue link.
[0046] According to embodiments of this disclosure, the data transmission instruction may be generated in response to some operations, such as an instruction to transmit transaction data generated in response to an operation involving transaction processing, or an instruction to transmit modified data generated when performing operations such as inserting, deleting, or modifying data in a database table.
[0047] According to embodiments of this disclosure, the preset program may include a Qrep Capture export program. The Qrep Capture export program differs from Qrep Capture. Qrep Capture is a systematic technology that can send Qrep Capture records to other systems. Qrep Capture, on the other hand, is a DB2 data replication product for IBM mainframes, used for host-to-host and host-to-platform DB2 data replication, sending data table records to the target system by reading DB2 logs.
[0048] According to embodiments of this disclosure, a database table may refer to a DB2 table. A DB2 table refers to a set of system tables created when a DB2 database (which can be understood as a large relational data platform that allows users or applications to query data from different databases or even different database management systems using the same Structured Query Language) is created. These system tables record information related to all database objects (tables, views, etc.), such as the name of the database object (including aliases for tables and views), program, program name, program instance name, dependent objects, names of dependent objects, and types of dependent objects. In embodiments of this disclosure, a DB2 table may include multiple data records, each of which may be obtained in response to a transaction. For example, in response to a transaction, multiple tables may be inserted, multiple rows / columns of data may be inserted, multiple tables may be deleted, and multiple rows / columns of data may be deleted according to the transaction.
[0049] According to embodiments of this disclosure, a data warehouse can store metadata. Metadata can be understood as intermediary data, mainly describing data attributes and used to support functions such as indicating storage location, historical data, resource lookup, and file records. In embodiments of this disclosure, metadata may include the identifier of a database table and the byte length occupied by the database table. The identifier of a database table can be understood as the name of the database table. The byte length occupied by the database table can be obtained based on the bytes occupied by each data record in the database table.
[0050] According to embodiments of this disclosure, traffic analysis can be understood as calculating real-time traffic to metadata and analyzing the flow rate and total byte length of processed data records for each DB2 table. The resulting traffic analysis can serve as a basis for subsequently determining whether a message queue link needs to be created.
[0051] According to embodiments of this disclosure, the operational data of a message queue link may include the running memory of the message queue link, the transmission delay of the message queue link, and the operational status parameters of the message queue link.
[0052] According to embodiments of this disclosure, the preset conditions can be constructed based on running memory, latency thresholds, and running status. The preset conditions can be adaptively set according to actual needs. The situation where running data exceeds the preset conditions can be understood as at least one of the following: running memory exceeds a preset memory value, transmission latency exceeds a preset latency threshold, or running status parameters reach preset overload state parameters, etc.
[0053] According to embodiments of this disclosure, when the running data exceeds preset conditions, multiple idle message queue links can be automatically created, and data transmission instructions can be executed using the idle message queue links.
[0054] According to the data transmission method, apparatus, device, storage medium, and program product provided in this disclosure, in response to a data transmission command, a database table is obtained through a preset program; metadata is extracted from a data warehouse based on the database table; traffic analysis is performed on the metadata to obtain traffic analysis results; the operating data of message queue links is monitored, and if the operating data exceeds preset conditions, at least one idle message queue link is automatically created based on the traffic analysis results; and the data transmission command is executed using the idle message queue link. Because the data transmission process incorporates the traffic analysis results obtained from the traffic analysis of the database table, and automatically creates multiple message queue links based on the traffic analysis results, it at least partially overcomes the transmission bottlenecks and the inability to achieve horizontal data expansion in related technologies, thereby achieving the technical effect of improving the efficiency and flexibility of data transmission.
[0055] According to an embodiment of this disclosure, operation S203 may further include the following operations: in response to an operation on the database table, recording the change in the byte length; obtaining the flow rate of the database table by analyzing the change in the byte length over a period of time; and determining the flow analysis result based on the flow rate of the database table.
[0056] According to embodiments of this disclosure, in a database table, the byte length of the database table changes with each data record processed. Based on the change in byte length over a certain period, the flow rate of the database table can be determined. Specifically, the flow rate can refer to how many bytes of data records are transmitted per minute; or how many bytes of data records are copied per minute; or how many rows of data records are copied per minute. The flow rate obtained through flow analysis can be used as the result of the flow analysis.
[0057] The data transmission method provided in the embodiments of this disclosure can also monitor and control data traffic, thereby improving the flexibility of data transmission.
[0058] According to embodiments of this disclosure, the metadata may include the identifier of a database table, and the above method may further include the following operations: registering at least one idle message queue link to a message queue routing table; and returning at least one idle message queue link from the message queue routing table using the identifier of the database table.
[0059] According to embodiments of this disclosure, the message queue routing table may contain multiple message queue links and their names. Except for newly created idle links, the other message queue links in the routing table may be pre-configured so that when a data transmission command is detected, the corresponding message queue link can be found based on the attribute parameters in the data transmission command for transmission, thereby improving data transmission efficiency.
[0060] According to embodiments of this disclosure, when creating an idle message queue link, a link identifier that can identify the link can be configured for the idle link, and the link identifier and link address are registered together in the routing table. It is understood that the link identifier can be associated with an identifier in a database table. In subsequent operations, the link identifier associated with the database identifier can be found in the message queue routing table, and the link address corresponding to that link identifier can be returned.
[0061] According to embodiments of this disclosure, when only one free message queue link is returned, this single free message queue link can be used as the target message queue link, and data transmission can be performed using this target message queue link. When the number of returned free message queue links is m (m > 1), a hash algorithm can be used to determine n target message queue links from the m free message queue links, where m > 1, 1 ≤ n ≤ m.
[0062] According to embodiments of this disclosure, the process of determining n target message queue links from m idle message queue links using a hash algorithm may include the following operations: extracting the primary key field from a database table; and using the hash value obtained by hashing the field value of the primary key field and the value of m as n.
[0063] According to embodiments of this disclosure, a primary key field can be one or more fields in a table. The value of the primary key field can be used to identify a specific data record in the table, such as a row of data or an entity. A primary key of a table can be composed of multiple keywords.
[0064] According to embodiments of this disclosure, if the message queue routing table returns m message queue links (understandably, these m message queue links may include message queue links that still have running resources, or they may include created idle message queue links), then a hash value n needs to be obtained by performing a hash operation based on the field value of the primary key field in the data record and the value of m. The value of n can be in the range of 1≤n≤m. Then, the nth address among the m message queue links returned by the message queue routing table is the target message queue link that the data record needs to send to.
[0065] According to embodiments of this disclosure, n target message queue links are determined from m idle message queue links using a hash algorithm. This allows for the distribution of large amounts of data in database tables. Each data record can be assigned a target message queue link, thus enabling a database table to be transmitted through multiple message queue links. This solves the problem of low data transmission efficiency caused by a database table being transmitted only through one message queue link in related technologies. Consequently, it achieves balanced distribution of large amounts of data in database tables, ensures the stability of each message queue link, and improves overall data synchronization.
[0066] According to embodiments of this disclosure, by dynamically monitoring the running data of the message queue link and the traffic information of the database table, when the message queue link experiences memory shortages or data latency exceeds a threshold, it can automatically scale horizontally from a single message queue link to multiple message queue links. This solves the problems of insufficient transmission capacity, low transmission efficiency, and low data synchronization when transmitting large amounts of database tables through a single message queue link. Furthermore, the process of transmitting large amounts of database tables is dynamic, real-time, and variable, requiring no manual intervention, effectively improving data transmission throughput efficiency.
[0067] According to embodiments of this disclosure, during the execution of data transmission instructions, the operating parameters generated during the execution of data transmission instructions can be monitored to obtain monitoring results; if the monitoring results indicate an execution abnormality, an alarm message is generated and a retry operation is performed; if the number of retry operations exceeds a preset threshold, execution is stopped and relevant maintenance personnel are notified.
[0068] According to embodiments of this disclosure, the operating parameters can be parameters used to characterize the execution of data transmission instructions, such as transmission rate, transmission duration, and transmitted bytes. If the monitoring results indicate abnormal execution of the data transmission instructions, such as network failure, timeout, or packet loss, an alarm will be issued promptly and execution will be retried until normal operation is restored. If the number of retries exceeds a preset threshold, operation can be proactively stopped, and relevant maintenance personnel will be notified for troubleshooting and analysis. The preset threshold can be adaptively adjusted according to actual needs.
[0069] According to embodiments of this disclosure, by automatically monitoring the execution of data transmission instructions, issuing alarms and retrying execution when abnormalities occur, and stopping operation after a certain number of retries, the automation and intelligence of the data transmission process can be improved, the burden on maintenance personnel can be reduced, manpower consumption can be reduced, and the efficiency of data transmission can be improved.
[0070] Figure 3 The diagram illustrates the architecture of a data transmission system based on related technologies.
[0071] like Figure 3 As shown, the data transmission system 300 of the related technology may include a DB2 database 301, a Qrep Capture module 302, an MQ (Message Queue) link 303 for the host system, an MQ link 304 for the platform system, an application 305, and a target database 306. The message queue can be understood as a message middleware product for IBM mainframes and platforms, capable of storing different types of message data and possessing the ability to decouple upstream and downstream data and smooth out peak and valley loads. The application 305 can immediately read data from the MQ link 304 of the platform system and write it to the target database 306 for the platform side.
[0072] Continue to refer to Figure 3 Data transmission generally refers to data transfer between the host and the platform. The migration of DB2 data from an IBM host system to a SUSE platform system typically involves the host's IBM Qrep Capture module 302 retrieving DB2 data from the DB2 database 301, then sending it to the host system's MQ link 303. A data channel is then established between the host system's MQ link 303 and the platform system's MQ link 304. Finally, the application 305 retrieves the data from the platform system's MQ link 304 and writes it to the target database 306.
[0073] However, this technical solution generally has limitations. Qrep Capture can only connect to one host MQ, and all DB2 table data obtained by QrepCapture can only be forwarded directly to the host MQ. When a batch program updates a large table file in the DB2 database, and the updated data in the table file exceeds hundreds of millions, the sending speed of QrepCapture is greater than that of the host MQ. This often leads to problems such as the host MQ running out of memory and platform data reception delays, failing to meet the business requirements of real-time download of batch data from the host to the platform, thus affecting the real-time experience of the business.
[0074] Furthermore, the related technologies still have the following problems in the data transmission process: Qrep Capture has better throughput performance than host MQ, and host MQ has a throughput bottleneck when updating large amounts of data; Qrep Capture can only send to one host MQ, and cannot achieve horizontal scaling of transmission; Qrep Capture can only forward data directly and cannot control data flow; host MQ is the optimal solution for synchronizing high-time data to the platform, but the memory capacity and sending speed of host MQ still cannot meet the requirements of batch data transmission of hundreds of millions of updates, which will affect the real-time experience of the business, reduce the user experience, and reduce the efficiency and flexibility of data transmission.
[0075] Figure 4 An architectural diagram of a data transmission system according to an embodiment of the present disclosure is illustrated schematically.
[0076] like Figure 4 As shown, the data transmission system 400 of this embodiment may include a DB2 database 401, an exit program 402, a record capture module 403, an MQ distribution module 404, a traffic analysis module 405, an intelligent monitoring module 406, an MQ monitoring module 407, an MQ link of the host system 408, an MQ link of the platform system 409, an application 410, and a target database 411. It is understood that... Figure 4The ellipsis in the text can be used to represent MQ link 408 for multiple host systems, or MQ link 409 for a platform system.
[0077] The record capture module 403 is mainly used to acquire data records of the DB2 table sent by the Qrep Capture exit program 402, extract table metadata, and perform data forwarding functions. For each data record of the DB2 table, this module extracts metadata such as the corresponding table name and the byte length of the data record, and sends it asynchronously to the traffic analysis module 405; then the DB2 table data record is forwarded to the MQ distribution module 404 for processing.
[0078] The MQ distribution module 404 is primarily used to send data records from DB2 tables to specified host MQ addresses, implementing MQ multi-link distribution functionality. For each DB2 table data record, the MQ routing table is first queried based on the table name corresponding to the data record. If the MQ routing table query returns an MQ address that can be sent directly, the data record is sent directly to that MQ address. If the MQ routing table returns m MQ addresses, the data record will be hashed using the primary key field and the value of m to obtain a hash value n, where n ranges from 1 to n to m. The nth address among the m MQ addresses returned by the MQ routing table is the target MQ address to which the data record needs to be sent, and the MQ distribution module 404 completes the record distribution to the target MQ address. If a sending anomaly occurs during multi-link distribution, an alarm will be triggered promptly, and the sending will be retried until normal operation is restored. If the number of retries exceeds a threshold, the operation will automatically stop, requiring maintenance personnel to investigate and analyze the issue.
[0079] The traffic statistics module 405 is mainly used to perform real-time traffic calculation on the metadata of the data records in the table, and to calculate and store the flow rate and the total number of bytes of the data records sent for each table, providing data analysis basis for the intelligent monitoring module 406.
[0080] The intelligent monitoring module 406 is primarily used to access and view real-time traffic statistics from the traffic statistics module 405, dynamically detecting the flow rate and total byte length of traffic for each table. The intelligent monitoring module 406 also calls the MQ monitoring module 407 to obtain information such as the running status, memory capacity, and data latency of each MQ. If the MQ memory capacity or data latency exceeds the warning threshold, it combines the traffic statistics of the database table to select another idle MQ node for that table, and then distributes the database table's data records to this new idle MQ node. By dynamically expanding new MQ links and distributing database table data records, multi-link distribution for database tables with large data volumes can be achieved. The intelligent monitoring module 406 also registers the new idle MQ address corresponding to the database table in the MQ routing table.
[0081] The MQ monitoring module 407 is mainly used to monitor and obtain information such as the running status of each MQ, MQ memory capacity, and MQ data latency. It is also used for operations such as initializing and closing MQ connections to save MQ node operating resources.
[0082] The data transmission system provided in this disclosure not only maintains the original MQ transmission method, but also adds intelligent traffic awareness, automatic creation of multiple links, and intelligent data routing balancing algorithms, which solves the shortcomings of the original solution and ensures that when the amount of data to be transmitted is large, the data can be downloaded to the platform in real time.
[0083] According to embodiments of this disclosure, a method for transmitting large amounts of data based on a host MQ is provided. By setting up a multi-link data transmission system 400, when encountering problems such as insufficient MQ memory space and increased transmission latency when updating a database table with a large amount of data, multiple parallel MQ transmission links are automatically created through intelligent analysis. Furthermore, the traffic of the database table is evenly distributed to different MQ links through a hash routing algorithm, thereby solving the original problems of insufficient storage and high transmission latency of a single MQ.
[0084] According to embodiments of this disclosure, Qrep Capture provides an exit program for sending data records from a database table to other external systems. By implementing an MQ multi-link intelligent distribution system, the system captures data records provided by the Qrep Capture exit program and calculates the traffic information for each table in real time. When encountering situations where a large amount of data is updated in the database table due to the host running batch programs, and the Qrep Capture sending speed is greater than the host MQ sending speed, the data transmission system can detect the MQ memory and data latency in real time. When the MQ memory or data latency exceeds the threshold, it can dynamically create new MQ nodes and distribute the traffic from the database table evenly across multiple different MQ links, thereby avoiding problems such as memory space exhaustion and platform data reception delays that occur on a single MQ link.
[0085] Figure 5 The application is illustrated schematically. Figure 4 The flowchart shows the data transmission method executed by the data transmission system.
[0086] like Figure 5 As shown, the data transmission method of this embodiment includes operations S501 to S505.
[0087] When operating S501, it captures records provided by the Qrep Capture export program, extracts the table's metadata and sends it to the traffic analysis module, and sends the data records to the MQ distribution module.
[0088] When operating S502, the traffic statistics module performs real-time traffic calculations on the table's metadata, completing the calculation and storage of the flow rate and the total byte length of the data records sent for each table.
[0089] When operating the S503, the MQ monitoring module monitors the MQ running status, memory capacity, data latency, and other information in real time.
[0090] When operating S504, the intelligent monitoring module simultaneously monitors the real-time traffic of the table and the operation of MQ. When MQ experiences memory or data latency exceeding the threshold, the intelligent monitoring module determines and selects the database table based on the traffic situation, requests a new idle MQ node from the MQ monitoring module, and registers the new idle MQ node in the MQ routing table of the database table.
[0091] When operating the S505, the MQ distribution module queries the MQ routing table to obtain the MQ address to which the record will be sent. If the MQ routing table query returns a single target MQ address, the record is sent to that MQ address; if the MQ routing table query returns multiple MQ addresses, the target MQ address corresponding to the record is calculated using a hash algorithm, and finally the record is sent to that MQ address.
[0092] According to the embodiments of this disclosure, the related contents between operations S401 to S405 and operations S201 to S205 can be referred to each other, and will not be repeated here.
[0093] According to the data transmission method provided in this disclosure, by dynamically monitoring the traffic information of MQ nodes and tables, when MQ memory becomes scarce or data latency exceeds a threshold, it can automatically scale horizontally from a single MQ node to multiple MQ nodes, solving the problem of insufficient capacity for transmitting large database tables through a single MQ. The multi-link transmission of large database tables is dynamic, real-time, and variable, requiring no manual intervention, effectively improving data transmission throughput. Based on the dynamically adjusted number of MQ nodes, a hash algorithm can evenly distribute data from large database tables, ensuring stable performance of each MQ node and improving the overall data synchronization timeliness.
[0094] It should be noted that, unless it is explicitly stated that there is a sequential order of execution between different operations, or that there is a sequential order of execution between different operations in terms of technical implementation, the execution order between multiple operations may not be significant, and multiple operations may be executed simultaneously.
[0095] Based on the above data transmission method, this disclosure also provides a data transmission apparatus. The following will be combined with... Figure 6 The device is described in detail.
[0096] Figure 6A schematic block diagram of a data transmission apparatus according to an embodiment of the present disclosure is shown.
[0097] like Figure 6 As shown, the data transmission device 600 of this embodiment includes an acquisition module 610, an extraction module 620, an analysis module 630, a first monitoring module 640, and an execution module 650.
[0098] The acquisition module 610 is used to obtain the database table carried in the data transmission instruction in response to the data transmission instruction through a preset program;
[0099] Extraction module 620 is used to extract metadata from the data warehouse based on database tables;
[0100] Analysis module 630 is used to perform traffic analysis on metadata and obtain traffic analysis results;
[0101] The first monitoring module 640 is used to monitor the operation data of the message queue link, and automatically create at least one idle message queue link based on the traffic analysis results when the operation data exceeds the preset conditions. The message queue link is used for data transmission.
[0102] The execution module 650 is used to execute data transmission instructions based on at least one idle message queue link.
[0103] According to the data transmission method, apparatus, device, storage medium, and program product provided in this disclosure, in response to a data transmission command, a database table is obtained through a preset program; metadata is extracted from a data warehouse based on the database table; traffic analysis is performed on the metadata to obtain traffic analysis results; the operating data of message queue links is monitored, and if the operating data exceeds preset conditions, at least one idle message queue link is automatically created based on the traffic analysis results; and the data transmission command is executed using the idle message queue link. Because the data transmission process incorporates the traffic analysis results obtained from the traffic analysis of the database table, and automatically creates multiple message queue links based on the traffic analysis results, it at least partially overcomes the transmission bottlenecks and the inability to achieve horizontal data expansion in related technologies, thereby achieving the technical effect of improving the efficiency and flexibility of data transmission.
[0104] According to embodiments of this disclosure, the data transmission device may further include a registration module and a return module.
[0105] The registration module is used to register at least one idle message queue link into the message queue routing table.
[0106] The return module is used to return at least one free message queue link from the message queue routing table by an identifier in the database table.
[0107] According to embodiments of this disclosure, the data transmission apparatus may further include a determining module.
[0108] The determination module is used to determine n target message queue links from the m free message queue links using a hash algorithm, where m > 1 and 1 ≤ n ≤ m, given that m free message queue links are returned.
[0109] According to embodiments of this disclosure, the determining module may further include an extraction unit and a calculation unit.
[0110] Extraction unit, used to extract the primary key field from a database table.
[0111] The operation unit is used to perform a hash operation on the field value of the primary key field and the value of m, and the resulting hash value is used as n.
[0112] According to embodiments of this disclosure, the data transmission device may further include a second monitoring module, a generation module, and a notification module.
[0113] The second monitoring module is used to monitor the operating parameters generated during the execution of data transmission instructions and obtain monitoring results.
[0114] The generation module is used to generate alarm information and perform retry operations when the monitoring results indicate that an execution anomaly has occurred.
[0115] The notification module is used to stop execution and notify relevant maintenance personnel when the number of retry operations exceeds a preset threshold.
[0116] According to embodiments of this disclosure, the analysis module may further include a recording unit, an analysis unit, and a determination unit.
[0117] A record unit is used to record changes in byte length in response to operations on a database table.
[0118] The analysis unit is used to obtain the flow rate of the database table by analyzing the changes in byte length within the analysis period.
[0119] The determination unit is used to determine the flow analysis results based on the flow rate in the database table.
[0120] According to embodiments of this disclosure, any multiple modules among the acquisition module 610, extraction module 620, analysis module 630, first monitoring module 640, and execution module 650 can be combined into one module, or any one of these modules can be split into multiple modules. Alternatively, at least some of the functions of one or more of these modules can be combined with at least some of the functions of other modules and implemented in one module. According to embodiments of this disclosure, at least one of the acquisition module 610, extraction module 620, analysis module 630, first monitoring module 640, and execution module 650 can be at least partially implemented as hardware circuitry, such as a field-programmable gate array (FPGA), a programmable logic array (PLA), a system-on-a-chip, a system-on-a-substrate, a system-on-package, an application-specific integrated circuit (ASIC), or implemented in hardware or firmware by any other reasonable means of integrating or packaging the circuitry, or implemented in software, hardware, or firmware, or in any suitable combination of any of these three implementation methods. Alternatively, at least one of the acquisition module 610, extraction module 620, analysis module 630, first monitoring module 640, and execution module 650 may be implemented at least partially as a computer program module, which can perform corresponding functions when the computer program module is run.
[0121] It should be noted that the data transmission device part in the embodiments of this disclosure corresponds to the data transmission method part in the embodiments of this disclosure. The specific description of the data transmission device part is referred to in the data transmission method part, and will not be repeated here.
[0122] Figure 7 A block diagram schematically illustrates an electronic device suitable for implementing a data transmission method according to an embodiment of the present disclosure.
[0123] like Figure 7 As shown, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 702 or a program loaded from a storage portion 708 into a random access memory (RAM) 703. The processor 701 may include, for example, a general-purpose microprocessor (e.g., a CPU), an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (e.g., an application-specific integrated circuit (ASIC)), etc. The processor 701 may also include onboard memory for caching purposes. The processor 701 may include a single processing unit or multiple processing units for performing different actions of the method flow according to an embodiment of the present disclosure.
[0124] RAM 703 stores various programs and data required for the operation of electronic device 700. Processor 701, ROM 702, and RAM 703 are interconnected via bus 704. Processor 701 performs various operations of the method flow according to embodiments of the present disclosure by executing programs in ROM 702 and / or RAM 703. It should be noted that the programs may also be stored in one or more memories other than ROM 702 and RAM 703. Processor 701 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in said one or more memories.
[0125] According to embodiments of this disclosure, the electronic device 700 may further include an input / output (I / O) interface 705, which is also connected to a bus 704. The electronic device 700 may also include one or more of the following components connected to the I / O interface 705: an input section 706 including a keyboard, mouse, etc.; an output section 707 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 708 including a hard disk, etc.; and a communication section 709 including a network interface card such as a LAN card, modem, etc. The communication section 709 performs communication processing via a network such as the Internet. A drive 710 is also connected to the I / O interface 705 as needed. A removable medium 711, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 710 as needed so that computer programs read from it can be installed into the storage section 708 as needed.
[0126] This disclosure also provides a computer-readable storage medium, which may be included in the device / apparatus / system described in the above embodiments; or it may exist independently and not assembled into the device / apparatus / system. The computer-readable storage medium carries one or more programs that, when executed, implement the method according to the embodiments of this disclosure.
[0127] According to embodiments of this disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, such as, but not limited to: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In this disclosure, the computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. For example, according to embodiments of this disclosure, the computer-readable storage medium may include ROM 702 and / or RAM 703 and / or one or more memories other than ROM 702 and RAM 703 described above.
[0128] Embodiments of this disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowchart. When the computer program product is run on a computer system, the program code enables the computer system to implement the data transmission method provided in the embodiments of this disclosure.
[0129] When the computer program is executed by the processor 701, it performs the functions defined in the system / apparatus of this disclosure embodiments. According to embodiments of this disclosure, the systems, apparatuses, modules, units, etc., described above can be implemented by computer program modules.
[0130] In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device or a magnetic storage device. In another embodiment, the computer program may also be transmitted and executed in the form of signals over a network medium, and may be downloaded and installed via the communication section 709, and / or installed from a removable medium 711. The program code contained in the computer program can be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination thereof.
[0131] In such an embodiment, the computer program can be downloaded and installed from a network via the communication section 709, and / or installed from the removable medium 711. When the computer program is executed by the processor 701, it performs the functions defined in the system of this disclosure embodiment. According to embodiments of this disclosure, the systems, devices, apparatuses, modules, units, etc., described above can be implemented by computer program modules.
[0132] According to embodiments of this disclosure, program code for executing the computer programs provided in embodiments of this disclosure can be written in any combination of one or more programming languages. Specifically, these computational programs can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. Programming languages include, but are not limited to, languages such as Java, C++, Python, "C", or similar programming languages. The program code can execute entirely on the user's computing device, partially on the user's device, partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).
[0133] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0134] Those skilled in the art will understand that the features described in the various embodiments and / or claims of this disclosure can be combined or combined in various ways, even if such combinations or combinations are not explicitly described in this disclosure. In particular, the features described in the various embodiments and / or claims of this disclosure can be combined or combined in various ways without departing from the spirit and teachings of this disclosure. All such combinations and / or combinations fall within the scope of this disclosure.
[0135] The embodiments of this disclosure have been described above. However, these embodiments are for illustrative purposes only and are not intended to limit the scope of this disclosure. Although various embodiments have been described above, this does not mean that the measures in the various embodiments cannot be used advantageously in combination. The scope of this disclosure is defined by the appended claims and their equivalents. Various substitutions and modifications can be made by those skilled in the art without departing from the scope of this disclosure, and all such substitutions and modifications should fall within the scope of this disclosure.
Claims
1. A data transmission method, comprising: In response to a data transmission command, the database table carried in the data transmission command is obtained through a preset program; Based on the database tables, extract metadata from the data warehouse; Perform traffic analysis on the metadata to obtain the traffic analysis results; Monitor the operation data of the message queue link, and if the operation data exceeds a preset condition, automatically create at least one idle message queue link based on the traffic analysis results, wherein the message queue link is used for data transmission; The data transmission instruction is executed based on the at least one idle message queue link; The metadata includes the identifier of the database table; The method further includes: registering the at least one idle message queue link to a message queue routing table; At least one idle message queue link is returned from the message queue routing table based on the identifier of the database table. Given m free message queue links, a hash algorithm is used to determine n target message queue links from the m free message queue links, where m > 1 and 1 ≤ n ≤ m; The step of using a hash algorithm to determine n target message queue links from the m idle message queue links includes: Extract the primary key field from the database table; The hash value obtained by performing a hash operation on the field value of the primary key field and the value of m is used as n.
2. The method according to claim 1, further comprising: The operating parameters generated during the execution of the data transmission instructions are monitored, and the monitoring results are obtained; If the monitoring results indicate an abnormality, an alarm message will be generated and a retry operation will be performed. If the number of retry operations exceeds a preset threshold, execution will be stopped and relevant maintenance personnel will be notified.
3. The method of claim 1, wherein, The metadata includes the byte length occupied by the database table; The traffic analysis of the metadata to obtain the traffic analysis results includes: In response to operations on the database table, the change in the byte length is recorded; The flow rate of the database table is obtained by analyzing the change in the length of the bytes within the period. The flow analysis results are determined based on the flow rates in the database tables.
4. The method of claim 1, wherein, The operational data includes the running memory of the message queue link, the transmission delay of the message queue link, and the operational status parameters of the message queue link.
5. A data transmission device, comprising: The acquisition module is used to acquire the database table carried in the data transmission instruction through a preset program in response to the data transmission instruction; An extraction module is used to extract metadata from the data warehouse based on the database table, wherein the metadata includes the identifier of the database table; The analysis module is used to perform traffic analysis on the metadata and obtain the traffic analysis results; The first monitoring module is used to monitor the operation data of the message queue link, and when the operation data exceeds a preset condition, automatically create at least one idle message queue link based on the traffic analysis results, wherein the message queue link is used for data transmission; An execution module is configured to execute the data transmission instruction based on the at least one idle message queue link; The registration module is used to register the at least one idle message queue link into the message queue routing table; The return module is used to return at least one of the idle message queue links from the message queue routing table by means of the identifier in the database table; The determination module is used to determine n target message queue links from the m free message queue links using a hash algorithm, where m > 1 and 1 ≤ n ≤ m, given that m free message queue links are returned. The determining module includes an extraction unit and a calculation unit; The extraction unit is used to extract the primary key field from the database table; The calculation unit is used to perform a hash operation on the field value of the primary key field and the value of m, and the resulting hash value is used as n.
6. An electronic device, comprising: One or more processors; Storage device for storing one or more programs. Wherein, when the one or more programs are executed by the one or more processors, the one or more processors perform the method according to any one of claims 1 to 4.
7. A computer-readable storage medium having executable instructions stored thereon, which, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 4.
8. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1 to 4.