A message processing method and related apparatus
By automatically generating ISO message node path configuration tables and using W3c.dom technology to parse message content, the problems of high computational load and high space requirements in ISO message processing are solved, achieving efficient message processing.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- AGRICULTURAL BANK OF CHINA
- Filing Date
- 2022-10-24
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies for processing ISO messages involve large computational loads and high computational space requirements, and the method of recursively traversing adjacency lists to build data models is inefficient.
By automatically generating ISO message node path configuration tables and using ISO message format files, insert statements are generated. This allows for independent operation without database dependence, reducing computational complexity. Furthermore, message content is parsed using W3c.dom technology.
It reduces computing space and computational load, shortens computation time, reduces subsequent maintenance costs, and improves processing efficiency.
Smart Images

Figure CN115643177B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to a message processing method and related apparatus. Background Technology
[0002] The Society for Worldwide Interbank Financial Telecommunications (SWIFT) and its community, committed to creating real-time, frictionless payments, plan to migrate the Message Type (MT) payment and cash management message standards to the ISO 20022 standard.
[0003] Current technologies all handle MT messages. The format of International Organization for Standardization (ISO) messages is completely different from that of MT messages. MT messages are in FIN format and their data model is suitable for two-level paths, while ISO messages are based on XML format and their content typically involves three or more levels of paths.
[0004] If the method for processing MT message format is used to process ISO message, that is, the data model built by recursively traversing the adjacency list, the amount of computation required increases significantly when the path hierarchy is relatively high, and the required computing space is also large. Summary of the Invention
[0005] To address the aforementioned issues, this application provides a message processing method and related apparatus for processing ISO messages, which can reduce computational space and computational load, and shorten the computation time.
[0006] Based on this, the embodiments of this application disclose the following technical solutions:
[0007] On the one hand, embodiments of this application provide a message processing method, the method comprising:
[0008] Obtain the ISO message format file and Chinese name dictionary from the International Organization for Standardization. The ISO message format file includes the standard format of ISO messages, and the Chinese name dictionary includes the Chinese definitions of ISO messages.
[0009] Read the i-th line of data from the ISO message format file, where the initial value of i is 1;
[0010] Remove the attribute and name attribute identifier from the node that reads the i-th row of data;
[0011] If the node removal attribute indicates that the data in the i-th row is non-obsolete data, and the node name attribute indicates that the data in the i-th row is node rule information, then the path of the data in the i-th row is completed according to the type of the data in the i-th row to obtain the path information of the data in the i-th row. The type of the data in the i-th row includes a message header and a message body.
[0012] By merging the information from the i-th row of data, we obtain format information and duplicate information identifiers;
[0013] If the duplicate information identifier indicates that the data in the i-th row can be repeated, obtain the maximum and minimum number of times it can be repeated;
[0014] The required node identifier of the i-th row of data is obtained based on the path information. The required node identifier is used to identify whether the node represented by the i-th row of data must appear in the ISO message.
[0015] Extract the regular expression, the format type of the i-th row of data, and the length information corresponding to the format type based on the format information;
[0016] The node number of the node represented by the i-th row of data is determined according to the type of the i-th row of data and the order of the nodes represented by the i-th row of data;
[0017] Complete the Chinese definition of the node represented by the i-th row of data according to the Chinese name dictionary to obtain the Chinese name and Chinese annotation of the node;
[0018] Increment the value of i by 1, execute the step of reading the i-th line of data in the ISO message format file and subsequent steps, until all data in the ISO message format file has been read;
[0019] Based on the path information, the duplicate information identifier, the required node identifier, the maximum number of times, the minimum number of times, the regular expression, the format type, the length information, the node Chinese name, and the node Chinese annotation, generate the insertion statement corresponding to the ISO message node path configuration table.
[0020] Optionally, after merging the information of the i-th row of data to obtain format information and a repetition information identifier, and before obtaining the maximum and minimum allowed number of repetitions if the repetition information identifier indicates that the i-th row of data can be repeated, the method further includes:
[0021] If the i-th row of data includes optional information, then an insert statement for the ISO message node option configuration table is generated based on the optional information;
[0022] Increment the value of i by 1, and execute the step of reading the i-th line of data in the ISO message format file and subsequent steps until all data in the ISO message format file has been read.
[0023] Optionally, determining the node number of the node represented by the i-th row of data based on the type of the i-th row of data and the order of the nodes represented by the i-th row of data includes:
[0024] Determine whether the node represented by the i-th row of data is the first node of the message header;
[0025] If the node represented by the i-th row of data is the first node in the message header, then the node number is determined to be AA000;
[0026] If the node represented by the i-th row of data is not the first node of the message header, then determine whether the node identified by the i-th row of data is the first node of the message body.
[0027] If the node represented by the i-th row of data is the first node of the message body, then the node number is determined to be DA000;
[0028] If the node represented by the i-th row of data is not the first node of the message body, the node number is determined by incrementing the preset step size.
[0029] Optionally, the method further includes:
[0030] Obtain ISO messages;
[0031] Read the message content, path information, and message type of the ISO message;
[0032] The message structure is obtained from the ISO message node path configuration table according to the message type;
[0033] The i-th leaf node in the message content is obtained according to the message structure, where i is 1;
[0034] Obtain the path information and node number of the i-th leaf node;
[0035] The node content of the i-th leaf node is parsed based on the path information of the i-th leaf node;
[0036] The node value of the i-th leaf node is determined based on the node number and the number of repetitions of the i-th leaf node.
[0037] Increment the value of i by 1, execute the step of obtaining the i-th leaf node in the message content according to the message structure and subsequent steps, until all the content of the message content is read.
[0038] The node content and the node value are stored in the ISO message node information table.
[0039] Optionally, before obtaining the message structure from the ISO message node path configuration table according to the message type, the method further includes:
[0040] Obtain the message header and message body content based on the message content;
[0041] The message header content and the message body content are verified according to the ISO format verification file.
[0042] If the verification passes, then the process of retrieving the message structure from the ISO message node path configuration table based on the message type is executed.
[0043] Optionally, the method further includes:
[0044] Obtain the ISO message node and transaction domain mapping table, which includes the node name and node path of the node to be converted;
[0045] The record of the node to be converted is obtained from the ISO message node information table according to the node path;
[0046] Obtain user conversion needs;
[0047] The records of the nodes to be converted are converted according to the conversion requirements to obtain the converted records of the nodes that meet the conversion requirements.
[0048] The user is shown the records of the converted node.
[0049] On the other hand, this application provides a message processing apparatus, the apparatus comprising:
[0050] The acquisition unit is used to acquire the ISO message format file and the Chinese name dictionary of the International Organization for Standardization. The ISO message format file includes the standard format of ISO messages, and the Chinese name dictionary includes the Chinese interpretation of ISO messages.
[0051] The reading unit is used to read the i-th line of data in the ISO message format file, where the initial value of i is 1;
[0052] The reading unit is also used to read the node removal attribute and name attribute identifier of the i-th row of data;
[0053] The completion unit is used to complete the path of the i-th row of data according to the type of the i-th row of data if the node removal attribute indicates that the i-th row of data is non-obsolete data and the node name attribute indicates that the i-th row of data is node rule information, thereby obtaining the path information of the i-th row of data. The type of the i-th row of data includes a message header and a message body.
[0054] The merging unit is used to merge the information of the i-th row of data to obtain format information and duplicate information identifier;
[0055] The acquisition unit is further configured to acquire the maximum and minimum number of times the data in the i-th row can be repeated if the duplicate information identifier indicates that the data in the i-th row can be repeated.
[0056] The acquisition unit is further configured to acquire the required node identifier of the i-th row of data according to the path information, wherein the required node identifier is used to identify whether the node represented by the i-th row of data must appear in the ISO message;
[0057] The extraction unit is used to extract the regular expression, the format type of the i-th row of data, and the length information corresponding to the format type based on the format information.
[0058] The determining unit is configured to determine the node number of the node represented by the i-th row of data based on the type of the i-th row of data and the order of the nodes represented by the i-th row of data.
[0059] The completion unit is also used to complete the Chinese definition of the node represented by the i-th row of data according to the Chinese name dictionary, so as to obtain the Chinese name and Chinese annotation of the node;
[0060] The loop unit is used to increment the value of i by 1, execute the step of reading the i-th line of data in the ISO message format file and subsequent steps, until all data in the ISO message format file has been read;
[0061] The generation unit is used to generate the insertion statement corresponding to the ISO message node path configuration table based on the path information, the duplicate information identifier, the required node identifier, the maximum number of times, the minimum number of times, the regular expression, the format type, the length information, the node Chinese name, and the node Chinese annotation.
[0062] On the other hand, this application provides a computer device, the device including a processor and a memory:
[0063] The memory is used to store program code and transmit the program code to the processor;
[0064] The processor is configured to execute the methods described above according to instructions in the program code.
[0065] On the other hand, this application provides a computer-readable storage medium for storing a computer program for performing the methods described above.
[0066] On the other hand, embodiments of this application provide a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the methods described above.
[0067] The advantages of the above-mentioned technical solution in this application are:
[0068] As can be seen from the above technical solution, the construction of the data model for ISO messages in XML format—that is, the automatic generation of insertion statements for the ISO message node path configuration table from the ISO message format file—is a process that can run independently without relying on a database, with extremely low time complexity. Furthermore, it requires no environment configuration, and the code for building the data model is packaged and released, providing input parameters. This eliminates the technical background requirements for subsequent maintenance personnel, reducing the cost of maintaining the ISO message data model. Moreover, using this data model eliminates the need for recursive model construction using adjacency lists and parsing message content using string truncation, reducing computational space and workload, and shortening the computation time. Attached Figure Description
[0069] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0070] Figure 1 A flowchart illustrating a message processing method provided in this application embodiment;
[0071] Figure 2 This application provides a structural diagram of an XML-structured text.
[0072] Figure 3 A schematic diagram of a message processing apparatus provided in an embodiment of this application;
[0073] Figure 4 This is a structural diagram of a computer device provided in an embodiment of this application. Detailed Implementation
[0074] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present application, and not all embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present application.
[0075] In related technologies, message processing methods for MT messages utilize multiple static tables to form a structure tree. Based on this structure tree, a data model is constructed by recursively traversing the adjacency list to extract the message content. The specific steps are as follows:
[0076] Step 1: Extract the format of the MT (FIN) message from the HTML webpage on the SWIFT website.
[0077] Step 2: Store the fields (first-level paths) of each type of MT (FIN) message into the message body table, then store the sub-lines (second-level paths) under each field into the message name table, and finally store the format, starting position identifier, length, and whether it is required for each sub-line into the message format table.
[0078] Step 3: Read the message body table and message name table from the database, and use the adjacency list data structure to build the message model.
[0079] Step 4: Recursively traverse the message model and extract the message content based on the identifier of the starting position of the sub-line.
[0080] Therefore, the processing scheme for MT(FIN) messages recursively reads static data from the database to construct a tree-structured model, resulting in long initialization times. ISO messages are in XML format, and their content typically involves paths at three or more levels. While the data model for MT messages is suitable for messages with two-level paths, using the same methods for processing MT messages to process ISO messages, the recursive traversal of the adjacency list to construct the data model significantly increases the computational load and requires more computational space when the path hierarchy is high.
[0081] Based on this, embodiments of this application provide a message processing method that, by establishing a data model suitable for ISO messages, processes ISO messages, thereby reducing computational space and computational load, and shortening the computation time required.
[0082] The message processing method provided in this application can be applied to terminal devices or servers. Terminal devices include, but are not limited to, mobile phones, computers, and intelligent voice interaction devices; servers can be independent physical servers, server clusters or distributed systems composed of multiple physical servers, or cloud servers providing cloud computing services. Terminal devices and servers can be directly or indirectly connected via wired or wireless communication, and this application does not impose any restrictions on this connection.
[0083] The following is combined with Figure 1 This paper describes a message processing method provided in this application, taking a server as the execution subject of the method as an example. See also... Figure 1 The figure is a flowchart of a message processing method provided in an embodiment of this application, which may include S101-S112.
[0084] S101: Obtain the ISO message format file and Chinese name dictionary from the International Organization for Standardization.
[0085] In practical applications, users can directly import ISO message format files and Chinese name dictionaries, or they can input the identifiers of ISO message format files (such as file names) and Chinese name dictionaries (such as dictionary identifiers) into the server. The server will then retrieve the corresponding ISO message format files and Chinese name dictionaries based on the identifiers.
[0086] The ISO message format file includes the standard format of ISO messages, and the Chinese name dictionary includes the Chinese definitions of ISO messages, which are provided by branch business personnel according to the needs of the branch.
[0087] One possible approach is to download the standard format of ISO messages from the official website; the downloaded file is typically an Excel spreadsheet. Another possible method is to utilize the API of the POI technology to read the standard format file of the ISO message in Excel format.
[0088] S102: Read the i-th line of data from the ISO message format file, where the initial value of i is 1.
[0089] Because ISO messages are in XML format, XML is a markup language used to structure electronic documents. For example... Figure 2 As shown, the text in the XML structure consists of tag names and tag content, where each tag corresponds to a node in this embodiment.
[0090] Different message types correspond to different message structures. Each message format table contains tens of thousands of entries, each recording path information and its corresponding format. Since there is a one-to-one correspondence between path information and format, this embodiment stores the path information and format in an ISO message node path configuration table. In subsequent ISO message format verification, it is only necessary to query the ISO message node path configuration table.
[0091] Based on this, in order to correctly extract the content of each type of message, this application embodiment establishes a corresponding ISO message node path configuration table for each message. Specifically, based on the fully automatic generation data model of the ISO message format file, the format and path information of the ISO message are extracted by traversing each line of data in the ISO message format file to generate the insertion statement for the ISO message node path configuration table. The entire process does not rely on a database and can run independently with extremely low time complexity (for example, extracting one message type takes only about 10 seconds). Furthermore, for subsequent static data maintenance of ISO messages, only the new version of the ISO message format file needs to be input, and the message processing method extracted in this application embodiment can be run to update the data model of the corresponding message type without configuring any other environment. The following describes in detail how to generate the corresponding ISO message node path configuration table insertion statement based on each ISO message format file.
[0092] It's important to note that ISO messages consist of a header and a body. Both the header and body typically contain at least one node. By traversing each line of data in the ISO message format file—starting from the first line—the location of the corresponding node for that type of ISO message can be determined. This allows for direct extraction of the corresponding node when receiving similar messages later. Compared to string truncation, this method avoids concerns about extracted content containing identifiers, and changes to node names and path information do not affect the extraction of the message content.
[0093] As one possible implementation, before executing S102, a static path dictionary can be set to represent the missing secondary paths corresponding to certain nodes of each type of message.
[0094] S103: Remove the attribute and name attribute identifier from the node that reads the i-th row of data.
[0095] On the SWIFT website, the ISO message standard is stored in an Excel spreadsheet. The format of ISO messages is quite complex, generally categorized as follows:
[0096] (1) Node attributes: whether it is a leaf node.
[0097] (2) Node type: whether it is required or optional.
[0098] (3) Whether a node can be repeated: if yes, the maximum and minimum number of inputs.
[0099] (4) Node content format type: indicates whether the node content is in text, date, datetime or decimal format (maximum and minimum length of decimal places).
[0100] (5) Regular expressions for node content validation.
[0101] Because the ISO message format is quite complex, related technologies typically store it in multiple tables, with the format and message structure stored separately. Furthermore, the data model for MT messages requires numerous static data tables. Maintaining this static data necessitates interaction with different database tables, which is time-consuming, and data migration requires ensuring version consistency across multiple tables, making it prone to errors.
[0102] After research and analysis, seven formats were identified for ISO messages: whether they are required, whether they repeat, minimum number of occurrences, maximum number of occurrences, format type, minimum length, maximum length, and regular expressions for the format content. Each leaf node path and format is uniquely defined. Therefore, for later data maintenance and migration, these seven categories were integrated into a single table—the ISO message node path configuration table—to record the paths and their corresponding seven formats, reducing the probability of errors.
[0103] The following sections provide explanations of the ISO message node path configuration table.
[0104] The node removal attribute is used to indicate whether data has been obsolete during the upgrade process. For example, in the standard format of an ISO message in Excel spreadsheet form, the node removal attribute is typically represented by the IsRemoved data column. If the IsRemoved data column is "yes," it means that the data in that row has been obsolete during the ISO version upgrade, so the current loop can be skipped and the next row can be iterated. If the IsRemoved data column is "no," it means that the data in that row is not obsolete during the ISO version upgrade.
[0105] The name attribute identifier is used to identify whether the data is node rule information. For example, in the standard format of ISO messages in Excel spreadsheet form, the name attribute identifier is generally identified by the data column "name". If the content of the data column "name" is "Algorithm", "CrossElementComplexRule", "CrossElementSimpleRule", or "Textual", it means that the data in that row does not describe node rule information, so the current loop can be skipped and the next row can be traversed. If the content of the data column "name" indicates that the data in that row is node rule information...
[0106] S104: If the node removal attribute indicates that the data in the i-th row is non-obsolete data, and the node name attribute indicates that the data in the i-th row is node rule information, then complete the path of the data in the i-th row according to the type of the data in the i-th row to obtain the path information of the data in the i-th row.
[0107] The data in the i-th row includes a message header and a message body. Missing path information is filled in for different types. For example, in the standard format file of an ISO message in Excel spreadsheet form, if the node name contains "BusinessApplication Header", the corresponding message header root path is filled in as " / AppHdr". If the node name contains "Document", the corresponding message body root path is filled in as " / Document" (because these two nodes lack path descriptions).
[0108] As one possible approach, the obtained path information can be saved to the path data column in the standard format file of the ISO message, which is in the form of an Excel spreadsheet.
[0109] S105: Merge the information of the i-th row of data to obtain format information and duplicate information identifier.
[0110] The duplicate information identifier is used to indicate whether a node can appear repeatedly. It should be noted that in the standard format file of ISO messages in Excel spreadsheet format, TypeOrCodeChange and MultiplicityChange are preferentially used as format information and duplicate information identifiers, respectively.
[0111] As one possible implementation, the path of the second level of the message body can be completed before executing S105.
[0112] As one possible implementation, after S105 and before S106, it can be determined whether the data in the i-th row includes optional information. If so, an insert statement for the ISO message node option configuration table is generated based on the optional information, and then S111 is executed. If the data in the i-th row does not include optional information, then S106 can be executed. The ISO message node option configuration table can be as shown in Table 1.
[0113] Table 1
[0114]
[0115] Therefore, it's determined whether this row of data describes optional values. Because some leaf nodes contain default and optional values, the message format manual will immediately follow such nodes with descriptions of optional values. Specifically, if the values of the XmlTag and Multiplicity columns are empty, an ISO message path will be generated for this row of data. An insert statement will be added to the ISO message node option configuration table, and then this loop will be skipped, iterating to the next row of data. For the rare cases where leaf nodes contain default and optional values, a supplementary static data configuration table is added. Therefore, in subsequent message format validation, only two tables need to be queried to construct the message's data model.
[0116] S106: If the duplicate information identifier indicates that the data in the i-th row can be repeated, obtain the maximum and minimum number of times it is allowed to be repeated.
[0117] For example, in the standard format file of ISO messages in Excel spreadsheet form, the duplicate information of nodes can be extracted. When the data column multiplicity is empty or 1, it means that it will not be repeated; otherwise, it will be repeated. Thus, the maximum and minimum number of times a node appears can be obtained.
[0118] As one possible implementation, before executing S106, the node can be distinguished as either a node belonging to the message header or a node belonging to the message body based on the path information (path column). Specifically, if the path contains AppHdr, the node belongs to the message header; if the path contains Document, the node belongs to the message body.
[0119] S107: Obtain the necessary node identifiers for the i-th row of data based on the path information.
[0120] The required node identifier indicates whether the node represented by the data in the i-th row must appear in the ISO message. For example, in the standard format file of an ISO message in Excel spreadsheet form, the node whose data column "path" contains "@" is a required node.
[0121] S108: Extract the regular expression, the format type of the i-th row of data, and the length information corresponding to the format type based on the format information.
[0122] For example, in the standard format file of an ISO message in Excel spreadsheet format, extract the node format information and regular expression. If TypeOrCode contains text, date, datetime, or decimal, it means that the corresponding format type is text, date, datetime, or decimal.
[0123] Specifically, it retrieves the corresponding length information based on the type. For example, if it is a text type, it retrieves the minimum and maximum length of the string; if it is a decimal type, it retrieves the minimum and maximum length of the decimal places.
[0124] S109: Determine the node number of the node represented by the i-th row of data based on the type of the i-th row of data and the order of the nodes represented by the i-th row of data.
[0125] There is a one-to-one correspondence between nodes and node numbers, so the corresponding node can be obtained based on the node number.
[0126] This application does not specifically limit the method for determining the node number. The following description uses one example, see A1-A5:
[0127] A1: Determine whether the node represented by the i-th row of data is the first node of the message header.
[0128] A2: If the node represented by the data in the i-th row is the first node in the message header, then the node number is determined to be AA000.
[0129] A3: If the node represented by the i-th row of data is not the first node of the message header, then determine whether the node represented by the i-th row of data is the first node of the message body.
[0130] A4: If the node identified by the data in the i-th row is the first node of the message body, then the node number is determined to be DA000.
[0131] A5: If the node represented by the i-th row of data is not the first node of the message body, the node number is determined by incrementing the preset step size.
[0132] This application does not specifically limit the preset step size increment method. For example, it can increment by (step size * 36) to reserve space for subsequent repeated nodes through a skip increment method. Thus, the nodes in the message header increment from AA000 and the nodes in the message body increment from DA000.
[0133] S110: Complete the Chinese definition of the node represented by the i-th row of data according to the Chinese name dictionary to obtain the Chinese name and Chinese annotation of the node.
[0134] S111: Increment the value of i by 1, execute the step of reading the i-th line of data in the ISO message format file and subsequent steps, until all data in the ISO message format file has been read.
[0135] This allows you to traverse each line of the ISO message format file and obtain the path information, duplicate information identifier, required node identifier, maximum number of occurrences, minimum number of occurrences, regular expression, format type, length information, node Chinese name, and node Chinese comment for each line of data.
[0136] As one possible implementation, taking the "ISO 2022 Message Format Book - camt.029 Type" as an example, this ISO message format file shows the standard format book for the camt.029 type in ISO 2022 messages. The format information can be extracted from the content of the "Full_View" tab. The following are the information columns to be extracted, as shown in Table 2.
[0137] Table 2
[0138]
[0139] S112: Generate the insert statement corresponding to the ISO message node path configuration table based on path information, duplicate information identifier, required node identifier, maximum number of times, minimum number of times, regular expression, format type, length information, node Chinese name, and node Chinese comment.
[0140] For example, the ISO message node path configuration table can be shown in Table 3.
[0141] Table 3
[0142]
[0143] As can be seen from the above technical solution, the construction of the data model for ISO messages in XML format—that is, the automatic generation of insertion statements for the ISO message node path configuration table from the ISO message format file—is a process that can run independently without relying on a database, with extremely low time complexity. Furthermore, it requires no environment configuration, and the code for building the data model is packaged and released, providing input parameters. This eliminates the technical background requirements for subsequent maintenance personnel, reducing the cost of maintaining the ISO message data model. Moreover, using this data model eliminates the need for recursive model construction using adjacency lists and parsing message content using string truncation, reducing computational space and workload, and shortening the computation time.
[0144] After automatically generating the insert statements corresponding to the ISO message node path configuration table through S101-S112, corresponding insert statements (such as SQL statements) are generated according to different types of ISO message format files, thereby storing the message structure of each type of ISO message in the database for subsequent use in parsing ISO messages. The following explanation is based on S201-S209.
[0145] S201: Obtain ISO message.
[0146] S202: Read the message content, path information, and message type of the ISO message.
[0147] As one possible implementation, the dom4j API can be used to read the XML-formatted message content, corresponding path information, and message type.
[0148] S203: Obtain the message structure from the ISO message node path configuration table based on the message type.
[0149] As one possible implementation, B1-B3 can also be executed before S203 is executed.
[0150] B1: Obtain the message header and message body content based on the message content.
[0151] For example, in the standard format file of an ISO message in Excel spreadsheet format, the message header content and message body content can be obtained according to the AppHdr and Document tag names, respectively.
[0152] B2: Verify the message header and message body content separately according to the ISO format verification file.
[0153] For example, the schcam file can be used to verify the content of the message header and the message body separately. The schcam file is an officially provided format verification file, which is divided into two parts: the message header and the message body.
[0154] B3: If the verification passes, then execute S203.
[0155] S204: Obtain the i-th leaf node in the message content according to the message structure, where i is 1.
[0156] The message structure is obtained directly from the ISO message node path configuration table based on the message type, and the leaf node records in the message structure are looped around.
[0157] In related technologies, message parsing involves a recursive process of parent and child nodes, extracting data by truncating strings at specific positions and lengths. Reading message content layer by layer by truncating strings is not only time-consuming, but also requires frequent modifications to the truncation code when ISO message standards change, such as node path changes. Secondly, extracting data by truncating strings using start position identifiers cannot handle cases where the extracted content contains identifiers. Furthermore, ISO messages may contain nodes with the same name under different paths, and nodes may appear cyclically. The string truncation method cannot flexibly handle such situations.
[0158] Based on this, the embodiments of this application utilize the API of W3c.dom technology to parse the message content according to the path of the leaf node. The API of W3c.dom technology is chosen because it can remove the namespace of XML tags, etc.
[0159] S205: Obtain the path information and node number of the i-th leaf node.
[0160] S206: Parse the node content of the i-th leaf node based on the path information of the i-th leaf node.
[0161] S207: Determine the node value of the i-th leaf node based on the node number and repetition count of the i-th leaf node.
[0162] For example, create a nodeIndex number using a base-36 notation (0-9A-Z). This number corresponds to each parsed message content. Numbering is necessary because there are duplicate nodes with the same path. 4.3. For duplicate nodes, the nodeIndex is incremented by one each time.
[0163] S208: Increment the value of i by 1, execute the step of obtaining the i-th leaf node in the message content according to the message structure and subsequent steps, until all the content of the message content has been read.
[0164] The value of i is continuously increased, that is, S204-S208 is continuously executed until all the contents of the message are read.
[0165] S209: Store the node content and node value in the ISO message node information table.
[0166] The parsed content is saved to the ISO message node information table. This table stores the parsed content of each node in the message, with each node value corresponding to its content. This allows the node content to be retrieved from the ISO message node information table based on the node value.
[0167] As one possible implementation, the ISO message node information table can be as shown in Table 4.
[0168] Table 4
[0169]
[0170] As can be seen from the above technical solution, directly querying the leaf node path of the ISO message data model and using W3c.dom technology, the corresponding node content can be obtained based on the path information. This eliminates the need for recursively constructing the model using an adjacency list and parsing the message content by string truncation. This saves computational space and time, eliminates concerns about the extracted content containing identifiers, and ensures that each node corresponds to a unique value; changes to the node name and path do not affect content extraction.
[0171] S201-S209 extracts the content of all leaf nodes, but data users may only need a portion of the message content or wish to process the content, such as converting data types. Therefore, a flexibly configurable message content processing interface is urgently needed. Based on the data user's needs, the message content can be customized and assigned to different types of classes for return, thus decoupling message data extraction and data processing. See S301-S305 for details.
[0172] S301: Obtain the ISO message node and transaction domain mapping table.
[0173] The ISO message node and transaction domain mapping table includes the node name and path of the node to be converted, providing data users with a configuration table to extract content from a specified path. Based on the configuration information, Java reflection is used to assign values to the class variables storing the data. As one possible implementation, the ISO message node and transaction domain mapping table is shown in Table 5.
[0174] Table 5
[0175]
[0176] S302: Retrieve the record of the node to be converted from the ISO message node information table based on the node path.
[0177] The ISO message node information table includes information on multiple nodes. Each node's information is stored in a column or a row, and each column or row of data constitutes a record.
[0178] The node to be converted is the node that the user wants to convert.
[0179] S303: Obtain user conversion requirements.
[0180] For example, which node's type will be converted, and what will the converted type be?
[0181] S304: Based on the transformation requirements, transform the records of the nodes to be transformed to obtain the transformed records of the nodes that meet the transformation requirements.
[0182] S305: Displays the records of the converted nodes to the user.
[0183] For example, based on the ISO message node and transaction domain mapping table data, obtain the class name (ISO message node) and the node path mapping the class variable provided by the data user. Based on the node path, retrieve the record in the ISO message node information table. Utilize Java's reflection mechanism to obtain the type of the class variable (e.g., string, BigDecimal, etc., for conversion purposes). Convert the string type in the ISO message node information table to the corresponding string, BigDecimal, etc. (records of the converted nodes). Leveraging Java's polymorphism, an interface can return different types of return values.
[0184] Therefore, by utilizing Java's reflection and polymorphism mechanisms, a flexibly configurable interface for message content processing is provided. Based on the data user's conversion needs, the message content is personalized and assigned to different types of classes for return, thereby achieving the goal of decoupling message data extraction and data processing.
[0185] In addition to the message processing method provided, this application also provides a message processing apparatus, such as... Figure 3 As shown, it includes:
[0186] The acquisition unit 301 is used to acquire the ISO message format file and the Chinese name dictionary of the International Organization for Standardization. The ISO message format file includes the standard format of ISO messages, and the Chinese name dictionary includes the Chinese interpretation of ISO messages.
[0187] The reading unit 302 is used to read the i-th line of data in the ISO message format file, where the initial value of i is 1;
[0188] The reading unit 302 is also used to read the node removal attribute and name attribute identifier of the i-th row of data;
[0189] The completion unit 303 is used to complete the path of the i-th row of data according to the type of the i-th row of data if the node removal attribute indicates that the i-th row of data is non-obsolete data and the node name attribute indicates that the i-th row of data is node rule information, so as to obtain the path information of the i-th row of data. The type of the i-th row of data includes a message header and a message body.
[0190] Merging unit 304 is used to merge the information of the i-th row of data to obtain format information and duplicate information identifier;
[0191] The acquisition unit 301 is further configured to acquire the maximum and minimum number of times the data in the i-th row can be repeated if the duplicate information identifier indicates that the data in the i-th row can be repeated.
[0192] The acquisition unit 301 is further configured to acquire the required node identifier of the i-th row of data according to the path information, wherein the required node identifier is used to identify whether the node represented by the i-th row of data must appear in the ISO message;
[0193] Extraction unit 305 is used to extract a regular expression, the format type of the i-th row of data, and the length information corresponding to the format type according to the format information;
[0194] The determining unit 306 is used to determine the node number of the node represented by the i-th row of data according to the type of the i-th row of data and the order of the nodes represented by the i-th row of data;
[0195] The completion unit 303 is also used to complete the Chinese definition of the node represented by the i-th row of data according to the Chinese name dictionary, so as to obtain the Chinese name and Chinese annotation of the node;
[0196] The loop unit 307 is used to increment the value of i by 1, execute the step of reading the i-th line of data in the ISO message format file and subsequent steps, until all data in the ISO message format file has been read;
[0197] The generation unit 308 is used to generate an insertion statement corresponding to the ISO message node path configuration table based on the path information, the duplicate information identifier, the required node identifier, the maximum number of times, the minimum number of times, the regular expression, the format type, the length information, the node Chinese name, and the node Chinese annotation.
[0198] As one possible implementation, the generation unit 308 is further configured to:
[0199] After merging the information of the i-th row of data to obtain format information and duplicate information identifier, before obtaining the maximum and minimum number of allowed duplicate occurrences if the duplicate information identifier indicates that the i-th row of data can be repeated, if the i-th row of data includes optional information, then an insert statement for the ISO message node option configuration table is generated based on the optional information.
[0200] Increment the value of i by 1, and execute the step of reading the i-th line of data in the ISO message format file and subsequent steps until all data in the ISO message format file has been read.
[0201] As one possible implementation, the determining unit 306 is specifically used for:
[0202] Determine whether the node represented by the i-th row of data is the first node of the message header;
[0203] If the node represented by the i-th row of data is the first node in the message header, then the node number is determined to be AA000;
[0204] If the node represented by the i-th row of data is not the first node of the message header, then determine whether the node identified by the i-th row of data is the first node of the message body.
[0205] If the node represented by the i-th row of data is the first node of the message body, then the node number is determined to be DA000;
[0206] If the node represented by the i-th row of data is not the first node of the message body, the node number is determined by incrementing the preset step size.
[0207] As one possible implementation, the apparatus further includes a message parsing unit, used for:
[0208] Obtain ISO messages;
[0209] Read the message content, path information, and message type of the ISO message;
[0210] The message structure is obtained from the ISO message node path configuration table according to the message type;
[0211] The i-th leaf node in the message content is obtained according to the message structure, where i is 1;
[0212] Obtain the path information and node number of the i-th leaf node;
[0213] The node content of the i-th leaf node is parsed based on the path information of the i-th leaf node;
[0214] The node value of the i-th leaf node is determined based on the node number and the number of repetitions of the i-th leaf node.
[0215] Increment the value of i by 1, execute the step of obtaining the i-th leaf node in the message content according to the message structure and subsequent steps, until all the content of the message content is read.
[0216] The node content and the node value are stored in the ISO message node information table.
[0217] As one possible implementation, the apparatus further includes a message parsing unit, used for:
[0218] Obtain the message header and message body content based on the message content;
[0219] The message header content and the message body content are verified according to the ISO format verification file.
[0220] If the verification passes, then the process of retrieving the message structure from the ISO message node path configuration table based on the message type is executed.
[0221] As one possible implementation, the device further includes a conversion unit for:
[0222] Obtain the ISO message node and transaction domain mapping table, which includes the node name and node path of the node to be converted;
[0223] The record of the node to be converted is obtained from the ISO message node information table according to the node path;
[0224] Obtain user conversion needs;
[0225] The records of the nodes to be converted are converted according to the conversion requirements to obtain the converted records of the nodes that meet the conversion requirements.
[0226] The user is shown the records of the converted node.
[0227] As can be seen from the above technical solution, the construction of the data model for ISO messages in XML format—that is, the automatic generation of insertion statements for the ISO message node path configuration table from the ISO message format file—is a process that can run independently without relying on a database, with extremely low time complexity. Furthermore, it requires no environment configuration, and the code for building the data model is packaged and released, providing input parameters. This eliminates the technical background requirements for subsequent maintenance personnel, reducing the cost of maintaining the ISO message data model. Moreover, using this data model eliminates the need for recursive model construction using adjacency lists and parsing message content using string truncation, reducing computational space and workload, and shortening the computation time.
[0228] This application also provides a computer device, see [link to relevant documentation] Figure 4 The figure illustrates a structural diagram of a computer device provided in an embodiment of this application, such as... Figure 4 As shown, the device includes a processor 410 and a memory 420:
[0229] The memory 410 is used to store program code and transmit the program code to the processor;
[0230] The processor 420 is used to execute any of the message processing methods provided in the above embodiments according to the instructions in the program code.
[0231] This application provides a computer-readable storage medium for storing a computer program that executes any of the message processing methods provided in the above embodiments.
[0232] This application also provides a computer program product or computer program that includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the message processing methods provided in the various optional implementations of the above aspects.
[0233] It should be noted that the various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the systems or apparatus disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the descriptions are relatively simple, and relevant parts can be referred to the method section.
[0234] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.
[0235] It should also be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0236] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented directly by hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
[0237] The above description of the disclosed embodiments enables those skilled in the art to make or use this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims
1. A method of processing a packet, the method comprising: The method includes: Obtain the ISO message format file and Chinese name dictionary from the International Organization for Standardization. The ISO message format file includes the standard format of ISO messages, and the Chinese name dictionary includes the Chinese definitions of ISO messages. Read the i-th line of data from the ISO message format file, where the initial value of i is 1; Remove the attribute and name attribute identifier from the node that reads the i-th row of data; If the node removal attribute indicates that the data in the i-th row is non-obsolete data, and the node name attribute indicates that the data in the i-th row is node rule information, then the path of the data in the i-th row is completed according to the type of the data in the i-th row to obtain the path information of the data in the i-th row. The type of the data in the i-th row includes a message header and a message body. By merging the information from the i-th row of data, we obtain format information and duplicate information identifiers; If the duplicate information identifier indicates that the data in the i-th row can be repeated, obtain the maximum and minimum number of times it can be repeated; The required node identifier of the i-th row of data is obtained based on the path information. The required node identifier is used to identify whether the node represented by the i-th row of data must appear in the ISO message. Extract the regular expression, the format type of the i-th row of data, and the length information corresponding to the format type based on the format information; The node number of the node represented by the i-th row of data is determined according to the type of the i-th row of data and the order of the nodes represented by the i-th row of data; Complete the Chinese definition of the node represented by the i-th row of data according to the Chinese name dictionary to obtain the Chinese name and Chinese annotation of the node; Increment the value of i by 1, execute the step of reading the i-th line of data in the ISO message format file and subsequent steps, until all data in the ISO message format file has been read; Based on the path information, the duplicate information identifier, the required node identifier, the maximum number of times, the minimum number of times, the regular expression, the format type, the length information, the node Chinese name, and the node Chinese annotation, generate the insertion statement corresponding to the ISO message node path configuration table.
2. The method of claim 1, wherein, After merging the information of the i-th row of data to obtain format information and a repetition information identifier, and before obtaining the maximum and minimum allowed number of repetitions if the repetition information identifier indicates that the i-th row of data can be repeated, the method further includes: If the i-th row of data includes optional information, then an insert statement for the ISO message node option configuration table is generated based on the optional information; Increment the value of i by 1, and execute the step of reading the i-th line of data in the ISO message format file and subsequent steps until all data in the ISO message format file has been read.
3. The method of claim 1, wherein, The step of determining the node number of the node represented by the i-th row of data based on the type of the i-th row of data and the order of the nodes represented by the i-th row of data includes: Determine whether the node represented by the i-th row of data is the first node of the message header; If the node represented by the i-th row of data is the first node in the message header, then the node number is determined to be AA000; If the node represented by the i-th row of data is not the first node of the message header, then determine whether the node identified by the i-th row of data is the first node of the message body. If the node represented by the i-th row of data is the first node of the message body, then the node number is determined to be DA000; If the node represented by the i-th row of data is not the first node of the message body, the node number is determined by incrementing the preset step size.
4. The method of claim 1, wherein, The method further includes: Obtain ISO messages; Read the message content, path information, and message type of the ISO message; The message structure is obtained from the ISO message node path configuration table according to the message type; The i-th leaf node in the message content is obtained according to the message structure, where i is 1; Obtain the path information and node number of the i-th leaf node; The node content of the i-th leaf node is parsed based on the path information of the i-th leaf node; The node value of the i-th leaf node is determined based on the node number and the number of repetitions of the i-th leaf node. Increment the value of i by 1, execute the step of obtaining the i-th leaf node in the message content according to the message structure and subsequent steps, until all the content of the message content is read. The node content and the node value are stored in the ISO message node information table.
5. The method according to claim 4, characterized in that, Before obtaining the message structure from the ISO message node path configuration table according to the message type, the method further includes: Obtain the message header and message body content based on the message content; The message header content and the message body content are verified according to the ISO format verification file. If the verification passes, then the process of retrieving the message structure from the ISO message node path configuration table based on the message type is executed.
6. The method according to claim 1, characterized in that, The method further includes: Obtain the ISO message node and transaction domain mapping table, which includes the node name and node path of the node to be converted; The record of the node to be converted is obtained from the ISO message node information table according to the node path; Obtain user conversion needs; The records of the nodes to be converted are converted according to the conversion requirements to obtain the converted records of the nodes that meet the conversion requirements. The user is shown the records of the converted node.
7. A message processing apparatus, characterized in that, The device includes: The acquisition unit is used to acquire the ISO message format file and the Chinese name dictionary of the International Organization for Standardization. The ISO message format file includes the standard format of ISO messages, and the Chinese name dictionary includes the Chinese interpretation of ISO messages. The reading unit is used to read the i-th line of data in the ISO message format file, where the initial value of i is 1; The reading unit is also used to read the node removal attribute and name attribute identifier of the i-th row of data; The completion unit is used to complete the path of the i-th row of data according to the type of the i-th row of data if the node removal attribute indicates that the i-th row of data is non-obsolete data and the node name attribute indicates that the i-th row of data is node rule information, thereby obtaining the path information of the i-th row of data. The type of the i-th row of data includes a message header and a message body. The merging unit is used to merge the information of the i-th row of data to obtain format information and duplicate information identifier; The acquisition unit is further configured to acquire the maximum and minimum number of times the data in the i-th row can be repeated if the duplicate information identifier indicates that the data in the i-th row can be repeated. The acquisition unit is further configured to acquire the required node identifier of the i-th row of data according to the path information, wherein the required node identifier is used to identify whether the node represented by the i-th row of data must appear in the ISO message; The extraction unit is used to extract the regular expression, the format type of the i-th row of data, and the length information corresponding to the format type based on the format information. The determining unit is configured to determine the node number of the node represented by the i-th row of data based on the type of the i-th row of data and the order of the nodes represented by the i-th row of data. The completion unit is used to complete the Chinese definition of the node represented by the i-th row of data according to the Chinese name dictionary, so as to obtain the Chinese name and Chinese annotation of the node; The loop unit is used to increment the value of i by 1, execute the step of reading the i-th line of data in the ISO message format file and subsequent steps, until all data in the ISO message format file has been read; The generation unit is used to generate the insertion statement corresponding to the ISO message node path configuration table based on the path information, the duplicate information identifier, the required node identifier, the maximum number of times, the minimum number of times, the regular expression, the format type, the length information, the node Chinese name, and the node Chinese annotation.
8. A computer device, characterized in that, The device includes a processor and a memory: The memory is used to store program code and transmit the program code to the processor; The processor is configured to execute the method according to any one of claims 1-6 according to the instructions in the program code.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium is used to store a computer program for performing the method according to any one of claims 1-6.
10. A computer program product, characterized in that, Includes a computer program or instructions; when the computer program or instructions are executed by a processor, the method described in any one of claims 1-6 is performed.