Dynamic hybrid resolution of bond ambiguity instructions with deterministic verification system and method
By combining rule-based parsing and large-scale model parsing with a dynamic hybrid parsing and deterministic verification system for fuzzy bond trading instructions, the problems of low parsing accuracy and high computational resource consumption in bond market trading are solved, achieving efficient and reliable parsing and verification of bond trading instructions.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CFETS FINANCIAL DATA CO LTD
- Filing Date
- 2026-03-23
- Publication Date
- 2026-06-19
Smart Images

Figure CN122240646A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of financial technology, specifically to a dynamic hybrid parsing and deterministic verification system and method for bond fuzzy instructions. Background Technology
[0002] The interbank bond market primarily employs an over-the-counter (OTC) trading model, where communication and negotiation among traders heavily rely on instant messaging tools. This operational method generates massive amounts of unstructured text data daily, which not only includes core elements such as buy / sell directions, codes, and prices, but also contains a large amount of industry-specific jargon, abbreviations, and vague numerical descriptions such as "around" or "nearby."
[0003] Currently, there are two main technological applications in the field of instruction parsing. One is rule-based matching technology based on regular expressions or keyword templates. This technology extracts key information by predefined rigid text patterns and is often used to process standardized instructions with relatively fixed formats. The other is end-to-end parsing technology based on deep learning or generative pre-trained models, which has emerged in recent years. This technology utilizes the powerful semantic representation capabilities of neural networks to directly map natural language text into target data structures, aiming to solve the problem of semantic understanding in complex contexts.
[0004] The aforementioned rule-based solutions rely heavily on text structure and often struggle to recognize colloquial expressions with arbitrary word order, vague modifiers, or emerging jargon. Furthermore, maintaining large-scale rule bases is costly, and conflicts between rules are difficult to exhaustively resolve. While large-model-based solutions improve semantic understanding, their reasoning processes are computationally expensive and have high latency, making them unsuitable for the stringent millisecond-level timeliness requirements of trading systems. Moreover, generative models lack rigid constraints on business logic, making them prone to fabricating data or misinterpreting numerical units. Therefore, this invention provides a dynamic hybrid parsing and deterministic verification system and method for fuzzy bond instructions to address the shortcomings of existing technologies. Summary of the Invention
[0005] To address the shortcomings of existing technologies, this invention provides a dynamic hybrid parsing and deterministic verification system and method for fuzzy bond instructions, which solves the problems of low accuracy in parsing natural language trading instructions, high computational resource consumption, and lack of logical self-consistency verification in existing fuzzy bond instruction parsing and verification technologies.
[0006] To achieve the above objectives, the present invention provides the following technical solution: The first aspect of the present invention provides a dynamic hybrid parsing and deterministic verification system for fuzzy bond instructions, comprising: The instruction preprocessing module is used to convert unstructured natural language bond trading instructions into structured preprocessed data containing lexical, syntactic, and entity features; The routing decision module is used to construct a multi-dimensional feature space based on the structured preprocessed data and calculate the complexity score, and generate path selection instructions according to the dual threshold strategy. The hybrid parsing module is used to respond to the path selection instruction and extract a set of candidate business elements through the rule parsing processing unit or the large model parsing processing unit. The fallback verification module is used to perform confidence-weighted integration and logical consistency detection on the candidate business element set, and to perform deterministic repair on abnormal elements to generate a compliant business element set. The mapping generation module is used to convert the set of compliant business elements into a structured query statement that conforms to the syntax of the target system; The closed-loop optimization module is used to dynamically adjust system parameters based on user interaction behavior data.
[0007] Preferably, the instruction preprocessing module includes a financial field-specific word segmentation unit, a numerical time standardization unit, and an entity enhancement recognition unit; The financial field-specific word segmentation unit utilizes a pre-built bond field dictionary database and a bidirectional maximum matching algorithm to identify the boundaries of professional terms and the structure of compound terms in the input text, generating a word segmentation sequence. The numerical time standardization unit converts non-standard values in the word segmentation sequence into standard values based on the unit mapping table, and extrapolates the relative time description into specific date ranges based on the benchmark transaction calendar. The entity enhancement recognition unit uses a pre-trained sequence labeling model to calculate the entity label probability distribution of each word in the word segmentation sequence, and constructs structured preprocessed data containing a named entity list and semantic feature vectors.
[0008] Preferably, the routing decision module includes a multi-dimensional feature extraction unit and a complex metric calculation unit; The multidimensional feature extraction unit is used to extract structural features, semantic features and business features from the structured preprocessed data. The structural features are determined based on the maximum depth of the dependency syntax tree, the semantic features are determined based on the density of fuzzy expression patterns, and the business features are determined based on the jargon professionalism index. The complexity quantification calculation unit adopts a linear weighted evaluation model, which calculates a complexity score representing the difficulty of instruction processing by weighting and summing the structural features, semantic features, and business features according to preset feature weight coefficients.
[0009] Preferably, the routing decision module further includes a dynamic threshold decision unit, which is configured with a low-complexity threshold and a high-complexity threshold. The dynamic threshold decision unit compares the complexity score with the low complexity threshold and the high complexity threshold respectively: when the score is lower than or equal to the low complexity threshold, it generates an instruction to activate the rule parsing path; when the score is higher than or equal to the high complexity threshold, it generates an instruction to activate the large model parsing path; if the score is between the two, it generates an instruction to activate the two paths in parallel. The dynamic threshold decision unit also collects and verifies the memory utilization rate of the system in real time. When the memory utilization rate is detected to continuously exceed the preset warning value, the values of the low complexity threshold and the high complexity threshold are adjusted upward by a preset step size to expand the judgment range of low complexity instructions.
[0010] Preferably, the rule parsing processing unit in the hybrid parsing module includes a regular expression pattern matching engine and a static template filling engine; The regular expression pattern matching engine scans the instruction text according to a hierarchical regular expression library sorted by business priority, and extracts substrings that conform to the syntactic structure based on the longest match priority strategy. The static template filling engine is based on the skeleton topology defined by the business intent slot template. It uses keyword anchoring technology to map the input text to the template with the highest matching degree, and fills the text fragments at the corresponding positions into the semantic slot sequence. It uses a financial terminology dictionary to map the extracted text fragments into standard codes to construct the candidate business element set.
[0011] Preferably, the large model parsing processing unit in the hybrid parsing module includes a prompt word construction subunit and a format constraint subunit; The prompt word construction subunit adopts a retrieval-enhanced few-shot learning strategy, uses a semantic embedding model to encode the current instruction into a query vector, retrieves the most similar historical instructions and standard answers from the historical instruction vector database, and constructs a structured input prompt word vector by combining the role definition of the verification system and the output format constraints. The format constraint subunit uses a grammar-constrained decoding algorithm to verify the generated text, calculates the parsing confidence based on the conditional probability of the valid key information tokens in the generated sequence, and encapsulates the parsing result into a set of candidate business elements containing parsing path identifiers and confidence scores.
[0012] Preferably, the verification fallback module includes a result integration and judgment unit and a business logic consistency detection unit; The result integration and judgment unit executes a priority-based weighted merging strategy: it prioritizes the adoption of the complete matching results of the rule parsing path; when there are null values in the rule parsing results, it only uses the large model parsing results for complementary filling when the confidence score of the large model parsing results is greater than the preset high confidence admission threshold. The business logic consistency detection unit, based on the bond trading business rule base, performs price range validity verification, term logic verification, and transaction direction mutual exclusion verification on the integrated set of elements, and generates an abnormal status identifier when a logical conflict or missing key element is detected.
[0013] Preferably, the verification fallback module further includes a multi-level deterministic fallback unit, which, in response to the abnormal state identifier, sequentially activates the issuer set operation subunit and the edit distance fuzzy matching subunit; The issuer set operation subunit parses the logical structure of compound terms according to the logical operator mapping table, and performs union, difference and intersection operations based on atomic tag sets for terms representing parallel, exclusion and restriction relationships, respectively, to reconstruct the target issuer ID set; The edit distance fuzzy matching subunit calculates the normalized Levenstein distance between the input text fragment and the terms in the standard terminology library, and outputs the corrected standard value only when the similarity score is higher than the adaptive length threshold set based on word length.
[0014] Preferably, the mapping generation module includes a structured query building unit, which uses abstract syntax tree construction technology to transform the set of compliant business elements into a standard query logic structure that includes a data projection clause, a data source location clause, a set of static constraints, a set of dynamic constraints, and a result sorting rule clause. The closed-loop optimization module includes a weight incremental update unit. The weight incremental update unit dynamically updates the matching confidence weight of the term dictionary using an exponentially weighted moving average algorithm based on the feedback signal generated by the user's confirmation or correction behavior of the parsing results. When the weight is lower than the elimination threshold, the corresponding mapping rule is marked as invalid.
[0015] The second aspect of this invention provides a dynamic hybrid parsing and deterministic verification method for fuzzy bond instructions, comprising the following steps: We acquire unstructured natural language bond trading instructions from users, and construct structured preprocessed data containing lexical, syntactic, and entity features through financial-specific word segmentation and entity enhancement recognition processing. Based on the multidimensional features of the structured preprocessed data quantization instructions and the complexity score, path selection instructions are generated according to the dual threshold strategy. In response to the path selection instruction, a rule parsing step or a semantic reasoning parsing step is executed to extract a set of candidate business elements. The rule parsing step extracts elements based on regular pattern matching and static template filling, and the semantic reasoning parsing step extracts elements based on structured prompts and generative reasoning. The set of candidate business elements is subjected to confidence-weighted integration and logical consistency detection. When logical conflicts or missing key elements are detected, deterministic repair is performed through term expansion mapping, issuer set operation or edit distance matching to generate a set of compliant business elements. Using abstract syntax tree (AST) construction technology, the set of compliant business elements is converted into a structured query statement that conforms to the syntax of the target system, and then driven to execute by the external transaction system. Collect user feedback data on the parsing results, and dynamically adjust the term mapping weight or routing decision threshold based on the feedback data.
[0016] This invention provides a dynamic hybrid parsing and deterministic verification system and method for fuzzy bond instructions. It has the following beneficial effects: 1. This invention calculates complexity scores based on the structural depth, fuzziness density, and technical jargon of instructions, and then routes low-complexity instructions to the rule parsing path and highly ambiguous instructions to the large model parsing path. Combined with adaptive threshold adjustment logic based on memory utilization, the system can automatically adjust the decision threshold upwards under high-load scenarios, expanding the acceptance range of the rule path. This ensures the system's ability to understand long-tail complex instructions while reducing the response latency and hardware computational overhead of regular instructions, thus guaranteeing the core availability of the system under extreme concurrency environments.
[0017] 2. The hybrid parsing module of this invention combines the deterministic advantages of a rule engine with the generalization and understanding capabilities of a large model. The verification fallback module further integrates the multi-source output results with confidence weighting. For logical conflicts or missing elements that occur during the parsing process, the system uses issuer set operations and edit distance matching to forcibly repair abnormal data, ensuring that the final generated structured query statement strictly conforms to the business rules and database constraints of bond trading, and realizing the accurate conversion from unstructured fuzzy text to executable machine instructions.
[0018] 3. The closed-loop optimization module of this invention transforms user confirmation or correction operations into feedback signals, uses an exponentially weighted moving average algorithm to fine-tune the mapping weights of the terminology dictionary in real time, and dynamically adjusts the routing decision threshold based on periodic error rate statistics. This mechanism enables the system to automatically adapt to newly added industry jargon and user expression habits as business data accumulates, maintaining a high level of parsing accuracy and system robustness without frequent full-scale model training. Attached Figure Description
[0019] Figure 1 This is a system architecture diagram of the present invention; Figure 2 This is a flowchart of the method steps of the present invention; Figure 3This is a performance comparison chart of a specific embodiment of the present invention; Figure 4 This is a comparison chart of resource consumption and latency in a specific embodiment of the present invention.
[0020] Among them, 110 is the instruction preprocessing module; 120 is the routing decision module; 130 is the hybrid parsing module; 140 is the checksum module; 150 is the mapping generation module; and 160 is the closed-loop optimization module. Detailed Implementation
[0021] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0022] See attached document Figure 1 , Figure 1 This is a system architecture diagram according to an embodiment of the present invention. The present invention provides a dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions, including an instruction preprocessing module 110, a routing decision module 120, a hybrid parsing module 130, a verification fallback module 140, a mapping generation module 150, and a closed-loop optimization module 160.
[0023] The instruction preprocessing module 110 is used to acquire unstructured natural language bond trading instructions input by the user. The instruction preprocessing module 110 constructs a sequence of lexical units for the instruction based on financial domain-specific word segmentation logic, determines the boundaries of professional terms, and converts non-standardized time and numerical descriptions into a standard format that the system can compute. The instruction preprocessing module 110 uses a domain entity dictionary to enhance the features of the text, marking bond type entities and issuer type entities, thereby constructing structured preprocessed data containing the original text, word segmentation sequence, entity tag list, and preliminary feature vectors.
[0024] The routing decision module 120, connected to the instruction preprocessing module 110, is used to construct a multi-dimensional feature space for instructions based on structured preprocessed data. This module 120 quantifies instruction attributes from three dimensions: structural features, semantic features, and business features, and calculates a complexity score representing the difficulty of instruction processing based on a preset weighting system. The routing decision module 120 generates path selection instructions based on a comparison between this complexity score and a preset dual-threshold strategy.
[0025] The hybrid parsing module 130, controlled by the path selection instructions generated by the routing decision module 120, integrates a rule parsing processing unit and a large model parsing processing unit. When a low-complexity path instruction is received, the hybrid parsing module 130 activates the rule parsing processing unit, directly extracting business elements based on a predefined regular expression pattern library and terminology mapping table. When a high-complexity path instruction is received, the module activates the large model parsing processing unit, constructing a reasoning context based on structured prompt word templates to extract business elements. When a medium-complexity path instruction is received, the module activates both processing units in parallel. Based on the above processing, the hybrid parsing module 130 constructs a set of candidate business elements including transaction direction, term, issuer, and price.
[0026] The fallback verification module 140 is used to perform quality assessment and logic repair on the candidate business element set output by the hybrid parsing module 130. The fallback verification module 140 calculates the confidence index of each candidate element and performs consistency detection based on bond trading business logic. When the confidence index is detected to be lower than the preset threshold or there is a business logic conflict, the module starts the hierarchical fallback repair logic. By performing term expansion mapping, regular expression fallback matching, issuer set operation and edit distance matching operation, the abnormal elements are reconstructed or corrected, thereby generating a compliant business element set that passes the consistency verification.
[0027] The mapping generation module 150 converts the set of compliant business elements output by the verification fallback module 140 into executable system instructions. Based on a pre-set standard dictionary for bond trading systems, the mapping generation module 150 maps the compliant business elements to corresponding standard parameter values, and fills in missing fields with default values and performs unit normalization. Using the standardized parameter set, the mapping generation module 150 dynamically assembles a structured query statement that conforms to the target system's syntax rules, and drives the external bond trading system to execute the query operation through a standardized interface.
[0028] The closed-loop optimization module 160 works in conjunction with the mapping generation module 150 to construct the system's adaptive evolution mechanism. The closed-loop optimization module 160 presents the parsing results on the interactive interface and captures the user's confirmation, correction, or evaluation operations on the parsing results. The closed-loop optimization module 160 converts the user's interactive behavior data into parameter update instructions, and adjusts the weight values in the terminology mapping table or corrects the rule thresholds in the checksum module 140 in real time, thereby realizing the dynamic updating and accuracy improvement of the system's knowledge base.
[0029] See attached document Figure 2 , Figure 2 This is a flowchart of a method according to an embodiment of the present invention. The present invention provides a method for dynamic hybrid parsing and deterministic verification of fuzzy bond instructions, comprising the following steps: S1, the instruction preprocessing module 110 obtains unstructured instructions, uses the maximum matching algorithm combined with a financial dictionary to perform word segmentation, identifies professional terminology boundaries and compound terms, and converts non-standard values into standard values. S2, the routing decision module 120 quantifies the feature dimensions of the instruction and calculates the complexity score, and sends the instruction to the rule parsing path, the large model parsing path, or both paths simultaneously according to the preset dual threshold strategy. S3, the hybrid parsing module 130 performs the parsing task. The rule-based parsing path uses regular patterns and term mapping tables to extract elements, while the large-scale model parsing path uses structured prompt word templates and deep learning models to extract elements. S4, Verify the confidence level of each path result calculated by the fallback module 140 and perform weighted integration to detect the consistency of key business elements; S5, if a key element is detected to be missing or the logic verification fails, the verification fallback module 140 executes deterministic fallback logic, including term expansion mapping, regular expression fallback, issuer set operation and edit distance matching; S6, the mapping generation module 150 converts intermediate elements into standard values that can be executed by the system, and handles missing values and unit conversions; S7, the mapping generation module 150 dynamically generates structured query statements based on the mapped standard elements, and calls the external bond trading system through a standardized interface; S8, the closed-loop optimization module 160 displays the analytical elements and confidence levels to the user, captures the user's confirmation or modification behavior of the results, updates the term mapping weights in real time, and adjusts the rule parameters. S9 records the entire operational trajectory data for performance analysis, accuracy analysis, and business analysis.
[0030] The instruction preprocessing module 110 serves as the system's input data processing unit, receiving natural language text and converting it into serialized data containing lexical, syntactic, and semantic features. This instruction preprocessing module 110 includes a financial domain-specific word segmentation unit, a numerical time normalization unit, and an entity enhancement recognition unit.
[0031] The special word segmentation unit in the financial field is configured with a word segmentation engine based on the bidirectional maximum matching algorithm. This engine is connected to a preset bond field dictionary database, which stores no less than 3,500 entries and includes "institutional abbreviations" (such as Nongfa, Jinchu), "bond variety abbreviations" (such as secondary perpetual, special national debt), and "trading jargon". When the word segmentation engine performs word segmentation operations, it loads the above dictionary and scans the input text stream. For character combinations such as "3-4Y" in the text, the word segmentation engine identifies it as an indivisible single lexical unit through preset regular protection rules; for compound terms such as "big banks' state-owned shares", the word segmentation engine identifies it as a compound phrase with a parallel logic based on the hierarchical tags in the dictionary and retains its internal structure information.
[0032] The numerical time normalization unit is used to normalize non-standard expressions in the word segmentation sequence. The numerical time normalization unit internally stores a unit mapping table and a reference trading calendar. For numerical expressions, the numerical time normalization unit identifies numerical strings in the instruction that contain suffixes (such as "e", "kw") and calculates the standard numerical value using the following formula: ; In the formula, represents the converted standard numerical value; represents the basic digital part extracted from the instruction text; is the magnification coefficient matched in the unit mapping table according to the suffix character, where when the suffix is "e" or "hundred million", the value is 10 8 , when the suffix is "w" or "ten thousand", the value is 10 4 , when the suffix is "kw" or "ten million", the value is 10 7 .
[0033] For time expressions, the unit performs an absolute time deduction based on the working day logic. If the instruction contains relative time words such as "next March", "next Friday", or "T+1", the unit obtains the current reference working day of the system , combines the preset inter-bank market holiday data, and calculates the specific date range . For example, when "T+1" is recognized and it is Friday, the system automatically skips Saturday and Sunday and deduces the target date to the next Monday.
[0034] The entity augmentation and recognition unit is equipped with a pre-trained sequence labeling model. This model adopts a BERT-BiLSTM-CRF network architecture, taking a standardized word segmentation sequence as input and outputting the entity label probability distribution for each word. After supervised fine-tuning using financial corpora, the label set includes, but is not limited to, "issuer type," "bond type," "maturity characteristics," and "price characteristics." The entity augmentation and recognition unit determines the entity label for each word based on the probability maximization principle.
[0035] The instruction preprocessing module 110 is also equipped with a feature vector construction unit for generating feature vectors that support complexity evaluation. The structure and semantic indicators of the statistical instructions for constructing feature vectors are shown in the following formula: ; In the formula, Indicates the total number of lexical units after word segmentation (normalized value); Indicates the vector transpose sign; The professionalism ratio is calculated by dividing the number of identified domain dictionary terms by [the percentage of the number of terms]. ; The fuzzy expression density is represented by the following formula: ,in The number of tokens in the command that match a preset fuzzy vocabulary (containing "left and right", "nearby", "approximately", "short duration"); The semantic feature vector is a fixed-dimensional vector (e.g., 768-dimensional) or a low-dimensional vector (e.g., 32-dimensional) obtained by average pooling the top-level hidden state vector of BERT output by the entity augmentation recognition unit. It is used to characterize the semantic density of the entire instruction.
[0036] The instruction preprocessing module 110 outputs a structured data packet through a data interface. This data packet contains the original text string, word segmentation and part-of-speech tagging sequences, a named entity list, and the feature vectors calculated above. .
[0037] The routing decision module 120 is connected to the data output of the instruction preprocessing module 110. It is configured to pre-evaluate the processing cost of instructions and dynamically allocate computational paths based on the evaluation results. This routing decision module 120 maps unstructured text features to a high-dimensional feature space and uses a weighted computation model to quantify the parsing difficulty of instructions, thereby establishing a dynamic balance between low-latency rule matching and high-precision model inference. The routing decision module 120 includes a multi-dimensional feature extraction unit, a complex quantification computation unit, and a dynamic threshold decision unit.
[0038] The multidimensional feature extraction unit is used to extract feature vectors representing the difficulty of instruction processing from structured data packets. Specifically designed for bond trading instructions, this multidimensional feature extraction unit constructs a ternary feature space comprising structural feature dimensions, semantic feature dimensions, and business feature dimensions.
[0039] In terms of structural features, this multidimensional feature extraction unit calculates the normalized syntactic depth of the instruction, calls a lightweight dependency parser (e.g., a parser based on the Arc-Eager algorithm) to generate the dependency syntactic tree of the instruction, and calculates the maximum depth of the tree. This depth indicator reflects the nesting level of modification relationships (such as noun-head relationships and adverbial-head relationships) in instructions. The deeper the level, the more complex the logical structure.
[0040] In terms of semantic features, this multi-dimensional feature extraction unit calculates the density of fuzzy expression patterns. Internally, this unit stores a fuzzy lexicon specifically for bond trading, containing no fewer than 200 lexical units representing uncertainty, including terms indicating approximate relationships such as "left and right" and "nearby," terms indicating interval relationships such as "above" and "below," and terms indicating non-rigid demand such as "seeing out" and "valuation." The multi-dimensional feature extraction unit traverses the word segmentation sequence and counts the total frequency of the aforementioned fuzzy words.
[0041] In terms of business characteristics, this multi-dimensional feature extraction unit calculates a jargon professionalism index. This unit maintains a terminology hierarchy mapping table, classifying bond terms into different levels and assigning them weights. For example, the standardized term "21 Treasury Bond 01" is defined as a Level 1 term (weight 0.2), "State-owned Stock" as a Level 2 term (weight 0.5), "Eryong" and "T+0" as Level 3 terms (weight 0.8), and "Implicit Rating" as a Level 4 term (weight 1.0). This multi-dimensional feature extraction unit accumulates the weights of all terms in the instruction and performs normalization to generate the jargon professionalism index.
[0042] The complexity quantification computation unit is configured to fuse the aforementioned multi-dimensional features into a single complexity score. This complexity quantification computation unit employs a linear weighted evaluation model, which assigns corresponding contribution proportions to features of different dimensions based on a pre-statistical distribution of feature importance. Its calculation logic is shown in the following equation: ; In the formula, The overall complexity score of the instruction is represented, and its value ranges from 1 to 2. ; The depth of the dependency syntax tree representing the instruction; This represents a preset standardized depth benchmark value, which is determined based on the 95th quantile of the statistical distribution of historical bond instruction sets. In a preferred embodiment, The value ranges from 8 to 12; This represents the density of fuzzy expression patterns, and its value is the ratio of the number of fuzzy words in the instruction to the total number of lexical units. This represents the professionalism index of jargon, which is the normalized average weight value of the terms. These are the weight coefficients for structural features, semantic features, and business features, respectively, and their values are determined based on the contribution of each feature to the parsing failure rate; in a preferred embodiment, The value range is 0.20 to 0.25. The value range is 0.25 to 0.50. The value ranges from 0.25 to 0.30, and satisfies... .
[0043] Dynamic threshold decision unit configuration is used to base decisions on complexity scores. Generate path selection instructions. This unit has a pre-defined dual-threshold judgment logic, including a low-complexity threshold. With high complexity threshold When the system is under standard load, The value range is set to The preferred value is 0.35; The value range is set to The preferred value is 0.65. The decision unit performs the following comparison logic: when When the instruction is determined to be a standardized instruction, a first path instruction is generated. This instruction controls the system to activate only the rule parsing processing unit in the hybrid parsing module 130 to achieve millisecond-level response. When the instruction is determined to be a highly ambiguous and complex instruction, a second path instruction is generated. This instruction controls the system to activate only the large model parsing processing unit in the hybrid parsing module 130 to ensure the accuracy of semantic understanding of complex logic. When the instruction is determined to be of medium complexity, a third path instruction is generated. This instruction controls the parallel activation of the rule parsing and processing unit and the large model parsing and processing unit, and initiates the subsequent result integration process.
[0044] Furthermore, the dynamic threshold decision unit is equipped with resource-aware adaptive adjustment logic. This unit collects system performance metrics in real time, including queries per second (QPS) and GPU memory utilization. When it detects that GPU memory utilization continuously exceeds a preset warning value (e.g., 85%) for more than a preset time window (e.g., 10 seconds), the dynamic threshold decision unit automatically triggers a degradation protection mechanism, adjusting the threshold upwards by a preset step size (e.g., 0.05). and The value is [value missing]. By increasing the threshold, the judgment range of low-complexity instructions is expanded, causing more instructions in a critical state to be forcibly diverted to the rule parsing path with lower computational overhead. This prevents service overload and paralysis in extreme concurrency scenarios, ensuring the core availability of the system.
[0045] The hybrid parsing module 130 is connected to the routing decision module 120 and is used to perform specific instruction semantic extraction tasks. This hybrid parsing module 130 adopts the technical principle of "deterministic rules first, generative models complementing each other." It uses a rule engine to process high-frequency instructions with fixed structures to ensure absolute accuracy and low latency, and a large model engine to process semantically complex long-tail instructions to achieve generalized understanding. The hybrid parsing module 130 consists of a rule parsing processing unit and a large model parsing processing unit, which work independently or in parallel under the control of routing instructions, ultimately outputting a set of candidate business elements with a unified data structure.
[0046] The rule parsing and processing unit handles standardized instructions deemed to be of low complexity. This unit integrates a regular expression pattern matching engine and a static template filling engine. The regular expression pattern matching engine stores a hierarchical library of regular expressions ordered by business priority. This library includes "trading direction patterns," "term expression patterns," "price constraint patterns," and "combination patterns." The engine scans the input instruction text, extracting substrings that conform to specific syntactic structures. To resolve conflicts arising from multiple rule matches, the regular expression pattern matching engine is configured with a "longest match preferred" strategy. This means that when the same text fragment simultaneously satisfies multiple regular expression patterns, the pattern with the most covered characters is prioritized for parsing.
[0047] The static template filling engine is configured with a set of business intent slot templates, which define the skeleton topology of standard instructions and the positions of variables to be filled. In one specific implementation, the general inquiry template is defined as a semantic slot sequence containing a preset order, which consists of "transaction direction slot", "issuer attribute slot", "term feature slot", and "price constraint slot" in sequence. The engine uses keyword anchoring technology to map the input text to the template with the highest matching degree and fills the text fragments at the corresponding positions into the semantic slots defined above. The rule parsing processing unit calls a financial terminology dictionary to map the extracted text fragments (such as "major banks") to the standard codes within the system. This dictionary uses a hash table structure to store industry jargon as keys and a structure containing the full name, standard code, and entity type as values, ensuring that the time complexity of the mapping process is O(n log n). .
[0048] The large model parsing processing unit is used to process ambiguous instructions deemed to be highly complex. This unit is built upon a generative pre-trained transform model fine-tuned by the instructions and includes a prompt word construction subunit, an inference generation subunit, and a format constraint subunit. The prompt word construction subunit is configured to encapsulate natural language instructions into structured input prompt word vectors. It employs a retrieval-enhanced few-shot learning strategy, and its prompt word construction logic is shown in the following formula: ; In the formula, This indicates a concatenation operation for text strings or vectors. This indicates the system role definition command, setting the model as "Bond Trading Analysis Assistant" and the output constraint as "No Illusion"; This indicates a task description instruction, specifying the element fields to be extracted (such as transaction direction, remaining maturity range, issuer and its attributes, yield range). This represents the output of the dynamic example retrieval function. This function first uses a lightweight semantic embedding model to retrieve the current user command. Encoded as query vector Then, it retrieves the corresponding commands from the preset historical instruction vector database. The frontmost cosine similarity Using historical instructions and their standard answers as contextual examples, the lightweight semantic embedding model employs a Sentence-BERT structure based on a Siamese network architecture. This lightweight semantic embedding model includes two BERT encoders with shared parameters, configured to map variable-length natural language text to a fixed-dimensional dense vector space. In a preferred embodiment, The value can be between 3 and 5, and the retrieval threshold for cosine similarity is set to 0.75; This indicates an output format constraint directive, forcing the model to output only data that conforms to a predefined JSON Schema; This indicates the user's natural language instruction that is currently being parsed.
[0049] Inference generation subunit reception It then performs inference calculations. For instructions containing multiple nested logic, the inference generation subunit injects a thought chain trigger (e.g., "Please think step by step") into the prompt. This trigger causes the large model to generate intermediate inference steps before generating the final JSON. To ensure the determinism of the output, the sampling temperature parameter of the inference generation subunit is set to 0.0–0.1, and the kernel sampling parameter is set to 0.95 during inference to suppress the randomness of the model.
[0050] The format constraint subunit is used to validate and clean the text generated by inference and to calculate the parsing confidence. This subunit loads a predefined JSON schema description file, uses a grammar-constrained decoding algorithm to ensure the legality of the output structure, and calculates the parsing confidence based on the probability distribution of tokens generated by the model. The calculation formula is as follows: ; In the formula, This represents the confidence score of the parsing result of the instruction, with a value range of... ; This represents the total number of valid key information tokens (excluding structural characters such as curly braces and quotation marks) in the generated sequence; Indicates the generated first One Token; This indicates that the model performs well under given input. and the preceding token Under the given conditions, generate the current token. The conditional probability; Represents an exponential function; Represents a logarithmic function; This represents the preceding token sequence. The formula eliminates the influence of generation length on the confidence score by calculating the arithmetic mean of the logarithmic probabilities of the generated sequence and then taking the exponent.
[0051] Both the rule parsing and processing unit and the large model parsing and processing unit encapsulate the parsing results into a unified intermediate business object. This object contains the "original instruction ID," "parsing path identifier (Rule / LLM)," "extracted feature key-value pairs," and "parsing confidence." "Four core fields. For the rule parsing path, if a match is successful, its..." The default value is 1.0; if no match is found, the value is 0.0. The IBO object is then transmitted to the subsequent check fallback module 140 via shared memory or a message queue.
[0052] The fallback verification module 140 is connected to the hybrid parsing module 130 and is used to transform the probabilistic intermediate business objects generated upstream into deterministic compliance instructions. Based on the technical principles of "confidence-weighted integration" and "logical constraint convergence," the fallback verification module 140 eliminates random errors from a single parsing path through cross-validation of multi-source results and uses a domain knowledge base to perform deterministic repair of abnormal data. The fallback verification module 140 includes a result integration judgment unit, a business logic consistency detection unit, and a multi-level deterministic fallback unit.
[0053] The result integration and decision unit receives and merges candidate elements from both the rule parsing path and the large model parsing path. This unit executes a priority-based weighted merging strategy. When the system receives a set containing both rule parsing results... With the set of analytical results of the large model When inputting the result, the result integration decision unit first evaluates... The matching status.
[0054] like The matching status is marked as "complete match", and the result is directly adopted by the integrated judgment unit. As a baseline result. If a field contains null values or the matching status is "partial match", the result will trigger the complementary padding logic in the integrated decision unit: for the target field ,like empty and The result is not empty; the integrated decision unit reads it. confidence score Only when Greater than the preset high confidence threshold Only then will the system Enter the final result. In a preferred embodiment, The value ranges from 0.80 to 0.90, with a preferred value of 0.85. This value is determined based on the quantiles at which the accuracy rate reached 99% in the historical test set.
[0055] The business logic consistency check unit is used to perform compliance verification on the integrated business elements. Internally, this unit stores a bond trading business rule base, which contains no fewer than 50 hard constraint logic rules in the form of condition-action pairs. The unit iterates through the integrated element set and performs the following specific checks: This unit performs price range validity checks using inequalities... Verify the legality of the yield range, among which A preset upper limit for abnormal returns (e.g., 50%) is used to filter out numerical anomalies caused by unit errors (e.g., mistaking BP for %). This business logic consistency detection unit performs term logic verification, checking the matching relationship between bond type and remaining term. For example, when the bond type is identified as "ultra-short-term financing bond," if the extracted term value is greater than 270 days, it is determined to be a logical conflict. This business logic consistency detection unit also performs transaction direction mutual exclusion verification, checking whether there are semantically mutually exclusive action enumeration values within the same instruction range (such as the simultaneous existence of "buy" and "sell"). When a logical conflict or missing key elements (such as issuer, price) is detected, this unit generates an abnormal state flag, driving the data flow into a multi-level deterministic fallback unit.
[0056] The multi-level deterministic fallback unit is used to execute a tiered repair strategy. This multi-level deterministic fallback unit activates the issuer set operation subunit and the edit distance fuzzy matching subunit in sequence according to the timing logic of "set operation first, fuzzy matching later".
[0057] The issuer set operation subunit is used to handle the logical parsing of composite entities. This subunit connects to a tagged issuer database, where each issuer entity record is associated with a multi-dimensional set of atomic tags. Internally, this subunit maintains a logical operator mapping table that maps conjunctions in natural language to set operators (e.g., mapping "not" and "excluding" to difference operators, and "and" and "both" to union operators). When a composite term instruction is received, this subunit parses the logical structure according to the operator mapping table and performs set operations: for terms representing parallel relationships (such as "large state-owned shares"), the issuer set operation subunit performs a union operation. ; For terms representing exclusion relationships (such as "non-bank institutions"), the issuer set operation subunit performs a difference operation: ; For terms that represent qualifying relationships (such as "AAA-rated state-owned enterprises"), the issuer set operation subunit performs an intersection operation: ; In the formula, The set of target issuer IDs obtained from the calculation. All are basic sets retrieved from the database based on atomic tags; This represents the basic set retrieved from the database based on the first atomic label; This represents the base set retrieved from the database based on the second atomic tag.
[0058] The edit distance fuzzy matching subunit is used to correct spelling errors or non-standard expressions. This subunit maps input text fragments to a standard terminology database based on a string similarity algorithm. To balance matching accuracy for both long and short words, the edit distance fuzzy matching subunit uses the normalized Levenstein distance as a metric, calculated as follows: ; In the formula, Indicates the input string With standard strings Similarity score, range of values ; This represents the Levenstein edit distance function, which calculates the minimum number of operations required to transform two strings through insertion, deletion, and replacement. and These represent the character lengths of the two strings, respectively.
[0059] This edit distance fuzzy matching subunit is configured with adaptive length threshold determination logic to determine whether the similarity score is valid. This logic sets a stepped fuzzy matching acceptance threshold. Regarding length Short words (which are prone to ambiguity), set (i.e., forced exact match); for length medium to long words, setting Regarding length Long words, setting Only when Only when the edit distance fuzzy matching subunit is in a certain condition can it output the corrected standard value; otherwise, it maintains the original input and marks it as unrecognizable, thus establishing a constraint boundary between error correction capability and misjudgment risk.
[0060] The mapping generation module 150 is connected to the verification fallback module 140 and is used to convert logically consistent intermediate business elements into machine instructions that can be directly executed by the bond trading system. Based on the technical principle of "semantic-schema alignment," the mapping generation module 150 converts entities in the natural language concept space into key-value constraints in the database schema space through deterministic mapping logic, thereby eliminating the heterogeneity between natural language and structured query language. The mapping generation module 150 includes an element standardization mapping unit, a data normalization and filling unit, and a structured query construction unit.
[0061] The element standardization mapping unit is configured to convert business elements in text form into standard codes within the system. This unit stores a multi-dimensional mapping dictionary, which is kept consistent with the underlying data of the external bond trading system through a periodic synchronization mechanism (e.g., daily T+1 synchronization). The dictionary includes a "bond subtype code table," an "institutional identity code table," a "rating symbol code table," and a "trading venue code table."
[0062] This element standardization mapping unit receives a set of compliant business elements and performs table lookup operations by traversing the enumerated fields in the set. Specifically, for the "bond subtype" field, this element standardization mapping unit uses the text value (such as "China Development Bank bond") as the key to retrieve and obtain the system-defined enumeration code; for the "credit rating" field, this element standardization mapping unit maps the text value to a numerical weight or standard character code used for sorting calculations.
[0063] To ensure retrieval performance under high concurrency scenarios, this element standardization mapping unit adopts a "two-level hash index" structure: the first level hash uses the field category as the key, and the second level hash uses the MD5 hash value of the element text value as the key. This structure ensures that the retrieval time complexity remains constant even with millions of terms. .
[0064] The data normalization and imputation unit handles the unification of units for numeric fields and context filling for missing fields. For numeric fields (such as yield and net price), this unit performs unit-based normalization. It reads the numerical value and its accompanying unit identifier, and uses the following formula to convert inputs of different units into the decimal format stored in the system database: ; In the formula, This represents the normalized standard value. This represents the original numerical value extracted from the instruction. This indicates the unit conversion factor. This factor is dynamically determined based on the unit identifier: when the identified unit is "%", When the identification unit is "BP", When the instruction does not display units and When the value is determined to be a default percentage expression, take the value. ; This represents the benchmark deviation value, which is 0 in regular queries; when queries involve basis or interest rate spread, this value is the current benchmark interest rate.
[0065] For non-critical missing fields, this data normalization and imputation unit performs default value injection based on the spatiotemporal context. This data normalization and imputation unit reads the system's current timestamp. And fill in the "Clearing Speed" field according to the preset trading session rules: If If it is earlier than the preset intraday clearing deadline (e.g., 16:30), then T+0 (same-day clearing) will be filled; if If the deadline is later than this, T+1 (next-day settlement) will be used. Additionally, if the instruction does not specify a trading market, the cell will default to the primary market code (e.g., the interbank market CIB) to ensure that the generated query conditions have a complete primary key index.
[0066] The structured query building block is used to dynamically generate executable code based on a standardized feature set. To ensure the absolute syntactic correctness of the generated query statements and prevent injection attacks, this block uses an abstract syntax tree (AST) construction technique instead of direct string concatenation. The block first instantiates a root query node, then traverses the feature set, transforming each feature into a child node of the tree. For numerical range elements (such as yield ranges), binary comparison nodes are constructed, containing operators (≥, ≤) and operands; for set elements (such as a list of issuers), set inclusion nodes are constructed; and for logical combination relationships, logical connection nodes are constructed. After completing the tree structure construction, the interpreter is called to traverse the AST and serialize it into the target database dialect. The generated standard query logic structure follows the following formal definition: ; In the formula, This represents the final generated structured query instruction logic string; ⊕ represents the logical concatenation or serialization operation of query clauses; The data projection clause defines the list of target fields to be returned in the query results (including but not limited to bond code, bond abbreviation, latest transaction price, estimated yield, etc.). The data source locator clause specifies the underlying database entity table or view object to which the query operation targets. This represents a set of static constraints, including the system's default hard filtering rules, specifically including logical determinations of "bond's current status is valid" and "its trading market is valid"; This represents a set of dynamic constraints, which are business logic conditions generated by serializing an Abstract Syntax Tree (AST), such as a logical AND combination of "Issuer ID belongs to a specific set" and "Rate value is greater than a specific threshold". This clause specifies the sorting rule for the results. The rule is dynamically generated based on the extreme value terms in the instruction: if the instruction contains the semantics of "maximum value", a descending sorting rule is generated for the target field; if it contains the semantics of "minimum value", an ascending sorting rule is generated for the target field; if there is no extreme value semantics, a descending sorting rule based on "latest update time" is generated by default.
[0067] This structured query building block ultimately transmits the generated data through a standardized database connection interface (such as JDBC or ODBC) or a Web API interface. Send the data to an external bond trading system for execution and retrieve the result set.
[0068] The closed-loop optimization module 160 connects to the front-end interactive interface, the back-end log database, and various processing modules of the system to establish an automatic correction loop for system parameters. Based on feedback control theory, this module 160 treats user interaction behavior as either an "error signal" or a "reward signal" for the system, minimizing the analytical error function. The system utilizes an incremental update algorithm to dynamically adjust the weight parameters of feature extraction and the threshold parameters of routing decisions, thereby achieving gradual convergence of system performance over discrete business time steps. This closed-loop optimization module 160 includes a multi-level feedback acquisition unit, a weight incremental update unit, and a global parameter dynamic adjustment unit.
[0069] The multi-level feedback collection unit is used to extract and quantify effective feedback data from the user behavior log database. This unit divides the data into two categories based on the explicitness of the interaction and assigns corresponding target correction values. .
[0070] For explicit feedback, this multi-level feedback collection unit identifies user editing behavior on the result confirmation page. When it detects that a user has modified and submitted fields automatically filled in by the system (such as "Issuer" and "Term"), the unit extracts the difference between the "Original Parsed Value" and the "User Corrected Value," marks the original parsed result as "Error," and sets a target correction value. .
[0071] For implicit feedback, this unit tracks the user's action path. If it detects that a user clicked the "Confirm Execution" button without modifying any fields, this unit marks the current parsing result as "Correct" and sets a target correction value. If a user modifies the query criteria multiple times but ultimately fails to submit the task, the unit will mark the session as "invalid noise" and will not generate a feedback signal.
[0072] The weighted incremental update unit is used to fine-tune the matching confidence of the terminology dictionary in the preprocessing module and the acceptance threshold of fuzzy matching in the verification module based on feedback signals. It employs an exponentially weighted moving average algorithm to update parameters, balancing historical prior knowledge with current observation data. When a target correction value is received for a specific mapping rule (e.g., mapping "term A" to "code B")... Then, calculate the new weights using the following formula: ; In the formula, This represents the updated weight parameters, whose value range is strictly limited to [value range missing]. between; The weight parameters represent the current time step. This indicates the target correction value (0.0 or 1.0) corresponding to this interaction. This represents the learning rate coefficient, used to control the step size of parameter updates. To prevent malicious sample attacks or accidental misoperations from causing drastic fluctuations in system parameters, Set as a small sample constant, its preferred value range is: Through this update logic, if a certain obscure terminology is continuously corrected by the user (i.e., continuously received), (signal), and its associated default weights It will decay exponentially. When When the value falls below a preset elimination threshold (e.g., 0.15), the unit generates an instruction to mark the mapping rule as "invalid" and stop calling it in subsequent parsing.
[0073] The global parameter dynamic adjustment unit is used to adjust the control parameters of the routing decision module 120 based on system-level performance statistics. This unit has a defined statistical time window (e.g., 24 hours) and periodically calculates the rule path error rate within that window. The proportion of resource consumption of large models This unit stores adaptive bidirectional threshold adjustment logic to optimize the routing decision threshold. (Low complexity boundary): When the statistical results satisfy (in To calculate the error rate of rule paths within the statistical window, When the preset error rate tolerance limit is set (e.g., 5%), it indicates that a large number of complex instructions are being incorrectly routed to the rule parsing path. In this case, the global parameter dynamic adjustment unit performs a threshold reduction operation to decrease the number of instructions entering the rule path. ; In the formula, This represents the adjusted routing decision threshold (low complexity boundary). This represents the current routing decision threshold (low complexity boundary).
[0074] When the statistical results satisfy (in This represents the percentage of resource consumption for large models. (This refers to a preset upper limit for computing resource budget, such as 40% of the total query volume) and When this occurs, it indicates a waste of computational resources in the system. At this point, the global parameter dynamic adjustment unit performs a threshold increase operation, forcing more instructions in a critical state to enter the low-cost rule path. ; In the formula, The preset adjustment step size has a range of values. This unit will add the new [feature] after each adjustment. The value is written to the system configuration center and the indicator is re-evaluated in the next statistics window, thus forming a closed-loop control.
[0075] To more clearly illustrate the actual workflow of this invention, the following example uses a typical interbank market bond trading instruction. Assume the user inputs an unstructured instruction text: "Buy 30 million yuan of China Development Bank bonds with approximately 2 years remaining, yield around 2.50%, preferably new bonds."
[0076] First, the instruction preprocessing module 110 performs word segmentation and entity recognition on the text, marking "receive" as the transaction direction and converting "30 million" into the standard value 3 × 10⁻⁶. 7 The system identifies "China Development Bank" as a bond entity and marks "around 2 years" and "around 2.50" as fuzzy semantic fragments. Subsequently, the routing decision module 120 extracts the features of this instruction, calculating its structural depth to be 4, fuzzy density to be 0.4, and overall complexity score to be... The score is 0.55. This score falls within the preset threshold. and Between these, the system determines that it is a medium-complexity instruction and generates a parallel path instruction.
[0077] The hybrid parsing module 130 simultaneously activates two parsing paths: the rule parsing unit quickly extracts precise elements, but cannot accurately match "around 2 years" and "new bonds"; the large model parsing unit, through retrieval enhancement generation technology, infers that the term range corresponding to "around 2 years" is [1.8Y, 2.2Y], and the yield range corresponding to "around 2.50" is [2.45%, 2.55%], and maps "new bonds" to logical conditions. The verification fallback module 140 integrates the results of the two, using the high confidence results of the large model to fill the gaps in the rule parsing, and confirms the absence of logical conflicts through consistency checks.
[0078] Finally, the mapping generation module 150 transforms the aforementioned compliance elements into a standard structured query instruction: this instruction specifies retrieving the "bond code" and "latest price" fields from the market data source, and sets the joint filtering conditions as follows: bond type equal to "China Development Bank bond", remaining maturity between 1.8 and 2.2 years, yield between 2.45% and 2.55%, and issuance date later than the current date by 30 days (corresponding to new bond logic). It also specifies that the result set is sorted in descending order by "data update time". This query is sent to the trading system, successfully returning a list of bonds that meet the conditions, realizing the conversion from fuzzy natural language to precise machine instructions.
[0079] Experimental verification and effect comparison: Under the same hardware environment (NVIDIA A100 GPU), three solutions were compared: Solution A (traditional pure rule engine), Solution B (pure large model end-to-end parsing), and the solution in this embodiment (hybrid parsing).
[0080] See attached document Figure 3 and attached Figure 4 Experimental results show that, in terms of parsing accuracy, the overall F1 score of this embodiment reaches 98.5%, higher than that of Scheme A (72.4%), and on par with that of Scheme B (98.8%). In instruction subsets containing complex logic, this invention effectively solves the illusion problem present in pure large model schemes through the repair mechanism of the fallback module 140, resulting in a 4.2% improvement in the final execution compliance rate.
[0081] In terms of system performance, the average response latency of the proposed solution is controlled within 120ms, a reduction of approximately 85.8% compared to 850ms in Solution B. More importantly, thanks to the routing decision module 120's traffic splitting mechanism, approximately 65% of high-frequency simple instructions are processed only through the rule-based path, resulting in a reduction of over 60% in average GPU memory utilization compared to Solution B. After two weeks of operation, the closed-loop optimization module 160, with adaptive adjustment of the dynamic threshold, further improved the system throughput (QPS) by 15% while maintaining the same accuracy. These experimental data fully demonstrate that this embodiment, while ensuring high accuracy, possesses extremely high engineering practical value and computational cost-effectiveness.
Claims
1. A dynamic hybrid parsing and deterministic verification system for fuzzy bond instructions, characterized in that, include: The instruction preprocessing module is used to convert unstructured natural language bond trading instructions into structured preprocessed data containing lexical, syntactic, and entity features; The routing decision module is used to construct a multi-dimensional feature space based on the structured preprocessed data and calculate the complexity score, and generate path selection instructions according to the dual threshold strategy. The hybrid parsing module is used to respond to the path selection instruction and extract a set of candidate business elements through the rule parsing processing unit or the large model parsing processing unit. The fallback verification module is used to perform confidence-weighted integration and logical consistency detection on the candidate business element set, and to perform deterministic repair on abnormal elements to generate a compliant business element set. The mapping generation module is used to convert the set of compliant business elements into a structured query statement that conforms to the syntax of the target system; The closed-loop optimization module is used to dynamically adjust system parameters based on user interaction behavior data.
2. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 1, characterized in that, The instruction preprocessing module includes a financial field-specific word segmentation unit, a numerical time standardization unit, and an entity enhancement recognition unit. The financial field-specific word segmentation unit utilizes a pre-built bond field dictionary database and a bidirectional maximum matching algorithm to identify the boundaries of professional terms and the structure of compound terms in the input text, generating a word segmentation sequence. The numerical time standardization unit converts non-standard values in the word segmentation sequence into standard values based on the unit mapping table, and extrapolates the relative time description into specific date ranges based on the benchmark transaction calendar. The entity enhancement recognition unit uses a pre-trained sequence labeling model to calculate the entity label probability distribution of each word in the word segmentation sequence, and constructs structured preprocessed data containing a named entity list and semantic feature vectors.
3. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 1, characterized in that, The routing decision module includes a multi-dimensional feature extraction unit and a complex quantification calculation unit; The multidimensional feature extraction unit is used to extract structural features, semantic features and business features from the structured preprocessed data. The structural features are determined based on the maximum depth of the dependency syntax tree, the semantic features are determined based on the density of fuzzy expression patterns, and the business features are determined based on the jargon professionalism index. The complexity quantification calculation unit adopts a linear weighted evaluation model, which calculates a complexity score representing the difficulty of instruction processing by weighting and summing the structural features, semantic features, and business features according to preset feature weight coefficients.
4. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 3, characterized in that, The routing decision module also includes a dynamic threshold decision unit, which is configured with low-complexity thresholds and high-complexity thresholds. The dynamic threshold decision unit compares the complexity score with the low complexity threshold and the high complexity threshold respectively: when the score is lower than or equal to the low complexity threshold, it generates an instruction to activate the rule parsing path; when the score is higher than or equal to the high complexity threshold, it generates an instruction to activate the large model parsing path. If the score is between the two, generate instructions to activate both paths in parallel; The dynamic threshold decision unit also collects and verifies the memory utilization rate of the system in real time. When the memory utilization rate is detected to continuously exceed the preset warning value, the values of the low complexity threshold and the high complexity threshold are adjusted upward by a preset step size to expand the judgment range of low complexity instructions.
5. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 1, characterized in that, The rule parsing processing unit in the hybrid parsing module includes a regular expression pattern matching engine and a static template filling engine; The regular expression pattern matching engine scans the instruction text according to a hierarchical regular expression library sorted by business priority, and extracts substrings that conform to the syntactic structure based on the longest match priority strategy. The static template filling engine is based on the skeleton topology defined by the business intent slot template. It uses keyword anchoring technology to map the input text to the template with the highest matching degree, and fills the text fragments at the corresponding positions into the semantic slot sequence. It uses a financial terminology dictionary to map the extracted text fragments into standard codes to construct the candidate business element set.
6. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 1, characterized in that, The large model parsing processing unit in the hybrid parsing module includes a prompt word construction subunit and a format constraint subunit; The prompt word construction subunit adopts a retrieval-enhanced few-shot learning strategy, uses a semantic embedding model to encode the current instruction into a query vector, retrieves the most similar historical instructions and standard answers from the historical instruction vector database, and constructs a structured input prompt word vector by combining the role definition of the verification system and the output format constraints. The format constraint subunit uses a grammar-constrained decoding algorithm to verify the generated text, calculates the parsing confidence based on the conditional probability of the valid key information tokens in the generated sequence, and encapsulates the parsing result into a set of candidate business elements containing parsing path identifiers and confidence scores.
7. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 1, characterized in that, The verification fallback module includes a result integration and judgment unit and a business logic consistency detection unit; The result integration and judgment unit executes a priority-based weighted merging strategy: it prioritizes the adoption of the complete matching results of the rule parsing path; when there are null values in the rule parsing results, it only uses the large model parsing results for complementary filling when the confidence score of the large model parsing results is greater than the preset high confidence admission threshold. The business logic consistency detection unit, based on the bond trading business rule base, performs price range validity verification, term logic verification, and transaction direction mutual exclusion verification on the integrated set of elements, and generates an abnormal status identifier when a logical conflict or missing key element is detected.
8. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 7, characterized in that, The verification fallback module also includes a multi-level deterministic fallback unit, which, in response to the abnormal state identifier, sequentially activates the issuer set operation subunit and the edit distance fuzzy matching subunit. The issuer set operation subunit parses the logical structure of compound terms according to the logical operator mapping table, and performs union, difference and intersection operations based on atomic tag sets for terms representing parallel, exclusion and restriction relationships, respectively, to reconstruct the target issuer ID set; The edit distance fuzzy matching subunit calculates the normalized Levenstein distance between the input text fragment and the terms in the standard terminology library, and outputs the corrected standard value only when the similarity score is higher than the adaptive length threshold set based on word length.
9. The dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions according to claim 1, characterized in that, The mapping generation module includes a structured query building unit, which uses abstract syntax tree construction technology to transform the set of compliant business elements into a standard query logic structure that includes a data projection clause, a data source location clause, a set of static constraints, a set of dynamic constraints, and a result sorting rule clause. The closed-loop optimization module includes a weight incremental update unit. The weight incremental update unit dynamically updates the matching confidence weight of the term dictionary using an exponentially weighted moving average algorithm based on the feedback signal generated by the user's confirmation or correction behavior of the parsing results. When the weight is lower than the elimination threshold, the corresponding mapping rule is marked as invalid.
10. A dynamic hybrid parsing and deterministic verification method for bond fuzzy instructions, applied to the dynamic hybrid parsing and deterministic verification system for bond fuzzy instructions as described in claims 1-9, characterized in that, Includes the following steps: We acquire unstructured natural language bond trading instructions from users, and construct structured preprocessed data containing lexical, syntactic, and entity features through financial-specific word segmentation and entity enhancement recognition processing. Based on the multidimensional features of the structured preprocessed data quantization instructions and the complexity score, path selection instructions are generated according to the dual threshold strategy. In response to the path selection instruction, a rule parsing step or a semantic reasoning parsing step is executed to extract a set of candidate business elements. The rule parsing step extracts elements based on regular pattern matching and static template filling, and the semantic reasoning parsing step extracts elements based on structured prompts and generative reasoning. The set of candidate business elements is subjected to confidence-weighted integration and logical consistency detection. When logical conflicts or missing key elements are detected, deterministic repair is performed through term expansion mapping, issuer set operation or edit distance matching to generate a set of compliant business elements. Using abstract syntax tree (AST) construction technology, the set of compliant business elements is converted into a structured query statement that conforms to the syntax of the target system, and then driven to execute by the external transaction system. Collect user feedback data on the parsing results, and dynamically adjust the term mapping weight or routing decision threshold based on the feedback data.