A large model execution engine and method based on intention decoupling and dynamic planning

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a large-scale model execution engine based on intent decoupling and dynamic programming, the problems of weak intent processing capabilities, rigid execution processes, and high integration costs in enterprise business automation are solved. It realizes the accurate decomposition, dynamic planning, and transparent execution of complex business instructions, improves the reliability and maintainability of the system, and supports the efficient integration of heterogeneous enterprise systems.

CN122220182APending Publication Date: 2026-06-16SHENZHEN ZHISOFT TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: SHENZHEN ZHISOFT TECH CO LTD
Filing Date: 2026-03-19
Publication Date: 2026-06-16

Application Information

Patent Timeline

19 Mar 2026

Application

16 Jun 2026

Publication

CN122220182A

IPC: G06F11/30; G06F11/32; G06F11/34; G06F18/22

AI Tagging

Application Domain

Hardware monitoring

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Data distribution method, apparatus and device, medium, and program product
WO2026119002A1Hardware monitoring
Codifying, discovery and validation of system-in-package resources
WO2026120123A1Detecting faulty hardware by power-on testHardware monitoring
A cloud-edge collaborative task scheduling method based on non-intrusive sensing and value evaluation
CN122261846AResource allocation Hardware monitoring Quality of service Edge computing
Server power supply master-slave dynamic management method, storage medium, device and system
CN122195781AVolume/mass flow measurement Hardware monitoring
Geometric-aware distance measure for performance testing analysis
US20260161523A1Hardware monitoring

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies in the field of enterprise business automation suffer from problems such as weak intent processing capabilities, rigid execution processes, opaque execution processes, and high integration costs. They lack a closed-loop intelligent control mechanism that integrates intent understanding, dynamic planning, flexible execution, and transparent traceability, which limits the application of large models in complex enterprise environments.

Method used

A large-scale model execution engine based on intent decoupling and dynamic programming is constructed, including a multimodal input parsing module, an intent recognition and decoupling module, a dynamic programming and execution scheduling module, a tool invocation and resource adaptation module, and a white-box execution monitoring module, to achieve user input standardization, intent decoupling, dynamic programming, and transparent traceability.

Benefits of technology

It enables precise decomposition, dynamic planning, and transparent execution of complex business instructions, improving system reliability and maintainability, reducing integration costs, supporting efficient integration of heterogeneous enterprise systems, and promoting the large-scale deployment of large models in enterprise production environments.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure FT_1

Patent Text Reader

Abstract

The application belongs to the technical field of cross technology of artificial intelligence and enterprise informatization, and discloses a large model execution engine and method based on intention decoupling and dynamic planning, the engine comprising: a multi-modal input analysis module, which is used for converting user input into a standardized request; an intention recognition and decoupling module, which is used for identifying and decomposing a composite business intention into a sub-intention list with priority and dependency relationship; a dynamic planning and execution scheduling module, which is used for matching resources for the sub-intention, planning and dynamically adjusting an execution link; a tool calling and resource adapting module, which is used for calling an external business system according to the link; and a white-box execution monitoring module, which is used for recording, tracking and visualizing the execution process throughout. The application solves the problems of weak intention processing capability, rigid execution process, non-transparent process and high integration cost in the prior art, and realizes accurate processing of complex business intention, dynamic optimization of an execution link and full-process transparent tracing.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the interdisciplinary field of artificial intelligence and enterprise information technology, specifically involving a large model execution engine and method based on intent decoupling and dynamic programming. Background Technology

[0002] Currently, large-scale language models are increasingly widely used in enterprise business automation. However, existing technical solutions mostly rely on general or simple execution frameworks, and their core engines have significant shortcomings when handling complex and dynamic enterprise business: 1) Weak intent processing capabilities: Existing solutions typically only support simple recognition of single intents and cannot structurally decouple multi-layered composite instructions such as "query + validation + execution," leading to execution deviations or failures. 2) Rigid execution flow: Using preset fixed call chains, they lack the ability to perceive and dynamically adjust runtime states such as API availability, data freshness, and changes in business rules. 3) Opaque execution process: The intermediate reasoning and call logic from user input to the final action is not fully recorded, forming a "black box" that is difficult to audit, debug, and optimize. 4) High integration costs: When interfacing with heterogeneous enterprise systems (such as ERP, MES, HR), extensive customized development is required for different protocols and interfaces, lacking a unified adaptive mechanism.

[0003] The root cause of these problems is that existing technologies have not built a closed-loop intelligent control mechanism of "intent understanding - dynamic programming - elastic execution - transparent traceability" at the kernel level of the execution engine, which restricts the effective implementation of large models in enterprise production environments with high reliability requirements. Summary of the Invention

[0004] The purpose of this invention is to provide a large model execution engine and method based on intent decoupling and dynamic programming to solve the above-mentioned technical problems.

[0005] The objective of this invention can be achieved through the following technical solutions: A large-scale model execution engine based on intent decoupling and dynamic programming, characterized in that it includes: The multimodal input parsing module is used to receive and process the user's multimodal input and convert it into a standardized structured request; The intent recognition and decoupling module, connected to the multimodal input parsing module, is used to identify the composite business intent in the standardized request and decompose it into a structured list of sub-intents with priority and dependency. The dynamic planning and execution scheduling module, connected to the intent recognition and decoupling module, is used to match resources for the sub-intent list, plan execution links, and dynamically adjust the execution links based on runtime status. The tool invocation and resource adaptation module is connected to the dynamic planning and execution scheduling module and is used to invoke external business systems according to the planned execution chain; The white-box execution monitoring module is connected to the multimodal input parsing module, the intent recognition and decoupling module, the dynamic planning and execution scheduling module, and the tool invocation and resource adaptation module, and is used to record, track and visualize the data and status of the execution process throughout the entire process.

[0006] As a further description of the technical solution of the present invention, the intent recognition and decoupling module includes: The intent recognition unit performs multi-level intent classification on the standardized request based on a fine-tuned large language model. The business manual matching unit is used to perform vector similarity matching between user requests and pre-imported business manual content to supplement business rules and operation steps. The sub-intent generation unit, connected to the intent recognition unit and the business manual matching unit, is used to extract core actions based on intent classification results and business rules, determine the priority and dependency of each sub-intent, construct a directed acyclic graph to represent the dependency, and output the structured sub-intent list.

[0007] As a further description of the technical solution of the present invention, the tool invocation and resource adaptation module includes: The OpenAPI call adapter supports multiple communication protocols and has built-in protocol templates for automatically filling in authentication information, performing parameter mapping, and parsing responses. The third-party system access unit adopts a plug-in architecture, which interacts with the B / S architecture business system through plug-ins that conform to standardized interface specifications; The call control unit is used to implement timeout control, retry strategy based on exponential backoff algorithm, dynamic rate limiting based on token bucket algorithm, and load balancing for interface calls.

[0008] As a further description of the technical solution of the present invention, the white-box execution monitoring module includes: The execution log recording unit is used to record the operation logs of each module in a standardized format. The link tracing unit, based on a distributed tracing protocol, generates a global tracing identifier to connect the entire process operation; The execution link visualization unit is used to generate interactive flowcharts based on execution logs and link tracing data, and dynamically display the execution status of each node through color coding; The retrieval and analysis unit supports querying execution records based on multiple conditions and provides statistical analysis functions for execution success rate and average execution time.

[0009] As a further description of the technical solution of the present invention, the business manual matching unit supports importing business manuals in Markdown, Word or Excel formats, and uses the BERT model to convert the user request text and business manual content into vectors, performs matching by calculating cosine similarity, and performs hierarchical processing on the matching results based on a preset similarity range threshold, wherein the range threshold includes at least two different similarity levels.

[0010] As a further description of the technical solution of the present invention, the plug-in of the third-party system access unit supports a multi-dimensional element positioning mechanism, dynamic element recognition and waiting, layout change adaptation, and anomaly recovery and retry mechanism to cope with the dynamic changes of the business system page.

[0011] A large-scale model execution method based on intent decoupling and dynamic programming, the method comprising: Receive multimodal input from users, process and semantically standardize the multimodal input through parsing, and output a standardization request; The standardized request is subjected to intent recognition, matched with relevant business manual rules, decomposed into a list of sub-intents with priority and dependency relationships, and a dependency relationship graph is constructed. Match the optimal resources to the sub-intent list, and combine the real-time collected resource status and business rule constraints to plan and generate the initial execution link; The external business system is invoked according to the execution chain, and timeout, retry, and rate limiting controls are implemented during the invocation process; If resource anomalies or business rule changes are detected during execution, a local replanning is triggered to adjust the execution path of the affected sub-intents and execute a degradation strategy. It records execution data and status throughout the process, generates a visual execution chain diagram and statistical analysis report, and supports querying and auditing.

[0012] The beneficial effects of this invention are: This invention fundamentally solves the core challenge of applying large models in complex enterprise business scenarios by constructing an intelligent execution engine that deeply integrates intent understanding, dynamic planning, and transparent monitoring. Its core benefit lies in achieving a leap from rigid execution to flexible adaptation: the system can deeply analyze complex business instructions containing multiple logics from users, accurately decompose them into structured atomic tasks, and dynamically plan and adjust the optimal execution path based on real-time awareness of external resource status and business rule changes, thereby maintaining high robustness and continuity in the event of resource anomalies or process changes. Simultaneously, this invention completely breaks the traditional "black box" model. Through end-to-end logging, link tracing, and visualization, every step from natural language input to the final business system call is clearly traceable and auditable, greatly improving the system's maintainability and reliability. At the integration level, its innovative plug-in architecture and unified protocol adaptation mechanism significantly simplify the process of interfacing with various heterogeneous enterprise systems, significantly reducing development and maintenance costs. Overall, this invention not only improves the accuracy and success rate of large models in handling complex tasks, but also provides a solid technical foundation for the safe, reliable, and efficient integration of large models into critical business production processes through its dynamic, transparent, and easy-to-integrate characteristics, thus promoting the transformation of enterprise intelligence from proof of concept to large-scale implementation.

[0013] Of course, any product implementing this invention does not necessarily need to achieve all of the advantages described above at the same time. Attached Figure Description

[0014] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0015] Figure 1 This is a schematic diagram of the engine part of the present invention. Detailed Implementation

[0016] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0017] Please see Figure 1The core execution engine of this invention adopts a modular and pipelined design, constructing a complete closed loop from user interaction to business system invocation, and then to result feedback and monitoring. The engine mainly comprises five core modules: a multimodal input parsing module, an intent recognition and decoupling module, a dynamic planning and execution scheduling module, a tool invocation and resource adaptation module, and a white-box execution monitoring module. Each module interacts with data and synchronizes its state through clearly defined interfaces, collaboratively achieving intelligent control throughout the entire process of "input standardization—intent decoupling—dynamic planning—elastic execution—transparent traceability."

[0018] Module 1: The Multimodal Input Parsing Module serves as the engine's input portal, responsible for receiving and processing business requests initiated by users through different channels and formats. Its core objective is to transform heterogeneous, unstructured raw input into standardized, structured requests that the engine can understand and process.

[0019] Module 2: Intent Recognition and Decoupling. This module is responsible for deeply understanding the user's complex business intent and decomposing it into a sequence of atomic tasks that can be executed independently and have clear logical relationships.

[0020] Composition and working principle: Intent Recognition Unit: This unit is built upon a large language model fine-tuned from a large corpus of enterprise business language (Llama-2-7B is used in this embodiment). It receives standardized request text and context from module 101, extracts semantic features through the model's multi-layer Transformer encoder, and performs three-level hierarchical intent classification via a specially trained classification head. Primary category (business area): For example, finance, supply chain, human resources, manufacturing.

[0021] Secondary categories (core actions): For example, in the supply chain field, these can be categorized into query, verification, creation, update, and deletion.

[0022] Third-level classification (specific operations): For example, checking inventory, verifying order compliance, and creating a shipping order.

[0023] The model outputs the classification results and their confidence scores for each level. In this embodiment, when the highest confidence score is ≥0.8, the intent recognition is considered reliable, and a structured result (including intent type, confidence score, associated predefined business rule ID, etc.) is directly output. If the confidence score is <0.8, the business manual matching unit is triggered for auxiliary verification.

[0024] Short-Term Memory (STM) Caching: This unit is implemented using the high-performance key-value database Redis. It maintains a cache area for each user session, storing key information from the last five rounds of dialogue, including user ID, dialogue sequence number, standardized intent for each round, execution status (success / failure), and key output data. The data has a time-to-live (TTL), defaulting to 30 minutes, and is automatically cleaned up after expiration. This cache provides cross-round dialogue memory capabilities for intent recognition and semantic standardization, forming the foundation for achieving coherent dialogue services.

[0025] Business Manual Matching Unit: This unit serves as a supplement to intent identification and a source of business rules. System administrators can import internal business manuals (SOPs) in Markdown, Word, or Excel formats. During import, the system automatically parses the document structure and extracts chapters, clauses, and operational steps.

[0026] Vectorization and Matching Unit: When matching is required (whether for low confidence in intent recognition or as a supplementary rule to the regular process), this unit uses the BERT model to convert the user request text and each rule / step text in the business manual into 64-dimensional semantic vectors.

[0027] Similarity calculation: The degree of semantic association between the user request vector and the content vector of each business manual is quantified by calculating the cosine similarity between the two.

[0028] Threshold Filtering: In this embodiment, the matching threshold is set to 0.7. All business manual entries with a similarity ≥ 0.7 are filtered out, and their associated operation steps, prerequisites, constraints, precautions, and other information are output as supplementary knowledge to the sub-intent generation unit. If there are no matching items or the matching degree is extremely low, a prompt "Business rules need to be supplemented" is generated, which can be fed back to the user or notified to the administrator.

[0029] Sub-intent generation unit: This unit is the core of intent decoupling. It receives intent classification results from the intent recognition unit and business rule information from the business manual matching unit, and performs the following steps: Core Action Extraction: Extract indivisible atomic actions from composite intents. For example, for the composite intent "Query the inventory of product A in the East China warehouse; if it is greater than 100, create a delivery order," extract the two core actions: "Query inventory" and "Create delivery order." "If it is greater than 100" is a conditional logic that will be reflected in subsequent dependencies.

[0030] Priority and dependency analysis: Determine the execution order of sub-intents based on business logic and rules.

[0031] Priority: Set to levels 1-5, with level 1 being the highest. For example, "validate data" usually has a higher priority than "perform data write".

[0032] Dependencies: This section analyzes the data flow and logical relationships between actions. For example, "Create delivery note" depends on the result of "Query inventory" as input. This unit identifies these relationships using dependency analysis algorithms.

[0033] Constructing a Directed Acyclic Graph (DAG): Based on the analysis results above, construct a DAG graph. Each node in the graph represents a sub-intention, and node attributes include: sub-intention ID, target description, core action, priority, and required resource tags (such as the API name called). Directed edges represent dependencies, pointing from parent intention nodes to child intention nodes. The introduction of DAG allows sub-intentions without dependencies to be executed in parallel, significantly improving efficiency.

[0034] Output a structured list: Finally, the DAG graph is serialized into a structured list of sub-intents and output to the dynamic programming module. Each entry in the list clearly defines its task objective, priority, dependent set of parent task IDs, and resource requirements.

[0035] Module 3: Dynamic Programming and Execution Scheduling module is responsible for allocating appropriate resources to the decoupled sub-intents and planning an optimal execution path that can adapt to changes in the system's runtime state.

[0036] Module 4: Tool Invocation and Resource Adaptation module is responsible for safely, reliably, and efficiently invoking various external business systems based on the execution chain instructions issued by the dynamic programming module, and processing the returned results.

[0037] Composition and working principle: OpenAPI Call Adapter: Responsible for connecting to standardized API interfaces. This adapter has built-in templates for more than 20 mainstream communication protocols, such as RESTful, SOAP, gRPC, JDBC, etc.

[0038] Workflow: Upon receiving the call instruction, the corresponding protocol template is loaded based on the API identifier in the instruction. The template defines the request structure, authentication method (such as OAuth2.0, APIKey), parameter mapping rules, and response parsing rules.

[0039] Automated processing: The adapter automatically retrieves a token or key from the engine's authentication center and fills it into the request header. It then converts the output parameters of the sub-intent into the format required by the API according to the mapping rules and sends the request. Upon receiving the response, it automatically parses the status code and return body, converts the business data into the engine's internal standard format, and maps unsuccessful status codes to a unified error type. This templated design greatly reduces the development workload for integrating new APIs.

[0040] Third-party system access unit: For traditional business systems that do not have open APIs or only have B / S (browser / server) architecture pages (such as some older versions of ERP and MES), this unit adopts a plug-in architecture for connection.

[0041] Plugin Specification: Defines a standardized set of plugin interfaces. Plugin developers only need to implement the interaction logic with specific business systems according to this specification. A plugin is essentially a script or service that can simulate user operations on a browser, implementing functions such as login, form filling, button clicks, and data retrieval.

[0042] Intelligent fault tolerance: The plugin integrates advanced RPA (Robotic Process Automation) technology, possessing strong fault tolerance and adaptive capabilities. Multi-dimensional element positioning: It stores multiple positioning methods such as XPath, CSS selectors, ID, and text, and automatically tries another method when one fails.

[0043] Dynamic waiting: Built-in intelligent waiting mechanism, polling to check whether the target element has been loaded and whether it is operable before operation.

[0044] Adaptive layout: By recording the relative positions of elements, it can respond to minor adjustments to the page UI.

[0045] Anomaly recovery: Automatically handles common anomalies such as pop-ups and session timeouts, and resumes execution after recovery.

[0046] Benefits: By adopting a plug-in approach, the system integration work that traditionally requires in-depth customization and development and takes more than 30 days can be shortened to less than 15 days. Furthermore, the plug-ins support hot-swapping, making maintenance and upgrades easier.

[0047] Call control unit: To ensure the stability of the execution process and avoid impacting downstream systems, this unit implements comprehensive call management.

[0048] Timeout control: Set a timeout for each API call (default 3 seconds, configurable) to prevent engine threads from being occupied for a long time due to downstream system blockage.

[0049] Intelligent retry: For brief failures caused by network jitter, an exponential backoff algorithm is used for retries. For example, the first retry waits 2 seconds, the second waits 4 seconds, and the third waits 8 seconds. This avoids blindly and frequently retrying, which would increase system load.

[0050] Dynamic rate limiting: Limits call frequency based on the token bucket algorithm. Different token generation rates can be set for interfaces of different importance, and dynamically adjusted according to load feedback from downstream systems to achieve flexible flow control.

[0051] Load balancing: When there are multiple instances of the same service, it supports algorithms such as round-robin and weighted round-robin (dynamically allocating weights based on the success rate and response time of the instance) to distribute requests to different instances, thereby improving the overall call success rate and performance.

[0052] Unified exception handling: After consecutive failed calls reach a threshold (e.g., 3 times), the call will not be retried. Instead, the exception information will be reported, and the link adjustment unit (1035) of the dynamic planning module will trigger degradation or replanning.

[0053] Module 5: The white-box execution monitoring module runs through the entire execution process, enabling full-link observability, traceability, and auditability.

[0054] Composition and working principle: Execution Log Recording Unit: Each module and key unit within the engine sends standardized logs to its corresponding unit asynchronously and non-blockingly during operation. Each log entry contains rich fields such as: trace_id (global trace ID), span_id (current operation ID), parent_span_id (parent operation ID), module_name, unit_name, timestamp, input_data, output_data, duration (duration), status (success / failure / in progress), and error_msg (if failure). Logs are persistently stored in Elasticsearch or Kafka, supporting sharding by time, module, and other dimensions, with configurable retention policies (e.g., default 30 days).

[0055] The tracing unit is implemented based on distributed tracing standards such as OpenTelemetry. It generates a globally unique `trace_id` for each user request. Each processing step (a span, such as "intent recognition" or "calling XX API") along the request flow generates a `span_id`, recording its association with its parent span, start and end times, duration, and status. Using the `trace_id`, logs scattered across different modules and even different servers can be chained together to reconstruct the complete lifecycle of a request's call chain.

[0056] Execution Link Visualization Unit: This unit uses front-end visualization libraries such as D3.js to dynamically render link tracing data into an interactive execution flowchart.

[0057] Visual representation: The nodes in the diagram represent various execution tasks or system calls, and the arrows represent the execution order and dependencies.

[0058] Status visualization: The status is displayed intuitively through color coding: green (success), red (failure), yellow (in execution), and gray (not executed / skipped).

[0059] Interactive features: Users can zoom and pan the flowchart. Clicking on any node will bring up a panel to view the detailed input and output data, time taken, error messages, and execution basis for that step (for example, the execution basis for the "Call Shipping API" node is: "Because the inventory query result is 150 pieces, which is greater than the order quantity of 100 pieces, the business rule [Inventory > Order Quantity] is satisfied").

[0060] Search and analysis unit: Provides powerful post-event query and analysis capabilities.

[0061] Multi-condition combined query: Supports precise filtering by combining multiple conditions (such as time range, user ID, intent type, execution status, time interval, error code, etc.) through the web interface or API, with query response time optimized to less than 1 second.

[0062] Statistical analysis: Automatically aggregates and analyzes execution records to generate dashboards that display key indicators such as daily / monthly execution success rate, average time consumption trend, time consumption percentage of each module, and distribution of high-frequency error types, presented in the form of bar charts, line charts, pie charts, etc.

[0063] Report Export: Supports exporting query results or statistical analysis reports to PDF or Excel format, facilitating auditing, review, and performance optimization.

[0064] System Workflow Summary Combining the modules mentioned above, a complete workflow of this engine is as follows: Input and Parsing: Users submit requests via text, voice, images, or forms (e.g., "Please check how much material A is left in the Beijing warehouse. If it exceeds 500 units, place a purchase order for 300 units with supplier B"). The multimodal input parsing module converts this into a standardized structured request.

[0065] Intent Decoupling: The intent recognition and decoupling module identifies this as a composite intent containing "query inventory" and "create purchase order". The business manual matching unit supplements this with the rule "purchase must match qualified suppliers". The sub-intent generation unit decomposes it into two sub-intents: ① Query the inventory of material A in the Beijing warehouse (priority 1); ② If the inventory > 500, create a purchase order for 300 units of material A from supplier B (priority 2, dependent on the result of ①). A DAG graph is then constructed. Dynamic Programming: The dynamic programming and execution scheduling module matches sub-intent ① with the "Inventory Query API" and sub-intent ② with the "Purchase System Create Order API". The status awareness unit displays that both APIs are healthy. The execution chain is planned and generated: Sub-intent ① is executed in parallel first, and after its success, sub-intent ② is executed.

[0066] Elastic execution: The tool invocation and resource adaptation module called the inventory query API as planned and successfully returned an inventory quantity of 600 units. However, when preparing to call the procurement system, the control unit detected that the procurement system API response timed out (abnormal status).

[0067] Dynamic Adjustment: The status awareness unit immediately updates the procurement system API status to "unavailable." The link adjustment unit is triggered, initiating a partial replanning. Since the inventory conditions on which sub-intention ② depends are met, but execution resources are abnormal, a degradation strategy is initiated: after failing to call the backup simplified interface (Level 1 degradation), the complete data, including the "Create Purchase Order" request, inventory results, and supplier information, is asynchronously written to a pending task queue (Level 2 degradation), and a prompt is immediately returned to the user: "Purchase order request has been accepted and is being processed in the queue due to system overload, order number PO_******" (Level 3 degradation, user-friendly prompt). The entire process is completed within 80 milliseconds.

[0068] Transparent Traceability: Throughout the process, the white-box execution monitoring module records logs and tracking information for all steps. Operations personnel can clearly see the execution chain change from green to red at the "Create Purchase Order" node through a visual interface. Clicking on this node reveals the error details: "API call timeout," and shows the subsequent triggering of a degradation strategy and the generation of pending tasks. The retrieval and analysis unit can calculate the daily failure rate of the procurement system API, providing data support for system optimization. Summarize This invention, through the precise collaboration of the aforementioned five core modules, constructs a large-scale intelligent execution engine with deep intent understanding, dynamic response to changes, stable and reliable execution, complete process transparency, and easy integration. It effectively addresses the core pain points of existing technologies in handling complex and dynamic enterprise-level business scenarios, providing a solid technical foundation for the reliable deployment of large-scale models in critical production environments.

[0069] The above description is merely an example and illustration of the concept of the present invention. Those skilled in the art can make various modifications or additions to the specific embodiments described or use similar methods to replace them, as long as they do not deviate from the concept of the invention or exceed the scope defined in the claims, they should all fall within the protection scope of the present invention.

Claims

1. A large-scale model execution engine based on intent decoupling and dynamic programming, characterized in that, include: The multimodal input parsing module is used to receive and process the user's multimodal input and convert it into a standardized structured request; The intent recognition and decoupling module, connected to the multimodal input parsing module, is used to identify the composite business intent in the standardized request and decompose it into a structured list of sub-intents with priority and dependency. The dynamic planning and execution scheduling module, connected to the intent recognition and decoupling module, is used to match resources for the sub-intent list, plan execution links, and dynamically adjust the execution links based on runtime status. The tool invocation and resource adaptation module is connected to the dynamic planning and execution scheduling module and is used to invoke external business systems according to the planned execution chain; The white-box execution monitoring module is connected to the multimodal input parsing module, the intent recognition and decoupling module, the dynamic planning and execution scheduling module, and the tool invocation and resource adaptation module, and is used to record, track and visualize the data and status of the execution process throughout the entire process.

2. The large model execution engine based on intent decoupling and dynamic programming according to claim 1, characterized in that, The intent recognition and decoupling module includes: The intent recognition unit performs multi-level intent classification on the standardized request based on a fine-tuned large language model. The business manual matching unit is used to perform vector similarity matching between user requests and pre-imported business manual content to supplement business rules and operation steps. The sub-intent generation unit, connected to the intent recognition unit and the business manual matching unit, is used to extract core actions based on intent classification results and business rules, determine the priority and dependency of each sub-intent, construct a directed acyclic graph to represent the dependency, and output the structured sub-intent list.

3. The large model execution engine based on intent decoupling and dynamic programming according to claim 1, characterized in that, The tool invocation and resource adaptation module includes: The OpenAPI call adapter supports multiple communication protocols and has built-in protocol templates for automatically filling in authentication information, performing parameter mapping, and parsing responses. The third-party system access unit adopts a plug-in architecture, which interacts with the B / S architecture business system through plug-ins that conform to standardized interface specifications; The call control unit is used to implement timeout control, retry strategy based on exponential backoff algorithm, dynamic rate limiting based on token bucket algorithm, and load balancing for interface calls.

4. The large model execution engine based on intent decoupling and dynamic programming according to claim 1, characterized in that, The white-box execution monitoring module includes: The execution log recording unit is used to record the operation logs of each module in a standardized format. The link tracing unit, based on a distributed tracing protocol, generates a global tracing identifier to connect the entire process operation; The execution link visualization unit is used to generate interactive flowcharts based on execution logs and link tracing data, and dynamically display the execution status of each node through color coding; The retrieval and analysis unit supports querying execution records based on multiple conditions and provides statistical analysis functions for execution success rate and average execution time.

5. The large model execution engine based on intent decoupling and dynamic programming according to claim 2, characterized in that, The business manual matching unit supports importing business manuals in Markdown, Word, or Excel formats. It uses the BERT model to convert the user request text and business manual content into vectors, performs matching by calculating cosine similarity, and performs hierarchical processing on the matching results based on a preset similarity range threshold, which includes at least two different similarity levels.

6. The large model execution engine based on intent decoupling and dynamic programming according to claim 3, characterized in that, The third-party system access unit's plugins support multi-dimensional element positioning mechanisms, dynamic element recognition and waiting, layout change adaptation, and exception recovery and retry mechanisms to cope with dynamic changes in business system pages.

7. A method for executing large models based on intent decoupling and dynamic programming, characterized in that, Applied to the large model execution engine based on intent decoupling and dynamic programming as described in any one of claims 1-6, the method includes: Receive multimodal input from users, process and semantically standardize the multimodal input through parsing, and output a standardization request; The standardized request is subjected to intent recognition, matched with relevant business manual rules, decomposed into a list of sub-intents with priority and dependency relationships, and a dependency relationship graph is constructed. Match the optimal resources to the sub-intent list, and combine the real-time collected resource status and business rule constraints to plan and generate the initial execution link; The external business system is invoked according to the execution chain, and timeout, retry, and rate limiting controls are implemented during the invocation process; If resource anomalies or business rule changes are detected during execution, a local replanning is triggered to adjust the execution path of the affected sub-intents and execute a degradation strategy. It records execution data and status throughout the process, generates a visual execution chain diagram and statistical analysis report, and supports querying and auditing.