A multi-stage constraint code generation method and system for industrial scenarios
By employing a multi-stage constraint code generation method, a constraint logic tree and a directed acyclic graph are constructed. Combined with a large language model, controlled decoding and deviation determination are performed, solving the problem of multi-level constraint processing in industrial scenarios and achieving efficient and controllable code generation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGDONG UNIV OF TECH
- Filing Date
- 2026-03-19
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies struggle to effectively handle multi-level and multi-type constraints in code generation for industrial scenarios, leading to generated results that deviate from actual needs, high computational overhead, and a lack of fine-grained iterative repair mechanisms, thus limiting the engineering usability and reliability of code generation.
A multi-stage constraint code generation method is adopted, which constructs constraint logic trees and directed acyclic graphs, processes constraint sets in layers, combines large language models for controlled decoding, and performs deviation judgment and logic repair after generation to ensure that the generated code conforms to industry standards.
It improves the overall consistency and controllability of generated code, reduces computational overhead, minimizes syntax illusions and structural errors, and enhances the controllability and engineering adaptability of the generation process.
Smart Images

Figure CN122240084A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and in particular to a method and system for generating multi-stage constraint codes for industrial scenarios. Background Technology
[0002] In recent years, artificial intelligence, especially code generation technology centered on Large Language Models (LLMs), has been profoundly impacting software engineering practices. By understanding requirements descriptions in natural language and directly generating executable code, LLM-driven code generation systems significantly lower the barrier to software development and accelerate system prototyping, feature iteration, and defect fixing processes. In general software domains, such as financial information systems, enterprise backend services, command-line tools, and data processing and analysis applications, this technology has demonstrated high practical value. However, compared to general application scenarios, specialized software systems, such as specific code for control, automation, and embedded environments (e.g., PLC programs, industrial control logic, equipment drivers, and real-time scheduling code), place much stricter demands on code determinism, verifiability, security, and standards compliance.
[0003] Traditional industrial code development heavily relies on human experience and domain knowledge. Developers must strictly adhere to established industry standards and specifications (such as IEC and ISO) and explicitly consider equipment constraints, real-time requirements, exception handling logic, and system-level security strategies during the design phase. Although recent research has attempted to introduce LLM into industrial code generation tasks, using pre-trained models fine-tuned with domain data and supplemented by hint engineering and post-generation verification mechanisms to improve generation quality, existing methods remain insufficient when facing highly complex and coupled constraints in real-world scenarios. As industrial requirement descriptions evolve from simple functional specifications to comprehensive specifications encompassing process semantics, control logic, timing constraints, and safety rules, how to systematically handle multi-level and multi-type constraints during code generation has become a key bottleneck restricting the implementation of LLM.
[0004] In summary, the technical problems existing in the relevant technologies need to be improved. Summary of the Invention
[0005] The main objective of this application is to propose a multi-stage constraint code generation method and system for industrial scenarios, aiming to solve the above-mentioned problems.
[0006] To achieve the above objectives, one aspect of this application proposes a multi-stage constraint code generation method for industrial scenarios, the method comprising: Obtain natural language requirements, convert the natural language requirements into a set of constraint primitives, and construct a constraint logic tree based on the set of constraint primitives; Traverse the nodes of the constraint logic tree and extract feature vectors. Apply a decision function to the feature vectors to divide the constraint primitive set into an early constraint set. Combine the dependency relationship of the constraint logic tree to construct a decoding intervention sequence for the early constraint set. Based on the decoding intervention sequence, the large language model is triggered to decode the natural language requirement and generate lexical units. The early constraint set is converted into a directed acyclic graph. During the decoding process, the lexical units are filtered according to the state of the directed acyclic graph to generate a set of legal lexical units. A constraint code sequence is constructed based on a set of legal lexical units. When the generated constraint code sequence reaches the preset termination condition, an abstract syntax tree is executed on the constraint code sequence, and the successfully parsed constraint code is output.
[0007] In some embodiments, the method further includes: Applying a hierarchical decision function to the feature vectors, the set of constraint primitives is divided into a set of delayed constraints; Construct a verification execution sequence for the set of delay constraints.
[0008] In some embodiments, after outputting the successfully parsed constraint code, the method further includes: For the verification execution sequence in the set of delay constraints, a verification script is automatically synthesized; Based on the verification script and the generated constraint code, capture the running status of the constraint code; The deviation function is invoked to determine the deviation in the operating state; When the deviation determination result is greater than the threshold, the logic repair is triggered.
[0009] In some embodiments, triggering logical repair when the deviation determination result is greater than the threshold specifically includes: Locate the constraint codes that violate the set of delay constraints and their corresponding constraint nodes; Construct repair suggestions by combining the context of natural language requirements and constraints in the code; Based on the repair suggestions, a large language model is invoked to generate targeted code patches; The patch is merged into the constraint code and the relevant decisions are re-executed until the constraint code satisfies the delay constraint and the maximum number of iterations is reached.
[0010] In some embodiments, constructing a constraint logic tree based on the set of constraint primitives specifically includes: The set of constraint primitives is converted into structured node objects; Call the dependency determination function to identify the semantic logical order of structured node objects; A directed graph is constructed based on the identified structure. Loop detection is performed on the directed graph and directed loops are removed to generate a directed acyclic graph. The directed acyclic graph is encapsulated as a constraint logic tree.
[0011] In some embodiments, constructing a decoding intervention sequence for the early constraint set specifically includes: Extract a subgraph from the directed acyclic graph that contains only nodes of the early constraint set; The subgraph is topologically sorted to generate a decoding intervention sequence, which includes a set of early constraint nodes and a set of dependencies between early constraint nodes.
[0012] In some embodiments, the step of capturing the running state of the constraint code based on the verification script and the generated constraint code includes a state identifier, an actually observed state vector, and a target state expected based on the constraint.
[0013] In some embodiments, the application of a hierarchical decision function to the feature vectors to divide the set of constraint primitives into a delayed constraint set, wherein the expression of the hierarchical decision function is: ; in For the stratified results, The feature vector of the constraint node, Indicates the constraint type. Indicates the objective of the constraint. For grammatical category constraints, This serves as a marker for the early constraint set. For the delay constraint set flag, To constrain the code's structural level, This constraint applies to the output level of the code.
[0014] In some embodiments, in the process of calling the deviation function to determine the deviation of the running state, the expression of the deviation function is: ; in This is the deviation value. This refers to the actual state vector observed during the runtime verification phase. The expected target state vector, For indicator functions, This is the deviation threshold.
[0015] To achieve the above objectives, another aspect of this application proposes a multi-stage constraint code generation system for industrial scenarios, the system comprising: A logic tree construction module is used to obtain natural language requirements, convert the natural language requirements into a set of constraint primitives, and construct a constraint logic tree based on the set of constraint primitives. The constraint code layering module is used to traverse the nodes of the constraint logic tree and extract feature vectors, apply a decision function to the feature vectors, divide the constraint primitive set into an early constraint set, and construct a decoding intervention sequence for the early constraint set based on the dependency relationship of the constraint logic tree. The controlled decoding module is used to trigger the large language model to decode the natural language requirement and generate lexical units based on the decoding intervention sequence, convert the early constraint set into a directed acyclic graph, and filter the lexical units according to the state of the directed acyclic graph during the decoding process to generate a set of legal lexical units. The constraint code generation module is used to construct a constraint code sequence based on a set of legal lexical units. When the generated constraint code sequence reaches a preset termination condition, an abstract syntax tree is executed on the constraint code sequence, and the successfully parsed constraint code is output. The embodiments of this application include at least the following beneficial effects: This application provides a multi-stage constraint code generation method and system for industrial scenarios. By introducing directed acyclic graph constraints in the early decoding stage, this application performs real-time pruning of the lexical probability space, which ensures that the generated constraint code sequence is within the legal syntax path, thereby reducing the generation of syntax illusions and structural errors. This application divides the constraint execution process into an early constraint set. By adopting differentiated constraint strategies at different stages, it avoids introducing high-cost semantic or execution-level verification in the early decoding stage, thereby reducing the overall computational overhead while ensuring the generation quality and improving the controllability and engineering adaptability of the generation process. Attached Figure Description
[0016] Figure 1 A flowchart of a multi-stage constraint code generation method for industrial scenarios provided in this application embodiment; Figure 2 A flowchart illustrating the multi-stage constraint code method for industrial scenarios provided in this application embodiment; Figure 3 A flowchart of the logic repair process provided in the embodiments of this application; Figure 4 This is a schematic diagram of the generated code provided for an embodiment of this application; Figure 5 A block diagram of a multi-stage constraint code generation system for industrial scenarios provided in this application embodiment. Detailed Implementation
[0017] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit it. In the following description, when referring to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with those of this application; they are merely examples of apparatuses and methods consistent with some aspects of the embodiments of this application as detailed in the appended claims.
[0018] It is understood that the terms “first,” “second,” etc., used in this application may be used herein to describe various concepts, but unless otherwise stated, these concepts are not limited by these terms. These terms are only used to distinguish one concept from another. For example, without departing from the scope of the embodiments of this application, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the words “if,” “when,” or “in response to a determination” as used herein may be interpreted as “when…” or “when…” or “in response to a determination.”
[0019] As used in this application, the terms "at least one", "multiple", "each", "any", etc., "at least one" includes one, two or more, "multiple" includes two or more, "each" refers to each of the corresponding multiples, and "any" refers to any one of the multiples.
[0020] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.
[0021] In related technologies, code generation techniques, when handling industrial-specific code generation tasks, exhibit fragmented and single-layered constraint modeling, limited to type or structural constraints. This results in a lack of unified representation and collaborative execution mechanisms for constraints at the syntactic, semantic, and behavioral levels, hindering the systematic processing of multi-level and multi-type constraints. Consequently, the generated code suffers from insufficient overall consistency and reliability. Furthermore, existing constraint processing methods heavily rely on manually predefined interfaces, templates, or static constraint directories, failing to automatically guide constraint logic and its dependencies from natural language prompts. This easily leads to incomplete constraint coverage and unidentified constraint conflicts, causing the generated results to deviate from actual requirements. In addition, existing generation processes lack a phased management mechanism, failing to effectively distinguish between early decoding stage structural enforcement and later semantic verification or execution testing, resulting in inefficient generation processes and increased computational costs. Moreover, existing methods introduce complex decoding constraints or search strategies during the generation stage, leading to high computational overhead in long sequence or large-scale prompt scenarios. Moreover, violations of constraints often rely on post-verification or overall rollback, lacking fine-grained iterative repair and dynamic adjustment mechanisms, thus limiting the engineering usability and reliability of the generated code.
[0022] Example 1 In view of this, refer to Figures 1-2 As shown, this application embodiment provides a multi-stage constraint code generation method for industrial scenarios, including: S101: Obtain natural language requirements, convert the natural language requirements into a set of constraint primitives, and construct a constraint logic tree based on the set of constraint primitives; S102: Traverse the nodes of the constraint logic tree and extract feature vectors, apply a decision function to the feature vectors, divide the constraint primitive set into an early constraint set, and construct a decoding intervention sequence for the early constraint set based on the dependency relationship of the constraint logic tree. S103: Based on the decoding intervention sequence, the large language model is triggered to decode the natural language requirement and generate lexical units, and the early constraint set is converted into a directed acyclic graph. During the decoding process, the lexical units are filtered according to the state of the directed acyclic graph to generate a set of legal lexical units. S104: Construct a constraint code sequence based on the set of legal tokens. When the generated constraint code sequence reaches the preset termination condition, perform abstract syntax tree parsing on the constraint code sequence and output the successfully parsed constraint code.
[0023] Through steps S101-S104, by introducing directed acyclic graph constraints in the early decoding stage, the probability space at the word level is pruned in real time, which will ensure that the generated constraint code sequence is within the legal grammatical path, thereby reducing the generation of grammatical illusions and structural errors. By constructing a constrained directed acyclic graph (DAG), the various constraints derived from the prompts are uniformly modeled as constraint nodes with explicit dependencies, and their execution and generation order is determined through topological sorting. This structured modeling approach enables constraints at different levels to work collaboratively within the same representation framework, avoiding constraint omissions or implicit conflicts. Furthermore, by explicitly defining the generation order, this invention effectively prevents logical inversion phenomena such as "use first, define later" in complex business logic, improving the overall consistency and completeness of the generated results in scenarios with long logical chains.
[0024] By adopting differentiated constraint strategies at different stages, high-cost semantic or execution-level verification is avoided in the early decoding stage. This reduces the overall computational overhead while ensuring the quality of the generated product, and improves the controllability and engineering adaptability of the generation process.
[0025] Specifically, step S101 includes: Obtaining the original requirements: Receiving the original user input in natural language. The natural language requirements include functional descriptions, structural requirements, library constraints, and implicit logic.
[0026] Define extraction mapping: Natural language requirements are mapped to a set of constraint primitives using a constraint extraction mapping function, where the expression for the constraint extraction mapping function is: ; in To constrain the extraction mapping function, For constraining primitives, For the requirements text The applied first A semantic parsing operator is used to identify structural constraints, syntactic restrictions, or logical conditions; For constraint type labels; Based on the weights obtained from the explicitness and default risk assessment, the aforementioned semantic decoupling technique is used to... Convert to a set of constraint primitives
[0027] Tupled property definition: for each primitive Define triplet properties This establishes its text source, semantic tags, and mandatory weights, among which... To constrain the text.
[0028] Instantiating constraint nodes: This transforms constraint primitives into structured node objects, i.e.: Used to explicitly define its scope in the code space. .
[0029] in A unique identifier representing a constraint node; This indicates the scope of the constraint in the generated code, which can be at the file level, class level, function level, or statement level. Indicates the category of constraint, used to distinguish between syntactic constraints, semantic constraints, or behavioral constraints; This represents a text segment extracted from the original natural language requirement and corresponding to the constraint node.
[0030] Dependency determination: Call the dependency determination function. Identify the semantic logical order between nodes.
[0031] The dependency determination function is as follows: ; The symbol " "" indicates a sequential relationship in semantic execution; for example, syntactic constraints take precedence over semantic constraints.
[0032] Directed graph modeling: When the determination result is that there are no logically conflicting dependencies, construct a directed graph. , among which the side Represents a structural or logical prerequisite dependency. Represents a node eigenvectors; Topology verification and resolution: Perform depth-first search (DFS) on a directed graph G to detect loops. If any loop exists... Then calculate the weight of each node. And remove the incoming edges corresponding to the constraint nodes with the smallest weights in the cycle, until there are no directed cycles in the graph, i.e. .in, Let G represent the set of all directed cycles that exist in a directed graph. This indicates that there are no directed cycles in the current constraint dependency graph, i.e., the graph is a directed acyclic graph (DAG).
[0033] Generating intermediate representations: Encapsulating the final directed acyclic graph into a constraint logic tree intermediate representation. This completes the transformation from unstructured text to a computable structure. Among other things, Representation and Constraint Logic Tree The associated metadata set is used to record the execution stage marker, weight information, and hierarchical determination results for each constraint node. The feature vector of a node.
[0034] Step S101 uses formal dependency modeling to explicitly encode the originally implicit "generation order" into structural constraints, fundamentally avoiding the semantic omissions and logical inversions common in traditional one-time generation.
[0035] In some embodiments, the mapping function for inducing constraints from natural language prompts is not limited to extraction methods based on large language models. For example, in scenarios such as SQL statement generation, protocol field conversion, or specific configuration language generation, parsers based on regular expressions, rule engines, or predefined XML or JSON templates can be used to perform structured parsing of input requirements, thereby directly generating a set of constraint primitives without calling a large language model.
[0036] In some embodiments, the dependency determination function used to determine dependencies between constraints can also take different implementation forms. For example, a constraint graph can be directly generated by retrieving a pre-built programming domain knowledge graph and performing ontology-level analysis on classes, methods, library functions, and their calling relationships. This approach completes the derivation of constraint dependencies through explicit knowledge modeling, and is functionally equivalent to the dependency construction process based on inference or analysis in this invention.
[0037] Specifically, step S102 includes: Constructing feature vectors: Traversal Extract feature vectors from nodes in the dataset. .in: Indicates the constraint type; The code level or module scope that indicates the effect of the constraint; Indicates the objective of the constraint. Indicates grammatical categories, Semantic constraints Indicates behavioral constraints.
[0038] Execution timing classification: application of hierarchical decision functions ,in accordance with Does it fall under the category of grammatical constraints? )and Whether it is restricted, the constraints are divided into early constraint sets. With delay constraint set The hierarchical determination function It can be defined as a rule function as follows, where the constraint belongs to the syntax category and its target is... If the constraint is limited during the decoding stage, it can be classified as an early constraint set; otherwise, it is classified as a delayed constraint set. The expression for the hierarchical decision function is as follows: ; in For the stratified results, The feature vector of the constraint node, Indicates the constraint type. Indicates the objective of the constraint. For grammatical category constraints, This serves as a marker for the early constraint set. For the delay constraint set flag, To constrain the code's structural level, This constraint applies to the output level of the code.
[0039] Combination The dependencies are as follows: Constructing decoding intervention sequences for Construct a verification execution sequence to achieve phased isolation of error types.
[0040] For early constraint sets Based on its constraint logic tree The dependency relationships in the code are used to construct the decoding intervention sequence. Specifically, firstly, extract only those sequences containing the necessary dependencies from the constraint logic tree T. A subgraph of a node, where a subgraph is a subset of the constraint logic tree T containing only the early constraint sets. The subgraph formed by the corresponding node and its dependent edges is denoted as . ,and This represents the set of early constraint nodes; This represents the set of dependencies between early constraint nodes.
[0041] ; Then to Perform topological sorting to obtain the decoded intervention sequence. This is the decoding intervention sequence, used to guide the order in which constraints are applied during model decoding: ; in, This represents the k-th early constraint primitive arranged in topological order. This sequence ensures that pre-constraints are satisfied before subsequent constraints are applied during generation; that is, constraints are satisfied sequentially. Logically dependent on ,but Prior to in the sequence .
[0042] Based on this determination, the system divides the constraint set into early constraint sets. With delay constraint set This layered mechanism, combined with the logical tree dependencies, generates corresponding execution sequences. This allows structural and logical errors to be handled independently at different stages, significantly improving system controllability. For the early constraint set... Generate decoding intervention sequences based on their topological order in the logic tree T. This is used to progressively apply structural constraints during model decoding; for delayed constraint sets Generate a verification execution sequence ; in This is used to execute runtime verifications sequentially after code generation. The execution sequence is used to specify the order in which constraints are applied, preventing verification conflicts and false positives.
[0043] Steps S101-S102 involve constraining and summarizing natural language requirements, performing structured modeling, and planning the execution sequence, without involving code generation or the decoding process of large language models.
[0044] After the above steps are completed, the system has obtained an early constraint set for controlling the generation process and its corresponding decoding intervention sequence.
[0045] Specifically, step S103 includes: early constraint set The rules in the code are compiled into a formal grammar. in Represents the eigenvector. R represents the finite alphabet of symbols in a formal grammar or automaton, R represents the set of production rules in a formal grammar, and S represents the start symbol of a formal grammar. In a controlled decoding scenario based on a large language model, Σ represents the vocabulary or set of all valid tokens used by the large language model, where each element... express The smallest semantic unit (Token) in a set. This represents the set of production rules in a formal grammar. The starting symbol for formal syntax.
[0046] The above formal grammar Convert to a directed acyclic graph .in For a set of states, This is the state transition function. This is the initial state.
[0047] Real-time state tracking: During LLM decoding, tokens are generated progressively by the Large Language Model (LLM) during the decoding process, based on the already generated tokens. Use tokens to maintain the current state of the DFA in real time. .
[0048] Logits space mapping and masking: During controlled decoding, based on the current state of the deterministic directed acyclic graph. Calculate the set of valid tokens, where the set of valid tokens is determined by the reachable outgoing edges of the state; for tokens that do not belong to the valid set, apply a negative infinity mask to their corresponding Logits values to correct the probability space of the model, thus obtaining the corrected Logits vector. .
[0049] The specific methods for calculating valid tokens, constructing masks, and correcting Logits are as follows: Based on deterministic directed acyclic graphs In the controlled decoding process, at each decoding step t, the present invention performs the following processing on the current generation state: First, based on the current state of the directed acyclic graph Through the state transition function Calculate the set of reachable states: ; in Let be the set of reachable states. This represents the current state of the directed acyclic graph. It is a word element.
[0050] And based on this, determine the set of legal lexical units. ; in It is a set of legal words.
[0051] Then, a mask vector is constructed based on the set of legal tokens: ; in This is the mask vector.
[0052] Apply the mask vector to the original vector output by the large language model at time t. The corrected vector is obtained. , ; The mask vector This is the mathematical representation of the aforementioned negative infinity mask.
[0053] Specifically, the mask vector It is a vector with the original vector For numerical sequences of the same dimension, for any word in the vocabulary, the corresponding components are... Based on the legality determination result, the following values are assigned: like For valid tokens ( ), then let =0; like Invalid Token ), then let = .
[0054] The above ensures that the generated tokens are always confined to a valid path that conforms to the formal syntax constraints.
[0055] Specifically, step S104 includes: Based on the revised The normalized probability distribution of the set of legal lexical units is calculated, and a constraint code sequence is constructed based on the calculated set of legal lexical units until the generated constraint code sequence reaches the preset termination condition, such as generating a terminator or reaching the maximum length.
[0056] Furthermore, an abstract syntax tree (AST) is parsed on the generated complete code sequence to verify whether it satisfies the structural constraints defined by the formal syntax.
[0057] If the AST is successfully parsed, it is confirmed that the code is consistent with the constraints imposed during the decoding stage in terms of syntax and structure, and the code is output as the base code Code_base; if the parsing fails, it is considered as an exception and triggers regeneration or rollback processing.
[0058] In some embodiments, the controlled decoding mechanism implemented through real-time Logits intervention in this invention during the code generation phase can also be replaced by a post-processing approach. For example, the system can first allow the generation of raw code, then use a syntax checker or abstract syntax tree parser to identify illegal structures, and perform targeted modifications on local code fragments using heuristic rules or specialized repair models, thereby ensuring that the generated result meets syntactic and structural constraints. Although this approach shifts constraint execution from the decoding phase, its goal remains to ensure that the generated code meets predetermined structural constraints.
[0059] Alternatively, a guided search strategy can be used as an alternative during the decoding process. For example, a constraint scorer can be introduced into the bundle search to assign higher weights to token paths that meet the constraints, rather than directly setting the probability of invalid tokens to invalid values. In this way, the system can still prioritize compliant generation paths during the search process, achieving a constraint-guided effect consistent with the controlled decoding concept of this invention.
[0060] Furthermore, after step S104, the following steps are also included: S201: Automatically synthesize verification scripts for the verification execution sequences in the set of delay constraints; S202: Based on the verification script and the generated constraint code, capture the running status of the constraint code; S203: Call the deviation function to determine the deviation of the operating state; S204: When the deviation judgment result is greater than the threshold, the logic repair is triggered.
[0061] Specifically, step S202: Based on the verification script and the generated constraint code, capture the running status of the constraint code, including: Constraint code and Execute in a physically isolated sandbox environment and capture runtime status. The operating state This is used to record the execution results of code in a sandbox environment. It includes a status indicator indicating whether the execution was successful, an observed state vector, and a target state expected based on constraints. These results serve as inputs for subsequent deviation determination and repair decisions.
[0062] Specifically, step S203 calls the deviation function to determine the deviation of the operating state, including: The expression for the deviation function is: ; in This is the deviation value. This refers to the actual state vector observed during the runtime verification phase. The expected target state vector, For indicator functions, This is the deviation threshold.
[0063] The deviation function not only compares the Boolean values of the output results, but also includes calculating the similarity of the execution trace, where... and These represent the actual state vector observed during the runtime verification phase and the expected target state vector, respectively. This is an indicator function used to determine whether the output deviation exceeds the logic deviation threshold. , The deviation value representing the abnormal features between the execution trajectory and the expected trajectory can be calculated from the distance or similarity between the sequences of key state variables during execution, such as a weighted distance function based on the difference of state vectors.
[0064] By introducing a deviation function in steps S201-S204, a protective judgment mechanism based on the deviation function is used to distinguish between different situations such as execution crashes, numerical fluctuations, and actual logical violations. The deviation function avoids unnecessary overall regeneration triggered by non-essential errors, thereby reducing invalid iterations. In the actual generation process, this strategy can significantly reduce token consumption and iteration count in the repair phase while ensuring logical correctness, improving the system's operating efficiency in long sequences and complex constraint scenarios.
[0065] refer to Figure 3 As shown, specifically, the logic repair in step S204 includes: Locate the constraint codes that violate the set of delay constraints and their corresponding constraint nodes; Construct repair suggestions by combining the context of natural language requirements and constraints in the code; Based on the repair suggestions, a large language model is invoked to generate targeted code patches; The patch is merged into the constraint code and the relevant decisions are re-executed until the constraint code satisfies the delay constraint and the maximum number of iterations is reached.
[0066] Furthermore, in some embodiments, during the closed-loop verification phase, the sandbox-based verification method of this invention can be replaced with formal verification methods. For example, formal methods such as Hoare logic, model checking, or SMT solvers can be used to mathematically verify whether the generated code satisfies given constraints. This alternative is particularly suitable for industrial control code generation scenarios with high requirements for security and determinism, and its technical purpose is consistent with the runtime verification mechanism in this invention.
[0067] Furthermore, the single-model repair mechanism used to correct the generated results can also be extended to a multi-model collaborative approach. For example, two or more models can be introduced to assume the roles of generation and review, respectively, and potential defects can be continuously exposed and repaired through game theory or adversarial iteration until the generated results satisfy all post-constraint sets.
[0068] refer to Figure 5 As shown, further, in another aspect of this application, a multi-stage constraint code generation system for industrial scenarios is proposed, wherein the system includes: A logic tree construction module is used to obtain natural language requirements, convert the natural language requirements into a set of constraint primitives, and construct a constraint logic tree based on the set of constraint primitives. The constraint code layering module is used to traverse the nodes of the constraint logic tree and extract feature vectors, apply a decision function to the feature vectors, divide the constraint primitive set into an early constraint set, and construct a decoding intervention sequence for the early constraint set based on the dependency relationship of the constraint logic tree. The controlled decoding module is used to trigger the large language model to decode the natural language requirement and generate lexical units based on the decoding intervention sequence, convert the early constraint set into a directed acyclic graph, and filter the lexical units according to the state of the directed acyclic graph during the decoding process to generate a set of legal lexical units. The constraint code generation module is used to construct a constraint code sequence based on a set of legal lexical units. When the generated constraint code sequence reaches a preset termination condition, the module performs abstract syntax tree parsing on the constraint code sequence and outputs the successfully parsed constraint code.
[0069] Example 2 Building upon Example 1, the following example uses a typical logic control module in an industrial automation scenario—a "MaterialScheduler with task rollback capability"—to demonstrate how the system handles complex industrial constraints. This example simulates the core state monitoring, action command issuance, and anomaly protection rollback logic in industrial control, specifically: Table 1. Constraint Summary Table
[0070] 1. Initial Input and Requirement Scenarios The user's original requirement was: "Implement a Python class MaterialScheduler to manage the status of materials on the production line. It should include functions for material deposit, material withdrawal, current inventory query, operation log recording, and task cancellation."
[0071] 1.1 Constraint Induction and Logic Tree Construction Referring to the constraint summary table in Table 1 above, where (Maintaining the balance of float type and the history of list type) (Cancellation logic: If the original task was an inbound task, reduce the inventory; if it was an outbound task, replenish the inventory and simultaneously clean up the historical stack.) (Automatic incrementing logic for task sequence numbers). The system automatically identifies the logical depth. For example, transaction rollback nodes. Marked as dependent (Class structure) (State variables) (Inbound command) and (Outbound instructions) This dependency modeling ensures that during generation, the system prioritizes defining the underlying state variables before filling in complex business logic.
[0072] 1.2 Constraint Hierarchy Strategy Determination Referring to Table 2, the system performs feature extraction and classification on the nodes in the logic tree: Table 2 Feature extraction and classification results
[0073] Early constraint set: (Variable definition) (Dictionary format) (Type hint) (Import restrictions) and other conditions are identified as "syntax / structure related" and enter the Logits intervention process.
[0074] Delay constraint set: (Global structure) (Complex logical revocation semantics) Logic such as (auto-incrementing ID) involves cross-method calls and dynamic runtime states, and is therefore deemed semantically related and enters the sandbox verification process.
[0075] 1.3 Constraint Code Generation During the LLM inference phase, the system loads the basic Python syntax and the DFA built from earlier constraints: For example, when the model predicts def collect_material(self, quantity), the DFA forces subsequent predictions to conform to the type hint format (e.g., float -> int:). If they do not conform, the probability of the relevant token will be set to negative infinity.
[0076] refer to Figure 4 As shown, the initial output generated a code block containing initialization properties and the deposit / withdrawal logic framework, as shown in the image above. At this point, the code has achieved 100% format alignment in terms of structure.
[0077] 1.4 Closed-loop verification and repair The system performs verification on the generated initial code: For the core constraint c8 (transaction rollback function), the verification module automatically synthesizes an industrial logic verification script: simulating the execution of the material entry instruction dispatch_material(100.0), followed by the execution of the task undo instruction undo_last_job(), and setting assertions to verify whether the real-time inventory stock has returned to its initial value of 0.0. The generated controlled code is then merged with this verification script and run in a sandbox environment. In this case, the initially generated code logic is complete, and all eight delay constraints pass verification. The deviation function detects that the observed state vector Vobs is completely consistent with the expected state vector Vexp, and the incremental repair logic is not triggered.
[0078] This application provides a multi-stage constraint code generation method for industrial scenarios, relating to the field of information technology. This method can be applied to terminals, servers, or software running on either. In some embodiments, the terminal can be a smartphone, tablet, laptop, desktop computer, smart speaker, smartwatch, or in-vehicle terminal, but is not limited to these. The server can be configured as an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms. The server can also be a node server in a blockchain network. The software can be an application implementing the XXX method, but is not limited to the above forms.
[0079] This application can be used in a wide variety of general-purpose or special-purpose computer system environments or configurations. Examples include: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics devices, network PCs, minicomputers, mainframe computers, and distributed computing environments including any of the above systems or devices. This application can be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform specific tasks or implement specific abstract data types. This application can also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.
[0080] The embodiments described in this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided by the embodiments of this application. As those skilled in the art will know, with the evolution of technology and the emergence of new application scenarios, the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.
[0081] Those skilled in the art will understand that the technical solutions shown in the figures do not constitute a limitation on the embodiments of this application, and may include more or fewer steps than shown, or combine certain steps, or different steps.
[0082] Those skilled in the art will understand that all or some of the steps in the methods disclosed above, as well as the functional modules / units in the systems and devices, can be implemented as software, firmware, hardware, or suitable combinations thereof.
[0083] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of the units described above is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0084] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0085] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0086] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes multiple instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing programs, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0087] The preferred embodiments of the present application have been described above with reference to the accompanying drawings, but this does not limit the scope of the claims of the present application. Any modifications, equivalent substitutions, and improvements made by those skilled in the art without departing from the scope and substance of the embodiments of the present application shall be within the scope of the claims of the present application.
Claims
1. A multi-stage constraint code generation method for industrial scenarios, characterized in that, The method includes: Obtain natural language requirements, convert the natural language requirements into a set of constraint primitives, and construct a constraint logic tree based on the set of constraint primitives; Traverse the nodes of the constraint logic tree and extract feature vectors. Apply a decision function to the feature vectors to divide the constraint primitive set into an early constraint set. Combine the dependency relationship of the constraint logic tree to construct a decoding intervention sequence for the early constraint set. Based on the decoding intervention sequence, the large language model is triggered to decode the natural language requirement and generate lexical units. The early constraint set is converted into a directed acyclic graph. During the decoding process, the lexical units are filtered according to the state of the directed acyclic graph to generate a set of legal lexical units. A constraint code sequence is constructed based on a set of legal lexical units. When the generated constraint code sequence reaches the preset termination condition, an abstract syntax tree is executed on the constraint code sequence, and the successfully parsed constraint code is output.
2. The method of claim 1, wherein, The method further includes: Applying a hierarchical decision function to the feature vectors, the set of constraint primitives is divided into a set of delayed constraints; Construct a verification execution sequence for the set of delay constraints.
3. The method of claim 2, wherein, After outputting the successfully parsed constraint code, the following is also included: For the verification execution sequence in the set of delay constraints, a verification script is automatically synthesized; Based on the verification script and the generated constraint code, capture the running status of the constraint code; The deviation function is invoked to determine the deviation in the operating state; When the deviation determination result is greater than the threshold, the logic repair is triggered.
4. The method of claim 1, wherein, When the deviation determination result is greater than the threshold, the logic repair is triggered, specifically including: Locate the constraint codes that violate the set of delay constraints and their corresponding constraint nodes; Construct repair suggestions by combining the context of natural language requirements and constraints in the code; Based on the repair suggestions, a large language model is invoked to generate targeted code patches; The patch is merged into the constraint code and the relevant decisions are re-executed until the constraint code satisfies the delay constraint and the maximum number of iterations is reached.
5. The method of claim 1, wherein, The construction of the constraint logic tree based on the set of constraint primitives specifically includes: The set of constraint primitives is converted into structured node objects; Call the dependency determination function to identify the semantic logical order of structured node objects; A directed graph is constructed based on the identified structure. Loop detection is performed on the directed graph and directed loops are removed to generate a directed acyclic graph. The directed acyclic graph is encapsulated as a constraint logic tree.
6. The method of claim 3, wherein, The construction of the decoding intervention sequence for the early constraint set specifically includes: Extract a subgraph from the directed acyclic graph that contains only nodes of the early constraint set; The subgraph is topologically sorted to generate a decoding intervention sequence, which includes a set of early constraint nodes and a set of dependencies between early constraint nodes.
7. The method of claim 3, wherein, The process of capturing the running state of the constraint code based on the verification script and the generated constraint code includes a state identifier, the actually observed state vector, and the target state expected based on the constraint.
8. The method of claim 2, wherein, The hierarchical decision function is applied to the feature vector to divide the set of constraint primitives into a delayed constraint set. The expression of the hierarchical decision function is as follows: ; wherein is a hierarchical result, is a feature vector of a constraint node, represents a constraint type, represents a constraint target, is a syntax category constraint, is an early constraint set flag, is a delayed constraint set flag, is a constraint that acts on a structural level of code, is a constraint that acts on an output level of code.
9. The multi-stage constraint code generation method for industrial scenarios according to claim 3, characterized in that, In the process of calling the deviation function to determine the deviation in the running state, the expression of the deviation function is: ; in This is the deviation value. This refers to the actual state vector observed during the runtime verification phase. The expected target state vector, For indicator functions, This is the deviation threshold.
10. A multi-stage constraint code generation system for industrial scenarios, characterized in that, The system includes: A logic tree construction module is used to obtain natural language requirements, convert the natural language requirements into a set of constraint primitives, and construct a constraint logic tree based on the set of constraint primitives. The constraint code layering module is used to traverse the nodes of the constraint logic tree and extract feature vectors, apply a decision function to the feature vectors, divide the constraint primitive set into an early constraint set, and construct a decoding intervention sequence for the early constraint set based on the dependency relationship of the constraint logic tree. The controlled decoding module is used to trigger the large language model to decode the natural language requirement and generate lexical units based on the decoding intervention sequence, convert the early constraint set into a directed acyclic graph, and filter the lexical units according to the state of the directed acyclic graph during the decoding process to generate a set of legal lexical units. The constraint code generation module is used to construct a constraint code sequence based on a set of legal lexical units. When the generated constraint code sequence reaches a preset termination condition, the module performs abstract syntax tree parsing on the constraint code sequence and outputs the successfully parsed constraint code.