Unit test intelligent generation and enhancement method and system fusing large language model and fuzzy test

By combining multi-level static analysis, compiler feedback iterative correction, and AST analysis with fuzz testing techniques, high-quality test code is generated and solidified, solving the problems of lack of semantics and usability in test code, and achieving efficient test asset generation and self-enhancement.

CN122240476APending Publication Date: 2026-06-19TIANJIN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TIANJIN UNIV
Filing Date
2026-03-10
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies lack semantics and maintainability when generating test code. The engineering usability of LLM-generated results is poor, and fuzz testing results are difficult to solidify, resulting in test code that cannot be directly integrated and executed, and it is difficult to effectively reach complex logical paths.

Method used

Initial test cases are generated through multi-level static analysis and a structured boundary condition knowledge base. They are then iteratively corrected based on compiler feedback. Static unit tests are transformed into fuzzy test drivers using AST parsing and refactoring operators. Finally, fuzzy test results are solidified into maintainable regression test cases through instrumentation monitoring and semantic decoding.

Benefits of technology

It significantly improves the readability and maintainability of test code, increases the compilation pass rate, thereby generating engineering assets that can be directly integrated and executed, enhances the self-enhancement capability of the test suite, improves coverage and the ability to discover deep logical defects, and reduces the maintenance cost of test cases.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240476A_ABST
    Figure CN122240476A_ABST
Patent Text Reader

Abstract

This invention discloses an intelligent unit test generation and enhancement method and system that integrates large language models (LLMs) and fuzzing. By statically parsing project code and combining it with a structured boundary knowledge base, it guides an LLM (Language Modeling) system to generate initial test cases with semantic depth. An iterative correction loop based on compiler feedback is established to ensure the compilability of the generated code. The compilable static tests are transformed into fuzzing drivers for deep path exploration. High-value inputs discovered in fuzzing are decoded and solidified into regressible, strongly typed test cases, which are then fed back into the test suite. This invention combines the semantic understanding of LLMs with the breadth of exploration advantages of fuzzing, significantly improving the quality, coverage, and usability of generated tests, and enabling the automatic evolution of test assets.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of software quality assurance and automated testing, specifically to an intelligent generation and enhancement method and system for unit tests that combines the logical generation capability of large language models with the dynamic exploration capability of fuzz testing. Background Technology

[0002] In the field of software automated testing, existing technologies mainly face the following bottlenecks: Limitations of traditional automated test generation tools (such as search-based software testing (SBST) tools): These tools (e.g., EvoSuite) can generate high-coverage test cases, but the generated test code often consists of random input lacking business semantics, resulting in poor readability and maintainability, a phenomenon known as test bubbles. More importantly, when faced with deep logical paths requiring specific domain knowledge or complex combinations of conditions, their heuristic search algorithms are prone to path explosion or insufficient search space, making it difficult to effectively reach the target.

[0003] The shortcomings of test generation techniques based on Large Language Models (LLMs): Solutions like ChatTester leverage the semantic understanding capabilities of LLMs to generate test code that better aligns with human habits. However, LLMs are inherently probabilistic models, and they are prone to generating "illusions"—code containing undefined references, illegal object constructions, or missing environment dependencies. This leads to test cases failing to compile or execute in real-world engineering environments, resulting in low usability.

[0004] The shortcomings of traditional fuzzing techniques: Coverage-guided fuzzing techniques, represented by AFL and libFuzzer, are highly efficient in exploring deep program paths and discovering crashes, but the test inputs they generate are usually machine-readable byte sequences, lacking readable assertion logic and business semantics, making it difficult to directly integrate them into test suites as maintainable regression test cases.

[0005] The unit test defined in this invention is a composite concept: that is, a unit test is semantic test logic generated by a large language model (solving the "no semantics" problem) + executable code ensured by a compilation feedback loop (solving the "unusable" problem) + an evolvable asset formed by fuzz test-driven and feedback mechanisms (solving the "difficult to solidify" problem).

[0006] In summary, existing technologies suffer from a disconnect between three key aspects: the lack of semantics and maintainability in generated tests, poor usability of LLM-generated code, and the inability to convert dynamic test results into static assets. A key technical challenge is how to deeply integrate LLM's deep semantic understanding, fuzz testing's broad-spectrum dynamic exploration capabilities, and the generation of usable test assets into a closed loop. Summary of the Invention

[0007] This invention aims to address three major problems in automated unit test generation: lack of semantics in test code, poor engineering usability of large model generation results, and difficulty in solidifying fuzzing results. It proposes an intelligent unit test generation and enhancement method and system that integrates large language models and fuzzing. Through static analysis based on AST, it extracts context from multiple dimensions, including project dependencies, file structure, and cross-file relationships, providing accurate knowledge background for the large model, suppressing fictitious references, and ensuring that the generated test code is highly aligned with the actual engineering environment, thereby improving the efficiency and practicality of test automation.

[0008] To achieve the above-mentioned objectives, the present invention proposes the following technical solution: Firstly, this invention proposes an intelligent generation and enhancement method for unit tests that integrates large language models and fuzzy testing, comprising the following steps: S1, Execute the steps for generating initial test cases based on multi-level static analysis and a structured boundary condition knowledge base: Perform multi-level static analysis on the project under test to extract the code context. ;Connect the code context with the knowledge base Feature fusion is performed, and a CoT (Coding in Trace) hint strategy is adopted. Guide the Large Language Model (LLM) to perform logic analysis and test planning, generating logic including assertions. With input data S1 generates initial unit test cases; S2, performs iterative self-correction steps based on compiler feedback: compiles the initial unit test cases generated in S1; captures compiler error stack information when compilation fails. and associate it with the current code context. The joint construction serves as a guide for iterative correction of the LLM. Finally, we obtain unit test code that can be successfully compiled; S3, Execute the step of converting static unit tests into a fuzz test driver for dynamic exploration: use the compilable test code obtained in S2 as the static unit test to be converted. The abstract syntax tree (AST) is parsed using the static tool tree-sitter, and the AST parsing and reconstruction operators are used. Locating the hard-coded constant parameters and the hard-coded constant parameters Replace with a dynamic byte stream injection interface dynamically generated by the fuzzing engine. Thus, the static unit test Convert to fuzz test driver ; S4, Execute the step of capturing fuzz test results and feeding them back to the static test suite: Execute the driver in the fuzz test engine. During the process, incremental inputs that improve coverage or trigger anomalies are identified through instrumentation monitoring; the parameter type signatures of the function under test obtained from the AST are utilized. byte stream slices of the incremental input Perform semantic decoding and map the result to a strongly typed parameter set that conforms to the semantics of the source code. They then solidified this into new regression test cases and integrated them into the original test suite.

[0009] In some implementations, the multi-level static resolution in step S1 is implemented using the Tree-sitter tool, specifically including: Project dependency resolution extracts external libraries and their version information from the build configuration file, forming a set of project environment variables. ; Internal file logic parsing involves parsing class files to extract method signatures, attributes, and internal logic flow; this is used to parse class files to extract method signatures, attributes, and internal logic flow, forming a logic flow set. ; Cross-file parsing identifies inter-class dependencies and inheritance hierarchies through symbolic references, constructing a dependency matrix. ; By parsing functions right , and Process the union of the sets to obtain the code context: .

[0010] In some implementations, in step S1, the structured boundary condition knowledge base The established typical error modes include at least: null references, overflow and underflow of numeric types, empty sets in collection data structures, single-element sets, and boundary handling of excessively large sets; the process of generating initial unit test cases satisfies:

[0011] in, This indicates the fusion of features between the knowledge base and the current code context. This represents the function body of the function under test, the signature of the parameter types, and the function signature and member variable information of the structure / class it belongs to. This indicates the process of invoking the generation of a large model. This represents the generated assertion logic and the set of input data.

[0012] In some implementations, the iterative self-correction in step S2 specifically follows the following logic: Let Indicates the first Test code for the next iteration To compile and verify the function, the correction process is as follows:

[0013] in, This represents the threshold for the maximum number of preset repair iterations. Indicates the iteration round, Indicates the first Unit test code for the wheel, This indicates the compilation error text obtained from the compiler when compilation fails. This indicates the context information of the function under test.

[0014] In some implementations, step S3 further includes automated scheduling fusion of drivers: Multiple fuzz test drivers targeting the same function under test Encapsulated into a switch-case branch structure within the same main driver. ,in, This represents the operators for AST parsing and reconstruction. This represents the original hard-coded constant parameters. This indicates a dynamic byte stream injection interface controlled by a fuzzing engine. This represents the original, compilable unit test code. Indicates maintaining Without changing the structure, perform a replacement operation on specific internal elements. This indicates that a dynamic byte stream will be injected into a hard-coded constant location; Using fuzz test input stream Leading byte As a branch selector , ;in, Represents the fuzz test input stream Preceding selection bit, Indicates the first Assign weights to the test operators of each test branch. This indicates the coverage information feedback from the previous round of fuzz testing. Indicates weight Dependent on coverage information feedback , This indicates the weight update algorithm.

[0015] In some implementations, the semantic decoding process in step S4 specifically involves: signing the type of each parameter. For the corresponding byte stream slices Apply the corresponding parsing rules to construct integer, string, or complex object instances that conform to the project's calling specifications, and finally combine them into a strongly typed parameter set, as shown below:

[0016] in, This represents a semantic decoding function. Slices representing high-value byte streams Indicates the first j The type signature of each parameter This represents the total parameters.

[0017] Secondly, this invention also proposes an intelligent unit test generation and enhancement system integrating large language models and fuzzy testing, used to implement the intelligent unit test generation and enhancement method integrating large language models and fuzzy testing described in any one of the claims, the system comprising: The perception generation module is used to generate initial test cases based on a multi-level static analysis and structured boundary condition knowledge base. The steps are as follows: Figure 1 As shown in the knowledge analysis phase, the first step is to perform multi-level static analysis on the project under test to extract the code context. ;Connect the code context with the knowledge base Feature fusion is performed, and a CoT (Coding in Trace) hint strategy is adopted. Generate as Figure 2 The Prompt guides the Large Language Model (LLM) to perform logic analysis and test planning, generating logic including assertions. With input data Initial unit test cases; The self-correcting module is used to compile the generated initial unit test cases; it also captures compiler error stack information when compilation fails. and associate it with the current code context. Together, we construct instruction templates to guide the iterative refinement of the Large Language Model (LLM). Finally, we obtain unit test code that can be successfully compiled; The driver conversion module is used to convert the obtained compilable test code into static unit tests. The LLM is used to parse its Abstract Syntax Tree (AST), and the AST is used for parsing and reconstruction operators. Locating the hard-coded constant parameters and the hard-coded constant parameters Replace with a dynamic byte stream injection interface dynamically generated by the fuzzing engine. Thus, the static unit test Convert to fuzz test driver ; The results module is used to execute the driver in the fuzz testing engine. During the process, incremental inputs that improve coverage or trigger anomalies are identified through instrumentation monitoring; the parameter type signatures of the function under test obtained from the AST are utilized. byte stream slices of the incremental input Perform semantic decoding and map the result to a strongly typed parameter set that conforms to the semantics of the source code. They then solidified this into new regression test cases and integrated them into the original test suite.

[0018] In some implementations, the static parsing function in the perception generation module is specifically implemented by an integrated Tree-sitter parsing engine, which is used to perform project dependency parsing, intra-file logic parsing, and cross-file association parsing.

[0019] In some implementations, the structured boundary condition knowledge base The typical error modes it has solidified include at least: null references, overflow and underflow of numeric types, empty sets of set data structures, single-element sets, and handling boundaries of very large sets.

[0020] Compared with the prior art, the beneficial effects of the technical solution of the present invention are: 1) By providing precise code context through multi-level static parsing, and combining a structured boundary condition knowledge base and a thought chain hint strategy, it guides the large language model (LLM) to perform expert-level logical reasoning and test planning. This effectively overcomes the shortcomings of traditional automated generation tools (such as SBST) in generating test code that lacks business semantics, resulting in initial unit test cases { , It has higher readability, maintainability and coverage of complex business paths, thus significantly improving the quality and logical depth of unit test generation.

[0021] 2) By constructing an iterative self-correcting closed loop based on compiler feedback, the compilation error stack can be processed. Real-time feedback is sent to LLM for targeted repair. This mechanism significantly improves the compilation pass rate of LLM-generated code from an extremely low level (2.05%) to an average of over 84.48%, making the generated test cases no longer just potentially correct text, but engineering assets that can be directly integrated and executed, thus solving the code unusability problem caused by the LLM illusion.

[0022] 3) Analyze and refactor the validated static test cases using the Abstract Syntax Tree (AST). Transforming it into a fuzz test driver allows fuzz testing to perform deep mutation exploration on templates with a good semantic foundation. This combination enables testing to reach complex, boundary paths that are difficult to cover with LLM static thinking. Experimental data in the documentation shows that this method can improve line coverage from 41.36% to 82.14% and branch coverage from 29.25% to 53.28%, effectively uncovering deep logical defects. 4) Through instrumentation monitoring and semantic decoding, the system can input high-value increments discovered by fuzz testing into the byte stream. Reverse engineer into strongly typed regression test cases that conform to the semantics of the source code. This information is automatically embedded into the test suite; this achieves a complete closed loop from dynamic exploration and discovery to static asset accumulation, enabling the test suite to have self-enhancing capabilities and significantly reducing the long-term costs of manual maintenance and supplementing test cases.

[0023] 5) Optimize test resource allocation to improve exploration efficiency: Through the driver scheduling fusion mechanism, the system can dynamically adjust the input weights for different test branches based on coverage feedback. This allows the computational power of fuzz testing to be focused on more complex logical regions with greater potential, thereby improving the overall efficiency of path exploration. Attached Figure Description

[0024] Figure 1 This is a schematic diagram of the overall process of the intelligent generation and enhancement method for unit tests that integrates large language models and fuzzy testing according to the present invention.

[0025] Figure 2 This is a block diagram of the intelligent generation and enhancement system for unit tests that integrates large language models and fuzzy testing, as described in this invention.

[0026] Figure 3 This is a schematic diagram illustrating the technical route of a specific implementation of the present invention.

[0027] Figure 4 Generate phase architecture diagrams for unit test cases.

[0028] Figure 5 Generate a Prompt diagram for unit test cases.

[0029] Figure 6 This is an architecture diagram for the enhanced unit testing phase. Detailed Implementation

[0030] To enable those skilled in the art to clearly understand and implement the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are only for explaining the present invention and are not intended to limit the scope of protection of the present invention.

[0031] Example 1: As Figure 1 As shown, this invention provides an intelligent generation and enhancement method for unit tests that integrates large language models and fuzzy testing. Figure 2 As shown, this is the technical route of the present invention, combined with... Figure 1 and Figure 2 The specific steps included in this method are described below: Step S1, generating initial test cases based on multi-level static analysis and a structured boundary condition knowledge base: Specifically, multi-level static analysis of project dependencies, intra-file logic, and cross-file associations is performed on the project under test to extract code context. The code context is then compared with a structured boundary condition knowledge base containing typical error patterns. Feature fusion is performed, and a CoT (Coding in Trace) hint strategy is adopted. Guide the Large Language Model (LLM) to perform logic analysis and test planning, generating logic including assertions. With input data Initial unit test cases; This phase utilizes an integrated multi-layered static parsing engine to systematically scan and analyze the project under test. The parsing scope covers the entire context, from function internal logic and class structure to inter-module dependencies and even the overall project topology, extracting key global information that constitutes the precise semantic background of the code. The system integrates this structured context with a predefined structured boundary condition knowledge base. This knowledge base systematically solidifies the experience of testing experts in null values, overflows, and complex data structure boundaries, forming reusable test logic. Based on this, a Chain of Thought (CoT) hint strategy guides the large language model to perform progressive reasoning: first, it simulates the developer's analysis of the code's business logic and potential risks; then, based on the boundary conditions in the knowledge base, it systematically deduces possible anomalies, thus completing a pre-construction of a test plan. This process effectively overcomes the logical discontinuity problem in traditional methods, ultimately generating high-quality initial unit test cases with deep logical rationality and the ability to cover complex business paths.

[0032] In the stage of building a structured boundary condition knowledge base based on expert experience and implementing deep reasoning planning, this step aims to transform the intuition and experience of senior test engineers into a computable and reusable structured knowledge base to give the model a more professional testing perspective. Its content covers various common and extreme boundary scenarios. By employing a chain-of-thought strategy, the system guides the large model to execute a rigorous "thinking loop" on the function under test: First, through multi-level static analysis, the core control flow and data flow in the code are extracted to identify potential risk points; then, simulating the analytical paradigm of senior test engineers, a systematic pre-analysis is performed on potential null value anomalies, overflow risks, and logical deviations in the code. This process is not merely a simple translation of the code, but a pre-conducted, structured test planning process designed to build seamless verification logic for each logical branch, thereby ensuring that the generated test plan has both project-level global breadth and code-level implementation depth.

[0033] Furthermore, the multi-level static resolution in step S1 is implemented using the Tree-sitter tool, specifically including: Project dependency resolution extracts external libraries and their version information from the build configuration file, forming a set of project environment variables. ; Internal file logic parsing involves parsing class files to extract method signatures, attributes, and internal logic flow; this is used to parse class files to extract method signatures, attributes, and internal logic flow, forming a logic flow set. ; Cross-file parsing identifies inter-class dependencies and inheritance hierarchies through symbolic references, constructing a dependency matrix. ; By parsing functions right , and Process the union of the sets to obtain the code context:

[0034] in, This represents the project environment and the set of external library versions. Represents the method signatures and attributes inside the class flow, Represents the inheritance and dependency matrix across files; Step S2 involves performing iterative self-correction based on compiler feedback: specifically, statically scanning the initial unit test cases generated in S1 to fill in missing dependencies and then compiling; if compilation fails, the error stack information captured by the compiler is recorded. With code context The code is encapsulated as a repair instruction and submitted to the LLM for iterative correction. Finally, we obtain unit test code that can be successfully compiled; Specifically, this invention designs an automated code repair closed loop and an iterative self-correction mechanism: For potential syntax errors or missing environment dependencies in the unit test cases initially generated by the Large Language Model (LLM), this stage constructs an automated repair closed loop based on compilation feedback. In the preprocessing stage, static scanning automatically identifies and completes missing library references, mock objects, and context dependencies, laying the foundation for code compilation. In the post-processing stage, if errors occur during compilation or execution, the system extracts detailed error stacks, missing symbols, and type conflicts returned by the compiler, structures them, and feeds them back to the LLM, guiding it to perform targeted iterative correction. Through the continuously running "compile-feedback-repair" closed loop, syntactically correct, engineering-compliant, and directly executable test code is generated. Step S3 involves converting the static unit tests into a fuzzy test driver for dynamic exploration: specifically, the compilable test code obtained in S2 is used as the static unit test to be converted. The LLM is used to parse its Abstract Syntax Tree (AST), and the AST is used for parsing and reconstruction operators. Locating the hard-coded constant parameters and the hard-coded constant parameters Replace with The interface represented by the byte stream injection dynamically generated by the fuzzing engine, thereby injecting the static unit test... Convert to fuzz test driver ; Regarding hard-coded constant parameters Usage rules: First, by parsing the UT source code, constants such as strings, numbers, and characters are extracted and assigned unique names. Constant type identification follows a preset type list (including integer, floating-point, string, and character types). Then, the program initializes the identified variables using the FuzzedDataProvider format to provide the random data required for fuzzing. By replacing assertion statements and character array declarations in the code, driver code conforming to the libfuzzer format is generated. Finally, this processed code and the corresponding constant data are packaged to form a complete, directly executable fuzzing driver program.

[0035] To transform static unit tests into dynamic drivers suitable for fuzzing, this step uses a Large Language Model (LLM) to parse the Abstract Syntax Tree (AST) of the unit test code, precisely locating the hard-coded constant parameters. Subsequently, by injecting a dedicated variable interface into the fuzzing framework, these static parameters are replaced with byte streams dynamically generated by the fuzzing engine. This transformation evolves the deterministic verification based on fixed inputs into a data-driven, dynamic testing process capable of extensive state space exploration. This transformation process can be summarized by the following formula:

[0036] in, This represents the operators for AST parsing and reconstruction. This represents the original hard-coded constant parameters. This represents the dynamic byte stream injection interface controlled by the fuzzing engine.

[0037] S3 also involves the automatic scheduling integration of drivers: multiple independent drivers targeting the same function under test are encapsulated into different case branches in a switch statement, i.e., in a switch-case branch structure, during scheduling, ... As a branch selector, the engine dynamically adjusts the corresponding branch for each case based on real-time coverage feedback. This enables adaptive optimization. The scheduling logic can be formally represented as:

[0038] in, Represents the fuzz test input stream Preceding selection bit, Indicates the first The test operators of each test branch are assigned weights, which are dynamically adjusted based on coverage feedback. This is the coverage information feedback from the previous round of fuzz testing. It represents the weight. Dependent on coverage information feedback . The representative weight update algorithm does not employ a fixed branch selection probability in the main driver program. Instead, it establishes a feedback model based on the code path discovery rate. By calculating the coverage gain of each sub-driver branch under unit variation data, it dynamically adjusts the logical interval corresponding to the input stream selection bit in the switch-case structure, thereby achieving adaptive allocation of the fuzzing budget. Based on the coverage information fed back by the fuzzing engine in real time, it dynamically adjusts the operator allocation weights of each test branch. .

[0039] Regarding the dynamic adjustment of operator allocation weights for each test branch based on coverage feedback The specific algorithm: The main driver abandons the traditional fixed-probability branch selection strategy and instead constructs a real-time feedback model based on code path discovery rate. This model monitors and calculates the coverage gain generated by each sub-driving branch (corresponding to each case in the switch-case structure) under a unit amount of mutated data, using this as a key indicator to evaluate branch exploration efficiency. Subsequently, the system dynamically stretches or compresses the input stream selection bits (i.e., the range mapped by the leading byte B0) in the switch-case logical structure according to this indicator, thereby achieving an adaptive tilting allocation of fuzzing resources (computation budget and number of mutations) towards more efficient branches. This mechanism ensures that testing resources can continuously focus on code regions with higher potential returns, thereby improving the overall efficiency and depth of fuzzing path discovery.

[0040] S3 also implements automated scheduling and fusion of drivers: specifically, the system encapsulates multiple independent drivers for the same function under test in the same switch-case branch structure.

[0041] Specifically, this step achieves enhanced test input based on fuzzing: After ensuring the usability of initial test cases, this step introduces fuzzing technology to further enhance code robustness. The system uses S2-verified unit tests as a base template and employs abstract syntax tree analysis to accurately identify variable input parameters in the test code. These parameters are then transformed into control interfaces for dynamic drivers and injected into a high-performance fuzzing engine. This engine automatically generates a large amount of randomized, non-normalized data streams to perform deep path exploration on the code under test. This not only overcomes the coverage bottleneck of manually written tests but also delves into code areas difficult to reach with conventional input, uncovering hidden deep-seated logical flaws and potential crash risks.

[0042] Step S4: Capture the fuzz test results and feed them back to the static test suite: After the UT parsing step in S3, the system stores a list of replaced hard-coded constant parameters in the static unit test parsing step of S3. During the decoding process in S4, the stored static variable replacement list is first read, and then the type signature of each parameter is determined. For the corresponding byte stream slices Apply the corresponding parsing rules to construct integer, string, or complex object instances that conform to the project's calling specifications, and finally combine them into a strongly typed parameter set; The driver is executed in the fuzz test engine. During the process, incremental inputs that can improve coverage or trigger anomalies are identified through instrumentation monitoring; the parameter type signatures of the function under test obtained from the AST are utilized. byte stream slices of the incremental input Perform semantic decoding and map the result to a strongly typed parameter set that conforms to the semantics of the source code. This is then solidified into new regression test cases and integrated into the original test suite. Finally, it is combined into a strongly typed parameter set, represented as follows:

[0043] in, This indicates a semantic decoding function that maps an unstructured byte stream to a strongly typed parameter set that conforms to semantic specifications. Slices representing high-value byte streams Indicates the first extracted from AST j The type signature of each parameter (such as basic types like int and String). Indicates the total parameters; Specifically, this step establishes a key result capture and feedback mechanism to solidify the value of regression testing: This stage aims to effectively consolidate the dynamic exploration results of fuzzing, achieving closed-loop enhancement of test assets. By embedding fine-grained instrumentation monitoring into the fuzzing engine, the system can capture incremental inputs in real time that can significantly improve code coverage or trigger exceptions. Subsequently, based on the type signatures extracted from the original code's Abstract Syntax Tree (AST), the system uses reverse parsing technology to remap the captured underlying byte sequences into readable, strongly typed test parameters that conform to the source code semantics. Finally, these parameters are automatically encapsulated and solidified into new regression test cases, integrated into the project's existing test suite. This mechanism completes the closed loop of transformation from automated machine exploration to maintainable knowledge assets, continuously strengthening the protective capabilities of the test suite and providing assurance for the long-term evolution of software quality.

[0044] Example 2: A unit test intelligent generation and enhancement system integrating large language models and fuzzy testing according to the present invention, used to implement the aforementioned intelligent unit test generation and enhancement method integrating large language models and fuzzy testing, the system comprising: The perception generation module 100 is used to generate initial test cases based on multi-level static analysis and a structured boundary condition knowledge base. The steps include: performing multi-level static analysis on the project under test, considering project dependencies, intra-file logic, and cross-file associations, to extract the code context. The code context is then compared with a structured boundary condition knowledge base containing typical error patterns. Feature fusion is performed, and a CoT (Coding in Trace) hint strategy is adopted. Guide the Large Language Model (LLM) to perform logic analysis and test planning, generating logic including assertions. With input data Initial unit test cases; The self-correcting module 200 is used to perform static scanning on the initial unit test cases to complete missing dependencies and execute compilation; if compilation fails, it will record the error stack information captured by the compiler. With code context The code is encapsulated as a repair instruction and submitted to the LLM for iterative correction. Finally, we obtain unit test code that can be successfully compiled; The driver conversion module 300 is used to convert the compilable test code into static unit tests. The abstract syntax tree (AST) is parsed using the static tool tree-sitter, and the AST parsing and reconstruction operators are used. Locating the hard-coded constant parameters and the hard-coded constant parameters Replace with a dynamic byte stream injection interface dynamically generated by the fuzzing engine. Thus, the static unit test Convert to fuzz test driver ; Result module 400 executes the driver in the fuzz testing engine. During the process, incremental inputs that improve coverage or trigger anomalies are identified through instrumentation monitoring; the parameter type signatures of the function under test obtained from the AST are utilized. byte stream slices of the incremental input Perform semantic decoding and map the result to a strongly typed parameter set that conforms to the semantics of the source code. They then solidified this into new regression test cases and integrated them into the original test suite.

[0045] Specifically, the structured boundary condition knowledge base The typical error modes it has solidified include at least: null references, overflow and underflow of numeric types, empty sets of set data structures, single-element sets, and handling boundaries of very large sets.

[0046] Specifically, the static parsing function in the context-aware generation module is implemented by the integrated Tree-sitter parsing engine, which is used to perform project dependency parsing, intra-file logic parsing, and cross-file association parsing.

[0047] like Figure 3 The diagram illustrates the specific technical approach of this invention.

[0048] In a preferred embodiment of the present invention, the core of the system is a deep context-aware unit test generation framework, which provides a solid engineering context foundation for large language models (LLM) through multi-level, high-precision static code analysis, thereby significantly improving the accuracy and usability of the generated code.

[0049] Specifically, the system first uses the Tree-sitter parsing engine to perform a comprehensive static analysis of the target project, systematically building and outputting the code context. .

[0050] The process consists of three progressively deeper analytical phases: 1. Project Dependency Resolution: By scanning the project's build configuration file, all external dependency libraries and their semantic versions are accurately identified and mapped. This step ensures at the underlying symbol table level that the test code generated by the LLM can achieve strong consistency with the current project environment in terms of parameter types, generic constraints, and method signatures when calling APIs, fundamentally avoiding compilation errors caused by missing dependencies or version incompatibility.

[0051] 2. File-level parsing: Deeply analyze the internal structure of a single class file and extract detailed metadata, including member variables, method implementations, and internal logic flow.

[0052] 3. Cross-file level resolution: Through the symbolic reference tracing mechanism, the inheritance relationship between classes, the implementation of interfaces, and the call chain to external utility classes or service components are analyzed to build a cross-file code relationship network.

[0053] The high-dimensional, structured metadata extracted from the aforementioned multidimensional analysis is carefully organized into semantically rich prompts, serving as an enhanced retrieval context for LLM reasoning and generation. This design fundamentally suppresses the "illusion" problem that may arise when the model lacks specific engineering context, ensuring that the generated test code is not only logically sound but also highly directly usable in engineering practice.

[0054] After completing high-precision static analysis and constructing a rich code context Afterward, the system enters the knowledge-driven analysis phase. The core innovation of this phase lies in abandoning the simple "feeding" of the original code and instead introducing a deeply integrated expert knowledge base for boundary conditions. It provides logical assistance and strategy enhancement. Its workflow can be formally represented as follows: in, This indicates the fusion of features between the knowledge base and the current code context. This represents the function body of the function under test, the signature of the parameter types, and the function signature and member variable information of the structure / class it belongs to. This indicates the process of invoking the generation of a large model. This represents the generated assertion logic and the set of input data, which is a set containing multiple units. The test suite for testing.

[0055] When static analysis identifies that the function under test involves a specific data structure (such as a set), the system guides the LLM to actively retrieve corresponding heuristic rules from the knowledge base. For example, for "set boundary handling," the model will be planned to prioritize scenarios such as empty sets, sets containing null pointer elements, and inputs with extreme lengths or illegal formats. Based on this, the LLM will pre-build a multi-dimensional test coverage matrix internally. This matrix clearly delineates the execution paths of normal business flows and various exception defense flows. Through this forward-looking logical modeling, it ensures that the final generated unit test cases can systematically and accurately cover those easily overlooked boundary areas and exception handling branches in the code, thereby greatly improving the completeness and robustness of the tests.

[0056] To address the robustness issue of automatically generated code, this invention constructs a closed-loop iterative strategy combining proactive injection and passive repair. During the code synthesis phase, the generation module automatically injects missing import statements and initializes mock objects based on the dependency topology graph parsed earlier. Immediately after code generation, an automated compilation verification and feedback process begins. If the compiler captures exceptions such as type mismatches or undefined symbols, the system semantically processes the original error stack information and provides real-time feedback to the LLM. After receiving real-world engineering feedback data, the model locates specific logical deviations or syntax defects and performs precise local refactoring of the initial code snippets. This process involves multiple controlled iterative loops until the generated test script achieves lossless compilation and stable execution in the current project context, completely resolving the common code error problems in traditional generation tools. The formula is... .

[0057] in, This represents the threshold for the maximum number of preset repair iterations. Indicates the iteration round, Indicates the first Unit test code for the wheel, This indicates the compilation error text obtained from the compiler when compilation fails. This indicates the context information of the function under test.

[0058] After obtaining a set of high-confidence initial unit tests, this embodiment further connects to the "input space enhancement based on fuzzing" step. The system uses static parsing tools to perform "template stripping" on these existing unit tests with basic logic. By identifying and extracting key code logic used to construct object instances and trigger method calls, the system maps the originally hard-coded static literals (such as fixed values ​​or hard-coded strings) into mutable input vectors. Subsequently, the system seamlessly mounts these logic fragments into the driver template of the fuzzing framework, generating a test driver directly controlled by a high-performance fuzzing engine. This technical approach allows fuzzing to fully inherit the complex business initialization logic built by humans or LLMs in the initial test cases, while utilizing the engine's mutation strategy to conduct large-scale exploration in the multi-dimensional input space, significantly improving the ability to reach deep logic branches. .

[0059] To allocate operator resources and computing power more efficiently, this invention designs an adaptive scheduling fusion mechanism oriented towards multiple drivers. This mechanism integrates the mutation-driven logic corresponding to multiple associated methods under the same target class into a switch-case branch scheduling structure of a main driver. During dynamic execution, the fuzzing engine monitors and quantifies the incremental coverage feedback brought by each branch in real time. If it detects that the branch logic corresponding to a certain method has a large depth or shows higher coverage mining potential, the engine guides the input byte stream to preferentially flow to that branch through a dynamic weight allocation algorithm, implementing high-frequency and deep mutation exploration, thereby achieving intelligent focusing of test resources on complex logic regions.

[0060] In the final stage, the system performs "dynamic result capture and persistent integration." When the fuzzing engine identifies incremental input that can trigger new execution paths or abnormal states, the system immediately captures its corresponding raw byte stream. Using a symbolic signature-based reverse object reconstruction mechanism, the system accurately maps the byte stream to strongly typed object instances or parameters conforming to programming language specifications. Ultimately, these high-value input samples obtained through dynamic exploration are automatically encapsulated into standardized unit test templates and persistently integrated into the existing test suite as new test assets.

[0061] Through the complete technical loop from static generation and dynamic enhancement to static solidification, this invention realizes the automated evolution and continuous strengthening of unit test sets, thereby fundamentally improving the quality reliability and inherent robustness of complex software systems.

[0062] In experimental verification of code coverage, the method proposed in this invention demonstrates a systematic improvement in both test depth and breadth. Compared to the baseline method, the average line coverage increased from 41.36% to 82.14%, and the branch coverage also increased from 29.25% to 53.28%, with key indicators nearly doubling.

[0063] In typical projects, this method demonstrates superior adaptability: in the tinyxml2 project, the row coverage rate reached 98.43%, and the branch coverage rate reached 85.59%, demonstrating accurate coverage of complex conditional logic; in the sqlite3 project, which has a more complex structure and contains a large number of low-level API calls, the row coverage rate also increased significantly from 35.65% to 81.24%, verifying the effectiveness and robustness of the method in real industrial scenarios.

[0064] In the longitudinal evaluation of iterative repair performance, the "compilation repair closed loop" constructed by this system demonstrated excellent code usability improvement capabilities. For a total of 2093 test cases, the initial compilation pass rate of the test code generated by the large language model was generally low (e.g., only 2.05% for the initial generation of the c_utils project), highlighting the inherent defects of traditional generation methods that are prone to syntax and logic deviations in real-world engineering environments. By introducing the closed-loop repair mechanism of this invention, after only one round of repair, the overall compilation rate achieved a significant leap (from 39.37% to 69.69% for sqlite3, for example); after three rounds of iterative repair, the average compilation pass rate of the entire test set reached 84.48%, and the average error reduction rate reached 72.44%. In typical projects such as json.cpp, the compilation pass rate after repair was close to fully usable (99.99%). This data trend confirms the "error stack feedback" Guided local reconstruction The "closed-loop verification" mechanism can effectively identify and correct structural errors in the model output, greatly improving the engineering usability of the generated code.

[0065] Therefore, this invention, by integrating large-scale model semantic understanding with dynamic exploration through fuzzy testing, constructs an intelligent testing system with capabilities for self-understanding code, self-correction of errors, and self-path discovery. This system effectively reduces the maintenance costs of test assets and significantly enhances the ability to ensure software quality.

[0066] In summary, this invention provides a method and system for intelligent generation and enhancement of unit tests that integrates large language models and fuzz testing. Its core implementation includes the following collaboratively operating modules: Context-aware generation based on multi-layer static parsing: Using tools such as Tree-sitter, deep AST parsing is performed on the target project from three dimensions: project dependencies, file structure, and cross-file associations. This extracts accurate contextual information, providing a reliable knowledge background for subsequent large language models and suppressing false references.

[0067] Combining a structured knowledge base with CoT-based planned test generation: Integrating a structured boundary condition knowledge base that solidifies expert experience, and using a CoT (Cooperation in Test) hint strategy, guides the large language model to first perform logical modeling and test planning, and then generate test case code with high maintainability and logical completeness.

[0068] Iterative self-correction based on compiler feedback: Dependencies are completed through static scanning and preprocessing, and when test code fails to compile, the accurate compiler error stack is fed back to the large language model for closed-loop iterative repair, which significantly improves the direct usability of the generated code.

[0069] Dynamic driver conversion for fuzz testing: By parsing the AST of unit tests, input control points are automatically identified and reconstructed into drivers that accept fuzzy variable injection, enabling statically generated test cases to be used as templates to access the fuzz testing engine for deep path exploration.

[0070] like Figure 4 The diagram shows the architecture of the unit test case generation phase. The structured boundary condition knowledge base is a case library containing common unit test boundary input values ​​such as null, maximum, and minimum values. The diagram illustrates the core process of intelligent unit test generation in this invention. On the left, in the knowledge analysis phase: the process begins with the system simultaneously inputting the code to be tested and the pre-set boundary condition knowledge base. Both enter the parsing function of the CoT (Coding Logic) analysis, performing multi-level static parsing and combining the knowledge base to conduct in-depth analysis of the normal flow and potential boundary flows of the code. The process progresses to the generation and repair phase on the right. Based on the aforementioned analysis, the system plans and executes three key actions sequentially: first, generating initial test cases; second, performing static dependency repair; and third, calling the Large Language Model (LLM) for iterative repair, ultimately outputting high-quality, directly compilable test code. This architecture diagram clarifies the automated, closed-loop generation path from raw code to usable test assets.

[0071] like Figure 5 The diagram shown illustrates the generation of unit test cases using a Prompt. It illustrates the structured composition of the Prompt, the specific instructions that drive the generation of test cases from the large language model. It is divided into three parts: Structured boundary condition knowledge base: It lists rules in the form of boundary input -> potential problem, such as null pointer input -> triggering nullptr error. This is a typical rule base that solidifies expert experience into a set of searchable and matchable production rules.

[0072] Context related to the function under test: This part defines the key information that needs to be parsed and filled in from the code under test, including function body, parameter types, class members, etc., providing an accurate code semantic environment for LLM.

[0073] Generate Test Strategy (CoT): This explicitly requires LLM to follow a four-step thought chain: "Generate framework -> Analysis flow -> Generate normal test cases -> Generate boundary test cases." This guides the model logic and ensures the completeness of the generated tests.

[0074] Figure 4 , Figure 5 The overall demonstration shows how to integrate a knowledge base in the form of a case study library, context parsed from specific code, and a step-by-step CoT strategy into a specific operational instruction.

[0075] like Figure 6 The diagram shown is an architecture diagram of the unit test enhancement stage. This diagram illustrates the core architecture of dynamic unit test enhancement and value feedback in step S4 of this invention.

[0076] Driver generation stage: This stage corresponds to the core operation of step S3. The process starts with the static unit test code on the left that has passed compilation verification, and performs the following transformations in sequence: Variable identification: The abstract syntax tree (AST) of the test code is parsed using the tree-sitter static tool to precisely locate the hard-coded input parameters. .

[0077] Scheduling fusion: Encapsulates multiple independent test logics for the same function into a unified fuzz test driver framework, usually manifested as a switch-case branch scheduling structure.

[0078] Initialization resolution: Completed code refactoring, replacing static parameters with dynamic throttling injection interfaces. Generate a fuzz test driver that can be directly loaded and executed by the fuzz test engine. .

[0079] Input amplification stage: This stage corresponds to the core operation of step S4 in the claim. The process unfolds from top to bottom: Fuzz test execution: Fuzz test driver Injecting a large number of randomized, non-normalized byte streams To conduct in-depth path exploration.

[0080] Results Capture and Feedback: The engine uses instrumentation monitoring to identify and capture high-value "incremental inputs" that can significantly improve code coverage or trigger exceptions in real time. Subsequently, the system calls the inverse decoding function `Decode`, based on the parameter type signatures extracted from the original AST. Slice the original byte stream Refactored into strongly typed test parameters that conform to the semantics of the source code This information is then solidified into new regression test cases and integrated into the project's test suite.

[0081] Overall logic: This diagram clearly reveals how the present invention transforms modified and usable static test assets into a dynamic exploration driving engine, and reverse-engineers the results of dynamic exploration into maintainable static assets, thereby realizing the automated, intelligent, and continuous evolution of the test suite.

[0082] The embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts can be referred to interchangeably. The systems disclosed in the embodiments are described concisely because they correspond to already disclosed methods; for detailed information, please refer to the method section.

[0083] The above specific embodiments are intended to illustrate the principles and implementation process of the present invention through examples, so as to help understand the core ideas and methods of the present invention. Based on the concept of the present invention, those skilled in the art can make appropriate adjustments and modifications to the specific embodiments and application scope. Therefore, the content of this specification should not be construed as a limitation of the present invention. Any modifications, equivalent substitutions, improvements or refinements made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for intelligent generation and enhancement of unit tests that integrates large language models and fuzzy testing, characterized in that, Includes the following steps: S1, Execute the steps for generating initial test cases based on multi-level static analysis and a structured boundary condition knowledge base: As shown in the knowledge analysis stage in Figure 1, firstly, multi-level static analysis is performed on the project under test to extract the code context. ; Connect the code context with the knowledge base Feature fusion is performed, and a CoT (Coding in Trace) hint strategy is adopted. The Prompt shown in Figure 2 guides the Large Language Model (LLM) for logic analysis and test planning, generating a model containing assertion logic. With input data Initial unit test cases; S2 executes an iterative self-correcting process based on compiler feedback: compiling the initial unit test cases generated in S1; capturing compiler error stack information when compilation fails. and associate it with the current code context. Together, we construct instruction templates to guide the iterative refinement of the Large Language Model (LLM). Finally, we obtain unit test code that can be successfully compiled; S3, Execute the step of converting static unit tests into a fuzz test driver for dynamic exploration: use the compilable test code obtained in S2 as the static unit test to be converted. The abstract syntax tree (AST) is parsed using the static tool tree-sitter, and the AST parsing and reconstruction operators are used. Locating the hard-coded constant parameters and the hard-coded constant parameters Replace with a dynamic byte stream injection interface dynamically generated by the fuzzing engine. Thus, the static unit test Convert to fuzz test driver ; S4, Execute the step of capturing fuzz test results and feeding them back to the static test suite: Execute the driver in the fuzz test engine. During the process, incremental inputs that improve coverage or trigger anomalies are identified through instrumentation monitoring; the parameter type signatures of the function under test obtained from the AST are utilized. byte stream slices of the incremental input Perform semantic decoding and map the result to a strongly typed parameter set that conforms to the semantics of the source code. They then solidified this into new regression test cases and integrated them into the original test suite.

2. The intelligent generation and enhancement method for unit tests integrating large language models and fuzzy testing as described in claim 1, characterized in that, The multi-level static analysis in step S1 is implemented using the Tree-sitter tool, specifically including: Project dependency resolution extracts external libraries and their version information from the build configuration file, forming a set of project environment variables. ; Internal file logic parsing involves parsing class files to extract method signatures, attributes, and internal logic flow; this is used to parse class files to extract method signatures, attributes, and internal logic flow, forming a logic flow set. ; Cross-file parsing identifies inter-class dependencies and inheritance hierarchies through symbolic references, constructing a dependency matrix. ; By parsing functions right , and Process the union of the sets to obtain the code context: .

3. The intelligent generation and enhancement method for unit tests integrating large language models and fuzzy testing as described in claim 1, characterized in that, In step S1, the structured boundary condition knowledge base The established typical error modes include at least: null references, overflow and underflow of numeric types, empty sets in collection data structures, single-element sets, and boundary handling of excessively large sets; the process of generating initial unit test cases satisfies: in, This indicates the fusion of features between the knowledge base and the current code context. This represents the function body of the function under test, the signature of the parameter types, and the function signature and member variable information of the structure / class it belongs to. This indicates the process of invoking the generation of a large model. This represents the generated assertion logic and the set of input data.

4. The method according to claim 1, characterized in that, The iterative self-correction in step S2 specifically follows the following logic: Let... Indicates the first Test code for the next iteration To compile and verify the function, the correction process is as follows: in, This represents the threshold for the maximum number of preset repair iterations. Indicates the iteration round, Indicates the first Unit test code for the wheel, This indicates the compilation error text obtained from the compiler when compilation fails. This indicates the context information of the function under test.

5. The intelligent generation and enhancement method for unit tests integrating large language models and fuzzy testing as described in claim 1, characterized in that, Step S3 also includes the automatic scheduling fusion of drivers: Multiple fuzz test drivers targeting the same function under test Encapsulated into a switch-case branch structure within the same main driver. ,in, This represents the operators for AST parsing and reconstruction. This represents the original hard-coded constant parameters. This indicates a dynamic byte stream injection interface controlled by a fuzzing engine. This represents the original, compilable unit test code. Indicates maintaining Without changing the structure, perform a replacement operation on specific internal elements. This indicates that a dynamic byte stream will be injected into a hard-coded constant location; Using fuzz test input stream Leading byte As a branch selector , ;in, Represents the fuzz test input stream Preceding selection bit, Indicates the first Assign weights to the test operators of each test branch. This indicates the coverage information feedback from the previous round of fuzz testing. Indicates weight Dependent on coverage information feedback , This indicates the weight update algorithm.

6. The intelligent generation and enhancement method for unit tests integrating large language models and fuzzy testing as described in claim 1, characterized in that, The semantic decoding process in step S4 specifically involves: obtaining the list of replaced static variables parsed and stored in step S3, and then signing each parameter according to its type. Obtain the byte length corresponding to its type, and then slice the complete byte stream to obtain the corresponding type signature. byte stream fragments ,Will The constants, such as integers and strings, that conform to the project's calling conventions are replaced in fuzzDriver and finally combined into a strongly typed parameter set, as shown below: in, This represents a semantic decoding function. Slices representing high-value byte streams Indicates the first j The type signature of each parameter This represents the total parameters.

7. A unit test intelligent generation and enhancement system integrating large language models and fuzzy testing, characterized in that, The system for implementing the method of any one of claims 1 to 6 comprises: The perception generation module is used to generate initial test cases based on multi-level static analysis and a structured boundary condition knowledge base. The steps are as follows: As shown in the knowledge analysis stage in Figure 1, firstly, multi-level static analysis is performed on the project under test to extract the code context. ;Connect the code context with the knowledge base Feature fusion is performed, and a CoT (Coding in Trace) hint strategy is adopted. The Prompt shown in Figure 2 guides the Large Language Model (LLM) for logic analysis and test planning, generating a model containing assertion logic. With input data Initial unit test cases; The self-correcting module is used to compile the generated initial unit test cases; it also captures compiler error stack information when compilation fails. and associate it with the current code context. Together, we construct instruction templates to guide the iterative refinement of the Large Language Model (LLM). Finally, we obtain unit test code that can be successfully compiled; The driver conversion module is used to convert the obtained compilable test code into static unit tests. The abstract syntax tree (AST) is parsed using the static tool tree-sitter, and the AST parsing and reconstruction operators are used. Locating the hard-coded constant parameters and the hard-coded constant parameters Replace with a dynamic byte stream injection interface dynamically generated by the fuzzing engine. Thus, the static unit test Convert to fuzz test driver ; The results module is used to execute the driver in the fuzz testing engine. During the process, incremental inputs that improve coverage or trigger anomalies are identified through instrumentation monitoring; the parameter type signatures of the function under test obtained from the AST are utilized. byte stream slices of the incremental input Perform semantic decoding and map the result to a strongly typed parameter set that conforms to the semantics of the source code. They then solidified this into new regression test cases and integrated them into the original test suite.

8. The intelligent unit test generation and enhancement system integrating large language models and fuzzy testing as described in claim 7, characterized in that, The static parsing function in the perception generation module is specifically implemented by the integrated Tree-sitter parsing engine, which is used to perform project dependency parsing, intra-file logic parsing, and cross-file association parsing.

9. The intelligent unit test generation and enhancement system integrating large language models and fuzzy testing as described in claim 7, characterized in that, The structured boundary condition knowledge base The typical error modes it has solidified include at least: null references, overflow and underflow of numeric types, empty sets of set data structures, single-element sets, and handling boundaries of very large sets.