Test code generation method and related product

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By integrating multimodal data and large models to generate test code, the problem of low efficiency and poor targeting in existing technologies has been solved, achieving efficient and accurate test code generation and improving user experience.

CN122309345APending Publication Date: 2026-06-30HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HUAWEI TECH CO LTD
Filing Date: 2024-12-31
Publication Date: 2026-06-30

AI Technical Summary

Technical Problem

Existing technologies are inefficient in generating test code and fail to meet user needs. The generated code lacks specificity and is difficult to comprehensively test complex software systems.

Method used

By fusing multimodal data, utilizing feature extraction and large models to generate test code, including source code, description documents, and diagrams, we construct intent representation vectors and generate test code that meets user needs.

Benefits of technology

It enables efficient and accurate generation of test code, improves the relevance of the testing process and the reliability of the system, and enhances the user experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122309345A_ABST

Patent Text Reader

Abstract

This application discloses a test code generation method and related products, applicable to the field of information technology. In this application, a code generation device is deployed with a first model or has the ability to invoke a first model. The code generation device can acquire multimodal data about a first software and obtain an intent representation vector based on the multimodal data. The code generation device constructs a first prompt word based on the intent representation vector and uses the first model to generate first test code based on the first prompt word. The first test code is used to test the first software. This application integrates multiple types of data related to the first software, effectively and fully utilizing information related to software testing. By using a model for code generation, it can efficiently and accurately generate test code that meets user needs, resulting in a good user experience.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of information technology, and in particular to test code generation methods and related products. Background Technology

[0002] Software testing is an indispensable part of the software development lifecycle. It tests the functionality of software, ensuring the reliability and stability of the software product. With technological advancements, software systems are becoming increasingly complex, integrating more and more functions and accumulating a vast number of modules. For complex software systems, more comprehensive and in-depth testing is needed to ensure they function correctly under various conditions.

[0003] Most software vendors still rely on manual coding to test software systems. Testers manually write and run test code based on testing requirements to automate software testing. This method is inefficient, error-prone, and fails to meet user testing needs. Some solutions automatically generate test code by analyzing software source code. However, as software systems become increasingly complex and the source code becomes massive, the efficiency of code generation remains low, and the generated test code lacks specificity and fails to match user testing requirements. Summary of the Invention

[0004] This application discloses a test code generation method and related products, which can efficiently and accurately generate test code that meets user needs and provides a good user experience.

[0005] Firstly, this application provides a test code generation method, which can be applied to a computing device, for example, implemented by a module within the computing device. This module may include software modules, hardware modules, or a combination of both. For ease of description, the following explanation will use a code generation device as the executing entity of the code generation method.

[0006] The test code generation method includes: a code generation device acquiring multimodal data related to the first software, obtaining an intent representation vector based on the multimodal data, constructing a first prompt word based at least on the intent representation vector, and generating first test code based on the first prompt word using a first model. This first test code is used to test the first software.

[0007] Here, "modality" refers to different forms or characteristics of data, and "multimodality" refers to data having multiple types or multiple information sources. Multimodal data includes at least two types of data; that is, multimodal data includes multiple data items, and at least two types of data exist among these multiple data items. As one possible implementation, the multimodal data includes the source code of the first software, and also includes the description document of the first software and / or illustrations of the first software. The first intent representation vector is a vector obtained by fusing features from multiple data items, capable of representing the features of the multimodal data.

[0008] In the above scheme, the code generation device can integrate multimodal data and analyze information from the source code of the first software and other related data formats to construct a prompt-guided model for generating test code. Other data formats related to the source code often contain information for understanding the functionality of the first software. Using this data, feature vectors can be extracted to analyze the user's testing intent. The intent representation vector obtained by fusing these feature representations integrates information from multiple modalities, accurately reflecting the testing requirements. This allows the test code to specifically test the functionality of the first software, and the testing process to better meet the user's testing needs.

[0009] Furthermore, this application adopts a "feature extraction + prompt words + model" architecture. It represents the input data as feature vectors (i.e., intent representation vectors) and uses these feature vectors to construct prompt words, thereby guiding the model to generate code. This architecture decouples the test code generation process. Feature extraction allows information to be expressed in vector form, while prompt words construct easily understandable prompts based on feature representations. The model then generates the optimal test code for the current input based on the guidance of the prompt words. This architecture has a clear purpose, separate functions, and interconnected modules, resulting in high efficiency and good performance in generating test code. Moreover, this architecture can be automated, ensuring high system reliability and complete functionality.

[0010] In summary, this application integrates various types of data related to the first software, effectively and fully utilizes information related to software testing, and uses models for code generation. It can efficiently and accurately generate test code that meets user needs, resulting in a good user experience.

[0011] In the above scheme, the first model can be deployed in the code generation device, or the code generation device can have the ability to call the first model. For example, the code generation device has a communication connection with the device on which the first model is deployed. Alternatively, the first model has a corresponding calling interface, through which the code generation device can call the first model.

[0012] Optionally, the first model may include one or more models. In the case of multiple models, the deployment of the first model may further include: some models are deployed on the code generation device, and some models are deployed on other devices, with the code generation device able to call these models deployed on other devices.

[0013] In one possible implementation of the first aspect, the description document of the first software includes information about the first software, which includes at least one of the following: the preconditions of the first software, test case information of the software, the source code file path of the first software, the product line of the first software, or the development department of the first software, etc.

[0014] The above implementation provides information that may be included in the description document, which can help understand the testing intent of the first software and improve the relevance and accuracy of the generated test code.

[0015] Among them, the preconditions, also known as prerequisites, refer to the conditions that the execution of software use cases or the operation of software must meet.

[0016] A test case is a description of a software testing task. It typically consists of test inputs, execution conditions, and expected results designed for a specific test objective, used to verify whether the software meets functional requirements. Test case information includes one or more of the following: test case name, test case ID, test objective, test environment, input data, and test steps. In some cases, one or more of the test objective, test environment, input data, and test steps may be included in the test case name.

[0017] In some cases, one or more test cases can form a test case group. In this case, the information of the first software also includes information about the test case group, or simply test case information. Of course, the name of this information is just an example; in a specific implementation, the name may be designed differently.

[0018] The source code file path of the first software is used to indicate the code project in which the first software is located and the location of the source code of the first software within the code project.

[0019] In yet another possible implementation of the first aspect, the illustration of the first software includes a flowchart of the first software and / or a functional module architecture diagram of the first software.

[0020] The above implementation provides possible illustrated scenarios, including graphics and symbols (including text), which can intuitively demonstrate the structure of the first software, improve the accuracy of understanding user testing requirements, and enhance the relevance and accuracy of generated test code. Flowcharts, in particular, clearly illustrate the software's execution steps, decision-making logic, and the relationships between its components in a graphical way, simplifying and visualizing complex program logic and facilitating understanding of the software's overall architecture and operational flow. Architecture diagrams can display components (or modules) and the relationships between them; analyzing architecture diagrams allows for a quick understanding of the software's organizational structure.

[0021] In another possible implementation of the first aspect, the first model is a large model. A large model is a model with a large number of parameters and a complex structure. For example, a large model includes a large language model (LLM), which refers to a deep learning model trained with a large amount of data, capable of generating language text or understanding the meaning of language text. For example, the first model includes one or more of the following: a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, or a Transformer-based model, and of course, the use of other models with different structures is not excluded.

[0022] In another possible implementation of the first aspect, the code generation device deploys a second model or has the ability to invoke a second model. The code generation device obtains a first intent representation vector based on multimodal data, including the following operation: the code generation device uses the second model to obtain the first intent representation vector based on the multimodal data.

[0023] In the above implementation, the operation of obtaining the first intent representation vector based on multimodal data can be implemented by a callable model, thus further decoupling the modules. The second model performs feature extraction and fusion in a centralized and specialized manner to obtain the intent representation vector, which improves the reliability and stability of the system and enhances the efficiency and quality of test code generation.

[0024] In another possible implementation of the first aspect, the second model includes an encoder and a first neural network, wherein the encoder is used to extract features of each data item in the multimodal data to obtain features of the multimodal data, and the neural network is used to obtain a first intention representation vector based on the features of the multimodal data.

[0025] In one possible implementation, an encoder is used to extract data and obtain feature vectors for each data item. Then, a neural network is used to fuse multiple features to obtain a first intent representation, thereby realizing information interaction and fusion between multimodal data and improving the comprehensiveness of the understanding of test requirements.

[0026] Optionally, the encoder includes multiple encoders, each corresponding to data of a specific modality. Further, optionally, the neural network is a multi-head attention neural network, with input consisting of features of multiple data points. The neural network can fuse the features based on the attention weights of the features of each data point to obtain a first intent representation vector.

[0027] In some cases, the attention weights of the features of each data point can be predefined or user-defined. In others, the attention weights are adjustable; for example, the neural network dynamically adjusts the contribution of each modality's feature vector in the fusion process based on a gating mechanism, and obtains the first intent representation vector based on the features of each data point and their respective contributions. In still other cases, the attention of the features of each data point is weighted and merged to obtain the first intent representation vector.

[0028] In another possible implementation of the first aspect, the encoder includes at least one of a code encoder, a document encoder, a graph encoder, and a user feedback encoder. The code encoder is used to extract features from source code type data. Exemplarily, the code encoder includes one or more of an abstract syntax tree, a convolutional neural network (CNN), and a recurrent neural network (RNN). In one possible design, the code encoder includes an abstract syntax tree, a CNN, and a RNN. The abstract syntax tree (AST) is used to parse the source code and extract structural information, such as functions, classes, or variables, or one or more structural elements. The CNN is used to capture local features of the code, and the RNN is used to capture sequential dependencies in the code. Feature vectors of the source code are generated based on the features obtained from the CNN and RNN.

[0029] A document encoder is used to extract features from document-type data. For example, a document encoder includes a pre-trained model. In some cases, a document encoder is used to extract one or more features from a document, such as keyword features, phrase features, sentence-level features, and semantic features, and generates a feature vector for the document based on the extracted features.

[0030] Graph encoders are used to extract features from graph-type data. For example, a graph encoder converts a graph into graph-structured data, using a graph neural network (GNN) to capture features of nodes (e.g., steps and / or components) and edges (e.g., dependencies between steps or call relationships between modules) in the graph, generating a feature vector for the graph.

[0031] Feedback encoders are used to extract features from feedback information.

[0032] In another possible implementation of the first aspect, the code generation device constructs a first prompt word based at least on a first intent representation vector, including the following steps: the code generation device constructs a first prompt word based on the first intent representation vector and context information of the first software, wherein the context information of the first software is used to indicate the association between the first software and the source code in a target code project, and the target code project is the code project in which the first software is located.

[0033] In the above embodiments, the code generation device can perceive contextual information and construct prompt words based on the code project where the first software is located, thereby improving the accuracy of understanding test requirements and enhancing the relevance and accuracy of the generated test code.

[0034] In yet another possible implementation of the first aspect, the code generation method further includes the following operation: the code generation device perceives context information of the first software based on the source code in the target project. The context information includes one or more of the following: dependency information of the first software on the source code in the target project, initialization variable information of the first software, and context step information of the first software.

[0035] In the above embodiments, the code generation device can perceive code context information, such as one or more of dependencies, initialization variables, or contextual step information. Utilizing context information can improve the accuracy of understanding test requirements.

[0036] In another possible implementation of the first aspect, the code generation device constructs a first prompt word based on a first intent representation vector and context information of the first software, including the following operations: the code generation device obtains a first intent structure based on the first intent representation vector and context information of the first software, and constructs the first prompt word using the first intent structure. The first intent structure includes multiple levels of content, each level of content being used to represent the test intent from at least one dimension.

[0037] Optionally, the context information of the first software can also be used when constructing the first prompt word. That is, the code generation device constructs the prompt word using the first intent structure and the context information of the first software.

[0038] In the above embodiments, the code generation device can generate a hierarchical intent structure. This hierarchical intent structure describes the test intent from multiple dimensions, comprehensively and in detail representing the test requirements. Using a hierarchical intent structure to construct cue words can improve the accuracy of the test intent expressed by the cue words, thereby enhancing the relevance and accuracy of the generated test code.

[0039] In some cases, the primary intent structure includes basic intent, intermediate intent, and high-level intent. Basic intent includes representations of intents such as the dependencies of the first software. Intermediate intent includes representations of intents such as preconditions and test procedures. High-level intent includes representations of intents such as expected results and test characteristics.

[0040] Optionally, primary intent can be analyzed from the features of multimodal data, while intermediate intent requires summarizing the information contained in multimodal data and analyzing contextual information. Advanced intent, on the other hand, is obtained by integrating information from total modal data, contextual information, primary intent, and other sources to infer the user's testing purpose and parse the user's final testing intent.

[0041] For example, the first intent structure includes one or more of the following: test objective, test point, test step, test feature, test dependency information, expected result, preconditions, test topic, etc.

[0042] In another possible implementation of the first aspect, the code generation apparatus obtains a first intent structure based on a first intent representation vector, including the following operation: the code generation apparatus uses an intent parsing model to obtain the first intent structure based on the first intent representation vector.

[0043] In another possible implementation of the first aspect, the code generation apparatus obtains a first intent structure based on a first intent representation vector and context information of the first software, including the following operation: the code generation apparatus uses an intent parsing model to obtain the first intent structure based on the first intent representation vector and context information of the first software.

[0044] In both implementations described above, the code device utilizes an intent parsing model to parse the user's test intent and obtain a hierarchical intent structure. Centralized and dedicated intent parsing via the intent parsing model improves system reliability and stability, and enhances the efficiency and quality of test code generation.

[0045] In another possible implementation of the first aspect, the intent parsing model includes an intent recognition module, a context adjustment module, and a hierarchical construction module. The intent recognition module is used to obtain a preliminary vector based on a first intent representation vector. The context adjustment module is used to obtain an intermediate vector from the preliminary vector and context information of the first software. The hierarchical construction module is used to construct a first intent structure based on the preliminary vector and the intermediate vector.

[0046] As a possible example, the intent recognition module uses a Transformer-based encoder to encode the fused intent representation vector (i.e., the first intent representation vector), capturing the basic test objectives in the test requirements. Furthermore, the intent recognition module uses the Transformer's self-attention mechanism to identify the core intent and test topic of the test requirements. These basic test objectives, core intents, and test topics can be used to generate multi-level content within the intent structure. The result of the intent parsing phase is a preliminary vector that indicates the basic test objectives, core intents, and test topics in the user's test requirements.

[0047] As a possible example, the context adjustment module inputs the initial vector and contextual information into the bidirectional encoding layer to capture the contextual dependencies in the test requirements. Through bidirectional encoding, the context adjustment module understands the contextual connections and logical relationships (such as contextual dependencies) in the test requirements, further refining the initial identification results to obtain intermediate vectors.

[0048] As a possible example, the hierarchical build module processes the initial and intermediate vectors into layers, constructing a hierarchical intent structure based on the logical structure of the test requirements. Optionally, during the construction process, the hierarchical build module can adjust the associations and / or importance between intents at different levels to generate the intent structure. This intent structure can hierarchically and accurately reflect the test requirements.

[0049] In another possible implementation of the first aspect, after generating first test code based on the first prompt word using the first model, the code generation device further performs the following operations: the code generation device optimizes the first intent structure based on feedback information to obtain a second intent structure; constructs a second prompt word based on the second intent structure and the context information of the first software; and generates second test code based on the second prompt word using the first model. The second test code is used to test the first software. The feedback information includes user feedback information and / or the execution result of the first test code.

[0050] In the above implementation, the code generation device can sense the feedback from the environment to the output, optimize the intent structure, construct new prompts, and use the new prompts to guide the generation of new test code, i.e., the second test code. Through iterative updates, the test code is optimized, and the quality of the test code is improved.

[0051] In another possible implementation of the first aspect, the code generation device optimizes the first intent structure based on feedback information to obtain a second intent structure, comprising the following steps: the code generation device extracts features from the feedback information to obtain feature vectors of the feedback information; uses the feature vectors of the feedback information to identify problematic content and / or correct content in the first intent structure; and uses an optimizer to strengthen the representation of the problematic content in the first intent structure, and / or to maintain or weaken the representation of the correct content in the first intent structure, thereby obtaining the second intent structure. Optionally, the optimizer is trained based on a reinforcement learning algorithm.

[0052] As a possible design, taking feedback information including user feedback and the execution result of the first test code as an example, the above processing can be roughly divided into a feedback coding stage, a feedback alignment stage, and an optimization stage. These stages will be described in detail below:

[0053] During the feedback encoding phase, the code generation device encodes and extracts features from the execution results of the first test code to obtain an execution result vector. Optionally, this execution result vector may include (or indicate) key metrics such as test coverage and error rate. The code generation device processes user feedback information (e.g., sentiment analysis and natural language processing) to extract user feedback features, which may include one or more of the following: test result satisfaction, test suggestions, or problem points.

[0054] During the feedback alignment phase, the code generation device aligns the feedback representation vector with the first intent structure, identifying the content associated with the feedback representation vector within the first intent structure. For example, the feedback representation vector may be associated with one or more of the test objectives, test points, and test steps. Through the feedback information and the first intent structure, the code generation device determines the problematic content within the intent structure, and optionally, it can also determine the correct content within the intent structure. In other words, it identifies the problems and deficiencies within the intent structure.

[0055] During the optimization phase, the code generation device adjusts the intent structure based on feedback information using an optimizer. For example, for errors or deficiencies in the intent structure indicated in the feedback information, the code generation device enhances the representation of these problematic parts, such as enhancing the representation of the corresponding test objective, test point, or test step. Further, the code generation device maintains or appropriately weakens the representation of correct parts. Optionally, the optimizer is trained based on a reinforcement learning algorithm.

[0056] In this way, the code generation device can improve errors or deficiencies during the code production process, thereby optimizing the test code and improving its quality.

[0057] In another possible implementation of the first aspect, the code generation device constructs a second prompt word based on the second intent structure and the context information of the first software, including the following steps: the code generation device constructs the second prompt word based on the second intent structure, the context information of the first software, and feedback information.

[0058] In the above implementation, feedback information is also used to optimize prompt words. This improves the quality of prompt words, optimizes test code, and enhances the overall quality of the test code.

[0059] In another possible implementation of the first aspect, after generating the first test code based on the first prompt word using the first model, the code generation device further performs the following operation: the code generation device optimizes the first prompt word to obtain a second prompt word based on feedback information, the feedback information including user feedback information and / or the execution result of the first test code.

[0060] In the above implementation, feedback information is used to optimize prompt words, thereby improving the quality of prompt words, optimizing test code, and enhancing the quality of test code.

[0061] In another possible implementation of the first aspect, the first cue word includes multiple items, each used to describe the test intent from at least one dimension.

[0062] The above implementation describes the content of the prompt words. The prompt words have multiple dimensions of content, which makes the prompt words high-quality and comprehensive, thus helping to improve the quality of test code.

[0063] In another possible implementation of the first aspect, the first prompt word includes one or more of the following: test characteristics, test steps, preconditions, expected results, test environment, and test behavior.

[0064] In another possible implementation of the first aspect, the multimodal data further includes feedback information, which includes user feedback on the previously generated test code and / or the execution result of the previously generated test code.

[0065] In another possible implementation of the first aspect, the code generation device acquires multimodal information related to the first software by the following steps: the code generation device receives multimodal data related to the first software input by a user.

[0066] Secondly, embodiments of this application provide a code generation apparatus. The code generation apparatus includes an acquisition module, a multimodal data fusion module, a prompt word construction module, and an inference module. The code generation apparatus also deploys a first model, or the code generation apparatus has the ability to invoke a first model. This code generation apparatus is used to implement the method described in the first aspect or any possible implementation thereof.

[0067] In one possible implementation of the second aspect, the acquisition module is used to acquire multimodal data related to the first software. The multimodal data fusion module is used to obtain a first intent representation vector based on the multimodal data, the first intent representation vector being obtained by fusing features from multiple data sources. The parsing and construction module is used to construct a first prompt word based at least on the first intent representation vector. The inference module is used to generate first test code based on the first prompt word using a first model, the first test code being used to test the first software.

[0068] In one possible implementation of the second aspect, the description document of the first software includes information about the first software, which includes at least one of the following: the preconditions of the first software, the use case information of the first software, the test case information of the software, the source code file path of the first software, the product line of the first software, or the development department of the first software.

[0069] In another possible implementation of the second aspect, the illustration of the first software includes a flowchart of the first software and / or a functional module architecture diagram of the first software.

[0070] In another possible implementation of the second aspect, the first model is an LLM.

[0071] In another possible implementation of the second aspect, the code generation device is equipped with a second model or has the ability to invoke the second model. The multimodal data fusion module is used to utilize the second model to obtain a first intent representation vector based on multimodal data.

[0072] In another possible implementation of the second aspect, the second model includes an encoder and a neural network, wherein the encoder is used to extract features of each data item in the multimodal data to obtain features of the multimodal data, and the neural network is used to obtain a first intention representation vector based on the features of the multimodal data.

[0073] In another possible implementation of the second aspect, the encoder includes at least one of a code encoder, a document encoder, a graph encoder, and a user feedback encoder. The code encoder is used to extract features from source code type data, the document encoder is used to extract features from document type data, the graph encoder is used to extract features from graph type data, and the feedback encoder is used to extract features from feedback information.

[0074] In another possible implementation of the second aspect, the parsing construction module is used to construct a first prompt word based on a first intent representation vector and context information of the first software. The context information of the first software is used to indicate the association between the first software and the source code in a target code project, where the target code project is the code project containing the first software.

[0075] In another possible implementation of the second aspect, the code generation apparatus further includes a context-aware module, which is used to obtain context information of the first software based on the source code in the target project. In some cases, the context information includes one or more of the following: dependency information of the first software on the source code in the target project, initialization variable information of the first software, and context step information of the first software.

[0076] In another possible implementation of the second aspect, the parsing construction module includes an intent parsing module and a prompt word construction module. The intent parsing module is used to obtain a first intent structure based on a first intent representation vector and context information of the first software. The first intent structure includes multiple levels of content, each level representing the test intent from at least one dimension. The prompt word construction module is used to construct a first prompt word based on the first intent structure and context information of the first software.

[0077] In another possible implementation of the second aspect, the intent parsing module is used to obtain a first intent structure based on a first intent representation vector and context information of the first software using an intent parsing model.

[0078] In another possible implementation of the second aspect, the intent parsing model includes an intent recognition module, a context adjustment module, and a hierarchical construction module. The intent recognition module is used to obtain a preliminary vector based on a first intent representation vector. The context adjustment module is used to obtain an intermediate vector based on the preliminary vector and context information of the first software. The hierarchical construction module is used to construct a first intent structure based on the preliminary vector and the intermediate vector.

[0079] In another possible implementation of the second aspect, the code generation apparatus further includes an intent optimization module. The intent optimization module is used to optimize the first intent structure based on feedback information to obtain a second intent structure, the feedback information including user feedback information and / or the execution result of the first test code. The prompt word construction module is further used to construct a second prompt word based on the second intent structure and context information of the first software. The reasoning module is further used to generate second test code based on the second prompt word using the first model, the second test code being used to test the first software.

[0080] In another possible implementation of the second aspect, the intent optimization module includes a feedback encoding module, a feedback alignment module, and an optimization execution module. The feedback optimization module extracts features from the feedback information to obtain a feature vector. The feedback alignment module uses the feature vector to identify the problematic and correct content in the first intent structure. The optimization execution module uses an optimizer to strengthen the representation of the problematic content in the first intent structure and maintain or weaken the representation of the correct content, thus obtaining a second intent structure.

[0081] In another possible implementation of the second aspect, the prompt word construction module is further configured to construct a second prompt word based on the second intent structure, the context information of the first software, and the feedback information.

[0082] In another possible implementation of the second aspect, the prompt word construction module is further configured to optimize the first prompt word to obtain a second prompt word based on feedback information, including user feedback information and / or the execution result of the first test code.

[0083] In another possible implementation of the second aspect, the first cue word includes multiple items, each used to represent the testing intent from at least one dimension.

[0084] In another possible implementation of the second aspect, the first prompt word includes one or more of the following: test characteristics, test steps, preset conditions, or expected results.

[0085] In another possible implementation of the second aspect, the multimodal data further includes feedback information, which includes user feedback on the previously generated test code and / or the execution result of the previously generated test code.

[0086] In another possible implementation of the second aspect, the acquisition module is further configured to receive multimodal data related to the first software input by the user.

[0087] Thirdly, this application also provides a user equipment, which includes a communication module and an interaction module. The communication module is used to receive external data and / or provide data to external systems, while the interaction module is used to interact with a user. The interaction module is used to receive multimodal data input by the user regarding the first software. The communication module is used to provide the multimodal data to a code generation device.

[0088] Furthermore, the communication module is also used to receive test code of the first software provided by the task code generation device, such as first test code or second test code.

[0089] Fourthly, this application also provides a model training device for training one or more models. The trained modules are deployed in a code generation device, or the trained models are deployed in an intermediate device for use by the code generation device.

[0090] Fifthly, this application also provides a computing device, including at least one processor and a memory, the memory storing a computer program, and at least one processor for calling the computer program stored in the memory to implement the method described in the first aspect or any possible implementation of the first aspect.

[0091] Sixthly, this application provides a computing device cluster. The computing device cluster includes at least one computing device, each computing device including a processor and a memory. The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, causing the computing device cluster to perform some or all of the methods described in the first aspect and any implementation thereof.

[0092] In a seventh aspect, this application also provides a user equipment, including a processor and a memory, the memory storing a computer program, and at least one processor for calling the computer program stored in the memory to perform the following operations: receiving multimodal data input by a user about a first software, and providing the multimodal data to a code generation device.

[0093] Optionally, the processor also performs the following operations: receiving test code for the first software provided by the task code generation device, such as first test code or second test code.

[0094] Eighthly, this application also provides a code generation system, which includes a computing device and a user device as described in the seventh aspect, the computing device being used to implement the method described in the first aspect or any possible implementation thereof.

[0095] In a ninth aspect, this application also provides a computer-readable storage medium storing computer instructions that, when executed by at least one processor, implement the method described in the first aspect or any possible implementation thereof.

[0096] In a tenth aspect, this application also provides a computer program product including computer instructions that, when executed by at least one processor, implement the method described in the first aspect or any possible implementation thereof.

[0097] Furthermore, the computer program product may be a software or program product containing instructions that can run on a computing device or be stored on any available medium.

[0098] The beneficial effects of aspects two through ten of this application can be found in the beneficial effects of the solution in aspect one. Attached Figure Description

[0099] The accompanying drawings used in the embodiments of this application are described below.

[0100] Figure 1 This is a schematic diagram of the architecture of a code generation system provided in an embodiment of this application;

[0101] Figure 2 This is a schematic diagram of the software functional modules of a code generation system provided in this application;

[0102] Figure 3 This is a schematic diagram illustrating a process for obtaining an intent representation vector based on multimodal data, as provided in this application.

[0103] Figure 4 This is a schematic diagram illustrating a deployment method for a second model provided in this application;

[0104] Figure 5 This is a schematic diagram of an intent resolution process provided in this application;

[0105] Figure 6 This is a schematic diagram illustrating a deployment method for an intent parsing model provided in this application;

[0106] Figure 7 This is a schematic diagram of an intention optimization process provided in this application;

[0107] Figure 8 This is a schematic diagram of the entire process of the training and application phases of a possible code generation system provided in this application;

[0108] Figure 9 This is a flowchart illustrating a code generation method provided in an embodiment of this application;

[0109] Figure 10 This is a schematic diagram of a possible code segment and its pseudocode provided in an embodiment of this application;

[0110] Figure 11 This is a schematic diagram of a possible software test document provided in an embodiment of this application;

[0111] Figure 12 This is a schematic diagram of a possible software flowchart provided in an embodiment of this application;

[0112] Figure 13 This is a schematic diagram of a possible prompt word provided in an embodiment of this application;

[0113] Figure 14This is a schematic diagram of the structure of a code generation device provided in an embodiment of this application;

[0114] Figure 15 This is a schematic diagram of the structure of a computing device provided in an embodiment of this application;

[0115] Figure 16 This is a schematic diagram of the structure of a computing device cluster provided in an embodiment of this application;

[0116] Figure 17 This is a schematic diagram of another computing device cluster provided in the embodiments of this application. Detailed Implementation

[0117] The embodiments of this application will now be described in detail with reference to the accompanying drawings. For ease of understanding, some terms that may be used in this application will be introduced first.

[0118] Modality refers to different forms or characteristics of data. Multimodal data refers to data with multiple types, such as code, documents (e.g., text documents), diagrams, feedback information, etc., which are one or more types of data. In some cases, multimodal data includes at least two types of data; that is, multimodal data includes multiple data items, and at least two types of data exist among these multiple data items. In other cases, multimodal data can also refer to data with different information sources. For example, code refers to the source code of software, while software documentation refers to documents that describe the software. Other modalities will not be discussed in detail here.

[0119] A model, also known as an artificial intelligence (AI) model or machine learning model, is a set of functions and parameters learned through training on training data. It is used to perform specific tasks, such as prediction, classification, or other data processing tasks. Model training is inseparable from machine learning techniques. Machine learning is the science of training computer programs or systems to perform tasks without explicit instructions. Classified by learning method, machine learning can be divided into supervised learning, unsupervised learning, and self-learning, among other categories. Deep learning is a subset of machine learning, based on deep neural network models and methods.

[0120] Large models are models with a large number of parameters and complex structures. For example, large models include large language models (LLMs), which are deep learning models trained on large amounts of data (such as text data) capable of generating or understanding the meaning of language text. In some cases, large models employ similar architectures and training objectives to small models. The difference between large and small models is that large models are much larger and require more training data and computational resources.

[0121] Convolutional neural networks (CNNs) are a class of feedforward neural networks that include convolutional computations and have a deep structure. They are one of the representative algorithms of deep learning.

[0122] Recurrent neural networks (RNNs) are a type of neural network that takes sequence data as input, recursively processes the data in the direction of the sequence, and connects nodes (i.e., recurrent units) in a chain-like manner. They are also one of the representative algorithms of deep learning. In some cases, RNNs include long short-term memory networks (LSTM).

[0123] Transformer is a model based on the attention mechanism.

[0124] An intelligent agent is a proxy capable of perceiving information and taking actions to achieve specific goals. Intelligent agents possess autonomy, adaptability, and interactivity. They perceive their environment (e.g., through sensors or data input), make judgments and decisions based on their learned knowledge and algorithms, and then execute actions to influence the environment or achieve predetermined goals.

[0125] It should be understood that the terms described above are used to illustrate certain concepts in order to facilitate understanding of this solution. The background and technical implementations defined in the introduction of the terms are merely examples, and as technology improves, the above technologies may have other implementations in specific implementations.

[0126] Software testing is an indispensable part of the software development lifecycle, ensuring that software products meet predetermined functional and performance requirements before release. With technological advancements, software systems are becoming increasingly complex, involving more functions and modules. Highly integrated software systems require more comprehensive and in-depth testing to ensure they function correctly under various conditions. Therefore, automated test code generation has become an important research direction in the field of software testing.

[0127] Currently, test code generation generally suffers from low accuracy and inaccurate intent recognition. In view of this, this application provides a test code generation method and related products that can efficiently and accurately generate test code that meets user needs, resulting in a better user experience. In this application, the code generation device (such as an intelligent agent) can utilize multimodal data to accurately analyze the user's testing intent and combine it with model inference to generate test code for the software. The test code is of high quality and can more accurately match the user's testing requirements. Moreover, it facilitates model-based code generation, resulting in high efficiency and reliability. Furthermore, the aforementioned model is a large model; this application utilizes a "prompt word + large model" architecture to generate test code, which can improve the quality of test code and enhance the user experience.

[0128] For ease of understanding, the architecture and business scenarios of the code generation system provided in the embodiments of this application are described below. It should be noted that the system architecture and business scenarios described in this application are for the purpose of more clearly illustrating the technical solutions of this application and do not constitute a limitation on the technical solutions provided in this application. It should be understood that as system architectures evolve and new business scenarios emerge, the technical solutions provided in this application are equally applicable to similar technical problems.

[0129] Please see Figure 1 , Figure 1 This is a schematic diagram of the architecture of a code generation system provided in an embodiment of this application. The code generation system includes a computing device 10. Optionally, it may also include one or more of the following: a user device 20, a cloud 30, or a model training device 40.

[0130] The computing device 10 is a device with computing capabilities, which may include a single computing device or a cluster of multiple computing devices. The computing device 10 may include servers, such as central servers, edge servers, or local servers in a local data center, or terminal devices such as desktop computers, laptops, or smartphones.

[0131] In this embodiment, the computing device 10 can acquire multimodal data related to the first software and use a model to obtain test code based on the multimodal data. The model can run within the computing device 10 or outside of it. In the latter case, the computing device 10 calls the model deployed outside of the computing device 10 through a predefined module. The computing device 10 also has other functions, such as feature extraction, prompt word construction, intent parsing, intent optimization, user feedback optimization, or prompt words, one or more of these. Please refer to the description below for related information.

[0132] In some possible implementations, the computing device 10 provides an access interface through which external devices can use the services provided by the computing device 10. For example, the access interface is an application programming interface (API).

[0133] In some cases, the code generation system also includes a user device 20. User device 20 is a device capable of interacting with a user, and may include, for example, an input / output (I / O) module capable of receiving user input and / or outputting information to interact with the user. Exemplarily, user device 20 includes, but is not limited to, a personal computer (PC), tablet computer, or smartphone. A PC may include a desktop computer or laptop computer.

[0134] User equipment 20 can communicate with computing device 10. Exemplarily, user equipment 20 can provide multimodal data to computing device 10, which then obtains test code based on the multimodal data and returns it to user equipment 20, enabling the user of user equipment 20 to test the first software based on the test code. For example, user equipment 20 can provide multimodal data and receive test code from computing device 10 via an API provided by computing device 10. Optionally, user equipment 20 can also receive input feedback information and receive feedback information from computing device 10. Computing device 10 can further optimize the test code based on the feedback information and provide the optimized test code to user equipment 20.

[0135] In some possible implementations, the functions of providing multimodal data to the server and receiving test code returned by the server can be integrated into a test code generation tool. This test code generation tool can be a functional module on a local or online test code generator, a functional module (including tools or plugins) on a local or online integrated development environment (IDE), a locally running software tool or plugin, or an online web-based tool, etc. For example, the test code generation tool can be integrated into computer software (such as a client), and the computer software can be installed on user device 20, allowing the user device to generate test code using the computer software. As another example, the test code generation tool can be integrated into an application (APP), and the APP can be installed on user device 20, allowing the user to generate code through the APP. As yet another example, the test code generation tool can be integrated into a lightweight program (such as a mini-program or microservice). As yet another example, the test code generation tool can be a webpage, allowing the user to generate code through the webpage.

[0136] In some cases, the code generation system also includes a cloud 30, which includes devices with computing resources. The cloud 30 includes various models or tools, and the computing device 10 can perform corresponding functions by calling upon these models or tools in the cloud. The computing resources of the cloud 30 include physical computing resources or virtual computing resources.

[0137] As one possible implementation, the cloud 30 includes an AI hub, which comprises one or more models or tools in the field of AI that can assist the computing device 10 in completing tasks. For example, the cloud 30 includes one or more models such as a first model, an encoder, a neural network, an intent parsing model, or an intent optimizer.

[0138] Optionally, the cloud 30 and the computing device 10 can be integrated into the same physical device, or they can belong to different physical devices. For example, in the latter case, the cloud 30 and the computing device 10 belong to different cloud infrastructures. Of course, this application also applies to the case where the cloud 30 and the computing device 10 belong to the same cloud infrastructure.

[0139] In some cases, models deployed in the cloud 30 or computing devices 10 are pre-trained by model training device 40. Model training device 40 is a computing-capable device that can train models based on training data and / or algorithms.

[0140] Optionally, the cloud 30 and the model training device 40 can be integrated together, for example, in the same physical device. Alternatively, the cloud 30 and the model training device 40 can be located in different computing instances, for example, in different physical devices. Similarly, the model training device 40 and the computing device 10 can be integrated together, or located in different computing instances, which include one or more of the following: a single computing device (or host), a cluster of computing devices, a container, a virtual machine, a cloud, etc.

[0141] The above describes a hardware architecture for a code generation system provided in this application. The following describes some software architectures provided in this application. Please refer to... Figure 2 , Figure 2 This is a schematic diagram of the software functional modules of a code generation system provided in this application. The code generation system includes a data input module, a multimodal fusion module, a parsing and construction module, and an inference module, wherein the parsing and construction module includes a prompt word construction module. Optionally, the code generation system also includes one or more of the following: a preprocessing module, a context-aware module, an intent parsing module, an intent optimization module, or a data output module. Their functions are described below.

[0142] The data input module is used to receive data from the code generation system. In some cases, the data input module can also be called an acquisition module. For example, the data input module is used to receive one or more of the following input by the user: source code, related documentation of the source code, flowcharts of the source code, architecture diagrams, etc. As one possible implementation, the code generation system provides an API, through which the user device can provide multimodal data to the code generation system, and the data input module can receive the multimodal data provided by the user device. In this embodiment, the data input module is capable of acquiring multimodal data related to the first software, and the multimodal data includes multiple data types, each including at least two types of data.

[0143] The multimodal data fusion module, also known as the multimodal data processing module, is used to obtain an intent representation vector based on multimodal data. This intent representation vector is obtained by fusing features from multiple data sources; that is, the intent representation vector is a fused feature vector of multimodal data. For example, the multimodal data fusion module extracts features from data from multiple modalities such as source code, documents, and flowcharts, and then fuses the feature vectors from different modalities.

[0144] Optionally, combined Figure 3 The process of obtaining an intent representation vector based on multimodal data (i.e., multimodal data fusion) includes the following stages: modality-specific encoding, modality attention weight determination, and fusion. Each stage can be completed by one or more functional modules. In some cases, the latter two stages can also be considered as one stage. In the modality-specific encoding stage, the code generation system extracts features from each data item in the multimodal data based on the modality of the data to obtain features for multiple data items. In the modality attention weight determination stage, the code generation system calculates the weights (or contributions) between the feature vectors of different modules. In the fusion stage, the code generation system obtains the first intent representation vector based on the features of multiple data items and the attention weights (or contributions) of each data item.

[0145] In some cases, the multimodal data fusion module can utilize a second model to obtain an intent representation vector based on the multimodal data. The second model can be a pre-trained model. In one possible implementation, the second model includes an encoder and a neural network. The encoder extracts features from each data item in the multimodal data to obtain features for the multimodal data, and the neural network obtains the intent representation vector based on the features of the multimodal data.

[0146] Please see Figure 4 , Figure 4 This is a schematic diagram illustrating a deployment method for a second model. The multimodal data fusion module includes a second model, such as... Figure 4 As shown in part (a), or, the multimodal data module can call a second model, such as Figure 4As shown in section (b), the second model includes an encoder and a neural network. The encoder can be of various types, meaning the second model includes multiple encoders, each corresponding to a specific modality of data. For example, the encoder may include at least one of a code encoder, document encoder, graph encoder, and feedback encoder, each used to encode a corresponding type of data. Optionally, the neural network is a multi-head attention neural network, with input consisting of features of multiple data items. The neural network can fuse features based on the attention weights of the features of each data item to obtain a first intent representation vector. In some cases, the attention weights of the features of each data item can be predefined or user-defined. In some cases, the attention weights of the features of each data item are adjustable; for example, the neural network dynamically adjusts the contribution of each modality feature vector in the fusion process based on a gating mechanism, and the neural network obtains the intent representation vector based on the features of each data item and their respective contributions. In other cases, the attention of the features of each data item is weighted and merged to obtain the intent representation vector.

[0147] Optionally, the input multimodal data is processed by a preprocessing module before being provided to the multimodal data fusion module. This preprocessing module is used to preprocess the input multimodal data. For example, the preprocessing module can perform one or more of the following operations depending on the data type: noise reduction, duplicate removal, useless data removal, or useless context removal, etc.

[0148] The context-aware module is used to sense the context information of the source code of the first software and outputs the context information of the first software. The context information of the first software includes one or more of the following: dependencies, initialization variables, or context step information.

[0149] The parsing module is used to construct prompt words based on the input information. Optionally, the input to the parsing module includes the output of the multimodal data fusion module. Optionally, when the code generation system includes a context-aware module, the input to the intent parsing module may also include the output of the context-aware module.

[0150] The parsing module includes a prompt word construction module, which constructs prompt words based on the input information. These prompt words guide the model in generating test code. Optionally, the prompt words include multiple elements, each representing the test intent from at least one dimension. For example, the prompt words may include one or more of the following: test characteristics, test steps, preconditions, expected results, test environment, and test behavior.

[0151] In one possible scenario, the parsing module includes a cue word building module but excludes an intent parsing module. The information input to the cue word building module includes the output of the multimodal data fusion module. When the code generation system includes a context-aware module, the input to the intent parsing module may optionally also include the output of the context-aware module.

[0152] In another possible scenario, the parsing module may also include an intent parsing module. This module derives the intent structure (i.e., the intent parsing process) based on information input to the intent parsing module, and the intent structure represents the user's test intent. The prompt word building module constructs prompt words based on the output of the intent parsing module, i.e., the intent structure. In this case, the information input to the intent parsing module includes the output of the multimodal data fusion module. Optionally, when the code generation system includes a context-aware module, the input to the intent parsing module may also include the output of the context-aware module.

[0153] As one possible implementation, please see Figure 5 Intent parsing comprises the following stages: intent recognition, context adjustment, and hierarchical construction. Each stage can be completed by one or more functional modules. In the intent recognition stage, the code generation system obtains a preliminary vector based on the first intent representation vector. In the context adjustment stage, the code generation system obtains an intermediate vector based on the preliminary vector and the context information of the first software. In the hierarchical construction stage, the code generation system constructs the first intent structure based on the preliminary vector and the intermediate vector.

[0154] In some cases, the intent parsing module can utilize the intent parsing model to construct an intent structure based on the input intent representation vector and the contextual information of the first software's source code. The intent parsing model can optionally be a pre-trained model. Optionally, the intent parsing model includes an intent parsing module, a context adjustment module, and a hierarchical construction module, which correspond to the functions of the three stages: intent recognition, context adjustment, and hierarchical construction.

[0155] Optionally, the intent parsing module includes an intent parsing model, such as Figure 6 As shown in part (a), or, the intent parsing module can invoke the intent parsing model, such as Figure 6 As shown in part (b).

[0156] As a possible implementation example, in the intent resolution phase, the intent resolution model uses a Transformer-based encoder to encode the fused intent representation vector (i.e., the output of the multimodal fusion encoding module) to capture the basic test objectives in the test requirements. Further, the intent resolution model uses the Transformer's self-attention mechanism to identify the core intent and theme within the test requirements. The result of the intent resolution phase is a preliminary vector, which indicates the basic test objectives, core intent, and test subject in the user's test requirements. In the context adjustment phase, the intent resolution model inputs the preliminary vector and contextual information into a bidirectional encoding layer to capture the contextual dependencies in the test requirements. Through bidirectional encoding, it understands the contextual connections and logical relationships (such as contextual dependencies) within the test requirements, refining the preliminary identification results to obtain an intermediate vector. This intermediate vector, combined with the code context, allows for a deeper understanding of the user's test requirements' deeper intent. During the hierarchical construction phase, the intent parsing model processes the initial intent and intermediate vectors into layers. Based on the logical structure of the test requirements, it constructs a hierarchical intent structure. For example, the constructed intent structure includes primary intents (or preliminary test intents), intermediate intents (or intermediate test intents), and advanced intents (or advanced test intents). Optionally, during the construction process, the hierarchical construction module can adjust the association and / or importance between intents at different levels to generate an intent structure that can hierarchically and accurately reflect the test requirements.

[0157] In some possible implementations, the input to the parsing module may also include user feedback on the previously generated test code. For example, test code generation may include multiple reasoning processes, each consisting of a prompt generation phase and a reasoning phase. In the prompt generation phase, the parsing module may optimize the prompts based on the output of the previous reasoning phase or user feedback on the output of the previous reasoning phase, thereby generating new prompts that serve as input for the current reasoning phase. For example, feedback information may be input to the prompt generation module and / or the intent parsing module to optimize the output of these modules.

[0158] As a possible example, the prompt building module constructs prompts based on the intent structure, contextual information from the first software, and feedback information, using a prompt template.

[0159] The reasoning module, or large model reasoning module, is used to generate test code based on prompts using the first model.

[0160] The feedback optimization module is used to optimize the intent structure and / or prompts based on feedback information. For example, the feedback information includes the execution results of test code and user feedback. Based on the execution results of the test code and the user feedback, the feedback optimization module identifies the parts of the intent structure and prompts that need optimization and updates the intent parsing measurement and prompt construction strategies in real time.

[0161] As one possible implementation, see Figure 7 The code generation system optimizes the intent structure based on feedback information, including the following stages: intent optimization, feedback alignment, and optimization execution. Each stage can be completed by one or more functional modules. In the intent optimization stage, the code generation system extracts features from the feedback information to obtain feature vectors. In the feedback alignment stage, the code generation system uses the feature vectors of the feedback information to identify the problematic and correct content in the first intent structure. In the optimization execution stage, the code generation system uses an optimizer to strengthen the representation of the problematic content in the original intent structure while maintaining or weakening the representation of the correct content in the first intent structure, resulting in a new intent structure.

[0162] As a possible implementation example, feedback information includes the execution results of the test code and user feedback. In the feedback encoding phase, the code generation system encodes and extracts features from the execution results of the first test code to obtain an execution result vector. Optionally, this execution result vector may include (or indicate) key metrics such as test coverage and error rate. The code generation system processes the user feedback information (e.g., sentiment analysis and natural language processing) to extract user feedback features, which may include one or more of the following: test result satisfaction, test suggestions, or problem points. This execution result vector and user feedback features can be integrated to obtain an inverse representation vector. In the feedback alignment phase, the code generation system aligns the feedback representation vector with the first intent structure, identifying the content associated with the feedback representation vector in the first intent structure, such as the feedback representation vector associating with one or more of the test objectives, test points, and test steps. The code generation system, through the feedback information and the first intent structure, determines the problematic content in the intent structure, and optionally, it can also determine the correct content in the intent structure. That is, it identifies the problems and deficiencies in the intent structure. In the optimization phase, the code generation system, based on an optimizer, adjusts the intent structure according to the feedback information. For example, for errors or deficiencies in the intent structure pointed out in the feedback information, the code generation system enhances the representation of these problematic parts, such as enhancing the representation of the corresponding test objectives, test points, or test steps. Furthermore, the code generation system maintains or appropriately reduces the representation of correct parts.

[0163] In some cases, the feedback optimization module can utilize a feedback optimization model to optimize the intent structure based on feedback information. The feedback optimization model can be a pre-trained model. Optionally, the feedback optimization model includes an intent optimization module, a feedback alignment module, and an optimization execution module, which correspond to the functions of the three stages: intent optimization, feedback alignment, and optimization execution. Optionally, the intent parsing module includes an intent parsing module, or the intent parsing module can call the intent parsing model.

[0164] In some cases, Figure 2 The code generation system shown is used to implement the code generation method of this application. For details on the specific operations performed by each module, please refer to the description in the method embodiment.

[0165] In some implementations, the aforementioned data input module, preprocessing module, multimodal data fusion block, data perception module, parsing module, inference module, intent optimization module, and data output module can all be implemented in software or hardware. For example, the implementation of the multimodal data fusion module will be described below. Similarly, the implementation of other modules can refer to the implementation of module A.

[0166] As an example of a software functional unit, a multimodal data fusion module may include code running on a computing instance. This computing instance may include at least one of a physical host (computing device), a virtual machine, or a container. Furthermore, the aforementioned computing instance may be one or more. For example, the multimodal data fusion module may include code running on multiple hosts / virtual machines / containers. It should be noted that the multiple hosts / virtual machines / containers used to run the code may be distributed within the same region or in different regions. Further, the multiple hosts / virtual machines / containers used to run the code may be distributed within the same availability zone (AZ) or in different AZs, each AZ comprising one or more geographically proximate data centers. Typically, a region may include multiple AZs.

[0167] Similarly, multiple hosts / virtual machines / containers used to run this code can be distributed within the same Virtual Private Cloud (VPC) or across multiple VPCs. Typically, a VPC is set up within a region. Communication between two VPCs within the same region, as well as between VPCs in different regions, requires a communication gateway to be set up within each VPC to enable interconnection between VPCs.

[0168] As an example of a hardware functional unit, a multimodal data fusion module may include at least one computing device, such as a server. Alternatively, module A may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD can be implemented using a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), generic array logic (GAL), or any combination thereof.

[0169] The multimodal data fusion module includes multiple computing devices that can be distributed within the same region or in different regions. Similarly, the A module includes multiple computing devices that can be distributed within the same Availability Zone (AZ) or in different AZs. Likewise, the multimodal data fusion module includes multiple computing devices that can be distributed within the same Virtual Private Cloud (VPC) or multiple VPCs. These multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.

[0170] It should be noted that, in other embodiments, the multimodal data fusion module can be used to execute any step in the test code generation method, the parsing and construction module can be used to execute any step in the test code generation method, and the other modules can also be used to execute any step in the test code generation method. The steps implemented by each module can be specified as needed, and the entire functionality of the code generation system is achieved by implementing different steps in the test code generation method through the above modules.

[0171] above Figure 2 The embodiments described herein illustrate the various modules of the code generation system and introduce multiple possible implementation methods. The following, in conjunction with… Figure 8 This paper presents a schematic diagram illustrating the entire process of the training and application phases of a possible code generation system.

[0172] Please see Figure 8During the training phase, training data, such as software code repositories, software documentation, and software images, is used to train various models. This training data includes multimodal data related to the software. The training process is as follows: the training data is input into a multimodal data fusion engine (which can be considered a second model) for feature extraction and fusion to train the engine. The output of the multimodal data fusion engine can be used as a fused feature vector dataset or added to it. The data in this fused feature vector dataset can then be used as training data for the intent parsing module.

[0173] Optionally, training data (such as externally input data and / or fused feature vector datasets) are also input into the context-aware module to train the context-aware module. In addition, the context-aware module can also perceive contextual information from the input data as training data for the intent parsing module.

[0174] Contextual information and a fused feature vector dataset are input into the intent parsing module to train the intent parser. Furthermore, the intent parsing module can generate a hierarchical intent structure based on the input information. This hierarchical intent structure can be used as training data to train the test code generation model. For example, the code generation system utilizes a base model to train a large generative test code model (which can be considered the first model) based on the hierarchical intent structure training data. This large model can be used for inference during the inference phase.

[0175] During the inference phase, the multimodal data from the first software is input into the multimodal data extractor (which can be considered the second model) to obtain a fused feature vector (i.e., an intent representation vector). A context-aware module is used to perceive contextual information from the code projects of the first software. The fused feature vector and contextual information are input into the intent parser to obtain a hierarchical intent structure. A cue word construction module is used to construct cue words based on the hierarchical intent structure and context-aware information; these cue words are context-enhanced. The cue words output by the cue word construction module are used as input into the large model for inference, thereby generating test code. Furthermore, the intent optimizer optimizes the output of the intent parser and / or the output of the cue word construction module based on the test code's execution results and user feedback, thereby guiding the large model to generate new test code. This feedback optimization process can be executed multiple times, thereby improving the quality of the output test code.

[0176] The method provided in the embodiments of this application is described below.

[0177] Please see Figure 9 , Figure 9 This is a flowchart illustrating a code generation method provided in an embodiment of this application. Optionally, this method can be... Figure 1 , Figure 2 ,or Figure 8The code generation system implementation shown here is for illustrative purposes only. In actual implementation, the execution... Figure 9 The specific product based on the method shown may have other designs. Figure 9 The names of the devices, apparatuses, information, and parameters shown are for illustrative purposes only; the names may be designed differently in actual implementations.

[0178] Combination Figure 9 A code generation method includes one or more steps from S901 to S904. It should be understood that, for ease of description, the method is described in the order of S901 to S904, and is not intended to limit the execution to this specific order. This application embodiment does not limit the order of execution, the execution time, or the number of executions of the above one or more steps. S901 to S904 are as follows:

[0179] S901, the code generation device acquires multimodal data related to the first software.

[0180] The code generation device is a device with computing capabilities. For example, the code generation device is a module deployed in the aforementioned computing device 10, such as a software module and / or a hardware module. Another example is the aforementioned code generation system. Yet another example is an intelligent agent.

[0181] Modality refers to different forms or characteristics of data, and multimodality refers to data having multiple types or multiple information sources. Multimodal data includes at least two types of data; that is, multimodal data includes multiple data items, and at least two types of data exist among these multiple data items. As one possible implementation, multimodal data includes the source code of a first software, and also includes the description document of the first software and / or diagrams of the first software. The following sections describe these three types of data:

[0182] The source code of the first piece of software refers to the code segment being tested, which is used to implement a specific function. Please see [link / reference]. Figure 10 , Figure 10 This is a schematic diagram of a possible code segment and its pseudocode provided in an embodiment of this application, wherein, Figure 10 Part (a) is the pseudocode for this code segment. Figure 10 Part (b) is a schematic diagram of the source code of this code segment. It can be seen that this code segment is used to implement the function of pushing media copy information.

[0183] The description document of the first software includes information about the first software, including at least one of the following: preconditions, test case information, source code file path, product line, or development department. Preconditions, also known as prerequisites, refer to the conditions that the software test cases must meet to execute or run. A test case describes how the software responds to external input. A test case describes the task of testing the software, typically consisting of test inputs, execution conditions, and expected results designed for a specific test objective, used to verify whether the software meets functional requirements. Test case information includes one or more of the following: test case name, test case ID, test objective, test environment, input data, and test steps. In some cases, one or more of the test objective, test environment, input data, and test steps may be included in the test case name. In some cases, test case information also includes postconditions, which refer to the system's state or result after the test case is executed. The source code file path of the first software indicates the code project in which the first software resides and the location of its source code within that project. In some cases, one or more test cases can form a test case group. In this case, the information of the first software also includes information about the test case group, or simply test case information. Of course, the name of this information is just an example; in a specific implementation, the name may be designed differently.

[0184] Please see Figure 11 , Figure 11 This is a schematic diagram of a possible software test document. The test document includes one or more of the following: product line name, department name, file path, test case name, test case ID, test case name, test case ID, preconditions, author, date, remarks, and modification history.

[0185] The illustrations of the first software include one or more of the following: flowcharts, functional module architecture diagrams, etc. Flowcharts clearly and graphically represent the software's execution steps, decision-making logic, and the relationships between its parts, simplifying and visualizing complex program logic and facilitating understanding of the software's overall architecture and operational flow. Architecture diagrams can show components (or modules) and the relationships between them; analyzing architecture diagrams allows for a quick understanding of the software's organizational structure.

[0186] Please see Figure 12 , Figure 12 This is a schematic diagram of a possible software flowchart provided in an embodiment of this application. The flowchart of the software under test illustrates the software's execution steps, decision logic, and the interrelationships between its various parts.

[0187] Optionally, the code generation device can obtain the multimodal data by receiving user input. This user input includes direct input on the code generation device, or input on the user's device before submission to the code generation device. For example, combined with... Figure 1 The code generation device is deployed in the computing device 10. The user can input multimodal data on the user device 20, and the user device provides the multimodal data to the computing device 10 so that the code generation device can obtain the multimodal data.

[0188] S902, the code generation device obtains the first intent representation vector based on multimodal data.

[0189] The first intent representation vector represents the features of the multimodal data. The multimodal data includes data from multiple modalities, which contain the testing intent for the first software. The vector obtained by extracting the features of the multimodal data can represent the testing intent for the first software.

[0190] In one possible implementation, the multimodal data includes multiple data items. The code generation device extracts features from each of the multiple data items to obtain features corresponding to each data item, and then fuses the features of the multiple data items to obtain a first intent representation vector. For example, the multimodal data includes the source code of the first software, the description document of the first software, and illustrations of the first software. After feature extraction from these three data items, features corresponding to the source code, the description document, and the illustrations are obtained, respectively. In this case, the first intent representation vector is obtained by fusing the features corresponding to the source code, the description document, and the illustrations.

[0191] Optionally, the first intent representation vector is obtained by fusing features from multiple data sets. In one case, the features of the multiple data sets are processed to obtain the first intent representation vector. For example, this can be achieved through weighted fusion, or by using the features of the multiple data sets and the contribution of each feature to obtain the first intent representation vector. In another case, the first intent representation vector includes the features of the multiple data sets; that is, the features of the multiple data sets are not subjected to additional calculations or processing, and the code generation device uses the features of the multiple data sets as the first intent representation vector.

[0192] Optionally, during feature extraction, the code generation device may utilize an encoder for feature extraction. Further, the encoder may include multiple encoders, each used to extract data for a specific modality. As one possible implementation, the code generation device utilizes multiple encoders to extract features from the data of each modality separately, and fuses the extracted features to obtain a first intent representation vector.

[0193] In some cases, an encoder includes at least one of the following: a code encoder, a document encoder, a graph encoder, and a user feedback encoder.

[0194] The following are exemplary descriptions of some of the encoders:

[0195] A code encoder is used to extract features from source code type data. A code encoder includes one or more of the following: Abstract Syntax Tree (AST), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN). In one possible design, the code encoder includes AST, CNN, and RNN. The AST is used to parse the source code and extract structural information, such as functions, classes, or variables. CNN is used to capture local features in the source code, and RNN is used to capture the sequential dependencies of the code. Feature vectors of the source code are generated based on the features obtained from CNN and RNN. For example, a source code feature extraction process can be represented by the following formula:

[0196] f src (x) = CNN + RNN(x)

[0197] Wherein, CNN+RNN represents feature extraction based on CNN and RNN, f src For code encoder, the output of code encoder is the features of source code (or feature vector of source code), and x is the input of encoder, i.e. source code.

[0198] A document encoder is used to extract features from document-type data. For example, a document encoder includes a pre-trained model. In some cases, a document encoder is used to extract one or more features from a document, such as keyword features, phrase features, sentence-level features, and semantic features, and generates a feature vector for the document based on the extracted features. Optionally, a document encoder may include one or more pre-trained models. For example, a document feature extraction process can be represented by the following formula:

[0199] f doc (x) = BERT(x)

[0200] Among them, bidirectional encoder representations from Transformers (BERT) is a pre-trained language model that can extract feature vectors from the input text (i.e., document). doc is the document encoder, and the output of the document encoder is the features of the document (or the feature vector of the document), while x is the input of the encoder, i.e., the document.

[0201] Graph encoders are used to extract features from graph-type data. For example, a graph encoder converts a graph into graph-structured data and uses a graph neural network (GNN) to capture features of nodes (e.g., steps and / or components) and edges (e.g., dependencies between steps or call relationships between modules) in the graph, generating a feature vector for the graph. For example, a graph feature extraction process can be represented by the following formula:

[0202] f flow (x)=GNN(x)

[0203] Where GNN(x) represents the feature extraction of input data x using GNN. f flow For graph encoders, the output of a graph encoder is the feature of the graph (or feature vector of the graph), and x is the input of the encoder, i.e., the graph, such as a flowchart or architecture diagram.

[0204] A feedback encoder is used to extract features from feedback information. For example, a feature extraction process for feedback information can be represented by the following formula:

[0205] f fb (x) = NLP(x)

[0206] Wherein, NLP(x) represents the feature extraction process performed by natural language processing (NLP) on the input data x. fb This is a feedback encoder. The output of the feedback encoder is the features of the feedback information (or the feature vector of the feedback information), and x is the input of the encoder, i.e., the feedback information, such as the execution result of test code and / or user feedback. In some cases, f fb For encoders that provide user feedback, other encoders can be used for the execution results of test code.

[0207] The above describes the feature extraction process. The following section uses the example of obtaining the intent representation vector after processing the features of multiple data sets to illustrate the process of obtaining the intent representation vector.

[0208] In one scenario, a neural network is used to fuse features from multiple datasets; this neural network is, for example, a multi-head neural network based on an attention mechanism. The neural network can fuse features to obtain a first intent representation vector based on the attention weights of the features from each dataset and / or the contribution of the features from each dataset.

[0209] For example, the attention weights and / or contributions of features can be determined based on a gating mechanism. One gating mechanism is represented as follows:

[0210] g=σ(W g[f src (x1); f doc (x2); f flow (x3); f fb (x4)]+b g )

[0211] Where σ represents the sigmoid function, W g and b g The semicolon indicates the learnable parameters, and the semicolon indicates vector concatenation.

[0212] For example, a process of fusing outputs is represented as follows:

[0213] Fused_Feature=g⊙Updated_Feature+(1-g)[f src (x1); f doc (x2); f flow (x3); f fb (x4)]

[0214] Here, Updated_Feature is the feature updated using attention weights or contributions, and ⊙ represents element-wise multiplication.

[0215] In some possible implementations, the attention weights of the features of each data point can be predefined or custom-defined. Similarly, the contribution of the features of each data point can also be predefined or custom-defined. For example, a process for calculating attention weights using a multi-head attention network is as follows:

[0216]

[0217] Where Attention(Q, K, V) is a multi-head attention network that calculates attention weights, softmax is an activation function, Q, K, and V are the query, key, and value matrices, respectively, and d k Let T be the dimension of the key, and T be the parameters used in the computation when using a neural network.

[0218] As one possible implementation, the attention weights and / or contributions of the features of each data point are adjustable. Taking contribution as an example, the code generation device dynamically adjusts the contribution of the feature vectors of each data point in the fusion process based on a gating mechanism. The neural network obtains a first intent representation vector based on the features of each data point and their respective contributions. Taking weight as an example, the attention weights of the features of each data point are adjustable, and the features of each data point are weighted and merged to obtain the first intent representation vector. Exemplarily, a process of updating the attention weights to update the feature vector can be expressed as the following formula:

[0219] Updated_Feature = Attention(f src (x1), f doc (x2), f flow (x3), f fb (x4))

[0220] Here, Updated_Feature represents the updated feature vector, and x1, x2, x3, and x4 are used to represent the inputs of each encoder.

[0221] In some possible implementations, the aforementioned feature extraction and feature fusion operations can be integrated into a model, which is then referred to as a second model for easy distinction. In other embodiments, the second model can be named differently, such as a multimodal data fusion model or a multimodal data fusion processor. Optionally, the second model can be integrated into the code generation device, or it can be deployed in other devices that the code generation device can invoke.

[0222] As one possible implementation example, the code generation device utilizes a second model to obtain a first intent representation vector based on multimodal data. Further, the second model includes an encoder and a first neural network, whereby the encoder extracts features from each data item in the multimodal data to obtain features of the multimodal data, and the neural network obtains the first intent representation vector based on the features of the multimodal data.

[0223] S903, the code generation device constructs a first prompt word based at least on a first intent representation vector.

[0224] The prompt words are used to guide the model in reasoning. Optionally, the first prompt word includes multiple items, each describing the test intent from at least one dimension. The first prompt word includes one or more of the following: test characteristics, test steps, preconditions, expected results, test environment, and test behavior.

[0225] For example, such as Figure 13 The first prompt includes the task instruction, product line name, department name, test case name, test case ID, test characteristic, test steps, preconditions, and expected result. The task instruction indicates the task the model should perform. The test characteristic is a feature identified based on the intent representation vector, indicating the characteristics of the test requirement. For example, "routine maintenance" indicates that the test requirement is for routine maintenance testing. The test steps represent the steps the test code should execute. Preconditions indicate the state the system should be in during testing. The expected result indicates the outcome the test should achieve.

[0226] In some cases, contextual information from the first code may also be used when constructing the first prompt words to improve the accuracy of understanding the test requirements. Specifically, the contextual information of the first software indicates the association between the first software and the source code in the target code project, where the target code project is the code project containing the first software. Optionally, the contextual information includes one or more of the following: the first software's dependencies on the source code in the target project, the first software's initialization variable information, and the first software's contextual step information, etc.

[0227] In one possible implementation, the code generation device perceives the context information of the first software based on the source code in the target project. The code project includes multiple source code files. When testing a source code file (such as the source code of the first software), the code generation device can look up the context information of that source code file from other source code files, thereby improving the accuracy of understanding the test requirements.

[0228] In some cases, when constructing the first prompt, the code generation device first generates an intent structure based on the first intent representation vector. The prompt is then constructed using this intent structure for easier differentiation. The following explanation uses the generation of the first intent structure as an example. The first intent structure includes multiple levels of content, each level describing the test intent from at least one dimension. For example, the first intent structure includes one or more of the following: test objective, test point, test step, test characteristic, test dependency information, expected result, preconditions, test topic, etc. This hierarchical intent structure describes the test intent from multiple dimensions, comprehensively and in detail representing the test requirements. Using a hierarchical intent structure to construct prompts improves the accuracy of the test intent expressed by the prompts, thereby enhancing the relevance and accuracy of the generated test code.

[0229] Optionally, contextual information can also be used when generating the intent structure, and / or when constructing prompts based on the intent structure.

[0230] As one possible implementation, the code generation device obtains a first intent structure based on a first intent representation vector and context information of the first software, and uses the first intent structure to construct a first prompt word.

[0231] As another possible implementation, the code generation device obtains a first intent structure based on a first intent representation vector and context information of the first software, and constructs a first prompt word using the first intent structure and context information of the first software.

[0232] The intent construction process was mentioned above. Below, an example of an intent structure generation process is described. The code generation device obtains a preliminary vector based on a first intent representation vector, obtains an intermediate vector based on the preliminary vector and context information of the first software, and constructs a first intent structure based on the preliminary vector and the optimized vector.

[0233] For example, the code generation device obtains a preliminary vector based on the first intent representation vector, which can be expressed as follows:

[0234] h0 = TE([f src (x1); f doc (x2); f flow (x3); f fb (x4)])

[0235] Where TE is a Transformer-based encoder, [f src (x1); f flow (x3); f fb [x4] is the first intent representation vector, and h0 is the initial vector used to represent the intent of the test requirement.

[0236] When optimizing the intent, the code generation device uses the initial vector and context information from the first software to obtain the optimized vector. For example, the code generation device performs bidirectional encoding on the initial vector to obtain an encoded output, and optimizes the encoded output based on the context information to obtain the optimized vector.

[0237] The bidirectional encoding process is as follows:

[0238]

[0239]

[0240] Where LCTM() represents LSTM-based encoding, The output of the forward LSTM The output of the backward LSTM is t, where t represents the parameters used in the encoding process.

[0241] In some cases, the aforementioned preliminary vectors and intermediate vectors are intermediate results in the calculation process, and they are represented here for ease of understanding.

[0242] S904, the code generation device uses the first model to generate the first test code based on the first prompt word.

[0243] The first test code is the test code for the first software. In some cases, when the first test code is run, it can test the functionality of the first software.

[0244] The model can learn from training data, analyze the relationships and patterns in the data, and perform tasks such as prediction or classification with high efficiency. Using the model to generate test code ensures both efficiency and quality. Furthermore, the first model can be one model or multiple models, some or all of which are pre-trained. For example, the first model includes an LLM, which is a deep learning model trained on a large amount of data, capable of generating or understanding the meaning of language text. Even more exemplarily, the first model includes one or more of the following: a convolutional neural network (CNN), a recurrent neural network (RNN), or a Transformer-based model; other model structures may be used subsequently.

[0245] It should be understood that the name of the first model can have multiple designs, such as being called the test code model, or claiming to be the large test code model (e.g.) Figure 8 (As shown).

[0246] Optionally, the first model is deployed within the code generation device. Alternatively, the first model is deployed outside the code generation device, but the code generation device can invoke the first model. Alternatively, the first model includes multiple functional modules, some of which are deployed within the code generation device, and some are deployed outside the code generation device. In short, the code generation device can use the first model.

[0247] In some cases, the first model can be called multiple times, generating test code based on the prompt words multiple times. When calling the model multiple times, the prompt words input to the model can be different, thus continuously optimizing the test code output by the first model.

[0248] Taking one output optimization as an example, the code generation device uses the first model to generate the first test code based on the first prompt word, then obtains feedback information, optimizes the first prompt word based on the feedback information, and uses the first model again to generate a new test code (referred to as the second test code for easy distinction) based on the optimized prompt word.

[0249] Optionally, when optimizing the first suggestion word, it can be optimized directly during the suggestion word construction stage, or it can be optimized by optimizing the information in the input suggestion word construction module. These two methods can also be combined. Below are some possible ways to optimize suggestion words:

[0250] In Method 1, the code generation device optimizes the first intent structure based on feedback information to obtain a second intent structure, and then constructs a second prompt word based on the second intent structure and the context information of the first software. That is, the method of optimizing the prompt word is to optimize the intent structure, thereby constructing the prompt word based on the new intent structure, thus completing the prompt word optimization. Further, the code generation device uses the first model to generate second test code based on the second prompt word, and the second test code is used to test the first software. The feedback information includes user feedback information and / or the execution result of the first test code.

[0251] As one possible implementation, the code generation device extracts features from the feedback information to obtain feature vectors of the feedback information, uses the feature vectors of the feedback information to identify the problematic content and correct content in the first intent structure, and uses an optimizer to strengthen the representation of the problematic content in the first intent structure and maintain or weaken the representation of the correct content in the first intent structure to obtain a second intent structure.

[0252] The following example, using feedback information including user feedback and the execution result of the first test code, illustrates an optimization process:

[0253] Step 11: The code generation device extracts features from the feedback information to obtain the feature vector of the feedback information.

[0254] The code generation device extracts the feature vector, or execution result vector, of the execution result of the first test code. For example, an execution result encoding process can be represented by the following formula:

[0255] f exec (x) = Metrics_Extactor(x)

[0256] Where Metrics_Extactor represents the feature extractor, f exec The output of the execution result encoder is the feature of the execution result (or the feature vector of the execution result), and x is the input of the encoder, that is, the execution result of the test code.

[0257] The code generation device extracts the feature vector of user feedback information, i.e., user feedback features. For example, a user feedback encoding process can be represented by the following formula:

[0258] f fb (x) = NLP(x)

[0259] Wherein, NLP(x) represents the feature extraction process performed by natural language processing (NLP) on the input data x. fbFor feedback encoders, the output of a feedback encoder is the feature of the feedback information (or the feature vector of the feedback information), and x is the input of the encoder, i.e. the feedback information, such as the execution result of test code and / or user feedback.

[0260] The code generation device merges the feature vector of the execution result of the first test code and the feature vector of the user feedback information to obtain the feature vector of the feedback information, i.e., the feedback representation vector. For example, one fusion process can be expressed as the following formula:

[0261] Feedback = [f exec (x1); f fb (x2)]

[0262] Here, Feedback, x1, and x2 represent the inputs of each encoder. Understandably, if the feedback information includes only one piece of information, such as only the execution result of the first test code or only user feedback, then this fusion process is unnecessary. Similarly, when the feedback information includes more content, features can be extracted for the other content and fused together in the above fusion process.

[0263] In some cases, this stage may be called the feedback encoding stage. During the feedback encoding stage, the code generation device encodes and extracts features from the execution results of the first test code, obtaining an execution result vector. Optionally, this execution result vector may include (or indicate) key metrics such as test coverage and error rate. The code generation device processes user feedback information (e.g., sentiment analysis and natural language processing) to extract user feedback features, which may include one or more of the following: test result satisfaction, test suggestions, or problem points.

[0264] Step 12: The code generation device uses the feature vector of the feedback information to identify the problematic and correct content in the first intent structure. As one possible implementation, the code generation device determines the similarity between the features of the feedback information and the content in the first intent structure based on the feature vector of the feedback information and the first intent structure. Further, the similarity is used to determine which item in the intent result the feedback information targets.

[0265] In some cases, this stage can be a feedback alignment stage. In one possible implementation, the first intent structure includes multiple layers. The code generation device aligns the feature vector of the feedback information with one or more layers of content in the first intent structure and outputs the aligned content. The aligned content and the content targeted by the feedback information include problematic content and / or correct content. In the feedback alignment stage, the code generation device aligns the feedback representation vector with the first intent structure, identifying the content associated with the feedback representation vector in the first intent structure. For example, the feedback representation vector may be associated with one or more of the test target, test point, and test step. The code generation device determines the problematic content in the intent structure through the feedback information and the first intent structure, and optionally, it may also determine the correct content in the intent structure. That is, it identifies the problems and deficiencies in the intent structure.

[0266] For example, one alignment mechanism is as follows:

[0267] Alignment(Feedback,HI)=similarity(Feedback,h i )

[0268] Where similarity is the similarity calculation function, HI is the intent structure, and h i This refers to a specific level of intent within the first intent structure.

[0269] Another example of aligned input is as follows:

[0270] Aligned_Feedback={Feedback i |similarity(Feedback i h i )>τ}

[0271] Here, τ is the similarity threshold, which refers to the content in the output intent structure that has a similarity greater than a certain threshold with the feedback information. This content includes the question content and / or the correct content.

[0272] Step 13: The code generation device uses an optimizer to enhance the representation of the problematic content in the first intent structure, and / or maintain or weaken the representation of the correct content in the first intent structure, to obtain a second intent structure. In some cases, the above process may be referred to as an optimization phase. In the optimization phase, the code generation device adjusts the intent structure based on feedback information from the optimizer. For example, for errors or deficiencies in the intent structure indicated in the feedback information, the code generation device enhances the representation of these problematic parts, such as enhancing the representation of the corresponding test objectives, test points, or test steps. Further, the code generation device maintains or appropriately weakens the representation of the correct parts.

[0273] For example, the code generation device adaptively optimizes the first intent structure using a loss function to obtain the second intent structure. One optimization process can be represented as follows:

[0274] L(θ)=-Σ i logP(Aligned_Feedback i |HI i ;θ)

[0275] Where θ represents the model parameters.

[0276] In method two, the code generation device optimizes the first intent structure based on feedback information to obtain a second intent structure, and constructs a second prompt word based on the second intent structure, the context information of the first software, and the feedback information. In the above method, the feedback information is used to optimize the intent structure and also to optimize the second prompt word.

[0277] For the optimization process of the intent structure, please refer to Method 1 mentioned above.

[0278] In method three, the code generation device incorporates multimodal information into the feedback information to obtain a new intent representation vector (referred to as the second intent vector for easy distinction), and constructs prompt words based on the new intent representation vector. For example, the code generation device performs intent parsing based on the second intent representation vector and the context information of the first software to obtain a new intent structure (referred to as the second intent structure), and then constructs new prompt words, i.e., the second prompt words, based on the new intent structure.

[0279] In other words, the code generation device updates the intent representation vector using feedback information, thereby updating the prompt words. For example, the code generation device generates a first intent representation vector using multimodal data. Upon receiving user feedback information, the code generation device can use the multimodal data and user feedback information to generate a second intent representation vector. This second intent representation vector participates in the subsequent prompt word construction process, thereby generating new prompt words to guide the model in generating new test code.

[0280] It should be understood that the above methods can be combined. For example, user feedback information can participate in both intent structure updates and prompt word updates. The combinations will not be explained in detail here.

[0281] exist Figure 9In the illustrated embodiment, the code generation device integrates multimodal data, analyzes information from the source code of the first software and other related data formats, and constructs a prompt-guided model to generate test code. Other data formats related to the source code often contain information for understanding the functionality of the first software. Using this data, feature vectors can be extracted to analyze the user's testing intent. The intent representation vector obtained by fusing these feature representations integrates information from multiple modalities, accurately reflects testing requirements, and enables the test code to specifically test the functionality of the first software, making the testing process more aligned with the user's testing needs.

[0282] and, Figure 9 The illustrated embodiment employs a "feature extraction + prompt words + model" architecture. This architecture has a clear purpose, separate functions, and interconnected modules that work collaboratively, resulting in high efficiency and good performance in generating test code. Moreover, this architecture can be automated, ensuring high system reliability and complete functionality.

[0283] In some possible implementations, the generated test code can be converted into code in a specific language. For example, the first test code generated above is in PyTnon format. The code generation system can receive user input instructions and convert them into test code in other programming languages, such as C or Java. Furthermore, the user can also specify the style and format of the converted test code to better suit their needs.

[0284] Furthermore, the above code conversion process can be implemented using a model.

[0285] In some possible implementations, the aforementioned code generation system, such as Figure 2 or Figure 3The code generation system shown also includes a semantic alignment module, an encoding module, and a code generation module. The data input module receives conversion information, including the code to be converted and conversion requirement information. For example, the conversion requirement information may include one or more of the following: indications for the target programming language, the target style, and the target format. The semantic alignment module, also known as a cross-language semantic alignment module, maps lexical elements in the code to be converted to corresponding lexical elements in another programming language. The input to the cross-language semantic alignment module can be the code to be converted and / or the code to be converted after processing by the preprocessing module (i.e., the output of the preprocessing module). The encoding module, also known as a multi-level fusion encoding module, uses an encoder to obtain one or more of the following: syntactic structure features, semantic features, and style features of the code to be converted. These features serve as the feature representation of the code to be converted. The code generation module uses a pre-trained model to generate preliminary code based on the feature representation. Optionally, the code generation system also includes a prompt word dynamic optimization module, which provides feedback to optimize the output of the code generation model. As a possible example, the prompt word dynamic optimization module is used to optimize the prompt word model based on reinforcement learning. It dynamically adjusts the prompt words and uses them to guide the code generation model to generate high-quality code. Finally, the code generation module outputs the target code.

[0286] The methods of the embodiments of this application have been described in detail above. The apparatus of the embodiments of this application is provided below. It should be understood that the module division of the apparatus (which may include a system) provided in this application is merely an exemplary illustration of one way to divide the structure of the apparatus. In practical applications, the structure of the apparatus may have other division methods. This application is equally applicable to apparatuses with other division methods but the same function. Furthermore, the apparatus name provided in this application is an exemplary name and can be replaced in specific implementations.

[0287] Please see Figure 14 , Figure 14 This is a schematic diagram of a code generation apparatus provided in an embodiment of this application. The code generation apparatus can be a standalone device (e.g., computing device 10) or a module within a standalone device, including hardware and / or software modules. This code generation apparatus 140 is used to implement the aforementioned test code generation method, for example... Figure 9 The test code generation method shown is the method executed by the code generation device.

[0288] In one possible design, the code generation device 140 includes an acquisition module 1401, a multimodal data fusion module 1402, a parsing and construction module 1403, and an inference module 1404. Optionally, the acquisition module 1401 is used to acquire input data and can also be called a data input module. Furthermore, the code generation device may also include other modules, such as a preprocessing module, a context-aware module, an intent optimization module, a data output module, etc. It should be understood that the above module division is exemplary; some modules can be divided into more modules, for example, the parsing and construction module 1403 can also be divided into an intent parsing module and a prompt word construction module. Alternatively, some modules can be merged into one module.

[0289] Optionally, the functions of the modules in the code generation device 140 can be found in [reference needed]. Figure 2 The illustrated embodiments and other embodiments are described below. Optionally, the code generation device 140 is an intelligent agent.

[0290] This application also provides a computing device. This computing device is a device with computing capabilities, including servers or terminal devices (including user equipment). Figure 15 A schematic diagram of the structure of the computing device provided in this application is shown as an example. Figure 15 As shown, the computing device 300 includes a connection line 301, a processor 302, a memory 303, and a communication interface 304, wherein the processor 302, the memory 303, and the communication interface 304 communicate with each other via the connection line 301. It should be understood that this application does not limit the number of processors 302 and memory 303 in the computing device 300.

[0291] Connection line 301 is used to connect various modules in the device to enable data transmission between different modules. Exemplarily, connection line 301 includes a bus, such as a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. In some cases, connection lines can be divided into address connection lines, data connection lines, control connection lines, etc. For ease of representation, Figure 15 Although only one line is used to represent it, this does not mean that the computing device 300 has only one connection line or only one type of connection line. The connection line 301 can include a path for transmitting information between various components of the computing device 300 (e.g., processor 302, memory 303, and communication interface 304).

[0292] Processor 302 may include any one or more processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).

[0293] Memory 303 may include volatile memory, such as random access memory (RAM). Memory 303 may also include non-volatile memory, such as flash memory, read-only memory (ROM), hard disk drive (HDD), or solid state drive (SSD).

[0294] The memory 303 stores executable code. The processor 302 executes the code stored in the memory 303 to implement the test code generation method, or the processor 302 is used to implement the functions of one or more of the aforementioned code generation apparatus 140, code generation system, etc. That is to say, the memory 303 stores instructions for executing the aforementioned test code generation method.

[0295] The communication interface 304 uses transceiver modules such as, but not limited to, network interface cards and transceivers to enable communication between the computing device 300 and other devices or communication networks. For example, the computing device 300 communicates with a user equipment through the communication interface 304.

[0296] This application also provides a computing device cluster. The computing device cluster includes at least one computing device, which may be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device in the computing device cluster may also be a terminal device such as a desktop computer, a laptop computer, or a smartphone.

[0297] Figure 16 An exemplary schematic diagram of the computing device cluster provided in this application is shown. Figure 16 As shown, the computing device cluster 400 includes at least one computing device 300. The memory 303 of one or more computing devices 300 in the computing device cluster 400 may store instructions for implementing the code generation system, testing the code generation method, or the code generation apparatus 140 described above.

[0298] In some possible implementations, the memories 303 of one or more computing devices 300 in the computing device cluster 400 may also store instructions for implementing the code generation system, test code generation method, or code generation apparatus 140. In other words, a combination of one or more computing devices 300 can jointly implement the instructions for the code generation system, test code generation method, or code generation apparatus 140. It should be noted that the memories 303 of different computing devices 300 in the computing device cluster 400 may store different instructions, each used to implement a portion of the functions of the instructions for the code generation system, test code generation method, or code generation apparatus 140.

[0299] In some possible implementations, one or more computing devices 300 in the computing device cluster 400 can be connected via a network, which can be a wide area network or a local area network, etc. Figure 17 One possible implementation is shown. For example... Figure 17 As shown, computing device 300A and computing device 300B are connected via a network.

[0300] It should be understood that Figure 17 The functions of computing device 300A shown can also be performed by multiple computing devices 300, and the functions of computing device 300B can also be performed by multiple computing devices 300. The specific deployment method can depend on business needs and the computing power of the computing devices.

[0301] Optionally, the aforementioned Figure 1 The computing device 10 shown may include Figure 15 The computing device 300 and / or shown Figure 16 (or Figure 17 The computing device cluster 400 is shown in the figure.

[0302] This application also provides a computer program product containing instructions. The computer program product may be a software or program product containing instructions capable of running on a computing device or stored on any available medium. When the computer program product is run on a computing device, it causes the computing device to execute the test code generation method described above.

[0303] This application also provides a computer-readable storage medium. This computer-readable storage medium can be any usable medium capable of being stored by a computing device, or a data storage device such as a data center containing one or more usable media. The aforementioned usable medium can be a magnetic medium.

[0304] (e.g., floppy disk, hard disk, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., solid-state drive), etc. The computer-readable storage medium includes instructions that instruct a computing device to execute the test code generation method described above.

[0305] In addition, a few additional points need to be made regarding this application:

[0306] I. The above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit it. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of this application.

[0307] 2. Unless otherwise stated, “multiple” means two or more.

[0308] 3. Unless otherwise specified or in case of logical conflict, the terms and / or descriptions in different embodiments of this application are consistent and can be referenced by each other. Technical features in different embodiments can be combined to form new embodiments based on their inherent logical relationships.

[0309] IV. The various numerical designations used in this application are merely for descriptive convenience and are not intended to limit the scope of protection of this application. The magnitude of the serial numbers used in this application does not imply a sequential order of execution; the execution order of each process should be determined by its function and internal logic. For example, the terms "first," "second," "third," "fourth," and other various terminology (if present) in the specification, claims, and drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. Such data can be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than that illustrated or described herein.

[0310] Furthermore, any embodiment or design described in this application as "exemplary" or "for example" should not be construed as being more preferred or advantageous than other embodiments or designs. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a concrete manner for ease of understanding.

[0311] V. The terms “comprising” and “having” and any variations thereof are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device that includes a series of steps or modules is not necessarily limited to those steps or modules that are expressly listed, but may include other steps or modules that are not expressly listed or that are inherent to such process, method, product or device.

[0312] VI. In this application, "for indicating" can be understood as "enabling". "Enabling" can include direct enabling and indirect enabling. When describing information for enabling A, it can include whether the information directly enables A or indirectly enables A, but does not necessarily mean that the information carries A.

[0313] The information that enables the information is called the information to be enabled. In the specific implementation process, there are many ways to enable the information to be enabled, such as, but not limited to, directly enabling the information to be enabled, such as the information to be enabled itself or its index. It can also be indirectly enabled by enabling other information, where there is a relationship between the other information and the information to be enabled. It can also enable only a part of the information to be enabled, while the other parts are known or pre-agreed upon. For example, enabling specific information can be achieved by using a pre-agreed (e.g., protocol-defined) arrangement of various pieces of information, thereby reducing enabling overhead to some extent. Simultaneously, common parts of various pieces of information can be identified and enabled uniformly to reduce the enabling overhead caused by individually enabling the same information.

[0314] VII. In this application, "predefined" may include preconfiguration. For example, predefining certain information means that the information is calculated or received in advance before performing an action that uses the information. The "predefined" can be implemented by pre-storing corresponding codes, tables, or other means that can be used to indicate relevant information in the device (e.g., controller or vehicle). This application does not limit the specific implementation method.

[0315] 8. The term "storage" or "preservation" in this application can refer to storage in one or more memory devices. These memory devices can be separately configured or integrated into an encoder, decoder, processor, or communication device. Alternatively, some memory devices can be separately configured, while others can be integrated into a decoder, processor, or communication device. The type of memory can be any form of storage medium, and this is not limited.

[0316] 9. The arrows or boxes indicated by dashed lines in the schematic diagrams in the accompanying drawings of this application represent optional steps or optional modules.

[0317] 10. Unless otherwise stated, " / " indicates that the objects before and after are in an "or" relationship. For example, A / B can mean A or B. In this application, "and / or" is merely a description of the relationship between the related objects, indicating that there can be three relationships. For example, A and / or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. A and B can be singular or plural.

[0318] XI. Unless otherwise stated, the names of devices, systems, modules and other information in the embodiments of this application are merely examples, and devices, systems and modules are used to represent possible entities that implement a certain function, and the meanings of the three can be used interchangeably.

Claims

1. A test code generation method, characterized in that, The code generation device is used in a code generation apparatus, wherein the code generation apparatus is equipped with a first model or the code generation apparatus has the ability to invoke the first model. The test code generation method includes: Acquire multimodal data related to the first software, wherein the multimodal data includes multiple data items and the multiple data items include at least two types of data, wherein the multimodal data includes the source code of the first software, wherein the multimodal data also includes the description document of the first software and / or the illustration of the first software, wherein the source code of the first software is of type code, the description document of the first software is of type document, and the illustration of the first software is of type image; A first intent representation vector is obtained based on the multimodal data, and the first intent representation vector is obtained by feature fusion of the multiple data. The first prompt word is constructed at least based on the first intent representation vector; Using the first model, a first test code is generated based on the first prompt word, and the first test code is used to test the first software.

2. The method according to claim 1, characterized in that, The description document of the first software includes information about the first software, which includes at least one of the following: The preset conditions of the first software, the test case information of the software, the source code file path of the first software, the product line of the first software, or the development department of the first software.

3. The method according to claim 1 or 2, characterized in that, The illustrations of the first software include a flowchart of the first software and / or a functional module architecture diagram of the first software.

4. The method according to any one of claims 1-3, characterized in that, The first model is a large language model (LLM).

5. The method according to any one of claims 1-4, characterized in that, The code generation device is equipped with a second model or has the ability to invoke a second model. The step of obtaining the first intent representation vector based on the multimodal data includes: Using the second model, a first intent representation vector is obtained based on the multimodal data.

6. The method according to claim 5, characterized in that, The second model includes an encoder and a neural network. The encoder is used to extract features from each data item in the multimodal data to obtain the features of the multimodal data. The neural network is used to obtain the first intention representation vector based on the features of the multiple data.

7. The method according to any one of claims 1-6, characterized in that, The construction of the first prompt word based at least on the first intent representation vector includes: Based on the first intent representation vector and the context information of the first software, the first prompt word is constructed. The context information of the first software is used to indicate the association between the first software and the source code in the target code project, where the target code project is the code project where the first software is located.

8. The method according to claim 7, characterized in that, The method further includes: Based on the source code in the target project, the context information of the first software is obtained. The context information includes one or more of the following: the dependency information of the first software on the source code in the target project, the initialization variable information of the first software, and the context step information of the first software.

9. The method according to claim 7 or 8, characterized in that, The step of constructing the first prompt word based on the first intent representation vector and the context information of the first software includes: A first intent structure is obtained based on the first intent representation vector and the context information of the first software. The first intent structure includes multiple layers of content, and each layer of content is used to represent the test intent from at least one dimension. The first prompt word is constructed based on the first intent structure and the context information of the first software.

10. The method according to claim 9, characterized in that, The step of obtaining the first intent structure based on the first intent representation vector and the context information of the first software includes: Using the intent parsing model, a first intent structure is obtained based on the first intent representation vector and the context information of the first software.

11. The method according to claim 10, characterized in that, The intent parsing model includes an intent recognition module, a context adjustment module, and a hierarchical construction module; The intent recognition module is used to obtain a preliminary vector based on the first intent representation vector; A context adjustment module is used to obtain an intermediate vector based on the initial vector and the context information of the first software; A hierarchical construction module is used to construct the first intent structure based on the initial vector and the intermediate vector.

12. The method according to any one of claims 9-11, characterized in that, After generating the first test code based on the first prompt word using the first model, the method further includes: The first intent structure is optimized based on the feedback information to obtain the second intent structure. The feedback information includes user feedback information and / or the execution result of the first test code. Based on the second intent structure and the context information of the first software, a second prompt word is constructed; Using the first model, a second test code is generated based on the second prompt word. The second test code is used to test the first software.

13. The method according to claim 12, characterized in that, The construction of the second prompt word based on the second intent structure and the context information of the first software includes: Based on the second intent structure, the context information of the first software, and the feedback information, a second prompt word is constructed.

14. The method according to any one of claims 1-13, characterized in that, The first prompt word includes one or more of the following: test characteristics, test steps, preconditions, expected results, test environment, and test behavior.

15. A code generation apparatus, characterized in that, The code generation device is equipped with a first model or has the ability to invoke the first model. The code generation device includes an acquisition module, a multimodal data fusion module, a prompt word construction module, and an inference module, wherein: The acquisition module is used to acquire multimodal data related to the first software. The multimodal data includes multiple data items and the multiple data items include at least two types of data. The multimodal data includes the source code of the first software, the description document of the first software, and / or the illustration of the first software. The source code of the first software is of type code, the description document of the first software is of type document, and the illustration of the first software is of type image. The multimodal data fusion module is used to obtain a first intent representation vector based on the multimodal data, wherein the first intent representation vector is obtained by feature fusion of the multiple data. The parsing construction module is used to construct a first prompt word based at least on the first intent representation vector; The reasoning module is used to generate first test code based on the first prompt word using the first model, and the first test code is used to test the first software.

16. The apparatus according to claim 15, characterized in that, The description document of the first software includes information about the first software, which includes at least one of the following: The preset conditions of the first software, the use case information of the first software, the test case information of the software, the source code file path of the first software, the product line of the first software, or the development department of the first software.

17. The apparatus according to claim 15 or 16, characterized in that, The illustrations of the first software include a flowchart of the first software and / or a functional module architecture diagram of the first software.

18. The apparatus according to any one of claims 15-17, characterized in that, The first model is a large language model (LLM).

19. The method according to any one of claims 15-18, characterized in that, The code generation device is equipped with a second model or the code generation device has the ability to invoke a second model; The multimodal data fusion module is used to obtain a first intent representation vector based on the multimodal data using the second model.

20. The apparatus according to claim 19, characterized in that, The second model includes an encoder and a neural network. The encoder is used to extract features of each data item in the multimodal data to obtain the features of the multimodal data. The neural network is used to obtain the first intention representation vector based on the features of the multimodal data.

21. The apparatus according to any one of claims 15-20, characterized in that, The parsing construction module is used to construct the first prompt word based on the first intent representation vector and the context information of the first software. The context information of the first software is used to indicate the association between the first software and the source code in the target code project, where the target code project is the code project where the first software is located.

22. The apparatus according to claim 21, characterized in that, The code generation device also includes a context-aware module. The context-aware module is used to obtain the context information of the first software based on the source code awareness in the target project. The context information includes one or more of the following: the dependency information of the first software on the source code in the target project, the initialization variable information of the first software, and the context step information of the first software.

23. The apparatus according to claim 21 or 22, characterized in that, The parsing construction module includes an intent parsing module and a prompt word construction module. The intent parsing module is used to obtain a first intent structure based on the first intent representation vector and the context information of the first software. The first intent structure includes multiple layers of content, and each layer of content is used to represent the test intent from at least one dimension. The prompt word construction module is used to construct the first prompt word based on the first intent structure and the context information of the first software.

24. The apparatus according to claim 23, characterized in that, The intent parsing module is used to obtain a first intent structure based on the first intent representation vector and the context information of the first software by utilizing the intent parsing model.

25. The apparatus according to claim 24, characterized in that, The intent parsing model includes an intent recognition module, a context adjustment module, and a hierarchical construction module; The intent recognition module is used to obtain a preliminary vector based on the first intent representation vector; A context adjustment module is used to obtain an intermediate vector based on the initial vector and the context information of the first software; A hierarchical construction module is used to construct the first intent structure based on the initial vector and the intermediate vector.

26. The apparatus according to any one of claims 23-25, characterized in that, The code generation device also includes an intent optimization module. The intent optimization module is used to optimize the first intent structure based on feedback information to obtain a second intent structure. The feedback information includes user feedback information and / or the execution result of the first test code. The prompt word construction module is further configured to construct a second prompt word based on the second intent structure and the context information of the first software; The reasoning module is also used to generate a second test code based on the second prompt word using the first model, and the second test code is used to test the first software.

27. The apparatus according to claim 26, characterized in that, The prompt word construction module is also used to construct a second prompt word based on the second intent structure, the context information of the first software, and the feedback information.

28. The apparatus according to any one of claims 15-27, characterized in that, The first prompt word includes one or more of the following: test characteristics, test steps, preconditions, expected results, test environment, and test behavior.

29. A computing device cluster, characterized in that, It includes at least one computing device, each computing device including a processor and memory; The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the method as described in any one of claims 1-14.

30. A computer-readable storage medium, characterized in that, Includes computer program instructions, which, when executed by a cluster of computing devices, perform the method as described in any one of claims 1-14.

31. A computer program product containing instructions, characterized in that, When the instruction is executed by a cluster of computing devices, the cluster of computing devices performs the method as described in any one of claims 1-14.