A code optimization and self-learning method based on automated test feedback

By combining full-dimensional automated testing and pre-trained models, the problem of difficulty in evaluating code quality and locating root causes after code generation is solved. This achieves quantitative evaluation of code quality and reliability of root cause location, improving the efficiency and accuracy of code generation and defect repair, and adapting to personalized needs.

CN122240083APending Publication Date: 2026-06-19WUHAN ENYI INTERNET TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
WUHAN ENYI INTERNET TECH CO LTD
Filing Date
2026-03-17
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies make it difficult to comprehensively assess code quality after code generation, cannot achieve accurate end-to-end root cause localization, rely on human experience for defect repair, are costly, lack self-evolution capabilities, and cannot adapt to personalized development needs.

Method used

By collecting multi-dimensional test feedback data through full-dimensional automated testing, a multi-dimensional defect risk value is constructed. The root cause is located by combining it with a pre-trained root cause analysis model. The code optimization agent is used to generate a repair solution, and the code generation model and root cause analysis model are optimized through a pattern learning module.

🎯Benefits of technology

It achieves reliable quantitative assessment of code quality and root cause localization, reduces human intervention, improves the efficiency and accuracy of code generation and defect repair, and adapts to the personalized needs of different projects.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240083A_ABST
    Figure CN122240083A_ABST
Patent Text Reader

Abstract

This invention discloses a code optimization and self-learning method based on automated test feedback, comprising the following steps: receiving a user's natural language requirement text; generating initial release code through a pre-trained code generation model; performing full-dimensional automated testing on the initial release code, collecting multi-dimensional test feedback data, and generating a structured multi-dimensional health check report; constructing a standardized problem-related feature matrix through a preset feature weight calculation formula; inputting the problem-related feature matrix into a pre-trained root cause analysis model to obtain the root cause classification probability output by the model and calculating the comprehensive confidence level; generating corresponding defect code repair solutions through a code optimization agent to generate optimized code; and applying the updated code generation model and root cause analysis model to the next automatic code generation and defect analysis process. This invention achieves automated process diversion through preset risk thresholds, avoiding ineffective root cause analysis and repair processes.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of code generation technology, specifically to a code optimization and self-learning method based on automated test feedback. Background Technology

[0002] Currently, code generation technology based on large language models is widely used in software development. It can quickly generate initial code based on users' natural language requirements, significantly lowering the development threshold and improving the efficiency of basic code development. However, existing technologies still have the following technical problems in the full-process management after code generation: First, traditional automated testing can only collect single functional results of test cases passing / failing, which cannot cover multi-dimensional data such as runtime performance, code coverage, security risks, and exception logs. It is difficult to comprehensively evaluate code quality and is very easy to miss non-functional defects such as performance bottlenecks and hidden security vulnerabilities. At the same time, the collected test data is mostly stored in isolation and is not bound to the code location or business scenario, which cannot provide effective support for subsequent defect analysis.

[0003] Secondly, existing solutions can only discover defects, but cannot achieve accurate end-to-end root cause localization. Defect remediation still heavily relies on the human experience of developers. Especially for complex issues such as performance bottlenecks in architecture-related classes, third-party dependency compatibility issues, and hidden logical defects, manual localization and remediation are time-consuming, labor-intensive, and prone to introducing secondary defects due to insufficient experience.

[0004] Third, existing code generation and defect repair solutions lack complete self-evolution capabilities. They cannot extract reusable and effective patterns from historical defect-repair processes, nor can they continuously optimize code generation models and root cause analysis models. As a result, code generation quality and defect repair capabilities cannot be iteratively improved, making it difficult to adapt to the personalized development needs of different projects.

[0005] To address these issues, we propose a code optimization and self-learning method based on automated test feedback. Summary of the Invention

[0006] The present invention proposes a code optimization and self-learning method based on automated test feedback, which can at least solve one of the technical problems in the background art.

[0007] To achieve the above objectives, the present invention adopts the following technical solution: A code optimization and self-learning method based on automated test feedback includes the following steps: S1. Receive the user's natural language request text, generate a large model through pre-trained code, and generate the initial code to be released. S2. Perform full-dimensional automated testing on the initial code to be released, collect multi-dimensional test feedback data, and calculate the multi-dimensional defect risk value of the corresponding code using a preset multi-dimensional defect risk value calculation formula. Generate a structured, multi-dimensional health check report; when the multi-dimensional defect risk value Preset risk threshold When, proceed to the next step; when At that time, directly output the initial code to be released; S3. Perform cross-dimensional association mapping between the multidimensional physical examination report and the pre-stored code change history, system architecture map, and dependency library metadata. Calculate the weight coefficients of each associated feature using a preset feature weight calculation formula to construct a standardized problem association feature matrix. S4. Input the problem-related feature matrix into the pre-trained root cause analysis model to obtain the root cause classification probability output by the model. Combining similar historical defect cases with matching degree The overall confidence level is calculated using a pre-defined root cause confidence level calculation formula. ;when Preset confidence threshold When outputting the root cause localization results of the code defect and the corresponding root cause classification label; when... If necessary, return to step S3 to supplement the associated features and reconstruct the problem associated feature matrix; S5. Based on the root cause localization results and root cause classification labels, generate corresponding defect code repair schemes through code optimization agents, iteratively optimize the initial code to be released, and generate optimized code. S6. Input the multi-dimensional defect risk values ​​before and after optimization, root cause localization results, code repair solutions, and code difference data before and after optimization into the pattern learning module. Calculate the effectiveness score of the defect-repair correlation pattern using the preset pattern effectiveness scoring formula. ;when Preset validity threshold At that time, the defect-repair correlation pattern is extracted, and the feature weights of the fine-tuning dataset of the code-generated large model and the root cause analysis model are updated. S7. Apply the updated code generation model and root cause analysis model to the next code automatic generation and defect analysis process.

[0008] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, in step S1, the initial code generation process based on natural language understanding specifically involves: receiving the user's natural language requirement text, performing word segmentation, entity extraction, intent recognition, and business constraint extraction on the requirement text through a natural language understanding module to generate a structured code generation requirement specification; inputting the code generation requirement specification into a pre-trained code generation model to generate initial release code that conforms to the requirement specification, syntax rules, and coding standards; and simultaneously using the business constraints as the basis for setting preset performance benchmark thresholds and coverage benchmark values.

[0009] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, in step S2, the full-dimensional automated testing includes functional testing, performance testing, code coverage testing, and security scanning testing; the collected multi-dimensional test feedback data includes: the pass / fail results of the test cases corresponding to the functional tests, and the log error stacks bound to the failed test cases; the peak CPU / memory usage, average SQL query time, and API interface response latency corresponding to the performance tests; the line coverage and branch coverage data corresponding to the code coverage tests; and the vulnerability level and the code line location of the vulnerability corresponding to the security scanning report. The formula for calculating the multidimensional defect risk value is as follows:

[0010] In the formula, The weighting coefficients are preset, and ; For functional defect scoring, Scoring for performance anomalies, Scoring for missing coverage Scoring security vulnerabilities The error stack is scored, and the values ​​of each score are all in the range of [0,1]. The functional defect score The performance anomaly score is the ratio of the number of failed test cases to the total number of test cases. The coverage missing score is the ratio of the number of performance metrics that exceed a preset performance benchmark threshold to the total number of performance metrics. The security vulnerability score is the ratio of the difference between the preset coverage benchmark value and the actual coverage value to the benchmark value, with a lower limit of 0. The error stack anomaly score is the ratio of the number of vulnerabilities weighted by vulnerability level to the preset maximum allowed number of vulnerabilities, with an upper limit of 1. This is the ratio of the frequency of abnormal log occurrences to a preset frequency threshold, with an upper limit of 1; all collected multidimensional test feedback data are uniquely bound and associated through code line numbers and test case IDs to generate a structured multidimensional health check report.

[0011] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, in step S3, the feature weight calculation formula is:

[0012] In the formula, For the first Normalized weights of each associated feature, For the first The degree of correlation between each associated feature and the location of the defect. For the first Historical root cause contribution of each associated feature This represents the total number of associated features; The specific process of the cross-dimensional association mapping is as follows: Based on the line number, the defect code location in the multi-dimensional health check report is associated with the corresponding code commit record and change context information in the code change history, and the correlation degree of the corresponding association features is determined. To reduce the overlap between changed lines of code and defective lines of code; based on the service call chain and data flow path in the system architecture diagram, the API response latency and SQL query time data in the multidimensional health check report are correlated with the architectural dependencies of the corresponding service nodes and database nodes, and the correlation degree of the corresponding correlation features is determined. This involves determining the call density between defective nodes and related nodes; based on the dependency library name and version number, it associates security vulnerabilities and runtime anomaly data from the multi-dimensional health check report with the corresponding dependency library's metadata and open-source known vulnerability libraries, and then assesses the correlation degree of these associated features. The matching degree between vulnerabilities and dependency libraries is calculated. Based on the above correlation results, multi-dimensional correlation features are extracted, and the weights of each correlation feature are combined. Construct a standardized problem-related feature matrix.

[0013] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, in step S4, the root cause confidence calculation formula is:

[0014] In the formula, , The preset weighting coefficients, and ; In step S4, the pre-trained root cause analysis model is a multi-class deep neural network model based on an attention mechanism. Its pre-training process uses a historical defect dataset labeled with root cause localization results and root cause classification labels for supervised training. The root cause classification labels include five categories: syntax errors, logical defects, performance bottlenecks, security vulnerabilities, and dependency compatibility issues. The model outputs the root cause classification probability. The predicted probability of the model for the target root cause classification label is [0,1]. The output root cause localization results include the precise code line number where the defect is located, the root cause description, and the full-link impact range. In step S4, the matching degree of the similar historical defect cases The calculation process is as follows: The constructed problem-associated feature matrix is ​​matched with the feature vectors in the pre-constructed historical defect case library using cosine similarity. The top-N similar historical defect cases are retrieved, and the highest cosine similarity value is taken as the result. The value range is [0,1]. The root cause localization results and remediation plans of the matched similar cases are used as auxiliary features and input into the root cause analysis model to optimize the root cause classification probability. The output result.

[0015] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, in step S5, the code optimization agent includes a root cause analysis unit, a repair scheme generation unit, and a local verification unit; its execution process is as follows: the root cause analysis unit performs structured analysis on the root cause location results and root cause classification labels, extracting the core constraints and repair targets of the defects; the repair scheme generation unit combines the system architecture diagram and dependency library metadata to generate repair code fragments that conform to coding standards and business requirements; the local verification unit performs targeted unit tests and regression tests on the repaired code fragments, and calculates the local multidimensional defect risk value after repair. ,when When verification passes, code iteration and optimization are completed; when If the verification fails, a repair plan is regenerated until the verification passes or the preset maximum number of iterations is reached. ; In step S5, when the root cause classification label is dependency compatibility issue or dependency library security vulnerability, the code optimization agent's remediation solution generation unit will also combine dependency library metadata and open-source vulnerability databases to generate a dependency library version upgrade / security replacement solution, simultaneously optimizing code dependencies; and synchronously update the dependency optimization results to the system architecture map and dependency library metadata, while also updating the historical root cause contribution of the corresponding associated features. .

[0016] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, wherein: in step S6, the pattern effectiveness scoring formula is:

[0017] In the formula, , , The preset weighting coefficients, and ; The multidimensional defect risk value before optimization. The optimized multidimensional defect risk value; This represents the accuracy of the root cause localization. This represents the actual number of iterations for this code fix. The maximum number of iterations is preset. In step S6, the specific process of the pattern learning module extracting the defect-repair correlation pattern is as follows: The multi-dimensional health check report, root cause localization results, code repair plan, and code difference data before and after optimization corresponding to this process are normalized and standardized. The correlation relationships between defect triggering scenarios, defect root cause types, repair code features, and post-repair effect verification data are extracted to form a standardized defect-repair correlation pattern. This defect-repair correlation pattern is then added to the fine-tuning dataset of the large code generation model, and the large code generation model is incrementally fine-tuned. Simultaneously, based on the accuracy of this root cause localization... Weights of each associated feature Update the attention weights of the root cause analysis model and the historical root cause contribution of the corresponding associated features. .

[0018] As a preferred embodiment of the code optimization and self-learning method based on automated test feedback described in this invention, step S7 further includes a self-updating process for weight coefficients: after completing a preset number of... After completing the entire code generation and optimization process, based on the average root cause localization accuracy and the average first-time pass rate for code fixes, the following analysis was conducted: , , , , , The weighting coefficients are adaptively updated, and the updated weighting coefficients must satisfy the normalization constraint; and the risk threshold is updated synchronously. Confidence threshold Validity threshold This enables closed-loop self-optimization of thresholds and weights throughout the entire process.

[0019] In another aspect, the present invention also discloses a computer-readable storage medium storing a computer program, which, when executed by a processor, causes the processor to perform the steps of the method described above.

[0020] In another aspect, the present invention also discloses a computer device, including a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor performs the steps of the method described above.

[0021] The beneficial effects of this invention are: This invention collects multi-dimensional test feedback data through full-dimensional automated testing, constructs a multi-dimensional defect risk value quantitative calculation formula, and normalizes and weights code quality data from multiple dimensions such as functionality, performance, coverage, security, and exception logs to achieve quantitative evaluation of code quality; it also achieves automated process diversion by setting risk thresholds, avoiding ineffective root cause analysis and repair processes. This invention introduces a pre-trained root cause analysis model, and constructs a root cause confidence calculation formula by combining the root cause classification probability output by the model with the matching degree of historical cases. The validity of the root cause localization results is verified by setting a pre-set confidence threshold. For results with insufficient confidence, the associated features are automatically supplemented and re-analyzed, which ensures the reliability of the root cause localization results from a mechanism perspective and avoids secondary defects caused by error repair. Attached Figure Description

[0022] Figure 1 This is a flowchart illustrating the steps of the code optimization and self-learning method based on automated test feedback in this invention. Detailed Implementation

[0023] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are some embodiments of the present invention, but not all embodiments.

[0024] like Figure 1 As shown, a code optimization and self-learning method based on automated test feedback includes the following steps: S1. Receive the user's natural language request text, generate a large model through pre-trained code, and generate the initial code to be released. S2. Perform full-dimensional automated testing on the initial code to be released, collect multi-dimensional test feedback data, and calculate the multi-dimensional defect risk value of the corresponding code using a preset multi-dimensional defect risk value calculation formula. Generate a structured, multi-dimensional health check report; when the multi-dimensional defect risk value Preset risk threshold When, proceed to the next step; when At that time, directly output the initial code to be released; S3. Cross-dimensional association mapping between the multidimensional physical examination report and the pre-stored code change history, system architecture map, and dependency library metadata. Calculate the weight coefficients of each associated feature using a preset feature weight calculation formula to construct a standardized problem association feature matrix. S4. Input the problem-related feature matrix into the pre-trained root cause analysis model to obtain the root cause classification probability output by the model. Combining similar historical defect cases with matching degree The overall confidence level is calculated using a pre-defined root cause confidence level calculation formula. ;when Preset confidence threshold When outputting the root cause localization results of the code defect and the corresponding root cause classification label; when... If necessary, return to step S3 to supplement the associated features and reconstruct the problem associated feature matrix; S5. Based on the root cause localization results and root cause classification labels, the code optimization agent generates corresponding defect code repair solutions, iteratively optimizes the initial code to be released, and generates optimized code. S6. Input the multi-dimensional defect risk values ​​before and after optimization, root cause localization results, code repair solutions, and code difference data before and after optimization into the pattern learning module. Calculate the effectiveness score of the defect-repair correlation pattern using the preset pattern effectiveness scoring formula. ;when Preset validity threshold At that time, extract the defect-repair correlation pattern, update the code to generate the fine-tuned dataset of the large model and the feature weights of the root cause analysis model; S7. Apply the updated code generation model and root cause analysis model to the next code automatic generation and defect analysis process.

[0025] In step S1, the initial code generation process based on natural language understanding specifically involves: receiving the user's natural language requirement text, performing word segmentation, entity extraction, intent recognition, and business constraint extraction on the requirement text through the natural language understanding module to generate a structured code generation requirement specification; inputting the code generation requirement specification into a pre-trained code generation model to generate initial release code that conforms to the requirement specification, syntax rules, and coding standards; and using the business constraints as the basis for setting preset performance benchmark thresholds and coverage benchmark values.

[0026] In step S2, the full-dimensional automated testing includes functional testing, performance testing, code coverage testing, and security scanning testing. The collected multi-dimensional test feedback data includes: pass / fail results of test cases for functional tests, and log error stacks bound to failed test cases; peak CPU / memory usage, average SQL query time, and API response latency for performance tests; line coverage and branch coverage data for code coverage tests; and vulnerability levels and the line numbers of code containing vulnerabilities for security scanning reports. The formula for calculating the multidimensional defect risk value is:

[0027] In the formula, The weighting coefficients are preset, and ; For functional defect scoring, Scoring for performance anomalies, Scoring for missing coverage Scoring security vulnerabilities The error stack is scored, and the values ​​of each score are all in the range of [0,1]. Functional defect score This is the ratio of the number of failed test cases to the total number of test cases; performance anomaly score. The ratio of the number of metrics that exceed the preset performance benchmark threshold to the total number of performance metrics; coverage missing score. The ratio of the difference between the preset coverage benchmark and the actual coverage value to the benchmark value, with a lower limit of 0; Security vulnerability score. This is the ratio of the number of vulnerabilities weighted by vulnerability level to the preset maximum allowed number of vulnerabilities, with an upper limit of 1; Error stack anomaly score. This is the ratio of the frequency of abnormal log occurrences to a preset frequency threshold, with an upper limit of 1; all collected multidimensional test feedback data are uniquely bound and associated through code line numbers and test case IDs to generate a structured multidimensional health check report.

[0028] In step S3, the formula for calculating the feature weights is as follows:

[0029] In the formula, For the first Normalized weights of each associated feature, For the first The degree of correlation between each associated feature and the location of the defect. For the first Historical root cause contribution of each associated feature This represents the total number of associated features; The specific process of cross-dimensional association mapping is as follows: Based on the line number, the location of the defective code in the multi-dimensional health check report is associated with the corresponding code commit record and change context information in the code change history, and the correlation degree of the corresponding association features is determined. To reduce the overlap between changed lines of code and defective lines of code; based on the service call chain and data flow path in the system architecture diagram, the API response latency and SQL query time data in the multidimensional health check report are correlated with the architectural dependencies of the corresponding service nodes and database nodes, and the correlation degree of the corresponding correlation features is determined. This involves determining the call density between defective nodes and related nodes; based on the dependency library name and version number, it associates security vulnerabilities and runtime anomaly data from the multi-dimensional health check report with the corresponding dependency library's metadata and open-source known vulnerability libraries, and then assesses the correlation degree of these associated features. The matching degree between vulnerabilities and dependency libraries is calculated. Based on the above correlation results, multi-dimensional correlation features are extracted, and the weights of each correlation feature are combined. Construct a standardized problem-related feature matrix.

[0030] Furthermore, in step S4, the formula for calculating the root cause confidence level is:

[0031] In the formula, , The preset weighting coefficients, and ; In step S4, the pre-trained root cause analysis model is a multi-class deep neural network model based on an attention mechanism. Its pre-training process uses a historical defect dataset labeled with root cause localization results and root cause classification labels for supervised training. The root cause classification labels include five categories: syntax errors, logical defects, performance bottlenecks, security vulnerabilities, and dependency compatibility issues. The model outputs the root cause classification probabilities. The predicted probability of the model for the target root cause classification label is [0,1]. The output root cause localization results include the precise code line number where the defect is located, the root cause description, and the full-link impact range. In step S4, the matching degree of similar historical defect cases The calculation process is as follows: The constructed problem-associated feature matrix is ​​matched with the feature vectors in the pre-constructed historical defect case library using cosine similarity. The top-N similar historical defect cases are retrieved, and the highest cosine similarity value is taken as the result. The value range is [0,1]. The root cause localization results and remediation plans of the matched similar cases are used as auxiliary features and input into the root cause analysis model to optimize the root cause classification probability. The output result.

[0032] In step S5, the code optimization agent includes a root cause analysis unit, a repair solution generation unit, and a local verification unit. The execution process is as follows: the root cause analysis unit performs structured analysis on the root cause localization results and root cause classification labels, extracting the core constraints and repair objectives of the defects; the repair solution generation unit combines the system architecture diagram and dependency library metadata to generate repair code snippets that conform to coding standards and business requirements; the local verification unit performs targeted unit tests and regression tests on the repaired code snippets and calculates the local multidimensional defect risk value after repair. ,when When verification passes, code iteration and optimization are completed; when If the verification fails, a repair plan is regenerated until the verification passes or the preset maximum number of iterations is reached. ; In step S5, when the root cause classification label is dependency compatibility issue or dependency library security vulnerability, the code optimization agent's remediation solution generation unit will also combine dependency library metadata and open-source vulnerability databases to generate a dependency library version upgrade / security replacement solution, simultaneously optimizing code dependencies; and synchronously update the dependency optimization results to the system architecture map and dependency library metadata, while also updating the historical root cause contribution of the corresponding associated features. .

[0033] Specifically, in step S6, the formula for scoring pattern effectiveness is:

[0034] In the formula, , , The preset weighting coefficients, and ; The multidimensional defect risk value before optimization. The optimized multidimensional defect risk value; This represents the accuracy of the root cause localization. This represents the actual number of iterations for this code fix. The maximum number of iterations is preset. In step S6, the specific process of the pattern learning module extracting defect-repair correlation patterns is as follows: The multi-dimensional health check report, root cause localization results, code repair plan, and code difference data before and after optimization corresponding to this process are normalized and standardized. The correlation relationships between defect triggering scenarios, defect root cause types, repair code features, and post-repair effect verification data are extracted to form standardized defect-repair correlation patterns. These patterns are then added to the fine-tuning dataset of the large code generation model for incremental fine-tuning. Simultaneously, based on the accuracy of this root cause localization... Weights of each associated feature Update the attention weights of the root cause analysis model and the historical root cause contribution of the corresponding associated features. .

[0035] Step S7 also includes a self-updating process for the weight coefficients: after completing a preset number of steps... After completing the entire code generation and optimization process, based on the average root cause localization accuracy and the average first-time pass rate for code fixes, the following analysis was conducted: , , , , , The weighting coefficients are adaptively updated, and the updated weighting coefficients must satisfy the normalization constraint; and the risk threshold is updated synchronously. Confidence threshold Validity threshold This enables closed-loop self-optimization of thresholds and weights throughout the entire process.

[0036] Specific examples of this invention are as follows: The initial values ​​for each parameter are preset as follows (those skilled in the art can adjust them flexibly according to the actual business scenario): Initial values ​​of weighted coefficients: =0.3, =0.2, =0.1, =0.25, =0.15, which satisfies the condition. ; Risk threshold =0.3; The weighting coefficients α=0.7 and β=0.3, satisfying α+β=1; Confidence threshold =0.85; The weighting coefficients γ1=0.4, γ2=0.35, and γ3=0.25 satisfy γ1+γ2+γ3=1; Validity threshold =0.7; Maximum number of iterations =5; The weight self-update cycle M = 100 times throughout the entire process; The historical defect case library stores over 100,000 annotated historical code defect cases, covering all 5 root cause classification tags.

[0037] The implementation steps are as follows: S1. Initial Release Code Generation Receive natural language request text input from the user: "Generate backend interface code for a user management system, including user registration, login, and information query functions, using Java and the Spring Boot framework, and connecting to a MySQL database."

[0038] The natural language understanding module processes the requirement text: it uses a BERT pre-trained model for word segmentation and entity extraction, extracting core entities such as "Java", "SpringBoot", "MySQL", "user registration", "user login", and "user information query"; it uses an intent recognition model to identify the user's core requirement as "user management system backend interface code generation"; and it uses a constraint extraction module to extract business constraints such as development language, framework, database, and functional scope, generating a structured code generation requirement specification.

[0039] The structured code generation requirements specification is input into a pre-trained code generation model (this embodiment uses the CodeLlama-70B pre-trained code model) to generate initial release code that conforms to the requirements specification, Java syntax specifications, and Spring Boot coding standards. This includes complete code for the Controller, Service, and Mapper layers, as well as corresponding database table structures and SQL statements. At the same time, the extracted business constraints are used as the basis for setting subsequent preset parameters. For example, the preset threshold for user login interface API response latency is 200ms, the threshold for SQL query time is 50ms, and the benchmark value for code line coverage is 90%. S2, Comprehensive Testing and Defect Risk Quantitative Assessment Full-dimensional automated testing is performed on the initial code to be released, and multi-dimensional test feedback data is collected. The specific execution process is as follows: Functional testing: Using the JUnit testing framework, 20 unit test cases and integration test cases were generated based on the requirements specification. After executing the tests, 4 failed cases were collected, and the corresponding log error stacks and defective code line numbers were bound to them. Functional defect scores were calculated. =4 / 20=0.2; Performance testing: Using the JMeter stress testing tool, a scenario of 100 concurrent users was simulated, and five core performance indicators were collected. Two of these indicators exceeded the preset performance benchmark threshold, and a performance anomaly score was calculated. =2 / 5=0.4; Code coverage testing: Using the JaCoCo coverage tool, the actual line coverage rate was 72%, which is lower than the preset benchmark of 90%. A coverage gap score was calculated. =(0.9-0.72) / 0.9=0.2; Security scanning test: Using the SonarQube code security scanning tool, two high-risk SQL injection vulnerabilities were detected. The preset maximum allowed number of weighted vulnerabilities is 5, and the number of vulnerabilities after weighting is 4. Security vulnerability scores were calculated. =4 / 5=0.8; Error stack anomaly collection: Using a log collection tool, 6 anomaly logs were collected during the test, with a preset frequency threshold of 10. The error stack anomaly score was then calculated. =6 / 10=0.6.

[0040] All collected multidimensional test feedback data are uniquely bound and associated using code line numbers and test case IDs to generate a structured multidimensional health check report. The scores are then substituted into the multidimensional defect risk value calculation formula: =0.3×0.2+0.2×0.4+0.1×0.2+0.25×0.8+0.15×0.6=0.45 =0.45≥ =0.3, therefore proceed to the subsequent root cause analysis step; S3, Cross-dimensional Association Mapping and Feature Matrix Construction The multidimensional health check report is cross-dimensionally associated and mapped with pre-stored code change history, system architecture diagram, and dependency library metadata. The specific process is as follows: 1. Code Change History Association: Based on line numbers, the location of defective code is associated with Git commit records. Of the 10 defective lines, 8 are from the most recent commit. The correlation is [not specified]. =0.8, the historical root cause contribution of this feature =0.6; 2. System Architecture Map Association: Based on the service call chain, API response latency data is associated with corresponding service nodes. The defective interface call chain involves 3 associated service nodes, 1 of which is a direct dependency node. The degree of association is... The historical root cause contribution of this feature is approximately 0.33. =0.7; 3. Dependency Library Metadata Association: Based on the dependency library name and version number, security vulnerabilities are associated with the CVE open-source vulnerability database. Both vulnerabilities match the currently introduced MyBatis dependency version, indicating a high degree of association. =1.0, the historical root cause contribution of this feature =0.8.

[0041] Three related features are extracted. These are then substituted into the feature weight calculation formula to calculate the normalized weight of each feature: Feature 1 weight =(0.8×0.6) / (0.8×0.6+0.33×0.7+1.0×0.8)≈0.318; Feature 2 weights =(0.33×0.7) / (0.8×0.6+0.33×0.7+1.0×0.8)≈0.153; Feature 3 weights =(1.0×0.8) / (0.8×0.6+0.33×0.7+1.0×0.8)≈0.529; Based on the calculated weight coefficients, and combined with the feature vectors of each feature, a standardized problem-related feature matrix is ​​constructed. S4. Root cause analysis and confidence level verification By inputting the problem-related feature matrix into a pre-trained root cause analysis model, the model outputs the highest predicted probability for security vulnerability classification. =0.92.

[0042] Simultaneously, cosine similarity matching was performed between the problem-related feature matrix and the feature vectors of the historical defect case database to retrieve the top-5 similar cases. The highest cosine similarity was 0.88, i.e. =0.88.

[0043] Substitute into the root cause confidence calculation formula: =(0.92×0.7+0.88×0.3) / 1=0.908 =0.908≥ =0.85, therefore the root cause localization result and root cause classification label are output: the root cause classification label is security vulnerability, and the root cause localization result includes the exact line number of the defect (the line of SQL code for user query in the Mapper layer), the root cause description (the SQL statement was not parameterized, and there is an SQL injection vulnerability), and the scope of the entire chain of impact (user login and information query interface are both affected).

[0044] S5, agent-assisted code auto-repair Based on the root cause localization results and classification labels, the agent executes code repair through code optimization. The specific process is as follows: 1. Root Cause Analysis Unit: Performs structured analysis on the root cause results and extracts the repair target as "fixing SQL injection vulnerabilities, ensuring normal SQL query function, conforming to Java coding standards, and not affecting the performance of the original interface"; 2. Repair Solution Generation Unit: Combining the system architecture diagram and dependency library metadata, generate repair code snippets, modify the original concatenated SQL statements into MyBatis #{} parameterized SQL statements, and eliminate SQL injection vulnerabilities; 3. Local Verification Unit: Perform targeted unit tests and regression tests on the repaired code snippets to calculate the local multidimensional defect risk value after repair. =0.18 < 0.3, verification passed, code iteration and optimization completed, optimized code generated.

[0045] S6, Pattern Learning and Model Update Input all the data from this process into the pattern learning module and substitute it into the pattern effectiveness scoring formula, where... =0.45, =0.18, the root cause localization results were completely accurate after manual verification. =1.0, the number of this repair iteration. =1, =5: =0.4×(0.45-0.18) / 0.45+0.35×1.0+0.25×(1-1 / 5)=0.79; =0.79≥ =0.7, therefore, this defect-repair correlation pattern is extracted and added to the fine-tuning dataset of the large code generation model. The model is then incrementally fine-tuned to prevent the same defect from being generated in subsequent similar code. Simultaneously, based on the accuracy and feature weights of this root cause localization, the attention weights of the root cause analysis model and the historical root cause contribution of the corresponding correlated features are updated. .

[0046] S7, Self-evolutionary closed-loop construction The updated code generation model and root cause analysis model are applied to the next automatic code generation and defect analysis process, forming a complete self-evolutionary closed loop. After every 100 full processes, based on the average root cause location accuracy and the average code repair pass rate of the entire process, the weight coefficients of each item are adaptively updated, and the thresholds are updated synchronously to achieve closed-loop self-optimization of the parameters of the entire process, continuously improving the adaptability of the solution to the target business scenario.

[0047] In another aspect, the present invention also discloses a computer-readable storage medium storing a computer program, which, when executed by a processor, causes the processor to perform the steps of the method described above.

[0048] In another aspect, the present invention also discloses a computer device, including a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor performs the steps of the method described above.

[0049] In another embodiment provided in this application, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to execute any of the code optimization and self-learning methods based on automated test feedback in the above embodiments.

[0050] It is understood that the systems, devices, and storage media provided in the embodiments of the present invention correspond to the methods provided in the embodiments of the present invention, and the explanations, examples, and beneficial effects of the relevant content can be referred to the corresponding parts of the above methods.

[0051] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state disk (SSD)).

[0052] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0053] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0054] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A code optimization and self-learning method based on automated test feedback, characterized in that, Includes the following steps: S1. Receive the user's natural language request text, generate a large model through pre-trained code, and generate the initial code to be released. S2. Perform full-dimensional automated testing on the initial code to be released, collect multi-dimensional test feedback data, and calculate the multi-dimensional defect risk value of the corresponding code using a preset multi-dimensional defect risk value calculation formula. Generate structured, multi-dimensional health check reports; When the multidimensional defect risk value Preset risk threshold When, proceed to the next step; when At that time, directly output the initial code to be released; S3. Perform cross-dimensional association mapping between the multidimensional physical examination report and the pre-stored code change history, system architecture map, and dependency library metadata. Calculate the weight coefficients of each associated feature using a preset feature weight calculation formula to construct a standardized problem association feature matrix. S4. Input the problem-related feature matrix into the pre-trained root cause analysis model to obtain the root cause classification probability output by the model. Combining similar historical defect cases with matching degree The overall confidence level is calculated using a pre-defined root cause confidence level calculation formula. ;when Preset confidence threshold When outputting the root cause localization results of the code defect and the corresponding root cause classification label; when... If necessary, return to step S3 to supplement the associated features and reconstruct the problem associated feature matrix; S5. Based on the root cause localization results and root cause classification labels, generate corresponding defect code repair schemes through code optimization agents, iteratively optimize the initial code to be released, and generate optimized code. S6. Input the multi-dimensional defect risk values ​​before and after optimization, root cause localization results, code repair solutions, and code difference data before and after optimization into the pattern learning module. Calculate the effectiveness score of the defect-repair correlation pattern using the preset pattern effectiveness scoring formula. ;when Preset validity threshold At that time, the defect-repair correlation pattern is extracted, and the feature weights of the fine-tuning dataset of the code-generated large model and the root cause analysis model are updated. S7. Apply the updated code generation model and root cause analysis model to the next code automatic generation and defect analysis process.

2. The code optimization and self-learning method based on automated test feedback according to claim 1, characterized in that: In step S1, the initial code generation process based on natural language understanding specifically involves: receiving the user's natural language requirement text, performing word segmentation, entity extraction, intent recognition, and business constraint extraction on the requirement text through the natural language understanding module to generate a structured code generation requirement specification; inputting the code generation requirement specification into a pre-trained code generation model to generate initial release code that conforms to the requirement specification, syntax rules, and coding standards; and using the business constraints as the basis for setting preset performance benchmark thresholds and coverage benchmark values.

3. The code optimization and self-learning method based on automated test feedback according to claim 1, characterized in that: In step S2, the full-dimensional automated testing includes functional testing, performance testing, code coverage testing, and security scanning testing. The collected multi-dimensional test feedback data includes: pass / fail results of functional test cases and error stacks of failed test cases; peak CPU / memory usage, average SQL query time, and API response latency of performance tests; line coverage and branch coverage data of code coverage tests; and vulnerability levels and line locations of vulnerabilities in security scan reports. The formula for calculating the multidimensional defect risk value is as follows: In the formula, The weighting coefficients are preset, and ; For functional defect scoring, Scoring for performance anomalies, Scoring for missing coverage Scoring security vulnerabilities The error stack is scored, and the values ​​of each score are all in the range of [0,1]. The functional defect score The performance anomaly score is the ratio of the number of failed test cases to the total number of test cases. The coverage missing score is the ratio of the number of performance metrics that exceed a preset performance benchmark threshold to the total number of performance metrics. The security vulnerability score is the ratio of the difference between the preset coverage benchmark value and the actual coverage value to the benchmark value, with a lower limit of 0. The error stack anomaly score is the ratio of the number of vulnerabilities weighted by vulnerability level to the preset maximum allowed number of vulnerabilities, with an upper limit of 1. This is the ratio of the frequency of abnormal log occurrences to a preset frequency threshold, with an upper limit of 1; all collected multidimensional test feedback data are uniquely bound and associated through code line numbers and test case IDs to generate a structured multidimensional health check report.

4. The code optimization and self-learning method based on automated test feedback according to claim 3, characterized in that: In step S3, the formula for calculating the feature weights is: In the formula, For the first Normalized weights of each associated feature, For the first The degree of correlation between each associated feature and the location of the defect. For the first Historical root cause contribution of each associated feature This represents the total number of associated features; The specific process of the cross-dimensional association mapping is as follows: Based on the line number, the defect code location in the multi-dimensional health check report is associated with the corresponding code commit record and change context information in the code change history, and the correlation degree of the corresponding association features is determined. To reduce the overlap between lines of code that change and lines of code that are defective; Based on the service call chain and data flow path in the system architecture diagram, the API response latency and SQL query time data in the multidimensional health check report are correlated with the architectural dependencies of the corresponding service nodes and database nodes, and the correlation degree of the corresponding correlation features is determined. The link call density between defective nodes and associated nodes; Based on the dependency library name and version number, the security vulnerabilities and runtime anomaly data in the multidimensional health check report are associated with the metadata of the corresponding dependency library and the open-source known vulnerability database, and the correlation degree of the associated features is determined. The degree of matching between vulnerabilities and their dependency libraries; Based on the above association results, multi-dimensional association features are extracted, and the calculated weights of each association feature are combined. Construct a standardized problem-related feature matrix.

5. The code optimization and self-learning method based on automated test feedback according to claim 4, characterized in that: In step S4, the formula for calculating the root cause confidence level is: In the formula, , The preset weighting coefficients, and ; In step S4, the pre-trained root cause analysis model is a multi-class deep neural network model based on the attention mechanism. Its pre-training process uses a historical defect dataset labeled with root cause localization results and root cause classification labels for supervised training. The root cause classification labels include five categories: syntax errors, logical defects, performance bottlenecks, security vulnerabilities, and dependency compatibility issues. Root cause classification probability output by the model The predicted probability of the model for the target root cause classification label is [0,1]. The output root cause localization results include the precise code line number where the defect is located, the root cause description, and the full-link impact range. In step S4, the matching degree of the similar historical defect cases The calculation process is as follows: The constructed problem-associated feature matrix is ​​matched with the feature vectors in the pre-constructed historical defect case library using cosine similarity. The top-N similar historical defect cases are retrieved, and the highest cosine similarity value is taken as the result. The value range is [0,1]. The root cause localization results and remediation plans of the matched similar cases are used as auxiliary features and input into the root cause analysis model to optimize the root cause classification probability. The output result.

6. The code optimization and self-learning method based on automated test feedback according to claim 4, characterized in that: In step S5, the code optimization agent includes a root cause analysis unit, a repair scheme generation unit, and a local verification unit. Its execution process is as follows: the root cause analysis unit performs structured analysis on the root cause localization results and root cause classification labels, extracting the core constraints and repair objectives of the defects; the repair scheme generation unit combines the system architecture diagram and dependency library metadata to generate repair code snippets that conform to coding standards and business requirements; the local verification unit performs targeted unit tests and regression tests on the repaired code snippets, calculating the local multidimensional defect risk value after repair. ,when When verification passes, code iteration and optimization are completed; when If the verification fails, a repair plan is regenerated until the verification passes or the preset maximum number of iterations is reached. ; In step S5, when the root cause classification label is dependency compatibility issue or dependency library security vulnerability, the code optimization agent's remediation solution generation unit will also combine dependency library metadata and open-source vulnerability databases to generate a dependency library version upgrade / security replacement solution, simultaneously optimizing code dependencies; and synchronously update the dependency optimization results to the system architecture map and dependency library metadata, while also updating the historical root cause contribution of the corresponding associated features. .

7. The code optimization and self-learning method based on automated test feedback according to claim 6, characterized in that: In step S6, the formula for scoring the effectiveness of the pattern is: In the formula, , , The preset weighting coefficients, and ; The multidimensional defect risk value before optimization. The optimized multidimensional defect risk value; This represents the accuracy of the root cause localization. This represents the actual number of iterations for this code fix. The maximum number of iterations is preset. In step S6, the specific process of the pattern learning module extracting the defect-repair correlation pattern is as follows: The multi-dimensional health check report, root cause localization results, code repair plan, and code difference data before and after optimization corresponding to this process are normalized and standardized. The correlation relationships between defect triggering scenarios, defect root cause types, repair code features, and post-repair effect verification data are extracted to form a standardized defect-repair correlation pattern. This defect-repair correlation pattern is then added to the fine-tuning dataset of the large code generation model, and the large code generation model is incrementally fine-tuned. Simultaneously, based on the accuracy of this root cause localization... Weights of each associated feature Update the attention weights of the root cause analysis model and the historical root cause contribution of the corresponding associated features. .

8. The code optimization and self-learning method based on automated test feedback according to claim 7, characterized in that: Step S7 also includes a self-updating process for the weight coefficients: after completing a preset number of steps... After completing the entire code generation and optimization process, based on the average root cause localization accuracy and the average first-time pass rate for code fixes, the following analysis was conducted: , , , , , The weight coefficients are adaptively updated, and the updated weight coefficients must satisfy the normalization constraint. And update the risk thresholds simultaneously. Confidence threshold Validity threshold This enables closed-loop self-optimization of thresholds and weights throughout the entire process.

9. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it causes the processor to perform the steps of the method as described in any one of claims 1 to 8.

10. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the computer program is executed by the processor, it causes the processor to perform the steps of the method as described in any one of claims 1 to 8.