AI-based CI continuous integration compilation error automatic repair and recovery method
By monitoring and analyzing compilation errors in the CI system and using artificial intelligence to generate fix code, automatically submitting and restarting the process, the problems of long CI process interruption time and wasted human resources are solved, and rapid automated repair and recovery of CI processes are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- DONGFENG MOTOR GRP
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies lack end-to-end solutions that can be deeply integrated with CI processes, enabling automatic identification and intelligent repair of compilation errors and automatic process recovery. This results in long CI process downtime, wasted human resources, and low efficiency.
The CI system monitors and analyzes compilation error information, uses a pre-trained artificial intelligence repair model to generate repair code, and automatically submits it to the code repository, triggering a new round of CI compilation process, forming an automated repair closed loop until compilation is successful or the maximum number of retry attempts is reached.
It achieves a fully automated closed loop for repairing and recovering compilation errors in the CI process, shortens the CI process interruption time, completes repairs within minutes, ensures the continuity and rapid feedback of the CI process, and solves the problems of R&D process stagnation and manpower waste caused by compilation errors.
Smart Images

Figure CN122309329A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of software engineering automation technology, and in particular to an AI-based method for automatic repair and recovery of compilation errors in continuous integration (CI). Background Technology
[0002] In modern agile software development, Continuous Integration (CI) has become a core practice for ensuring code quality and accelerating delivery. Developers frequently submit code to a shared repository, and the CI system automatically performs a series of operations, including code retrieval, dependency installation, compilation, and automated testing. If an error occurs during the compilation phase, the entire CI process is interrupted, and subsequent testing, code quality analysis, and deployment steps cannot be performed.
[0003] Currently, handling compilation errors in CI mainly relies on the following two methods, but both have significant drawbacks:
[0004] Traditional manual fix process: After a CI compilation failure, the system notifies developers via email or instant messaging. Developers then need to manually log into the CI system to view unstructured error logs, analyze the cause in their local environment, modify the code, and resubmit to trigger a new CI process. This method is inefficient, with the fix process typically taking 1-4 hours or even longer, severely disrupting the "continuity" of CI and forcing developers to frequently interrupt their core work to handle repetitive, simple compilation errors, resulting in a waste of human resources.
[0005] Local AI-assisted code correction tools, such as GitHub Copilot plugins, can provide real-time error correction suggestions as developers write code. However, these tools only work in the local development phase and are completely disconnected from the CI process. They cannot detect and fix unique errors caused by dependency and configuration differences in the CI environment, and the generated fix suggestions still need to be manually confirmed, applied, and submitted by developers. This does not reduce human intervention and fails to achieve an automated closed loop from error occurrence to process recovery.
[0006] Therefore, existing technologies lack an end-to-end solution that can be deeply integrated with CI processes to achieve automatic identification and intelligent repair of compilation errors and automatic process recovery. Summary of the Invention
[0007] In view of the technical defects and drawbacks existing in the prior art, embodiments of the present invention provide an AI-based method for automatic repair and recovery of compilation errors in continuous integration (CI) to overcome or at least partially solve the above problems. The specific solution is as follows:
[0008] As a first aspect of the present invention, an AI-based method for automatic repair and recovery of compilation errors in continuous integration (CI) is provided, which is executed in a CI system. The method includes:
[0009] Monitoring and Analysis: When a failure occurs during the compilation and execution phase of the CI process, the compilation error information is captured and analyzed to generate structured error description data;
[0010] Intelligent diagnosis and repair: The structured error description data is input into a pre-trained artificial intelligence repair model, which diagnoses the root cause of the error based on the data and generates corresponding repair code;
[0011] Fix code submission: Automatically submit the generated fix code to the code repository associated with the CI system;
[0012] Process Restart and Verification: In response to the submission of the fix code, a new round of CI compilation process is triggered and executed to compile and verify the fix code;
[0013] If the compilation fails during the process restart and verification, the monitoring and parsing process restart and verification will be re-executed based on the new round of compilation failure, forming an automated repair loop, until the compilation is successfully verified during the process restart and verification or the preset maximum number of retries is reached.
[0014] In some embodiments, the monitoring and analysis specifically include:
[0015] In response to the compilation failure, extract the compilation error information from the log output by the compilation failure;
[0016] Based on the extracted compilation error information, the error type is identified, and the line number information of the error location is extracted;
[0017] Based on the line number information, obtain the code snippet containing the erroneous line and its context from the source code;
[0018] Based on the error type, the line number information, and the code snippet, generate structured error description data that includes error type field, error line number field, related code field, and error reason field.
[0019] In some embodiments, the generated structured error description data is a data object in JSON format;
[0020] Furthermore, the identified error type is determined from a preset error type set, which includes syntax errors, dependency errors, and type mismatch errors.
[0021] In some embodiments, the intelligent diagnosis and repair step further includes a pre-verification step performed after generating the repair code and before executing the repair code submission step, the pre-verification step including:
[0022] Syntax verification: In an isolated environment consistent with the CI system compilation environment, the repair code is compiled or statically checked to verify its syntax correctness;
[0023] Compatibility analysis: In the isolated environment, a compatibility analysis is performed on the dependencies involved in the fix code;
[0024] If either the syntax verification or the compatibility analysis fails, the AI repair model is controlled to regenerate the repair code based on the failed result, and the pre-verification step is repeated until both the syntax verification and the compatibility analysis pass or the preset repair generation limit is reached.
[0025] In some embodiments, the step of automatically submitting the generated fix code to the code repository associated with the CI system specifically includes:
[0026] Based on the fix task corresponding to the current compilation failure, create a temporary fix branch in the code repository;
[0027] The repair code generated by the AI repair model in the intelligent diagnosis and repair step is added to the temporary repair branch;
[0028] Based on the changes to the fix code and the error message corresponding to the compilation failure, a commit record is generated, and the temporary fix branch containing the commit record is pushed to the remote code repository.
[0029] In some embodiments:
[0030] The temporary repair branch created has a branch name containing a unique error identifier generated based on the compilation failure;
[0031] And / or,
[0032] The generated commit record contains a specific prefix in its commit information to identify that the commit was generated by an automated fix.
[0033] In some embodiments, the process restart and verification step, in response to the submission of the fix code, triggering and executing a new round of CI compilation process is specifically implemented as follows:
[0034] In response to the event that the temporary fix branch is pushed to the code repository, a CI compilation and verification process for the temporary fix branch is initiated.
[0035] The CI compilation and verification process for the temporary repair branch includes:
[0036] (a) Pull the code for the temporary fix branch from the code repository;
[0037] (b) Compile the fetched code;
[0038] (c) Post-processing based on compilation results: If compilation is successful, a notification of successful repair is generated; if compilation fails, the monitoring and parsing to process restart and verification steps are re-executed based on the failure information generated from this compilation of the temporary repair branch.
[0039] In some embodiments, the artificial intelligence repair model used in the intelligent diagnosis and repair steps is obtained through the following process:
[0040] Obtain a training dataset, which contains multiple sets of historical compilation error information and corresponding fix code verified by humans;
[0041] The code generation large language model based on the Transformer architecture is used as the base model;
[0042] The base model is supervised and fine-tuned using the training dataset to obtain the AI-powered repair model that can output repair code based on the input compilation error information.
[0043] In some embodiments, the AI-based repair model supports iterative optimization based on feedback data generated during the operation of the CI system, wherein the iterative optimization includes:
[0044] Collect repair instance data that has been successfully compiled and verified, generated by the CI system during the execution of the AI-based continuous integration compilation error automatic repair and recovery method. The repair instance data includes the compilation error information that triggered the repair and the corresponding AI-generated and verified repair code.
[0045] The collected repair instance data is used as incremental training data to incrementally train the currently deployed artificial intelligence repair model to generate an optimized model version.
[0046] The optimized model version is deployed to the CI system for subsequent compilation error fixing.
[0047] In some embodiments, the method further includes environment consistency management to ensure that the CI compilation process is executed in a deterministic environment, the environment consistency management including:
[0048] Provide and maintain a standardized compilation environment definition for projects managed by the CI system;
[0049] When performing the compilation operation in the monitoring and analysis step or the process restart and verification step, based on the standardized compilation environment definition, a corresponding isolated compilation environment is created or reused to perform the compilation.
[0050] The standardized compilation environment definition includes a configuration file for building container images, and the isolated compilation environment is a container instance created from a container image built based on the configuration file.
[0051] The present invention has the following beneficial effects:
[0052] This invention achieves an end-to-end automated closed-loop repair and recovery of compilation errors in the Continuous Integration (CI) process. For the first time in the CI field, it constructs a fully automated, manual-intervention-free technical chain encompassing "error monitoring - intelligent diagnosis - automatic repair - compilation restart." This solution seamlessly connects previously fragmented, manually-dependent steps (log analysis, code modification, code submission, and process restart), and ensures robustness of the repair through a pre-defined retry mechanism. The direct technical effect is a significant reduction in CI process downtime, which can easily reach several hours under traditional manual processing, to minutes (e.g., reducing the repair time for a dependency conflict error from 1.5 hours to within 10 minutes). This greatly guarantees the realization of the core values of "continuity" and "rapid feedback" in the CI process, fundamentally solving the problems of development process stagnation, wasted manpower, and low efficiency caused by compilation errors. Attached Figure Description
[0053] Figure 1 A flowchart illustrating an AI-based method for automatic repair and recovery of compilation errors in continuous integration (CI) provided in this embodiment of the invention;
[0054] Figure 2 A schematic diagram of the monitoring and analysis process provided in an embodiment of the present invention;
[0055] Figure 3 A schematic diagram illustrating the process of submitting repair code according to an embodiment of the present invention;
[0056] Figure 4 A schematic diagram illustrating the training of the artificial intelligence repair model provided in an embodiment of the present invention;
[0057] Figure 5This is a structural block diagram of an electronic device provided in an embodiment of the present invention. Detailed Implementation
[0058] To enable those skilled in the art to better understand the technical solutions of the present invention, exemplary embodiments of the present invention are described below in conjunction with the accompanying drawings, including various details of the embodiments of the present invention to aid understanding. These should be considered merely exemplary. Therefore, those skilled in the art should recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the present invention. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.
[0059] Where there is no conflict, the various embodiments of the present invention and the features thereof may be combined with each other.
[0060] As used herein, the term “and / or” includes any and all combinations of one or more related enumerated entries.
[0061] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. As used herein, the singular forms “a” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that when the terms “comprising” and / or “made of” are used in this specification, the presence of the stated feature, integral, step, operation, element, and / or component is specified, but the presence or addition of one or more other features, integrals, steps, operations, elements, components, and / or groups thereof is not excluded. Terms such as “connected” or “linked” are not limited to physical or mechanical connections but can include electrical connections, whether direct or indirect.
[0062] Unless otherwise specified, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in commonly used dictionaries should be interpreted as having the meaning consistent with their meaning in the context of the relevant art and the invention, and will not be interpreted as having an idealized or overly formal meaning unless expressly so defined herein.
[0063] In the technical solution of this invention, the collection, storage, use, processing, transmission, provision, and disclosure of user personal information all comply with relevant laws and regulations and do not violate public order and good morals. The use of user data in this technical solution follows relevant national laws and regulations (e.g., the "Information Security Technology - Personal Information Security Specification"). For example: appropriate measures are taken for personal information access control; restrictions are imposed on the display of personal information; the purpose of using personal information does not exceed the scope of direct or reasonable association; and explicit identity targeting is eliminated when using personal information to avoid precisely locating a specific individual.
[0064] To address at least one of the technical problems existing in the aforementioned related technologies, this invention provides an AI-based method for automatic repair and recovery of compilation errors in continuous integration (CI). Figure 1 This is a flowchart illustrating an AI-based CI continuous integration compilation error automatic repair and recovery method provided in an embodiment of the present invention. The method includes:
[0065] S100, when a failure occurs in the compilation and execution phase of the CI process, the compilation error information is captured and parsed to generate structured error description data;
[0066] S200, the structured error description data is input into a pre-trained artificial intelligence repair model, which diagnoses the root cause of the error based on the data and generates corresponding repair code;
[0067] S300, the generated repair code is automatically submitted to the code repository associated with the CI system;
[0068] S400, in response to the submission of the fix code, triggers and executes a new round of CI compilation process to compile and verify the fix code;
[0069] If the compilation fails in S400, S100 to S400 will be re-executed based on the new round of compilation failure, forming an automated repair loop until the compilation is successfully verified in S400 or the preset maximum number of retries is reached.
[0070] This invention achieves an end-to-end automated closed-loop repair and recovery of compilation errors in the Continuous Integration (CI) process. For the first time in the CI field, it constructs a fully automated, manual-intervention-free technical chain encompassing "error monitoring - intelligent diagnosis - automatic repair - compilation restart." This solution seamlessly connects previously fragmented, manually-dependent steps (log analysis, code modification, code submission, and process restart), and ensures robustness of the repair through a pre-defined retry mechanism. The direct technical effect is a significant reduction in CI process downtime, which can easily reach several hours under traditional manual processing, to minutes (e.g., reducing the repair time for a dependency conflict error from 1.5 hours to within 10 minutes). This greatly guarantees the realization of the core values of "continuity" and "rapid feedback" in the CI process, fundamentally solving the problems of development process stagnation, wasted manpower, and low efficiency caused by compilation errors.
[0071] See Figure 2 As shown, in some embodiments, step S100 includes:
[0072] S110, in response to the compilation failure, extract compilation error information from the log output by the compilation failure;
[0073] S120, Based on the extracted compilation error information, identify the error type and extract the line number information of the location where the error occurred;
[0074] S130, based on the line number information, obtain the code segment containing the erroneous line and its context from the source code;
[0075] S140, based on the error type, the line number information, and the code snippet, generate structured error description data including an error type field, an error line number field, a related code field, and an error reason field.
[0076] Optional:
[0077] In step S140, the generated structured error description data is a data object in JSON format;
[0078] Furthermore, in step S120, the identified error type is determined from a preset error type set, which includes syntax errors, dependency errors, and type mismatch errors.
[0079] Simulate a Java project failing during CI compilation. After the compilation execution module executes the `mvn cleancompile` command, the console outputs an unstructured error log. At this point, the method described in the above embodiment is triggered.
[0080] The specific implementation process of compilation error capture and log parsing includes:
[0081] Corresponding step S110: Extract compilation error information
[0082] Response conditions: After the system (error capture module) detects that the compilation and execution module returns a "failure" status code (non-zero), it immediately intercepts and captures the entire console output text stream from the start to the end of the compilation.
[0083] In practice, the system focuses on identifying and extracting clearly marked errors from this large text. For example, it will locate log lines that begin with "ERROR" or "[ERROR]", or contain typical error patterns (such as "error:", "Exception in thread") and their related context.
[0084] Corresponding step S120: Identify error type and extract line number
[0085] Error type identification: The system analyzes the core error lines extracted in the previous step based on predefined keyword mapping rules. For example, if an error line contains "Syntax error on token," it is classified as a "syntax error"; if it contains "cannot find symbol" or "package ... does not exist," it may be classified as a "symbol not found error" or a "dependency error." In this embodiment, the system identifies the error type as "syntax error."
[0086] Extracting line number information: The system scans error messages using regular expressions, looking for patterns like "UserService.java:25" or "at line 25". In this example, the error was successfully extracted to occur on line 25 of the file UserService.java.
[0087] Corresponding step S130: Obtain the associated code snippet
[0088] Location and Read: Based on the file path (UserService.java) and line number (25) obtained in step S120, the system directly reads the file from the source code directory pulled in this CI process.
[0089] Extracting the context: The system doesn't just read line 25; instead, it reads a code block centered around the error line. Specifically, it reads line 25 itself (the error line), along with the three lines preceding it (lines 22-24) and the three lines following it (lines 26-28). This results in a code snippet containing the error line and its context, for example, displayed as "String userName = usrName; / / Assignment operation".
[0090] Corresponding step S140: Generate structured error description data
[0091] Data Assembly: The system assembles the outputs of the preceding steps into a structured form. It creates a JSON object containing the following fields:
[0092] errorType: Enter the "syntax error" identified in step S120.
[0093] errorLine: Enter "25" extracted in step S120.
[0094] relevantCode: Enter the text containing the context, such as "String userName = usrName;", obtained in step S130.
[0095] Reason: Based on the error type and code snippet, the system generates a brief reason description, such as "Variable name misspelled: 'usrName' should be 'userName'". (This description can be dynamically generated by combining a fixed template with extracted symbols).
[0096] Output: Finally, the system generates a complete, machine-readable JSON object, for example: "errorType": "Syntax error", "errorLine": 25, "relevantCode": "String userName = usrName;", "reason": "Variable name misspelling: 'usrName' should be 'userName'".
[0097] In some embodiments, step S200 further includes a pre-verification step performed after generating the fix code and before executing step S300, the pre-verification step including:
[0098] S210, In an isolated environment consistent with the compilation environment of the CI system, the repair code is compiled or statically checked to verify its syntax correctness;
[0099] S220, In the isolated environment, a compatibility analysis is performed on the dependencies involved in the fix code;
[0100] If either S210 or S220 fails verification or analysis, the AI repair model is controlled to regenerate repair code based on the failed result, and the pre-verification step is repeated until both S210 and S220 pass verification or the preset repair generation limit is reached.
[0101] Let's take a Maven-managed Java Web project as an example. A developer submits a piece of code containing errors to the main branch of the code repository, triggering the CI / ICM process. In step S200, an AI-powered fix model (e.g., a model fine-tuned based on CodeLlama-7B) has generated a fix for the compilation error.
[0102] The specific implementation process of the pre-validation step includes:
[0103] Triggers and Environment Preparation:
[0104] After the AI repair module generates the repair code, the system will not submit it immediately, but will instead proceed to the pre-verification sub-step.
[0105] The system automatically creates an isolated Docker container consistent with the production CI environment based on the pom.xml file in the project root directory and the JDK version (e.g., JDK 11) and Maven version (e.g., Maven 3.8) defined in the CI configuration. This container image is pre-built and contains the basic compilation environment required by the project.
[0106] Corresponding step S210: Syntax correctness verification:
[0107] The system places the AI-generated fix code (e.g., correcting a method signature error in a class) along with the complete source file applied to the fix into the temporary working directory of the isolated container.
[0108] Inside the container, the system executes the command `mvn clean compile -DskipTests`. This command is designed to compile the project code to verify the syntax correctness of the fixed code.
[0109] Verification passed according to the following criteria: The compilation process completed successfully, the console output was "BUILD SUCCESS", and no new compilation errors were generated. This proves that the fixed code is syntactically correct and can be accepted by the Java compiler.
[0110] Corresponding step S220: Dependency compatibility analysis:
[0111] After successful compilation, the system further performs dependency analysis. It runs the command `mvn dependency:tree` to generate the complete dependency tree for the current project.
[0112] The AI-powered fix module or a dedicated validation program analyzes this dependency tree to check for any new dependency conflicts that might be introduced by the fix. For example, if the fix upgrades a library version in pom.xml, the analyzer will compare the transitive dependencies of all dependencies in the project to confirm whether the new version is compatible with the version requirements of other existing libraries.
[0113] The analysis passed the criteria: dependency tree resolution was successful, and no version range overlap conflicts or known incompatible combinations were found (this information can be matched against a pre-defined list of known conflicts). This indicates that the fix will not compromise the project's dependency stability.
[0114] Result processing and iteration:
[0115] If both S210 and S220 pass: the pre-verification step is marked as "successful", and the fix code will be passed to the subsequent code submission module (S300 step).
[0116] If either S210 or S220 fails:
[0117] For example, suppose the AI-generated fix code incorrectly introduces a non-existent API call, causing S210 compilation to fail. The system will then send a compilation error log to the AI fix model, indicating "Compilation failed: Method 'XXX' not found".
[0118] Based on this new feedback, the AI repair model regenerates a fix for the API call issue.
[0119] The system then repeats the verification of S210 and S220 in the same or newly created isolated environment.
[0120] The "Generate-Verify" loop will run a maximum of 3 times (i.e., the "preset maximum number of repair generation attempts" is 3). If the repair solutions generated for 3 consecutive times fail the pre-verification, the system will determine that the automatic repair has failed, will exit the pre-verification step, and may send an alert to the developers requiring manual intervention, instead of submitting invalid code.
[0121] See Figure 3 As shown, in some embodiments, step S300, "automatically submitting the generated fix code to the code repository associated with the CI system," specifically includes:
[0122] S310, Based on the repair task corresponding to the current compilation failure, create a temporary repair branch in the code repository;
[0123] S320, add the repair code generated by the artificial intelligence repair model in step S200 to the temporary repair branch;
[0124] S330, based on the changes to the fix code and the error information corresponding to the compilation failure, generate a commit record and push the temporary fix branch containing the commit record to the remote code repository.
[0125] Following the above embodiments, the AI repair module has successfully generated repair code for the "variable name misspelling" error and has passed pre-validation. There is a commit on the main branch of the current code repository (taking GitLab as an example) that caused the compilation failure. The CI system has Personal Access Token (PAT) authentication for reading and writing operations on this repository.
[0126] The specific implementation process of automatically submitting the generated fix code to the code repository associated with the CI system includes:
[0127] Corresponding step S310: Create a temporary repair branch
[0128] The system (through the code submission module) obtains a unique identifier for this compilation failure, for example, the error ID generated by the CI system is 20240520-001.
[0129] The system uses the GitPython library and PAT authentication to create a new branch in the remote code repository. The branch name strictly follows the format: ci-ai-fix-20240520-001. The prefix ci-ai-fix- clearly identifies the purpose of the branch, and the error ID following it ensures the branch's uniqueness and traceability.
[0130] Corresponding step S320: Add the fix code to the temporary branch.
[0131] The system does not directly manipulate the main branch. It first pulls the code from the newly created ci-ai-fix-20240520-001 branch to the local working directory.
[0132] Then, the AI-generated fix code (e.g., correcting the variable usrName to userName in line 25 of the file UserService.java) is applied to the corresponding file in the local workspace. This "application" action essentially modifies the file content in the local workspace to include the fix.
[0133] The system executes the `git add` command to stage these changes.
[0134] Corresponding step S330: Generate commit record and push.
[0135] The system prepares to submit a message. Based on a predefined format, the commit message is constructed as: [CI-AI-FIX] Fix compilation error: Syntax error - Variable name spelling error. Where:
[0136] [CI-AI-FIX] is a specific identifier prefix used to quickly filter out all fixes automatically performed by AI in the commit history.
[0137] Fixing compilation errors: This is a fixed description.
[0138] Syntax errors - misspelled variable names come directly from the error type and reason summary identified in the previous error parsing step, so that the submission information itself contains the error context.
[0139] The system executes `git commit -m "[CI-AI-FIX] Fix compilation errors: syntax errors - misspelling of variable name"`, creating a commit record on the local branch.
[0140] Finally, the system executes `git push origin ci-ai-fix-20240520-001`, pushing the entire temporary branch containing this commit to the remote code repository (GitLab). This operation triggers a "push event" in the code repository.
[0141] The above embodiment clearly demonstrates how to securely and systematically integrate AI-repaired results into a version control system.
[0142] In some embodiments:
[0143] In step S310, the temporary repair branch created has a branch name containing a unique error identifier generated based on the compilation failure;
[0144] And / or,
[0145] In step S330, the generated commit record contains a specific prefix that identifies the commit as being generated by automated repair.
[0146] In the above embodiments:
[0147] During step S310, "Create a temporary repair branch", the system needs to generate a "unique error identifier based on the compilation failure".
[0148] Specifically, when the compilation execution module detects a compilation failure in step S100, the error capture module, while generating structured error data, calls an ID generator to produce a globally unique string, such as "20250216-1130-5a8f2b1c". This ID combines a timestamp and a random hash to ensure its uniqueness. This ID is the "unique error identifier".
[0149] Branch naming convention: When creating a branch, the system embeds this unique error identifier into the branch name. The final remote branch name is: ci-ai-fix-20250216-1130-5a8f2b1c. This name clearly indicates:
[0150] This is an AI fix branch initiated by the CI system (ci-ai-fix-).
[0151] It corresponds to a specific compilation failure instance identified as 20250216-1130-5a8f2b1c.
[0152] Any developer or subsequent compiler restore module (claim 7) can immediately know its origin and purpose upon seeing this branch name without having to check the commit history or logs, achieving extremely strong recognizability and management convenience.
[0153] In step S330, "Generate commit record", the system needs to include a specific identifier prefix in the commit information.
[0154] Specifically, the system predefines a constant string as an identifier prefix, such as "[CI-AI-FIX]". When constructing the commit message, this prefix is unconditionally placed at the very beginning of the commit message.
[0155] Submission message generation: Based on the error parsing results (error type "syntax error", reason "variable name spelling error"), the system generates the following complete submission message:
[0156] [CI-AI-FIX] Fixes compilation errors: Syntax errors - Variable name spelling errors
[0157] When developers see any commit record starting with "[CI-AI-FIX]" in the code repository's commit history, merge requests, or code comparison interface, they can instantly identify it as a change generated by an AI-automated system, rather than a manual commit. This greatly facilitates subsequent code review focus (quickly filtering out all AI changes for centralized review), change tracking (quickly locating whether an issue is related to a particular AI fix), and effect statistical analysis (easily calculating the number of AI-automated fixes and their success rate by searching for this prefix).
[0158] In some embodiments, step S400, "in response to the submission of the fix code, triggering and executing a new round of CI compilation process," is specifically implemented as follows:
[0159] In response to the event that the temporary fix branch is pushed to the code repository, a CI compilation and verification process for the temporary fix branch is initiated.
[0160] The CI compilation and verification process for the temporary repair branch includes:
[0161] (a) Pull the code for the temporary fix branch from the code repository;
[0162] (b) Compile the fetched code;
[0163] (c) Post-processing based on the compilation results: If the compilation is successful, a notification of successful repair is generated; if the compilation fails, steps S100 to S400 are re-executed based on the failure information generated during the compilation of the temporary repair branch.
[0164] Following the aforementioned embodiment, the system has previously successfully created a temporary fix branch named ci-ai-fix-20240520-001 and pushed a commit containing the fix code (commit information: [CI-AI-FIX] Fix compilation error: syntax error - variable name spelling error) to a remote code repository (e.g., GitLab).
[0165] The specific implementation process of the CI compilation and verification workflow for temporary fix branches includes:
[0166] Triggering event:
[0167] After the code commit module (S300 step) completes the push, the build recovery module uses the "commit event hook (Webhook)" feature of the code repository (GitLab) to monitor the push event for the ci-ai-fix-20240520-001 branch in real time. This event contains detailed information such as the commit ID and branch name.
[0168] Start the verification process:
[0169] The compilation recovery module responds to this event and immediately triggers a new round of CI compilation process, but the target of this process is clearly this temporary repair branch, rather than the main branch.
[0170] Subprocess (a): Retrieve code:
[0171] The code fetch module receives instructions from the compile recovery module, fetches the latest code from the ci-ai-fix-20240520-001 branch of the code repository through PAT authentication, and passes it to the compile execution module.
[0172] Subprocess (b): Execute compilation:
[0173] The compilation and execution module executes complete compilation commands (such as mvn cleancompile) on the pulled temporary branch code in a standard compilation environment predefined for the project (e.g., a Docker container containing JDK 11 and Maven 3.8).
[0174] Subprocess (c): Result post-processing:
[0175] Scenario 1: Compilation successful
[0176] If the compilation command executes successfully (returns with a return code of 0 and outputs "BUILD SUCCESS"), the system determines that the AI repair is effective.
[0177] The system generates a successful fix notification and sends it to relevant developers or groups via the integrated WeChat / DingTalk chatbot. The notification content is, for example: "[CI-AI Automatic Fix Successful] Branch ci-ai-fix-20240520-001 has been successfully compiled. Fix details: Corrected a spelling error in a variable name (usrName -> userName). Commit history: [link]". At this point, the automated loop for this compilation error has successfully ended. The temporary branch can be retained for review or subsequent automatic cleanup.
[0178] Scenario 2: Compilation failed
[0179] If the compilation command fails (returns a non-zero code and the output contains new error information), it means that the AI-generated fix has failed to solve the problem and may even have introduced new errors.
[0180] At this point, the compilation recovery module will not simply give up. It will initiate a new round of repair loop based on the entirely new failure information generated from this compilation of the temporary branch. Specifically:
[0181] Re-execute S100: The error capture module intercepts and parses the new compilation failure log, generating new structured error description data. This error is independent of the initial error on the main branch.
[0182] Re-execute S200: New error data is input into the AI repair module, and the model attempts to diagnose this "new problem" and generate new repair code.
[0183] Re-execute S300: After the new fix code passes pre-validation, the code commit module will create a new commit on the same temporary branch ci-ai-fix-20240520-001 instead of creating a new branch to maintain the continuity of the fix attempt. The commit message may become [CI-AI-FIX] Retry Fix: Unhandled NullPointerException.
[0184] Re-execute S400: The new submission triggers the verification process described in this embodiment again.
[0185] This process will be repeated until compilation is successful, or until the "preset retry limit" (e.g., 3 times) is reached as set in claim 1. If it still fails after 3 retries, the system will send a "repair failed" notification along with logs of all attempts, awaiting manual intervention.
[0186] join Figure 4 As shown, in some embodiments, the artificial intelligence repair model used in step S200 is obtained through the following process:
[0187] S510. Obtain the training dataset, which contains multiple sets of historical compilation error information and corresponding repair code verified by manual verification;
[0188] S520 uses a large language model for code generation based on the Transformer architecture as its basic model.
[0189] S530. The base model is fine-tuned under supervision using the training dataset to obtain the artificial intelligence repair model that can output repair code based on the input compilation error information.
[0190] In the above embodiments, the model using the Transformer architecture has excellent code understanding and generation capabilities; while fine-tuning with a paired dataset such as "compilation error - code repair" is a targeted domain adaptation training, which can accurately guide the model's general code capabilities toward the goal of "diagnosing errors and generating repairs", so that its output is not only syntactically correct, but also hits the root cause of compilation errors, significantly improving the accuracy and relevance of the repairs.
[0191] In some embodiments, the AI-based repair model supports iterative optimization based on feedback data generated during the operation of the CI system, wherein the iterative optimization includes:
[0192] Collect repair instance data that has been successfully compiled and verified, generated by the CI system during the execution of the AI-based continuous integration compilation error automatic repair and recovery method. The repair instance data includes the compilation error information that triggered the repair and the corresponding AI-generated and verified repair code.
[0193] The collected repair instance data is used as incremental training data to incrementally train the currently deployed artificial intelligence repair model to generate an optimized model version.
[0194] The optimized model version is deployed to the CI system for subsequent compilation error fixing.
[0195] The above-described embodiment endows the AI-powered repair model with the ability to iteratively optimize based on operational feedback, enabling continuous self-evolution and performance improvement of the entire automated repair system. This forms an intelligent closed loop with "learning" characteristics, allowing the system not only to handle current errors but also to accumulate experience from successful repair practices and optimize future performance. Specifically, it collects successfully verified repair instances (i.e., AI-generated code that is ultimately compiled) generated during the system's operation as new training data to incrementally train the existing model, thereby generating an optimized new version of the model for deployment. This process allows the AI repair model to continuously adapt to the project's unique coding style, dependency libraries, common error patterns, and team conventions, achieving a "the more you use it, the more accurate it becomes" effect.
[0196] In some embodiments, the method further includes environment consistency management to ensure that the CI compilation process is executed in a deterministic environment, the environment consistency management including:
[0197] Provide and maintain a standardized compilation environment definition for projects managed by the CI system;
[0198] When performing the compilation operation in step S100 or step S400, a corresponding isolated compilation environment is created or reused to perform compilation based on the standardized compilation environment definition.
[0199] The standardized compilation environment definition includes a configuration file for building container images, and the isolated compilation environment is a container instance created from a container image built based on the configuration file.
[0200] The above embodiments introduce an environment consistency management step, fundamentally ensuring the reproducibility and reliability of the compilation and repair process, and eliminating repair failures or uncertainty in results caused by environmental differences. This solution defines a standardized compilation environment (such as a container image definition) for the project, and creates or reuses a completely consistent isolated environment (such as a container instance) based on this definition for each compilation operation (whether it's an initial failed compilation or a post-repair verification compilation). This ensures that the environmental foundation for compilation behavior is deterministic and pure. Specifically, this solves the common problem in traditional CI / CD of "it works fine on my local machine," ensuring that the context on which the AI repair module diagnoses and generates repair code is completely consistent with the context of the final verification compilation, greatly improving the effectiveness of the repair solution. Simultaneously, the application of isolation technologies such as containerization avoids environmental pollution and conflicts between different compilation tasks, improving the overall stability and execution efficiency of the system.
[0201] In some embodiments, the AI-based CI continuous integration compilation error automatic repair and recovery method is executed by a continuous integration system with integrated artificial intelligence repair capabilities, the system comprising:
[0202] The code fetch module is configured to fetch code from a code repository;
[0203] The compiler execution module is configured to perform compilation operations on the pulled code;
[0204] The error capture module is configured to intercept and parse compilation error information when compilation fails, and generate structured error description data;
[0205] The AI repair module is configured to receive the structured error description data, diagnose the root cause of the error, and generate repair code.
[0206] The code submission module is configured to submit the fix code to the code repository;
[0207] The compilation recovery module is configured to trigger a new round of CI compilation process in response to the submission of the fix code;
[0208] The error capture module, the artificial intelligence repair module, the code submission module, and the compilation recovery module are connected in sequence, and together with the code retrieval module and the compilation execution module, they form an automated closed loop for repairing compilation errors.
[0209] The above-described embodiment defines the modular components and their interactions for executing the method at the system architecture level. This solution abstracts the method flow and maps it specifically to an entity system architecture consisting of six core modules: code retrieval, compilation and execution, error capture, AI repair, code submission, and compilation recovery.
[0210] Based on the same inventive concept, embodiments of the present invention also provide an electronic device. Figure 5 This is a structural block diagram of an electronic device provided in an embodiment of the present invention. Figure 5 As shown, an embodiment of the present invention provides an electronic device including: one or more processors 101, a memory 102, and one or more I / O interfaces 103. The memory 102 stores one or more programs, which, when executed by the one or more processors, enable the one or more processors to implement any of the AI-based CI continuous integration compilation error automatic repair and recovery methods described in the above embodiments; the one or more I / O interfaces 103 are connected between the processor and the memory, configured to enable information interaction between the processor and the memory.
[0211] The processor 101 is a device with data processing capabilities, including but not limited to a central processing unit (CPU); the memory 102 is a device with data storage capabilities, including but not limited to random access memory (RAM, more specifically SDRAM, DDR, etc.), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), and flash memory (FLASH); the I / O interface (read / write interface) 103 is connected between the processor 101 and the memory 102, and can realize information interaction between the processor 101 and the memory 102, including but not limited to a data bus (Bus).
[0212] In some embodiments, the processor 101, memory 102, and I / O interface 103 are interconnected via bus 104, and thus connected to other components of the computing device.
[0213] In some embodiments, the one or more processors 101 include a field-programmable gate array.
[0214] This invention also provides a computer-readable medium. The computer-readable medium stores a computer program, which, when executed by a processor, implements the steps of any of the AI-based CI continuous integration compilation error automatic repair and recovery methods described in the above embodiments. The computer-readable storage medium can be volatile or non-volatile.
[0215] This invention also provides a computer program product, including computer-readable code, or a non-volatile computer-readable storage medium carrying computer-readable code. When the computer-readable code is run in the processor of an electronic device, the processor in the electronic device executes any of the above-mentioned AI-based CI continuous integration compilation error automatic repair and recovery methods.
[0216] Those skilled in the art will understand that all or some of the steps, systems, and apparatuses disclosed above, and their functional modules / units, can be implemented as software, firmware, hardware, or suitable combinations thereof. In hardware implementations, the division between functional modules / units mentioned above does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may be performed collaboratively by several physical components. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit (ASIC). Such software can be distributed on a computer-readable storage medium, which may include computer storage media (or non-transitory media) and communication media (or transient media).
[0217] As is known to those skilled in the art, computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information such as computer-readable program instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), static random access memory (SRAM), flash memory or other memory technologies, portable compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical disc storage, magnetic cartridges, magnetic tape, disk storage or other magnetic storage devices, or any other medium that can be used to store desired information and is accessible to a computer. Furthermore, it is known to those skilled in the art that communication media typically contain computer-readable program instructions, data structures, program modules, or other data in modulated data signals such as carrier waves or other transmission mechanisms, and may include any information delivery medium.
[0218] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.
[0219] The computer program instructions used to perform the operations of this invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer-readable program instructions may be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing state information from the computer-readable program instructions. This electronic circuitry can execute the computer-readable program instructions to implement various aspects of the invention.
[0220] The computer program product described herein can be implemented specifically through hardware, software, or a combination thereof. In one alternative embodiment, the computer program product is specifically embodied in a computer storage medium; in another alternative embodiment, the computer program product is specifically embodied in a software product, such as a software development kit (SDK), etc.
[0221] Various aspects of the present invention are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.
[0222] These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processor of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner; thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.
[0223] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.
[0224] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.
[0225] Example embodiments have been disclosed herein, and while specific terminology has been used, it is for illustrative purposes only and should be construed as such, and is not intended to be limiting. In some instances, it will be apparent to those skilled in the art that features, characteristics, and / or elements described in conjunction with particular embodiments may be used alone, or in combination with features, characteristics, and / or elements described in conjunction with other embodiments, unless otherwise expressly indicated. Therefore, those skilled in the art will understand that various changes in form and detail may be made without departing from the scope of the invention as set forth in the appended claims.
Claims
1. An AI-based method for automatic repair and recovery of compilation errors in continuous integration (CI), characterized in that, The method, executed in a CI system, includes: Monitoring and Analysis: When a failure occurs during the compilation and execution phase of the CI process, the compilation error information is captured and analyzed to generate structured error description data; Intelligent diagnosis and repair: The structured error description data is input into a pre-trained artificial intelligence repair model, which diagnoses the root cause of the error based on the data and generates corresponding repair code; Fix code submission: Automatically submit the generated fix code to the code repository associated with the CI system; Process Restart and Verification: In response to the submission of the fix code, a new round of CI compilation process is triggered and executed to compile and verify the fix code; If the compilation fails during the process restart and verification, the monitoring and parsing process restart and verification will be re-executed based on the new round of compilation failure, forming an automated repair loop, until the compilation is successfully verified during the process restart and verification or the preset maximum number of retries is reached.
2. The method according to claim 1, characterized in that, The monitoring and analysis specifically include: In response to the compilation failure, extract the compilation error information from the log output by the compilation failure; Based on the extracted compilation error information, the error type is identified, and the line number information of the error location is extracted; Based on the line number information, obtain the code snippet containing the erroneous line and its context from the source code; Based on the error type, the line number information, and the code snippet, generate structured error description data that includes error type field, error line number field, related code field, and error reason field.
3. The method according to claim 2, characterized in that, The generated structured error description data is a data object in JSON format; Furthermore, the identified error type is determined from a preset error type set, which includes syntax errors, dependency errors, and type mismatch errors.
4. The method according to claim 1, characterized in that, The intelligent diagnosis and repair step also includes a pre-verification step performed after generating the repair code and before executing the repair code submission step. The pre-verification step includes: Syntax verification: In an isolated environment consistent with the CI system compilation environment, the repair code is compiled or statically checked to verify its syntax correctness; Compatibility analysis: In the isolated environment, a compatibility analysis is performed on the dependencies involved in the fix code; If either the syntax verification or the compatibility analysis fails, the AI repair model is controlled to regenerate the repair code based on the failed result, and the pre-verification step is repeated until both the syntax verification and the compatibility analysis pass or the preset repair generation limit is reached.
5. The method according to claim 1, characterized in that, The step of submitting the fix code includes automatically submitting the generated fix code to the code repository associated with the CI system. Based on the fix task corresponding to the current compilation failure, create a temporary fix branch in the code repository; The repair code generated by the AI repair model in the intelligent diagnosis and repair step is added to the temporary repair branch; Based on the changes to the fix code and the error message corresponding to the compilation failure, a commit record is generated, and the temporary fix branch containing the commit record is pushed to the remote code repository.
6. The method according to claim 5, characterized in that: The temporary repair branch created has a branch name containing a unique error identifier generated based on the compilation failure; And / or, The generated commit record contains a specific prefix in its commit information to identify that the commit was generated by an automated fix.
7. The method according to claim 5, characterized in that, In the process restart and verification steps, the specific implementation of triggering and executing a new round of CI compilation process in response to the submission of the fix code is as follows: In response to the event that the temporary fix branch is pushed to the code repository, a CI compilation and verification process for the temporary fix branch is initiated. The CI compilation and verification process for the temporary repair branch includes: (a) Pull the code for the temporary fix branch from the code repository; (b) Compile the fetched code; (c) Post-processing based on compilation results: If compilation is successful, a notification of successful repair is generated; if compilation fails, the monitoring and parsing, intelligent diagnosis and repair, repair code submission and process restart and verification steps are re-executed based on the failure information generated by this compilation of the temporary repair branch.
8. The method according to claim 1, characterized in that, The artificial intelligence repair model used in the intelligent diagnosis and repair steps is obtained through the following process: Obtain a training dataset, which contains multiple sets of historical compilation error information and corresponding fix code verified by humans; The code generation large language model based on the Transformer architecture is used as the base model; The base model is supervised and fine-tuned using the training dataset to obtain the AI-powered repair model that can output repair code based on the input compilation error information.
9. The method according to claim 8, characterized in that, The AI-powered repair model supports iterative optimization based on feedback data generated during the operation of the CI system. This iterative optimization includes: Collect repair instance data that has been successfully compiled and verified, generated by the CI system during the execution of the AI-based continuous integration compilation error automatic repair and recovery method. The repair instance data includes the compilation error information that triggered the repair and the corresponding AI-generated and verified repair code. The collected repair instance data is used as incremental training data to incrementally train the currently deployed artificial intelligence repair model to generate an optimized model version. The optimized model version is deployed to the CI system for subsequent compilation error fixing.
10. The method according to claim 1, characterized in that, The method also includes environment consistency management to ensure that the CI compilation process is executed in a deterministic environment, wherein the environment consistency management includes: Provide and maintain a standardized compilation environment definition for projects managed by the CI system; When performing the compilation operation in the monitoring and analysis step or the process restart and verification step, based on the standardized compilation environment definition, a corresponding isolated compilation environment is created or reused to perform the compilation. The standardized compilation environment definition includes a configuration file for building container images, and the isolated compilation environment is a container instance created from a container image built based on the configuration file.