An agent evolution system, method, electronic device and storage medium
By constructing a collaborative architecture between the metacognitive layer and the agent execution layer, the reasoning process and execution results of the agent are captured natively, enabling the agent to perform autonomous optimization. This solves the problem of the agent's lack of self-optimization, improves the accuracy and efficiency of task execution, and adapts to complex software development scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CARBON HARMONY TECH (SHANGHAI) CO LTD
- Filing Date
- 2026-04-10
- Publication Date
- 2026-06-12
AI Technical Summary
In existing IDE and agent integration solutions, agents lack self-optimization mechanisms, LLM request interactions are cumbersome and responses are fragmented, making it difficult to improve the execution accuracy and efficiency of agents, and failing to meet the needs of efficient collaboration in complex software development scenarios.
A collaborative architecture is constructed between the metacognitive layer and the agent execution layer. The metacognitive layer natively captures the reasoning process and execution results of the agent. Through learning and reflecting on the results, the agent can autonomously evolve, including observation, reflection, and upgrade modules, to achieve autonomous optimization of the agent.
Simplify the LLM request interaction process, improve request processing efficiency, achieve efficient linkage between the agent and the IDE, accurately generate learning and reflection results, significantly improve the accuracy and efficiency of agent task execution, and adapt to complex software development scenarios.
Smart Images

Figure CN122197948A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and more specifically, to an intelligent agent evolution system, method, electronic device, and storage medium. Background Technology
[0002] With the deep integration of artificial intelligence and software development technologies, agents based on Large Language Models (LLMs) have been widely used in Integrated Development Environments (IDEs) to assist developers in writing, debugging, and decomposing tasks, significantly improving software development efficiency. In existing IDE and agent integration solutions, agents typically run as independent modules, interacting with the IDE and LLM through external interfaces to achieve basic task execution functions.
[0003] However, existing technologies have obvious drawbacks: agents lack effective evolutionary mechanisms and cannot achieve self-optimization based on their own execution behavior; the interaction mode of forwarding LLM requests is cumbersome, and the LLM response is isolated from the agent's execution process, making it impossible to achieve native data linkage; IDEs cannot directly capture the agent's reasoning process and execution behavior, and can only obtain post-event data through additional integration of monitoring SDKs or external platforms, which has latency and limitations.
[0004] Meanwhile, in existing systems, the IDE and agent execution layer are independent of each other and lack a collaborative interaction mechanism. The LLM request forwarding efficiency is low, the agent cannot make full use of the IDE context information to optimize the execution process, and the IDE cannot achieve accurate evolution of the agent based on the agent's execution data. As a result, it is difficult to improve the execution accuracy and efficiency of the agent, and it cannot meet the high-efficiency collaboration requirements in complex software development scenarios. A new type of system is urgently needed to solve the above problems. Summary of the Invention
[0005] In view of this, the purpose of this application is to provide an intelligent agent evolution system, method, electronic device and storage medium that effectively solves the problems of existing IDEs lacking in-depth observation and learning of intelligent agents' behavior and the lack of secure self-evolution of intelligent agents.
[0006] In a first aspect, embodiments of this application provide an agent evolution system, the evolution system comprising a metacognitive layer and an agent execution layer: The agent execution layer is used to control the agent to execute the plan to be executed; The metacognitive layer is used to natively capture and analyze the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and to obtain learning and reflection results to evolve the agent from the target dimension.
[0007] In conjunction with the first aspect, this application provides a first possible implementation of the first aspect, wherein the metacognitive layer is used to natively capture and analyze the reasoning process, tool calls, and execution results generated by the agent executing the plan to be executed, to obtain learning and reflection results to evolve the agent from the target dimension, including at least one of the following features: The observation module is used to natively capture the inference process, tool calls, and execution results of the agent based on a preset execution mode; the preset execution mode includes at least one of the following: single agent mode and multi-agent topology mode; The reflection module is used to reflect on the reasoning process, tool calls, and execution results to identify success and failure patterns, and to generate learning reflection results. An upgrade module is used to upgrade the scripts of the agent based on the learning and reflection results; The verification module is used to verify the evolution of the agent after the upgrade script through a controlled variable experimental verification mechanism.
[0008] In conjunction with the first aspect, this application provides a second possible implementation of the first aspect, wherein the reflection module is used to reflect on success patterns and failure patterns from the reasoning process, tool calls, and execution results, and to form a learning reflection result, including at least one of the following features: A success pattern recognition submodule is used to extract best practices from the success patterns; The failure mode identification submodule is used to extract the failure reason from the failure mode; The efficiency analysis submodule is used to analyze the execution efficiency of the intelligent agent, identify redundant steps, and optimize opportunities. The learning outcome generation submodule is used to generate learning reflection results based on success patterns, failure patterns, and efficiency analysis results.
[0009] In conjunction with the first aspect, this application provides a third possible implementation of the first aspect, wherein the upgrade module is used to upgrade the script of the agent based on the learning reflection results, including at least one of the following features: The script change generation submodule is used to generate change suggestions for the agent script based on the learning and reflection results; The change review submodule is used to evaluate the impact of the proposed changes, including the risks and benefits, and to review the proposed changes. The version submission submodule is used to submit approved change suggestions to the version control module and create new agent script versions. The Change Notification submodule is used to generate change messages based on the content and effects of the changes, in order to provide change reminders.
[0010] In conjunction with the first aspect, this application provides a fourth possible implementation of the first aspect, wherein the observation module is used to natively capture the inference process, tool calls, and execution results of the agent based on a multi-agent topology, including at least one of the following features: The inference process capture submodule is used to natively capture the LLM invocation process of the agent; the invocation process includes at least one of the following: input message, output content, tool invocation information, and inference link; The execution behavior recording submodule is used to record the execution behavior of the intelligent agent in executing the plan to be executed; the execution behavior includes task decomposition, tool selection, parameter construction, execution results, and error handling; The context snapshot submodule is used to save context snapshots at key execution nodes; the context snapshots include the metacognitive layer state, agent state, and environment state. The performance metrics acquisition submodule is used to collect the performance metrics of the intelligent agent; the performance metrics include task completion rate, execution time, resource consumption, and user satisfaction.
[0011] In conjunction with the first aspect, this application provides a fifth possible implementation of the first aspect, wherein the agent execution layer, used to control the agent to execute the plan to be executed, includes at least one of the following features: The source identification module is used to identify the source to which the plan to be executed belongs, and generate an LLM request based on the source. The request encapsulation module is used to encapsulate the LLM request according to the target protocol format, and after encapsulation, send the LLM request to the metacognitive layer through the communication adaptation module; The response parsing module is used to receive and parse the LLM response based on the LLM request from the metacognitive layer. The control execution module is used to respond to the LLM response and execute the plan to be executed.
[0012] In conjunction with the first aspect, this application provides a sixth possible implementation of the first aspect, wherein the control execution module, configured to execute the plan to be executed in response to the LLM response, includes at least one of the following features: The audit request module is used to generate an audit confirmation request before performing key operations and to send the audit confirmation request to the metacognitive layer. The execution continuation module is used to receive real-time intervention instructions from the metacognitive layer and to report the execution status back to the metacognitive layer.
[0013] In conjunction with the first aspect, this application provides a seventh possible implementation of the first aspect, wherein the metacognitive layer further includes at least one of the following features: The review and confirmation module is used to review key operations based on the received review and confirmation requests according to the preset review strategy, and generate review results. The intervention generation module is used to generate real-time intervention instructions based on the audit results and all permissions and capabilities of the intelligent agent execution layer, or based on the anomalies identified by the observation module.
[0014] In conjunction with the first aspect, this application provides an eighth possible implementation of the first aspect, wherein the metacognitive layer further includes at least one of the following features: The request receiving module is used to receive the LLM request sent by the agent execution layer based on the communication adaptation module; The response generation module is used to generate an LLM response based on the LLM request and to feed the LLM response back to the agent execution layer.
[0015] In conjunction with the first aspect, this application provides a ninth possible implementation of the first aspect, wherein the response generation module is configured to generate an LLM response based on the LLM request, including at least one of the following features: The information extraction module is used to extract the agent context information in the LLM request and fuse the agent context information with the IDE context information obtained from the working area of the metacognitive layer. The response review module is used to integrate the obtained context information into the LLM request and review the LLM request after the input is completed to generate the corresponding LLM response.
[0016] Secondly, embodiments of this application provide an agent evolution method, the evolution method comprising: The agent execution layer controls the agent to execute the plan to be executed; The metacognitive layer natively captures and analyzes the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and obtains learning and reflection results to evolve the agent from the target dimension.
[0017] Thirdly, embodiments of this application provide an electronic device, including: a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the processor communicates with the memory via the bus. When the machine-readable instructions are executed by the processor, the steps of the intelligent agent evolution method are performed.
[0018] Fourthly, embodiments of this application provide a computer-readable storage medium storing a computer program, which, when executed by a processor, performs the steps of the intelligent agent evolution method.
[0019] This application provides an intelligent agent evolution system, comprising a metacognitive layer and an intelligent agent execution layer. The intelligent agent execution layer controls the intelligent agent to execute a plan to be executed. The metacognitive layer natively captures and analyzes the reasoning process, tool calls, and execution results generated by the intelligent agent in executing the plan to be executed, obtaining learning and reflection results to evolve the intelligent agent from the target dimension. This application has significant beneficial effects: First, by constructing a collaborative architecture between the metacognitive layer and the intelligent agent execution layer, the interaction process of LLM requests is simplified, request processing efficiency is improved, and efficient linkage is achieved. Second, the metacognitive layer can natively capture the reasoning process, tool calls, and execution results generated by the intelligent agent execution layer without the need for additional monitoring tools, avoiding data delays and privacy leakage risks, accurately generating learning and reflection results, and realizing the autonomous evolution of the intelligent agent. Third, by establishing a collaborative processing mechanism between the metacognitive layer and the intelligent agent execution layer for the plan to be executed, the intelligent agent execution layer can optimize the execution logic based on the feedback from the metacognitive layer, significantly improving the accuracy and efficiency of intelligent agent task execution, adapting to complex software development scenarios, reducing the workload of developers, and promoting the upgrade of the intelligent agent. Attached Figure Description
[0020] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0021] Figure 1 This paper shows a structural block diagram of an intelligent agent evolution system provided in an embodiment of this application; Figure 2 This paper illustrates an overall architecture diagram of an intelligent agent evolution system provided in an embodiment of this application. Figure 3 A schematic diagram of the architecture for agent upgrades provided in an embodiment of this application is shown; Figure 4 A flowchart illustrating an agent evolution method provided in an embodiment of this application is shown. Figure 5 A structural block diagram of an electronic device provided in an embodiment of this application is shown. Detailed Implementation
[0022] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. It should be understood that the accompanying drawings in this application are for illustrative and descriptive purposes only and are not intended to limit the scope of protection of this application. Furthermore, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of this application. It should be understood that the operations in the flowcharts may not be implemented in sequence, and steps without logical contextual relationships may be reversed or implemented simultaneously. In addition, those skilled in the art, guided by the content of this application, may add one or more other operations to the flowcharts, or remove one or more operations from the flowcharts.
[0023] Furthermore, the described embodiments are merely some, not all, of the embodiments of this application. The components of the embodiments of this application described and illustrated herein can typically be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of this application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely to illustrate selected embodiments of the application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.
[0024] It should be noted that the term "comprising" will be used in the embodiments of this application to indicate the presence of the features declared thereafter, but does not exclude the addition of other features.
[0025] With the widespread adoption of Large Language Models (LLMs) in software development, intelligent agents have been extensively embedded in Integrated Development Environments (IDEs). However, in existing solutions, the IDE and the agent's execution layer are independent of each other, LLM request forwarding is cumbersome, the IDE cannot natively capture the agent's reasoning process and needs to rely on third-party tools to obtain lagging data, the agent lacks an autonomous evolution mechanism, and it is difficult to meet the needs of efficient development.
[0026] Based on this, embodiments of this application provide an intelligent agent evolution system, method, electronic device, and storage medium, which are described below through embodiments.
[0027] Example 1 To facilitate understanding of this embodiment, a detailed description of an intelligent agent evolution system disclosed in this application embodiment will be provided first. For example... Figure 1 The diagram shows a structural block diagram of an intelligent agent evolution system, which can be executed on a terminal device. The intelligent agent evolution system provided in this application includes a metacognitive layer and an intelligent agent execution layer. The agent execution layer 101 is used to control the agent to execute the plan to be executed; The metacognitive layer 102 is used to natively capture and analyze the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and to obtain learning and reflection results to evolve the agent from the target dimension.
[0028] like Figure 2 As shown, the agent execution layer includes a task parsing module. This module processes received tasks, including complex tasks such as code generation, code refactoring, debugging, and data analysis, and generates an execution plan based on the task. The execution plan contains an identifier of the task source. The task parsing module parses the execution plan to clarify the core usage requirements, execution goals, required resources, and inference support requirements. It also removes invalid information and supplements the agent context information related to the execution plan, such as dialogue history, tool usage records, and memory states. The execution plan is broken down into clear, step-by-step execution plans, ensuring that each execution plan has clear execution boundaries and inference requirements. Subsequently, based on the specific execution logic of each execution plan and the required inference support from the Large Language Model (LLM), such as code logic derivation, tool call suggestions, and exception handling schemes, agents such as OPEN CLAW and the Large Language Model (LLM) are controlled to execute multiple execution plans of the execution plan, thereby realizing the processing and execution of the execution plan. In addition, to protect data security, the intelligent agent execution layer is equipped with an external protection layer to prevent external data from intruding and causing leakage of claudecode, LLM, OPEN CLAW, etc., thereby ensuring the smooth execution of the execution plan.
[0029] The metacognitive layer natively captures the entire process of the agent's execution layer executing the plan to be executed. It categorizes, sorts, and deeply analyzes the captured reasoning process, tool calls, and execution results, accurately identifies the success and failure modes in the agent's execution process, summarizes the advantages and disadvantages, and forms standardized learning reflection results. Based on the learning, it upgrades from the preset target dimension, namely the script dimension. The script dimension is selected from the parameter dimension, architecture dimension, and large language model LLM dimension. The script dimension is chosen because script upgrades naturally support the control variable experimental verification mechanism set by the metacognitive layer, such as A / B experimental verification, which can accurately evaluate the effect of each upgrade. This enables targeted optimization and adjustment of the agent's reasoning process, execution logic, and tool call methods, achieving autonomous, controllable, and efficient evolution of the agent and helping the agent continuously adapt to the needs of IDE development scenarios. The reasoning process, such as the agent outputting "I need to analyze the current code structure first, and then determine the file that needs to be modified", or "I need to modify the main function in the app.py file and add error handling logic", includes the name of the tool called by the agent, the parameters, and the execution result. For example, the agent may call the "read_file" tool with the parameter "app.py" and return the file content.
[0030] The metacognitive layer is built upon an integrated development environment (IDE) and serves as the LLMProvider module for the agent execution layer, forming the source of LLM capabilities for agents within the execution layer. Unlike Agent Middleware, which requires agent framework support, the metacognitive layer is framework-independent. Furthermore, compared to Agent Observability, which is primarily used for post-event analysis, lacks real-time intervention capabilities, stores data on third-party platforms, and requires SDK or callback processor integration during use, the metacognitive layer requires no SDK integration, allows for real-time intervention during agent execution, and stores data locally within the IDE, offering better privacy. The IDE metacognitive layer possesses all permissions and capabilities of the agent execution layer, including at least one of the following: access to the agent execution layer's working directory, file system, and project resources; reading the agent execution layer's configuration information, environment variables, and runtime status; invoking tools and external resources accessible to the agent execution layer; modifying the agent execution layer's script files, configuration files, prompt word templates, system settings, and runtime parameters; and inserting, intercepting, enhancing, or terminating operations during the agent execution layer's execution.
[0031] In some embodiments, the agent execution layer is used to control the agent to execute the plan to be executed, and includes at least one of the following features: The source identification module is used to identify the source to which the plan to be executed belongs, and generate an LLM request based on the source. The request encapsulation module is used to encapsulate the LLM request according to the target protocol format, and after encapsulation, send the LLM request to the metacognitive layer through the communication adaptation module; The response parsing module is used to receive and parse the LLM response based on the LLM request from the metacognitive layer. The control execution module is used to respond to the LLM response and execute the plan to be executed.
[0032] In this embodiment, the source identification module of the agent execution layer identifies the source of the plan to be executed. The source is determined by an identifier, which can identify whether the source is input by the user through the human-computer interaction module of the agent execution layer or issued by the metacognitive layer. If the plan to be executed is input by the user through the human-computer interaction module of the agent execution layer according to usage needs, the agent execution layer generates an LLM request after generating an execution plan based on the plan to be executed. The LLM request adopts a structured format and includes a request ID, timestamp, description of the plan to be executed, agent context information, and tool call requirements, etc., and is encapsulated by the request encapsulation module according to the target protocol format. An LLM request is encapsulated and, after encapsulation, sent to the metacognitive layer via the communication adaptation module between the agent execution layer and the metacognitive layer. A response parsing module receives and parses the LLM response from the metacognitive layer based on the LLM request to obtain the text content and tool call information in the LLM response. That is, the metacognitive layer generates an LLM response after processing the LLM request and sends it back to the agent execution layer via the communication adaptation module. The agent execution layer then responds to the LLM response through the control execution module, executes the plan to be executed, and during execution, coordinates with the tool call module to process the plan to be executed. The communication adaptation module establishes a communication connection with the metacognitive layer, specifically supporting the following communication methods: synchronous communication, including function calls, HTTP requests, WebSocket connections, and remote procedure calls; asynchronous communication, including message queues, event buses, and publish-subscribe patterns; and persistent communication, including shared storage, file exchange, and database read / write. The communication adaptation module uses a shared storage module as a communication bridge between the agent execution layer and the metacognitive layer, achieving temporal decoupling of requests and responses and persistent storage. The shared storage module is implemented using an embedded database, supporting concurrent access and transaction processing, ensuring that requests are not lost and are traceable, and supporting multi-turn dialogues in high-concurrency scenarios.
[0033] In some embodiments, the metacognitive layer further includes at least one of the following features: The request receiving module is used to receive the LLM request sent by the agent execution layer based on the communication adaptation module; The response generation module is used to generate an LLM response based on the LLM request and to feed the LLM response back to the agent execution layer.
[0034] In this embodiment, specifically: the metacognitive layer first receives the LLM request sent by the agent execution layer based on the communication adaptation module based on the request receiving module, and performs deep parsing of the LLM request based on the response generation module to extract the description of the execution plan, agent context information and tool call requirements. Then, combined with the context information inherent in the integrated development environment (IDE) itself, including the content of currently opened files, code structure, project dependencies and developer preferences, it calls the built-in LLM-related processing capabilities to generate an LLM response that fits the task requirements and adapts to the IDE development scenario, and feeds back the generated LLM response to the agent execution layer through the communication adaptation module.
[0035] In some embodiments, the response generation module is configured to generate an LLM response based on the LLM request, including at least one of the following features: The information extraction module is used to extract the agent context information in the LLM request and fuse the agent context information with the IDE context information obtained from the working area of the metacognitive layer. The response review module is used to integrate the obtained context information into the LLM request and review the LLM request after the input is completed to generate the corresponding LLM response.
[0036] In this embodiment, the metacognitive layer includes an information extraction module and a response review module. The information extraction module extracts agent context information from the LLM request, including task description, dialogue history, tool call requirements, and required agents. Simultaneously, it retrieves the IDE context information of the current project from the local workspace of the metacognitive layer, including open files, code structure, project dependencies, compilation configuration, and developer preferences. Currently open files include file path, file content, cursor position, and selected code segment; code structure includes function definitions, class definitions, variable declarations, and import statements; project dependencies include dependency configuration files such as package.json, requirements.txt, and pom.xml; and developer preferences include code style, indentation settings, and shortcut key configuration. The two types of contexts are semantically aligned, deduplicated, and logically integrated. For example, if the agent context information already includes a file reference, the IDE context information will not repeat the content of that file, thus achieving deep fusion of the agent context information and the IDE context information. The fused context information is then injected back into the LLM. The request process generates an enhanced request with complete information and a suitable scenario. Finally, the injected LLM request undergoes compliance, completeness, and rationality audits, verifying context consistency, parameter standardization, and task feasibility. Upon approval, the LLM calling submodule invokes the LLM model interface to generate a matching LLM response. The LLM response, in a structured format, includes the audit results of the LLM request, the execution plan and tool invocation suggestions, the response ID, the corresponding request ID, a timestamp, generated content, and status information.
[0037] The metacognitive layer matches target instances from the large language model instance library maintained by the multi-instance management submodule based on the task type and complexity in the LLM request. This large language model instance library stores multiple different LLM instances, each corresponding to a different LLM model, such as a code-specific model or a general inference model, or different configuration parameters, such as response accuracy, response speed, and cost threshold. The metacognitive layer first categorizes the task type in the LLM request, such as code generation, code debugging, logical reasoning, and data analysis, while quantitatively evaluating task complexity, such as simple code completion, complex project refactoring, and multi-step inference. Then, based on preset matching rules, it precisely compares the task type and complexity with the suitable scenarios and performance parameters of each instance in the instance library, selecting the target instance with the best suitability. Once the matching is complete... Subsequently, based on the large language model corresponding to the target instance, the LLM request that has been injected with fused context information and passed the review is subjected to in-depth processing and review. The review includes the completeness of the context information in the request, the adaptability of the task requirements and model capabilities, the standardization of the request format, and various performance parameters of the required large language model LLM. If other aspects pass the review but the performance parameters do not match, the performance parameters or template parameters can be directly changed to match. After the review is passed, the LLM interface corresponding to the large language model LLM in the target instance is called. Combined with the fused context information, an LLM response that fits the task requirements, adapts to the IDE development scenario, has rigorous logic, and can be directly used for agent execution is generated. This provides accurate and efficient inference support for the agent execution layer to carry out subsequent execution plans based on the response.
[0038] In some embodiments, the control execution module, configured to execute the execution plan in response to the LLM response, includes at least one of the following features: The audit request module is used to generate an audit confirmation request before performing key operations and to send the audit confirmation request to the metacognitive layer. The execution continuation module is used to receive real-time intervention instructions from the metacognitive layer and to report the execution status back to the metacognitive layer.
[0039] In this embodiment, the intelligent agent execution layer, after parsing the LLM response based on the response parsing module, executes the execution plan. During execution, the audit request module generates an audit confirmation request before critical operations such as file modification, configuration change, parameter modification, and high-risk deletion operations, and sends the audit confirmation request to the metacognitive layer. Upon receiving a real-time intervention instruction from the metacognitive layer based on the audit confirmation request, the execution continuation module parses the real-time intervention instruction and continues to execute the execution plan based on the real-time intervention instruction, and feeds back the execution status generated based on the real-time intervention instruction to the metacognitive layer.
[0040] In some embodiments, the metacognitive layer further includes at least one of the following features: The review and confirmation module is used to review key operations based on the received review and confirmation requests according to the preset review strategy, and generate review results. The intervention generation module is used to generate real-time intervention instructions based on the audit results and all permissions and capabilities of the intelligent agent execution layer, or based on the anomalies identified by the observation module. In this embodiment, after receiving the review confirmation request, the metacognitive layer confirms the key operation to be performed by the agent. If the review confirmation request is approved, the review result is passed and fed back to the agent execution layer to continue executing the key operation; otherwise, it is rejected. Based on the review result and all permissions and capabilities of the agent execution layer possessed by the metacognitive layer, a real-time intervention instruction is generated. The metacognitive layer also identifies anomalies based on the intervention generation module's observation of the agent execution layer's execution process. For different types of anomalies, the intervention execution module generates corresponding intervention instructions. These intervention instructions include injecting supplementary information, correcting execution parameters, or terminating the current task. For example, for inference logic contradictions, an intervention instruction of "adjusting inference strategy and re-organizing decision-making basis" is generated; for excessive execution time, an intervention instruction of "simplifying redundant steps and calling efficient tools" is generated; for tool call errors, an intervention instruction of "correcting tool parameters and replacing adapted tools" is generated. The key operations or anomalies of the agent are intervened based on these intervention instructions.
[0041] In some embodiments, the metacognitive layer is used to natively capture and analyze the reasoning process, tool calls, and execution results generated by the agent executing the execution plan, to obtain learning and reflection results to evolve the agent from the target dimension, including at least one of the following features: The observation module is used to natively capture the inference process, tool calls, and execution results of the agent based on a preset execution mode; the preset execution mode includes at least one of the following: single agent mode and multi-agent topology mode; The reflection module is used to reflect on the reasoning process, tool calls, and execution results to identify success and failure patterns, and to generate learning reflection results. An upgrade module is used to upgrade the scripts of the agent based on the learning and reflection results; The verification module is used to verify the evolution of the agent after the upgrade script through a controlled variable experimental verification mechanism.
[0042] In this embodiment, the agent execution layer has a built-in preset execution mode, which includes at least one of the following: a single-agent mode and a multi-agent topology mode. That is, the agent execution layer supports single-agent execution of the execution plan, and can also execute the execution plan based on a multi-agent topology mode. Specifically, the multi-agent topology mode can be a parallel topology, a serial topology, a hierarchical topology, or a cooperative topology. In practical use, the agent execution layer can select one of the single-agent mode and the multi-agent topology mode to perform inference, tool invocation, and generate execution results. The metacognitive layer, as a unified inference service endpoint, supports serving multiple agent instances simultaneously, and through the observation module, it monitors the inference process, tool invocation, and execution results under the selected mode from the single-agent mode and the multi-agent topology mode. The execution results are centrally captured natively. Based on the reflection module, the captured data is categorized, sorted, and deeply analyzed to identify success and failure patterns in the reasoning process, tool calls, and execution results. This results in standardized learning and reflection outcomes. Success patterns include the agent's efficient reasoning path for completing tasks, reasonable tool call logic, decision-making methods adapted to the IDE context, and efficient step progression rhythm. Reusable execution experiences are also extracted. Failure patterns cover issues such as contradictory reasoning logic, incorrect tool call adaptation, redundant or missing steps, insufficient utilization of the IDE context, and inadequate exception handling. The scenarios, causes, and scope of impact of these problems are clearly identified, ensuring that the learning and reflection outcomes include both experience summaries and problem localization, providing accurate basis for agent script upgrades. Based on the learning and reflection results, the upgrade module performs targeted upgrades to the agent's scripts. It optimizes the script's inference strategy, execution logic, and tool invocation rules by combining experience extracted from successful patterns. For problems corresponding to failure patterns, it modifies vulnerabilities, redundancies, or unreasonable logic in the script, improves the exception handling mechanism and IDE context adaptation logic, ensuring that the upgraded script can avoid original defects and reuse efficient experience. After the script upgrade is completed, the metacognitive layer uses a pre-set control variable experiment verification mechanism in the verification module to perform multi-scenario verification on the upgraded agent. This control variable experiment verification mechanism includes an A / B test verification mechanism, simulating task scenarios of different complexities and types, allowing the upgraded agent to execute corresponding tasks, synchronously capturing its inference process data and execution behavior, comparing the execution effects before and after the upgrade, and quantitatively evaluating from preset target dimensions (including core dimensions such as task execution efficiency, execution accuracy, inference logic rationality, tool invocation adaptability, and exception handling capabilities) to verify the effectiveness of the upgraded script. This ensures that the upgraded agent achieves optimization and improvement in all target dimensions, truly realizing the agent's co-evolution and helping the agent continuously adapt to the needs of IDE development scenarios.
[0043] The A / B testing verification mechanism includes the following steps: designing an A / B test plan, including experimental group, control group, evaluation indicators, and experimental period; assigning user requests to the experimental group or control group to ensure the statistical validity of the experiment; collecting experimental data and evaluating the effect of the upgraded agent script; and deciding whether to officially release the upgraded agent script based on the effect evaluation results.
[0044] In some embodiments, the reflection module is used to reflect on success and failure patterns from the reasoning process, tool calls, and execution results, and to form a learning reflection result, including at least one of the following features: A success pattern recognition submodule is used to extract best practices from the success patterns; The failure mode identification submodule is used to extract the failure reason from the failure mode; The efficiency analysis submodule is used to analyze the execution efficiency of the intelligent agent, identify redundant steps, and optimize opportunities. The learning outcome generation submodule is used to generate learning reflection results based on success patterns, failure patterns, and efficiency analysis results.
[0045] In this embodiment, the metacognitive layer includes a success pattern recognition submodule and a failure pattern recognition submodule. These two submodules analyze a large number of historical execution cases based on natively captured inference process data, tool calls, and execution results. From complex inference step sequences, they accurately identify best practices and reasons for failure. Best practices refer to the optimal execution path exhibited by the agent when completing a specific type of execution plan, characterized by streamlined steps, logical coherence, reasonable tool calls, short execution time, and accurate results. Reasons for failure refer to execution paths exhibited by the agent during execution, such as redundant steps, logical detours, repeated attempts, tool call conflicts, or insufficient utilization of the IDE context. The efficiency analysis submodule analyzes the agent's execution efficiency, identifies redundant steps and optimization opportunities, and generates efficiency analysis results. The learning result generation submodule generates learning reflection results based on the extracted success patterns, failure patterns, and the generated efficiency analysis results.
[0046] Based on the identified best practices, this application's path recognition module standardizes and structures these practices, refining and generating optimal inference templates. These optimal inference templates not only solidify successful execution steps but also include key decision-making logic, tool invocation specifications, and the best application of the IDE context. When the agent's execution layer receives a future time plan similar to a historical execution plan, the path recommendation module automatically triggers a task matching and template recommendation mechanism. By comparing the features, type, and context of the future time plan, it identifies its similarity to historical execution plans. Once a match is successful, the module proactively recommends the optimal inference template to the agent's execution layer. After receiving and referencing the optimal inference template, the agent's execution layer does not execute mechanically but flexibly adapts to the specific context of the current task, generating personalized inference paths. This allows the agent to quickly enter a highly efficient state when facing similar tasks, significantly shortening inference time, reducing trial-and-error costs, and fundamentally improving overall execution efficiency and quality. In some embodiments, the upgrade module is used to upgrade the agent's script based on the learning reflection results, including at least one of the following features: The script change generation submodule is used to generate change suggestions for the agent script based on the learning and reflection results; The change review submodule is used to evaluate the impact of the proposed changes, including the risks and benefits, and to review the proposed changes. The version submission submodule is used to submit approved change suggestions to the version control module and create new agent script versions. The Change Notification submodule is used to generate change messages based on the content and effects of the changes, in order to provide change reminders.
[0047] In this embodiment, the metacognitive layer includes a script change generation submodule, a change review submodule, a version submission submodule, and a change notification submodule, such as... Figure 3As shown, the script change generation submodule generates script change suggestions for the agent based on the learning and reflection results. These suggestions cover multiple dimensions, including optimizing the reasoning process (e.g., adjusting the division of multi-step thinking steps), refactoring execution logic (e.g., simplifying redundant decision branches), updating tool calling rules (e.g., supplementing key parameter verification logic), and enhancing exception handling mechanisms (e.g., covering more boundary exception scenarios). The change review submodule evaluates the effectiveness of the suggested changes and conducts multi-dimensional reviews of the evaluation results. These evaluation dimensions cover several core indicators: from a task execution efficiency perspective, assessing whether the changed script can shorten reasoning steps, reduce tool calls, and lower execution time; from an execution accuracy perspective, assessing whether the change can improve task completion accuracy and reduce logical errors or result deviations; from a resource consumption perspective, assessing the impact of the changed script on the runtime resources of the metacognitive layer and model call costs; and from a scenario adaptability perspective, assessing whether the change can better fit the current integrated development environment (IDE) development context and developer habits. Through multi-dimensional quantitative scoring and comparative analysis, a clear evaluation result of the change effect is generated, clarifying the feasibility and expected benefits of the change proposal. This evaluation result undergoes multi-dimensional review, with each dimension including at least one of the following characteristics: compliance review, verifying whether the change proposal conforms to the development specifications, scripting standards, and model usage rules of the integrated development environment (IDE); rationality review, assessing the matching degree between the change proposal and the learning and reflection results, ensuring that the change direction targets core issues rather than secondary details; risk review, identifying potential chain risks brought about by the change, such as introducing new logical contradictions, reducing system stability, or incompatibility with existing tools; and benefit review, quantitatively evaluating the expected magnitude of the change's improvement on the overall performance of the agent, and selecting change solutions where the benefits outweigh the costs. During the review process, if any dimension fails, the metacognitive layer will backtrack to the reflection result stage to optimize and adjust the change proposal, or re-analyze execution data to supplement more precise improvement directions. If all multi-dimensional reviews pass, a new script version is automatically created through the version submission submodule, the original script version is retained as a historical backup, completing the upgrade operation of the agent script, and relevant parties are notified of the script change through the change notification submodule.
[0048] The metacognitive layer also has things that can evolve along with the intelligent agent, such as skills / mcp. Introducing these tools when the metacognitive layer is thinking enables evolution.
[0049] In some embodiments, the observation module is used to natively capture the inference process, tool calls, and execution results of the agent based on a multi-agent topology, including at least one of the following features: The inference process capture submodule is used to natively capture the LLM invocation process of the agent; the invocation process includes at least one of the following: input message, output content, tool invocation information, and inference link; The execution behavior recording submodule is used to record the execution behavior of the intelligent agent in performing the task to be executed; the execution behavior includes task decomposition, tool selection, parameter construction, execution result and error handling; The context snapshot submodule is used to save context snapshots at key execution nodes; the context snapshots include the metacognitive layer state, agent state, and environment state. The performance metrics acquisition submodule is used to collect the performance metrics of the intelligent agent; the performance metrics include task completion rate, execution time, resource consumption, and user satisfaction.
[0050] In this embodiment, the metacognitive layer natively captures the LLM call process of the agent when executing the plan to be executed through the inference process capture submodule and the execution behavior recording submodule. The call process includes at least one of the following: input message, output content, tool call information and inference chain, and records the execution behavior of the agent in executing the task to be executed. The execution behavior includes task decomposition, tool selection, parameter construction, execution result and error handling. The native capture means that the agent does not need to actively report, integrate monitoring SDK, or deploy an external monitoring platform. The context snapshot submodule saves a context snapshot at key execution nodes, such as when executing key operation nodes. The context snapshot includes the metacognitive layer state, agent state and environment state. The performance index collection submodule collects the performance index of the agent when executing the plan to be executed. The performance index includes task completion rate, execution time, resource consumption and user satisfaction. The performance of the agent on the above performance indexes can be evaluated based on the performance index collection submodule. If the performance is lower than the preset expected performance, the metacognitive layer can further optimize the agent.
[0051] The agent in the agent execution layer and the metacognitive layer of this application can also adopt the same large language model LLM, fully sharing the context window, thereby generating zero-cost data reuse, reducing complexity, enhancing co-evolution, and highlighting the native capture of the agent's reasoning process without any conversion or adaptation. At the same time, the response speed of the metacognitive layer in generating intervention commands is faster, without the need to switch between models.
[0052] Example 2 This application also provides a method for agent evolution, such as Figure 4 The diagram shows a flowchart of an agent evolution method. The effect achieved by executing this agent evolution method on a terminal device corresponds to the effect achieved by the aforementioned agent evolution system. The agent evolution method described in this application includes: S401, The agent execution layer controls the agent to execute the plan to be executed; S402. The metacognitive layer natively captures and analyzes the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and obtains learning and reflection results to evolve the agent from the target dimension.
[0053] Example 3 This application also provides an electronic device, such as Figure 5 As shown, it includes: a processor 501, a memory 502, and a bus 503. The memory 502 stores machine-readable instructions that can be executed by the processor 501. When the electronic device is running, the processor 501 and the memory 502 communicate through the bus 503. When the machine-readable instructions are executed by the processor 501, the steps of any of the IDE-based intelligent agent evolution methods described above are executed.
[0054] Example 4 This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, performs the steps of any of the IDE-based intelligent agent evolution methods described above.
[0055] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems and devices described above can be referred to the corresponding processes in the method embodiments, and will not be repeated here. In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. Furthermore, multiple modules or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the displayed or discussed mutual coupling or direct coupling or communication connection can be through some communication interfaces; the indirect coupling or communication connection of devices or modules can be electrical, mechanical, or other forms.
[0056] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0057] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0058] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a processor-executable, non-volatile, computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, a platform server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, ROM, RAM, magnetic disks, or optical disks.
[0059] The above are merely specific embodiments of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. An intelligent agent evolution system, characterized in that, The evolutionary system comprises a metacognitive layer and an agent execution layer: The agent execution layer is used to control the agent to execute the plan to be executed; The metacognitive layer is used to natively capture and analyze the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and to obtain learning and reflection results to evolve the agent from the target dimension.
2. The system according to claim 1, characterized in that, The metacognitive layer is used to natively capture and analyze the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and to obtain learning and reflection results to evolve the agent from the target dimension, including at least one of the following features: The observation module is used to natively capture the inference process, tool calls, and execution results of the agent based on a preset execution mode; The preset execution mode includes at least one of the following: single agent mode and multi-agent topology mode; The reflection module is used to reflect on the reasoning process, tool calls, and execution results to identify success and failure patterns, and to generate learning reflection results. An upgrade module is used to upgrade the scripts of the agent based on the learning and reflection results; The verification module is used to verify the evolution of the agent after the upgrade script through a controlled variable experimental verification mechanism.
3. The system according to claim 2, characterized in that, The reflection module is used to reflect on the success and failure patterns from the reasoning process, tool calls, and execution results, and to form a learning reflection result, including at least one of the following features: A success pattern recognition submodule is used to extract best practices from the success patterns; The failure mode identification submodule is used to extract the failure reason from the failure mode; The efficiency analysis submodule is used to analyze the execution efficiency of the intelligent agent, identify redundant steps, and optimize opportunities. The learning outcome generation submodule is used to generate learning reflection results based on success patterns, failure patterns, and efficiency analysis results.
4. The system according to claim 2, characterized in that, The upgrade module is used to upgrade the script of the agent based on the learning reflection results, and includes at least one of the following features: The script change generation submodule is used to generate change suggestions for the agent script based on the learning and reflection results; The change review submodule is used to evaluate the impact of the proposed changes, including the risks and benefits, and to review the proposed changes. The version submission submodule is used to submit approved change suggestions to the version control module and create new agent script versions. The Change Notification submodule is used to generate change messages based on the content and effects of the changes, in order to provide change reminders.
5. The system according to claim 2, characterized in that, The observation module is used to natively capture the inference process, tool calls, and execution results of the agent based on a multi-agent topology, including at least one of the following features: The inference process capture submodule is used to natively capture the LLM call process of the agent; The invocation process includes at least one of the following: input message, output content, tool invocation information, and inference chain; The execution behavior recording submodule is used to record the execution behavior of the intelligent agent in executing the plan to be executed; the execution behavior includes task decomposition, tool selection, parameter construction, execution results, and error handling; The context snapshot submodule is used to save context snapshots at key execution nodes; the context snapshots include the metacognitive layer state, agent state, and environment state. The performance metrics acquisition submodule is used to collect the performance metrics of the intelligent agent; the performance metrics include task completion rate, execution time, resource consumption, and user satisfaction.
6. The system according to claim 1, characterized in that, The agent execution layer is used to control the agent to execute the plan to be executed, and includes at least one of the following features: The source identification module is used to identify the source to which the plan to be executed belongs, and generate an LLM request based on the source. The request encapsulation module is used to encapsulate the LLM request according to the target protocol format, and after encapsulation, send the LLM request to the metacognitive layer through the communication adaptation module; The response parsing module is used to receive and parse the LLM response based on the LLM request from the metacognitive layer. The control execution module is used to respond to the LLM response and execute the plan to be executed.
7. The system according to claim 6, characterized in that, The control execution module, used to respond to the LLM response and execute the plan to be executed, includes at least one of the following features: The audit request module is used to generate an audit confirmation request before performing key operations and to send the audit confirmation request to the metacognitive layer. The execution continuation module is used to receive real-time intervention instructions from the metacognitive layer and to report the execution status back to the metacognitive layer.
8. The system according to claim 6, characterized in that, The metacognitive layer also includes at least one of the following features: The review and confirmation module is used to review key operations based on the received review and confirmation requests according to the preset review strategy, and generate review results. The intervention generation module is used to generate real-time intervention instructions based on the audit results and all permissions and capabilities of the intelligent agent execution layer, or based on the anomalies identified by the observation module.
9. The system according to claim 1, characterized in that, The metacognitive layer also includes at least one of the following features: The request receiving module is used to receive the LLM request sent by the agent execution layer based on the communication adaptation module; The response generation module is used to generate an LLM response based on the LLM request and feed the LLM response back to the agent execution layer.
10. The system according to claim 9, characterized in that, The response generation module is used to generate an LLM response based on the LLM request, including at least one of the following features: The information extraction module is used to extract the agent context information in the LLM request and fuse the agent context information with the IDE context information obtained from the working area of the metacognitive layer. The response review module is used to integrate the obtained context information into the LLM request and review the LLM request after the input is completed to generate the corresponding LLM response.
11. A method for the evolution of an intelligent agent, characterized in that, The evolutionary method includes: The agent execution layer controls the agent to execute the plan to be executed; The metacognitive layer natively captures and analyzes the reasoning process, tool calls, and execution results generated by the agent in executing the plan to be executed, and obtains learning and reflection results to evolve the agent from the target dimension.
12. An electronic device, characterized in that, include: The device includes a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the processor communicates with the memory via the bus. When the machine-readable instructions are executed by the processor, they perform the steps of the intelligent agent evolution method as described in claim 11.
13. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps of the agent evolution method as described in claim 11.