A method and device for logical number fusion multi-agent collaborative evolution and knowledge routing

By dynamically mining the collaborative relationships between heterogeneous agents using a routing coordinator and graph attention network, and combining multi-granularity attention mechanism and mutual information maximization learning, the problems of low collaborative efficiency and knowledge sharing caused by model heterogeneity in multi-agent systems are solved, achieving deep integration and efficient collaboration of rational and mathematical models.

CN122242634APending Publication Date: 2026-06-19TIANJIN DEV ZONE JINGNUOHANHAI DATA TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TIANJIN DEV ZONE JINGNUOHANHAI DATA TECH CO LTD
Filing Date
2026-05-18
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Multi-agent systems suffer from ineffective collaboration due to model heterogeneity, low knowledge sharing efficiency and a tendency to generate negative transfer, and difficulty in deeply integrating rational and mathematical approaches.

Method used

The system obtains the capability and semantic representations of each agent through a routing coordinator, dynamically mines the implicit associations between heterogeneous nodes using a graph attention network, calculates the cooperation weights, filters the cooperative neighbor set through a sparsity strategy, performs knowledge calibration and weighted aggregation by combining a multi-granularity attention mechanism, and introduces a mutual information maximization contrastive learning strategy to optimize the local model.

Benefits of technology

It achieves efficient collaboration and knowledge sharing among heterogeneous intelligent agents, improves knowledge sharing efficiency, avoids negative transfer, and realizes deep integration of rational and mathematical approaches.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242634A_ABST
    Figure CN122242634A_ABST
Patent Text Reader

Abstract

This application discloses a multi-agent collaborative evolution and knowledge routing method and apparatus for fusion of rational and mathematical concepts, belonging to the field of artificial intelligence technology. The method includes: a routing coordinator constructs composite feature representations for each agent based on their capability and semantic representations used for fusion of rational and mathematical concepts; it calculates the collaborative weights between nodes using a graph attention network; and after selecting a set of collaborative neighbors for each agent through a sparsity strategy, it distributes the results to each agent; each agent uses local features as query conditions, calibrates the semantic knowledge from collaborative neighbors in the channel dimension through an attention mechanism, and combines the collaborative weights to perform weighted aggregation of multiple calibrated neighbor features to generate a globally enhanced semantic context representation; then, by maximizing the mutual information between local features and the globally enhanced semantic context representation, it internalizes external consensus knowledge into the local model, and optimizes and updates the parameters of the local model based on the local task objective and mutual information constraints.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence technology, and in particular to a method and apparatus for multi-agent collaborative evolution and knowledge routing that integrates rational and mathematical approaches. Background Technology

[0002] In multi-agent systems, heterogeneity among agents is a key factor affecting collaborative efficiency. Different agents may be based on heterogeneous model architectures (such as large language models with different parameter scales), possess differentiated knowledge representation methods, and undertake diverse task roles, resulting in a lack of unified semantic alignment and knowledge interaction mechanisms among them. Traditional multi-agent collaborative frameworks typically assume that all agents share the same model structure or knowledge representation. However, in real-world scenarios, agents may differ significantly in terms of software models, computing power, or hardware storage space and configuration, leading to variations in the model structures they can support. On the one hand, model parameters with mismatched dimensions cannot be directly aggregated; on the other hand, how to achieve effective knowledge sharing while supporting personalized agents becomes a pressing contradiction to be resolved.

[0003] In multi-agent collaborative scenarios, the agents exhibit significant differences in computing power, storage, and network conditions, resulting in a high degree of heterogeneity in the deployable neural network model structures. Differences in the number of model layers, parameter scale, and computing power type render traditional collaborative methods that assume model homogeneity ineffective, and there is a lack of effective alignment and aggregation mechanisms between heterogeneous models. Existing methods have their own limitations in handling heterogeneous problems: some require model homogeneity, others rely on external public datasets, and still others employ coarse-grained knowledge-sharing mechanisms that fail to resolve semantic conflicts caused by misalignment in heterogeneous feature spaces, easily leading to knowledge conflicts and negative transfer.

[0004] Furthermore, existing methods often fail to effectively integrate knowledge from both the "theoretical" and "numerical" levels when dealing with multi-agent collaborative problems. "Theoretical" refers to the deterministic laws of the physical world and professional knowledge obtained from experiments; "numerical" refers to the uncertain data information obtained from statistical analysis of big data in industrial production and data-driven artificial intelligence (AI) models. The integration of theory and mathematics aims to deeply, dynamically, securely, and interpretably combine the deterministic laws of the physical world with the uncertain numbers learned from big data to solve the problem of autonomous, reliable, and adaptive evolution of industrial systems. How to deeply integrate theory and mathematics is a significant challenge currently facing multi-agent systems.

[0005] There are currently no effective solutions to the technical problems existing in the above-mentioned multi-agent systems, such as ineffective collaboration due to model heterogeneity, low efficiency of knowledge sharing and susceptibility to negative transfer, and difficulty in deep integration of rational and mathematical approaches. Summary of the Invention

[0006] The embodiments of this disclosure provide a method and apparatus for multi-agent collaborative evolution and knowledge routing that integrates rational and mathematical models, so as to at least solve the technical problems existing in the prior art, such as the inability of multi-agent systems to effectively collaborate due to model heterogeneity, low efficiency of knowledge sharing and easy generation of negative transfer, and the difficulty of deep integration of rational and mathematical models.

[0007] According to one aspect of the present disclosure, a multi-agent collaborative evolution and knowledge routing method for fusion of rational and mathematical concepts is provided, comprising: a routing coordinator acquiring capability representations and semantic representations of each agent for fusion of rational and mathematical concepts, and constructing composite feature representations of each agent based on the capability representations and semantic representations; the routing coordinator using all agents for fusion of rational and mathematical concepts as nodes, dynamically mining implicit associations between heterogeneous nodes based on the composite feature representations using a graph attention network, calculating collaboration weights between nodes, and selecting a set of collaborative neighbors for each agent through a sparsity strategy, and distributing the set of collaborative neighbors and corresponding collaboration weights to each agent; each agent for fusion of rational and mathematical concepts receiving semantic knowledge from collaborative neighbors, using local features as query conditions, calibrating the received semantic knowledge in the channel dimension through an attention mechanism, and generating calibrated neighbor features; combining the collaboration weights to perform weighted aggregation of multiple calibrated neighbor features to generate a globally enhanced semantic context representation; each agent for fusion of rational and mathematical concepts internalizing external consensus knowledge into its local model by maximizing the mutual information between its local features and the globally enhanced semantic context representation, and jointly optimizing and updating the parameters of its local model based on local task objectives and mutual information constraints.

[0008] According to another aspect of the present disclosure, a storage medium is also provided, the storage medium including a stored program, wherein, when the program is executed, a processor performs any of the methods described above.

[0009] According to another aspect of the present disclosure, a multi-agent cooperative evolution and knowledge routing device for fusion of rationality and mathematics is also provided, comprising: multiple agents for fusion of rationality and mathematics and a routing coordinator, wherein the routing coordinator is communicatively connected to the multiple agents; The routing coordinator is used to acquire the capability and semantic representations of each agent used for fusion of rational and mathematical information, and to construct a composite feature representation for each agent based on the capability and semantic representations. The routing coordinator is also used to use all agents used for fusion of rational and mathematical information as nodes, and based on the composite feature representations, to dynamically mine implicit associations between heterogeneous nodes using a graph attention network, calculate the cooperation weights between nodes, and select a set of cooperative neighbors for each agent through a sparsity strategy, and distribute the set of cooperative neighbors and their corresponding cooperation weights to each agent. Each agent used for fusion of rational and mathematical information receives semantic knowledge from cooperative neighbors, uses local features as query conditions, and calibrates the received semantic knowledge in the channel dimension through an attention mechanism to generate calibrated neighbor features. It then combines the cooperation weights to perform weighted aggregation of the calibrated neighbor features to generate a globally enhanced semantic context representation. Each agent used for fusion of rational and mathematical information internalizes external consensus knowledge into its local model by maximizing the mutual information between local features and the globally enhanced semantic context representation, and optimizes and updates the parameters of the local model based on the local task objective and mutual information constraints.

[0010] The beneficial effects of this application are as follows:

[0011] (1) By using a topology generation mechanism based on dynamic semantics, the physical resource information and actual data of the agents are digitized into capability representations and semantic representations. Graph attention networks are then used to mine implicit associations between heterogeneous nodes in the latent space, dynamically calculating the cooperation weights between nodes. A set of cooperative neighbors is selected for each agent through a sparsity strategy. The cooperative topology can be adaptively adjusted according to the agent's resource budget vector and semantic representation. Under the premise of semantic similarity, nodes with stronger capability representations and higher data quality are given greater cooperation weights, effectively avoiding the interference of small sample noise on the global topology. This enables heterogeneous agents to accurately locate the best cooperative partners, thereby achieving efficient collaboration.

[0012] (2) By using a fusion mechanism based on multi-granularity attention, local features are used as query conditions. The channel-level attention gating network is used to calibrate the channel dimension of the received semantic knowledge, generate channel response masks, and obtain calibrated neighbor features. Then, the calibrated multi-source neighbor features are weighted and aggregated by collaboration weights to generate a globally enhanced semantic context representation. This enables adaptive identification and weighting of key feature channels that are beneficial to the local task, while suppressing noisy channels that conflict with the semantics of the local task. This fundamentally solves the problem of misalignment in heterogeneous feature space and significantly improves the efficiency and quality of knowledge sharing.

[0013] (3) A mutual information maximization contrastive learning strategy is introduced. The mutual information between local features and the global enhanced semantic context representation is maximized through InfoNCE contrastive loss, and the mutual information loss and local task loss are jointly optimized. This drives the local model to learn in the space of global consensus knowledge while maintaining the heterogeneous structure. Gradient guidance enables the local model to learn to extract general semantic features consistent with global consensus, ensuring that the fused knowledge is aligned with the semantics of the local task, effectively avoiding negative transfer, and enhancing the generalization ability and robustness of the local model.

[0014] (4) A deep integration of "theory" and "number" is achieved: In this application, "theory" is reflected in the structural theoretical modeling and strategy constraints of the collaborative process, including the structural design of the capability encoder and semantic projection head, the theoretical modeling of collaborative relationships by the graph attention network, the normalization and sparsification strategies of collaborative weights, and the knowledge internalization mechanism based on maximizing mutual information; "number" is reflected in the dynamic representation and learning of the system's operating state and actual observation information, including the real-time detection data of each agent on local system resources, the semantic prototype vector calculated based on local private data, the learning process of attention coefficients in the graph attention mechanism, the parameter training of the channel-level gating network, and the parameter update process of the local model based on task loss and contrastive loss. The theory and number are deeply integrated in the "topology generation mechanism based on dynamic semantics" and the "fusion mechanism based on multi-granularity attention": On the one hand, the structured theoretical model guides the data-driven feature learning and topology construction, so that the generation of collaborative relationships follows the theoretical optimal strategy and can adapt to the actual data distribution; on the other hand, the data-driven parameter update and feature calibration feed back and realize the co-evolution and knowledge internalization goals preset by the theoretical model. The two work together within a unified framework to achieve efficient collaboration and knowledge internalization among heterogeneous intelligent agents, fully embodying the core idea of ​​"the integration of theory and mathematics". Attached Figure Description

[0015] The accompanying drawings, which are included to provide a further understanding of this disclosure and form part of this application, illustrate exemplary embodiments of this disclosure and are used to explain this disclosure, but do not constitute an undue limitation of this disclosure. In the drawings:

[0016] Figure 1 This is a flowchart of the multi-agent cooperative evolution and knowledge routing method based on the fusion of rationality and mathematics as described in Embodiment 1 of this application;

[0017] Figure 2 This is an overall framework diagram of the multi-agent collaborative evolution and knowledge routing method based on the fusion of rationality and mathematical data as described in Embodiment 1 of this application. Detailed Implementation

[0018] To enable those skilled in the art to better understand the technical solutions of this disclosure, the technical solutions of the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are merely some embodiments of this disclosure, and not all embodiments. Based on the embodiments of this disclosure, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this disclosure.

[0019] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this disclosure are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this disclosure described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0020] Example 1

[0021] According to the first aspect of this embodiment, a multi-agent collaborative evolution and knowledge routing method integrating rationality and mathematics is provided. Figure 1 A flowchart illustrating the method is shown below. (Refer to...) Figure 1 As shown, the method includes:

[0022] S102: The routing coordinator obtains the capability representation and semantic representation of each agent for rational-numerical fusion, and constructs a composite feature representation of each agent based on the capability representation and semantic representation;

[0023] S104: The routing coordinator takes all the agents used for the fusion of rational and mathematical data as nodes, uses the composite feature representation, dynamically mines the implicit associations between heterogeneous nodes using graph attention network, calculates the cooperation weights between nodes, and selects a set of cooperative neighbors for each agent through a sparsity strategy, and distributes the set of cooperative neighbors and the corresponding cooperation weights to each agent.

[0024] S106: Each agent used for the fusion of rational and mathematical information receives semantic knowledge from its cooperative neighbors, uses local features as query conditions, and performs channel dimension calibration on the received semantic knowledge through an attention mechanism to generate calibrated neighbor features; the calibrated neighbor features are then weighted and aggregated in combination with the cooperative weights to generate a global enhanced semantic context representation.

[0025] S108: Each agent used for rational-numerical fusion internalizes external consensus knowledge into the local model by maximizing the mutual information between local features and the global enhanced semantic context representation, and jointly optimizes and updates the parameters of the local model based on the local task objective and mutual information constraints.

[0026] In this embodiment of the invention, when multiple heterogeneous agents (e.g., robots, sensor nodes, or edge computing devices equipped with different scale language models) are required to collaboratively complete complex tasks (such as environmental perception, target recognition, or decision planning) in a distributed environment, the first step is to obtain the capability representations and semantic representations of each agent for theoretical-numerical fusion through a routing coordinator, and then construct composite feature representations for each agent based on these capability and semantic representations (corresponding to step S102). Specifically, each agent locally detects its own system resources (e.g., CPU computing power, memory capacity, communication bandwidth, etc.) and converts them into capability representations that reflect its own computing power and resource constraints. A capability encoder is used in the process of mapping system resources to capability representations. This mapping mechanism embodies the theoretical modeling of the relationship between resources and model complexity, which falls under the category of "theory". At the same time, each agent uses local private data to extract semantic representations that can express its knowledge features (e.g., prototype vectors formed after mapping heterogeneous features to a unified semantic space through a projection head). The process of mapping local heterogeneous features to a shared semantic space relies on a semantic projection head, which is a feature alignment mechanism based on theoretical assumptions, embodying the modeling idea of ​​"theory". The routing coordinator then fuses the capability representation and semantic representation of each agent (e.g., through feature concatenation) to obtain a composite feature representation that can comprehensively describe the agent's capability level and knowledge preferences.

[0027] Then, the routing coordinator, using all agents as nodes, dynamically mines implicit associations between heterogeneous nodes based on the composite feature representation, calculates the cooperation weights between nodes, and selects a set of cooperative neighbors for each agent through a sparsity strategy. The cooperative neighbor set and its corresponding cooperation weights are then distributed to each agent (corresponding to step S104). Specifically, the routing coordinator uses the composite feature representations of all agents as input node features to the graph attention network, adaptively learns the correlation between each pair of agents through a multi-head attention mechanism, generates attention coefficients characterizing their cooperation potential, and obtains the final cooperation weights after normalization. Considering the actual communication bandwidth limitations, the routing coordinator sparsifies the fully connected topology, for example, retaining only the top K neighbors with the highest cooperation weights for each agent, forming its own exclusive set of cooperative neighbors, and distributing the identifiers of these neighbors and their corresponding cooperation weights to each agent. The routing coordinator's use of GAT to mine implicit associations between heterogeneous nodes is essentially a theoretical model of the cooperation relationships between agents, possessing a clear graph structure reasoning mechanism, and falling within the realm of "theory." Furthermore, through normalization and Top-K screening, it reflects the theoretical constraints on communication efficiency and collaboration quality, which is a "theoretical" design.

[0028] Thus, each agent can clearly identify which partners it should interact with in the current task, avoiding blind communication and ineffective collaboration.

[0029] Next, each agent receives semantic knowledge from its collaborating neighbors. Using local features as query conditions, it calibrates the received semantic knowledge at the channel dimension through an attention mechanism, generating calibrated neighbor features. The calibrated neighbor features are then weighted and aggregated using the collaborative weights to generate a globally enhanced semantic context representation (corresponding to step S106). Specifically, after receiving the semantic prototypes of its neighbors, each agent, considering that the same channel may encode different semantics due to model heterogeneity, needs to perform channel-level fine-tuning of the semantic prototypes of each neighbor using local features as query conditions. Through the attention mechanism, the agent can learn the importance weights of each channel, thereby suppressing noisy channels that conflict with the local task, highlighting complementary key information, and obtaining calibrated neighbor features. Subsequently, the agent uses the collaborative weights issued by the routing coordinator to weightedly sum the calibrated features from multiple neighbors, forming a globally enhanced semantic context representation that integrates collective intelligence, preserving the integrity of the local semantics while also absorbing beneficial components from external knowledge.

[0030] Finally, each agent internalizes external consensus knowledge into its local model by maximizing the mutual information between local features and the global enhanced semantic context representation, and optimizes and updates the parameters of the local model based on the local task objective and mutual information constraints (corresponding to step S108). Specifically, maximizing mutual information enables local features to converge towards the global enhanced semantic context representation in the semantic space, thereby achieving efficient fusion of heterogeneous knowledge without directly exchanging parameters, and promoting the alignment of local digital features towards the global enhanced semantic context representation that integrates multi-source mathematical knowledge. Furthermore, by maximizing the mutual information between local features and the global context, the model is driven to converge towards the consensus space, which is a theoretical guidance mechanism based on information theory and belongs to the category of "theory". On this basis, each agent uses the mutual information constraint as a regularization term and constructs an optimization objective function together with the local task objective (e.g., classification or regression loss), and iteratively updates the parameters of the local model through gradient backpropagation. After multiple rounds of such co-evolution, the local models of each agent can not only maintain good performance for their own private tasks, but also effectively absorb general semantic knowledge from heterogeneous partners, thereby improving the overall collaborative efficiency of the multi-agent system.

[0031] It should be noted that, Figure 2 A general framework diagram of the method described in this embodiment is shown. (Refer to...) Figure 2 As shown, this method mainly consists of two core mechanisms: a topology generation mechanism based on dynamic semantics (involving the collaborative interaction between agents and the route coordinator) and a fusion mechanism based on multi-granularity attention (primarily operating on the agent side, but relying on the topology information issued by the route coordinator). Figure 2 As shown, each agent first detects local system resources, generates a resource budget vector, and converts it into a capability representation through a capability encoder. Each agent generates the resource budget vector by detecting local system resources (such as CPU, memory, and bandwidth). This data comes from the actual operating environment and falls under the category of "numbers." Simultaneously, each agent uses a semantic projection head to map local heterogeneous features to a shared semantic space and calculates prototype vectors for each output semantic dimension based on local private data (these prototype vectors are statistically derived from the data, representing typical "number" modeling). These prototype vectors are then concatenated to obtain the semantic representation. Each agent uploads the generated capability representation and semantic representation to the routing coordinator.

[0032] After receiving the capability and semantic representations of each agent, the routing coordinator constructs a composite feature representation for each agent through feature concatenation. Then, using all agents as nodes, the routing coordinator takes these composite feature representations as input to a graph attention network, dynamically mining implicit associations between heterogeneous nodes and calculating cooperation weights between nodes through the graph attention mechanism. Based on this, the routing coordinator selects a set of cooperative neighbors for each agent using a sparsity strategy (e.g., retaining the top K neighbors with the highest cooperation weights), and distributes the set of cooperative neighbors and their corresponding cooperation weights to each agent.

[0033] After receiving semantic knowledge from their collaborating neighbors, each agent uses local features as query conditions. First, it refines the semantic knowledge of its neighbors through a channel-level attention mechanism, filtering out key feature channels relevant to its local task and generating calibrated neighbor features. Then, each agent, combining the collaboration weights issued by the routing coordinator, weights and aggregates multiple calibrated neighbor features to generate a globally enhanced semantic context representation that incorporates collective intelligence. Finally, each agent internalizes external consensus knowledge into its local model by maximizing the mutual information between its local features and the globally enhanced semantic context representation, and jointly optimizes and updates the parameters of its local model based on the local task objective and mutual information constraints. Through the collaborative work of these two mechanisms, efficient knowledge sharing and collaborative evolution among heterogeneous agents are achieved.

[0034] As described in the background section, in multi-agent collaborative scenarios, the agents exhibit significant differences in computing power, storage, and network conditions, resulting in a high degree of heterogeneity in the deployable neural network model structures. Differences in the number of model layers, parameter scale, and computing power type make traditional collaborative methods that assume model isomorphism ineffective, and there is a lack of effective alignment and aggregation mechanisms between heterogeneous models. Existing methods have their own limitations when dealing with heterogeneous problems: they either require model isomorphism, rely on external public datasets, or employ coarse-grained knowledge-sharing mechanisms that cannot resolve semantic conflicts caused by misalignment in heterogeneous feature spaces, easily leading to knowledge conflicts and negative transfer. Furthermore, existing methods often fail to effectively integrate knowledge at both the theoretical and mathematical levels when dealing with multi-agent collaborative problems.

[0035] In view of this, this application first obtains the capability and semantic representations of agents through a routing coordinator and constructs a composite feature representation, providing a unified data foundation for the association modeling of heterogeneous agents. Then, it utilizes a graph attention network to dynamically mine implicit associations and generate a sparse cooperative topology, enabling each agent to accurately locate the best cooperative partner, thus solving the problem of ineffective collaboration in heterogeneous environments. Next, through a fusion mechanism based on multi-granularity attention, it performs channel-level calibration and weighted aggregation of neighbor semantics, achieving refined fusion of heterogeneous knowledge and significantly improving knowledge sharing efficiency. Finally, it introduces a mutual information maximization contrastive learning strategy to ensure that external knowledge can be internalized into the local model without conflict, effectively avoiding negative transfer. Therefore, this application, through the collaborative design of a topology generation mechanism based on dynamic semantics and a fusion mechanism based on multi-granularity attention, on the one hand, guides data-driven feature learning and topology construction with a structured theoretical model, enabling the generation of collaborative relationships to both follow the theoretically optimal strategy and adapt to the actual data distribution; on the other hand, through data-driven parameter updates and feature calibration, it feeds back and achieves the co-evolution and knowledge internalization goals preset by the theoretical model. The two work synergistically within a unified framework, jointly realizing efficient collaboration and knowledge internalization among heterogeneous agents, fully embodying the core idea of ​​"theory-mathematics fusion." This solves the technical problems existing in current technologies, such as the inability of multi-agent systems to effectively collaborate due to model heterogeneity, low efficiency of knowledge sharing and susceptibility to negative transfer, and the difficulty in deeply integrating theory and mathematics.

[0036] Optionally, each agent determines its own capability representation and semantic representation through the following steps: each agent detects local system resources, generates a resource budget vector including computing power, memory capacity, and communication bandwidth, and maps the resource budget vector into a capability embedding vector through a capability encoder, which serves as its own capability representation; each agent uses a semantic projection head to map local heterogeneous features to a shared semantic space, and calculates the prototype vector of each output semantic dimension based on local private data, and concatenates the prototype vectors of all output semantic dimensions to obtain a semantic feature vector, which serves as its own semantic representation.

[0037] Specifically, such as Figure 2 As shown, in multi-agent collaborative scenarios, traditional collaboration methods based on fully connected or random connections struggle to capture the potential knowledge complementarity between heterogeneous agents, leading to ineffective collaboration or even negative transfer problems. To address this issue, a topology generation mechanism based on dynamic semantics is proposed. First, each agent performs local system resource detection to form a resource budget vector. As shown in the following formula:

[0038] ;

[0039] in, , , They are vectors The three components represent the agent's CPU computing power, memory capacity, and communication bandwidth, respectively, with i representing the agent's ID. The resource budget not only determines the local heterogeneous model... The structural complexity of a node implies its knowledge confidence within a multi-agent system, and larger models typically possess stronger feature extraction capabilities. To transform physical resource values ​​into computationally usable high-dimensional features, a lightweight capability encoder is introduced. The encoder consists of two multilayer perceptron (MLP) layers. The capability embedding vector is obtained through the following mapping. As a representation of ability, it is shown in the following formula:

[0040] ;

[0041] in, The dimension of the capability embedding vector.

[0042] Due to model structure Due to their diverse features and inconsistent feature vector dimensions, direct interaction is impossible. Therefore, this application introduces a lightweight semantic projection head for each agent. This maps local heterogeneous features to a shared semantic space. Projection heads typically consist of linear layers used to align the output dimensions of different models. The agent is based on local private data. Calculate the prototype vector for each output semantic dimension k. As a semantic carrier for privacy protection, it is shown in the following formula:

[0043] ;

[0044] in, For the k-th output semantic dimension of agent i, there is local private data. This is the feature extraction part of the local heterogeneous model of agent i.

[0045] This prototype vector highly abstracts the semantic information of local data without revealing the privacy of the original data.

[0046] The prototype vectors of all output semantic dimensions are concatenated in order to obtain the complete semantic feature vector of the agent. This serves as the semantic representation of the agent. If the agent lacks data in a certain output semantic dimension, zeros are padded at the corresponding position. This semantic feature vector serves as the semantic representation of the agent, as shown in the following formula:

[0047] ;

[0048] in, is the dimension of the semantic feature vector.

[0049] Thus, each intelligent agent will characterize its capabilities. and semantic representation Uploaded to the routing coordinator, providing a data foundation for subsequent construction of dynamic semantic topology and implementation of knowledge routing.

[0050] Optionally, the routing coordinator uses all agents as nodes and, based on the composite feature representation, dynamically mines implicit associations between heterogeneous nodes using a graph attention network to calculate the cooperation weights between nodes. This includes: using the composite feature representation of each agent as the input node features of the graph attention network; transforming the composite feature representation using a learnable linear transformation matrix; for any two agent nodes i and j, concatenating the transformed features, and calculating the attention coefficient of node i to node j using the attention weight vector, with the calculation formula as follows:

[0051] ;

[0052] in, Let i be a composite feature representation of agent i. Let be the composite feature representation of agent j. It is a learnable linear transformation matrix. For attention weight vectors, This indicates a concatenation operation; the attention coefficient is calculated using a learnable linear transformation matrix and an attention weight vector, and depends on feedback from the training data, thus falling under the category of "number".

[0053] The attention coefficients are normalized to obtain the final cooperation weight of node i to node j, calculated using the following formula:

[0054] ;

[0055] in, Let be the final collaboration weight between node i and node j. Let i be the set of all neighboring nodes that are connected to node i. Let be the attention coefficient of node i to node k.

[0056] Specifically, in the topology generation mechanism based on dynamic semantics, after the route coordinator receives the capability representations and semantic representations uploaded by each agent, it first performs a feature concatenation operation to combine the capability representations of each agent. and semantic representation Concatenation into composite feature representation As shown in the following formula:

[0057] ;

[0058] in, It provides a complete description of the capabilities and knowledge preferences of an agent i.

[0059] Then, the routing coordinator uses all the agents as nodes. N is the total number of agents. Graph Attention Network (GAT) is used to dynamically calculate the collaborative weights between nodes. End-to-end learning is used to capture the collaborative relationships of the asymmetric model structure.

[0060] Let W be the learnable linear transformation matrix in the GAT layer, and a be the weight vector of the attention mechanism. For any two nodes i and j, the attention coefficients... The calculation process is shown in the following formula:

[0061] ;

[0062] Among them, input and It includes resource embedding. Therefore, neural networks perform computation... At that time, the semantic similarity of both parties will be considered simultaneously. The assessment that determines semantic similarity is determined by The weights that determine the confidence level of resources.

[0063] To ensure that the attention coefficients have probabilistic meaning and facilitate subsequent weighted aggregation, the attention coefficients of all neighbors of node i (including itself; graph attention networks typically consider self-attention) are calculated. Softmax normalization is performed to obtain the final dynamic cooperation weights. :

[0064] ;

[0065] Next, GAT outputs the attention matrix. This constitutes the full semantic topology of the current round.

[0066] Thus, through the dynamic computation of the graph attention network, the routing coordinator can adaptively mine implicit associations between heterogeneous agents, generating weight coefficients for each agent with all other potential collaborating objects, laying the foundation for subsequent cooperative neighbor selection and knowledge fusion. This mechanism enables the system to dynamically adjust cooperative relationships based on the agents' real-time uploaded capabilities and semantic information, effectively avoiding ineffective collaboration and negative migration problems caused by static topologies or random connections.

[0067] Optionally, the operation of selecting a set of cooperative neighbors for each agent using a sparsity strategy and distributing the set of cooperative neighbors and the corresponding cooperative weights to each agent includes: for each agent i, calculating the final cooperative weights... Sort all potential neighbor nodes in descending order; retain the top K neighbors with the highest cooperation weights to construct a cooperative neighbor set, calculated using the following formula:

[0068] ;

[0069] in, For collaborative neighbor sets, Let be the final collaboration weight between node i and node k;

[0070] The cooperative neighbor set and the corresponding final collaboration weight Distribute to each intelligent agent.

[0071] Specifically, in the dynamic semantic-based topology generation mechanism, after the route coordinator calculates the full cooperation weight matrix A among all nodes through a graph attention network, although this matrix fully characterizes the semantic cooperation relationships among all agents in the current round, in actual deployment scenarios, especially in bandwidth-constrained cloud-edge environments, the overhead of each agent maintaining full connectivity communication with all other agents is too high. This not only consumes a large amount of network resources but may also introduce a large amount of low-value information interference. To address this, a sparsity strategy is introduced to significantly reduce the communication load while ensuring the quality of cooperation.

[0072] For each agent i, the route coordinator determines the weights. Sort all potential neighbors, and only keep the top K neighbors with the highest weights to construct a sparse neighbor list. :

[0073] ;

[0074] Where K is a preset positive integer, which is usually set according to the total system bandwidth, the number of agents and task requirements. For example, K=3 or K=5.

[0075] Ultimately, the routing coordinator will generate a sparse list of neighbors. and their corresponding normalized weights The packaged data is distributed to each agent to guide subsequent knowledge fusion. It should be noted that during the model training phase, this process dynamically evolves with each training round, and each round may generate a different cooperative topology (i.e., select different lists of sparse neighbors) to achieve adaptive adjustment to changes in the relationships between agents.

[0076] Thus, by employing a Top-K sparsity strategy, the routing coordinator significantly reduces communication overhead while preserving the most critical collaborative relationships, enabling this method to operate efficiently in real-world bandwidth-constrained environments. Each agent only needs to interact with a small number of high-value partners, reducing ineffective communication and avoiding interference from irrelevant knowledge, providing accurate and efficient input for subsequent fusion mechanisms based on multi-granularity attention.

[0077] Optionally, each agent receives semantic knowledge from cooperative neighbors, uses local features as query conditions, and performs channel-dimensional calibration on the received semantic knowledge through an attention mechanism to generate calibrated neighbor features. This includes: for each cooperative neighbor j, processing the local features of agent i... semantic prototype of cooperating neighbor j The concatenation is performed along the channel dimension to construct a joint feature vector containing semantic information from both parties. The calculation formula is:

[0078] ;

[0079] in, Indicates a splicing operation;

[0080] The joint feature vector The input channel-level attention gating network learns the importance weights of channels through two fully connected layers and uses the sigmoid function to generate channel response masks. The calculation formula is:

[0081] ;

[0082] in, and It is a learnable parameter matrix. It is the ReLU function. For the Sigmoid function; Each element value is between [0,1], representing the retention probability of the corresponding channel;

[0083] Using the generated channel response mask semantic prototype of cooperative neighbor j Perform element-wise multiplication to obtain the calibrated neighbor features. The calculation formula is:

[0084] ;

[0085] in, This indicates element-wise multiplication.

[0086] Specifically, due to significant differences in the model structure and feature extraction focus of each agent, even if the best cooperative neighbors are determined through a topology generation mechanism based on dynamic semantics, the received neighbor semantic features may still be misaligned in the channel dimension and latent space, and direct aggregation can easily lead to feature interference and negative transfer. To address this, a fusion mechanism based on multi-granularity attention is proposed. First, the agent introduces a channel-level attention mechanism to finely filter the received neighbor semantic prototypes, adaptively identifying and weighting key feature channels beneficial to the local task, thereby correcting structural biases between heterogeneous models at a fine-grained level. Second, combining the topology weights issued by the routing coordinator, the neighbor-level attention mechanism is used to weight and aggregate the channel-calibrated features, constructing a globally enhanced semantic context containing rich complementary knowledge. To further ensure the consistency between local features and external knowledge, a contrastive learning strategy that maximizes mutual information is introduced, maximizing the lower bound of mutual information between local features and the global enhanced semantic context representation in the manifold space, forcing the local model to converge towards high-quality neighbor semantics. Finally, the fused semantic bias is incorporated as a regularization term into the optimization objective, driving parameter updates of the local heterogeneous models through backpropagation, achieving efficient knowledge internalization under parameter-free exchange conditions.

[0087] For agent i, the semantic prototype of its cooperative neighbor j Although with local characteristics Even with the same dimensionality, the k-th channel might encode completely different semantic features in two models, such as texture versus contour. Directly adding them would destroy the semantic integrity of local features. Therefore, a channel attention network is designed to address this issue. Actively filter neighbor prototypes as query criteria. For channels related to local tasks, noise suppression channels are implemented using the following steps:

[0088] First, local features Prototype of Neighbor The concatenation is performed along the channel dimension to construct a joint feature vector containing semantic information from both parties. As shown in the following formula:

[0089] ;

[0090] in, This indicates a splicing operation, establishing an interactive channel between local needs and external knowledge.

[0091] Secondly, a gated network consisting of two fully connected layers is introduced to learn the importance weights of the channels. ReLU activation is used between the two fully connected layers. Finally, a channel response mask is generated using the Sigmoid function. As shown in the following formula:

[0092] ;

[0093] in, and It is a learnable parameter matrix. It is the ReLU function. This is the Sigmoid function. Each element value is between [0,1], representing the retention probability of the corresponding channel. The network parameters used to generate the channel response mask ( , It is learned from data and belongs to the category of "number".

[0094] Using the generated mask Neighbor prototype Perform element-wise multiplication to obtain the calibrated neighbor features. As shown in the following formula:

[0095] ;

[0096] This operation suppresses neighbor channels that conflict with the local task's semantics, making their weights approach 0, while activating complementary channels, making their weights approach 1, thus achieving the retention of important semantics from neighbors.

[0097] Thus, through the aforementioned channel-level attention calibration mechanism, each agent can adaptively extract the most relevant and valuable information for its own task from the semantic knowledge of each cooperating neighbor while preserving the integrity of its local features. This provides clean and accurate input for subsequent weighted aggregation and knowledge internalization. This mechanism fundamentally solves the semantic conflict problem caused by the misalignment of feature spaces in heterogeneous models and avoids the negative transfer risk brought about by direct fusion.

[0098] Optionally, the operation of weighted aggregation of multiple calibrated neighbor features in combination with the cooperative weights to generate a globally enhanced semantic context representation includes: obtaining the cooperative neighbor set issued by the routing coordinator. and the corresponding collaboration weights For all neighbors in the collaborative neighbor set, the calibrated neighbor characteristics will be... With corresponding collaboration weights A weighted summation is performed to generate a global enhanced semantic context representation. The calculation formula is as follows:

[0099] ;

[0100] in, This is a global enhanced semantic context representation for agent i.

[0101] Specifically, in the fusion mechanism based on multi-granularity attention, after agent i performs channel-level calibration on the semantic prototype of each cooperative neighbor j, it obtains multiple calibrated neighbor features. At this point, agent i needs to aggregate these features from different neighbors to form a global representation that integrates multi-source knowledge. However, since the resource confidence and semantic relevance of different neighbors vary, a simple equal-weighted average cannot reflect these differences and may even introduce interference from low-quality information. Therefore, agent i uses the cooperative weights issued by the routing coordinator for weighted aggregation.

[0102] First, agent i retrieves the set of cooperative neighbors issued by the routing coordinator from local storage. and their corresponding collaboration weights These collaborative weights are calculated and sparsified in a topology generation mechanism based on dynamic semantics through a graph attention network. They quantify the importance of each neighbor to agent i. That is, the larger the weight, the better the semantic knowledge of the neighbor matches the local needs of agent i, or the more reliable its resource capabilities.

[0103] Subsequently, for each neighbor j in the cooperative neighbor set, agent i calibrates its neighbor features. With corresponding collaboration weights Multiply the contributions to obtain the weighted neighbor contributions. Then, sum all the weighted contributions to generate the global enhanced semantic context representation of agent i. This process can be represented by the following formula:

[0104] ;

[0105] in, The global enhanced semantic context representation of agent i is obtained by weighted aggregation of features from multiple neighbors. The aggregation result depends on the semantic input of the current round and belongs to the dynamic fusion of "numbers".

[0106] Through the aforementioned weighted aggregation, agent i not only retains the channel-calibrated pure semantic information from each neighbor, but also reflects the credibility and relevance of knowledge from different sources through collaborative weights, resulting in a final globally enhanced semantic context representation. It aligns with local semantic preferences while incorporating collective intelligence. For example, in collaborative perception scenarios involving intelligent robots, multiple robots may observe the same target from different angles. After channel calibration and weighted aggregation, the global enhanced semantic context representation obtained by each robot can integrate features from various perspectives while highlighting information most relevant to its own task, thus improving perception accuracy.

[0107] Thus, through weighted aggregation operations, a fusion mechanism based on multi-granularity attention achieves efficient integration of multi-source heterogeneous knowledge, providing a high-quality global semantic context representation for subsequent knowledge internalization steps. This mechanism ensures that the aggregation result fully reflects the differences in contributions from each collaborating partner, avoiding the information dilution problem caused by equal-weighted averaging.

[0108] Optionally, each agent internalizes external consensus knowledge into its local model by maximizing the mutual information between local features and the globally enhanced semantic context representation, and optimizes the operation of updating the parameters of the local model based on the local task objective and mutual information constraints, including: agent i internalizes local features... With global enhanced semantic context representation Constructing positive sample pairs, local features The context representations of other agents constitute negative sample pairs; agent i calculates mutual information loss using the InfoNCE contrastive loss function. The calculation formula is:

[0109] ;

[0110] in, For cosine similarity, B is the temperature coefficient, and B is the set of negative samples.

[0111] Agent i constructs the overall optimization objective function of the local model, and jointly optimizes the local task loss and the mutual information loss, calculated as follows:

[0112] ;

[0113] in, Let be the loss function for the local task, denoted as cross-entropy loss, and represent the learning objective of the agent. To balance the hyperparameters; For the true label of local tasks; This is the prediction output of the local model for the input sample;

[0114] The parameters of the local model are updated using the backpropagation algorithm to improve the local features. Enhance the semantic context representation globally while maintaining the heterogeneous structure. Learning within the space allows for the internalization of externally agreed-upon knowledge.

[0115] Specifically, in the fusion mechanism based on multi-granularity attention, agent i has generated a globally enhanced semantic context representation. .However, Currently, it remains an external vector. The key issue is how to enable the local heterogeneous model to truly absorb the knowledge within it. Simple L2 distance (mean squared error) is insufficient to capture complex high-dimensional distribution relationships, potentially leading to the model mechanically fitting feature values ​​without understanding semantic meaning. Therefore, this mechanism introduces the mutual information maximization theory, forcing local features to... With global enhanced semantic context representation The distribution tends to be consistent, thereby achieving deep internalization of knowledge.

[0116] First, agent i will use local features With the corresponding global enhanced semantic context representation These are considered positive sample pairs, indicating that they should have a high degree of similarity. At the same time, Contextual representations with other agents (e.g., global augmented semantic contextual representations of other agents within the same batch) These are formed into negative sample pairs, indicating that they should have low similarity. In this way, through contrastive learning, the model can learn to distinguish between "external knowledge that is semantically consistent with itself" and "irrelevant external knowledge".

[0117] Subsequently, agent i uses the InfoNCE contrastive loss function to calculate the mutual information loss. The numerator of this loss function calculates the cosine similarity (after temperature coefficient) between positive sample pairs. (Scaling), the denominator is calculated as the sum of the similarity of positive sample pairs and the similarity of all negative sample pairs, using the following formula:

[0118] ;

[0119] in, For cosine similarity, Let B be the temperature coefficient, and B be the set of negative samples.

[0120] By minimizing The model is encouraged to make and The similarity should be as high as possible, while also being similar to negative samples. The similarity should be kept as low as possible to maximize [the desired effect]. and The lower bound of mutual information between them. Temperature coefficient. It controls the degree of attention given to difficult negative samples, i.e., smaller ones. This will make the model more focused on distinguishing negative samples that are highly similar to positive samples.

[0121] When constructing the overall optimization objective, agent i will incur mutual information loss. Loss of local mission (For example, cross-entropy loss in classification tasks) joint optimization is performed by balancing hyperparameters. The weights of the two are adjusted using the following formula:

[0122] ;

[0123] in, Let be the loss function for the local task, denoted as cross-entropy loss, and represent the learning objective of the agent. To balance the hyperparameters; For the true label of local tasks; This is the prediction output of the local model for the input sample.

[0124] The local task loss ensures that the model does not forget its ability to fit local private tasks while absorbing external knowledge, while the mutual information loss guides the model's feature extractor to move closer to the global consensus knowledge space.

[0125] Finally, the target is optimized using the backpropagation algorithm. The resulting gradients are used to update the parameters of the local model, so that the local features... While maintaining the heterogeneous model structure, gradually move towards a global enhanced semantic context representation. Learning occurs within the local learning space. The parameter updates of the aforementioned local models are entirely data-driven, falling under the category of "numbers." Through multiple iterations, the local models of each agent can not only perform their own tasks excellently but also internalize common semantic knowledge from heterogeneous partners, achieving co-evolution.

[0126] Thus, by maximizing mutual information, the fusion mechanism based on multi-granularity attention achieves efficient internalization of external knowledge without parameter exchange, effectively avoiding the negative transfer problem and improving the generalization ability and robustness of the local model.

[0127] Furthermore, according to a second aspect of this embodiment, a storage medium is provided. The storage medium includes a stored program, wherein, when the program is executed, a processor performs the method described in any of the above embodiments.

[0128] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that the present invention is not limited to the described order of actions, because according to the present invention, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily essential to the present invention.

[0129] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods according to the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0130] Example 2

[0131] This embodiment provides a multi-agent cooperative evolution and knowledge routing device for the fusion of rationality and mathematics. The device corresponds to the method described in Embodiment 1 and includes: multiple agents for the fusion of rationality and mathematics and a routing coordinator, wherein the routing coordinator is communicatively connected to the multiple agents. The routing coordinator is used to acquire the capability and semantic representations of each agent used for fusion of rational and mathematical information, and to construct a composite feature representation for each agent based on the capability and semantic representations. The routing coordinator is also used to use all agents used for fusion of rational and mathematical information as nodes, and based on the composite feature representations, to dynamically mine implicit associations between heterogeneous nodes using a graph attention network, calculate the cooperation weights between nodes, and select a set of cooperative neighbors for each agent through a sparsity strategy, and distribute the set of cooperative neighbors and their corresponding cooperation weights to each agent. Each agent used for fusion of rational and mathematical information receives semantic knowledge from cooperative neighbors, uses local features as query conditions, and calibrates the received semantic knowledge in the channel dimension through an attention mechanism to generate calibrated neighbor features. It then combines the cooperation weights to perform weighted aggregation of the calibrated neighbor features to generate a globally enhanced semantic context representation. Each agent used for fusion of rational and mathematical information internalizes external consensus knowledge into its local model by maximizing the mutual information between local features and the globally enhanced semantic context representation, and optimizes and updates the parameters of the local model based on the local task objective and mutual information constraints.

[0132] Optionally, the intelligent agent includes: a capability mapping module, used to detect local system resources, generate a resource budget vector including computing power, memory capacity and communication bandwidth, and map the resource budget vector into a capability embedding vector through a capability encoder as its own capability representation; and a semantic projection module, used to map local heterogeneous features to a shared semantic space using a semantic projection head, calculate the prototype vector of each output semantic dimension based on local private data, and concatenate the prototype vectors of all output semantic dimensions to obtain a semantic feature vector as its own semantic representation.

[0133] It should be noted that the multi-agent collaborative evolution and knowledge routing device that integrates rational and mathematical principles provided in this embodiment can realize all the functions and steps in the above method embodiments, solve the same technical problems, and achieve the same technical effects. The similarities will not be repeated here.

[0134] Therefore, according to this embodiment, firstly, the capability and semantic representations of agents are obtained through a routing coordinator, and a composite feature representation is constructed, providing a unified data foundation for the association modeling of heterogeneous agents. Then, implicit associations are dynamically mined using a graph attention network to generate a sparse cooperative topology, enabling each agent to accurately locate the best cooperative partner, solving the problem of ineffective collaboration in heterogeneous environments. Next, a channel-level calibration and weighted aggregation of neighbor semantics is performed through a fusion mechanism based on multi-granularity attention, achieving refined fusion of heterogeneous knowledge and significantly improving knowledge sharing efficiency. Finally, a contrastive learning strategy that maximizes mutual information is introduced to ensure that external knowledge can be internalized into the local model without conflict, effectively avoiding negative transfer. Therefore, this application, through the collaborative design of a topology generation mechanism based on dynamic semantics and a fusion mechanism based on multi-granularity attention, on the one hand, guides data-driven feature learning and topology construction with a structured theoretical model, enabling the generation of collaborative relationships to both follow the theoretically optimal strategy and adapt to the actual data distribution; on the other hand, through data-driven parameter updates and feature calibration, it feeds back and achieves the co-evolution and knowledge internalization goals preset by the theoretical model. The two work synergistically within a unified framework, jointly realizing efficient collaboration and knowledge internalization among heterogeneous agents, fully embodying the core idea of ​​"theory-mathematics fusion." This solves the technical problems existing in current technologies, such as the inability of multi-agent systems to effectively collaborate due to model heterogeneity, low efficiency of knowledge sharing and susceptibility to negative transfer, and the difficulty in deeply integrating theory and mathematics.

[0135] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0136] In the above embodiments of the present invention, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0137] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A multi-agent collaborative evolution and knowledge routing method integrating rationality and mathematics, characterized in that, include: The routing coordinator acquires the capability and semantic representations of each agent for rational-numerical fusion, and constructs a composite feature representation of each agent based on the capability and semantic representations; The routing coordinator uses all agents used for rational-numerical fusion as nodes. Based on the composite feature representation, it uses graph attention network to dynamically mine the implicit associations between heterogeneous nodes, calculates the cooperation weights between nodes, and selects a set of cooperative neighbors for each agent through a sparsity strategy. The set of cooperative neighbors and the corresponding cooperation weights are then distributed to each agent. Each agent used for the fusion of rational and mathematical knowledge receives semantic knowledge from its cooperative neighbors. Using local features as query conditions, it calibrates the received semantic knowledge in the channel dimension through an attention mechanism to generate calibrated neighbor features. The calibrated features of multiple neighbors are weighted and aggregated using the aforementioned collaborative weights to generate a globally enhanced semantic context representation. Each agent used for the fusion of rational and mathematical information internalizes external consensus knowledge into its local model by maximizing the mutual information between local features and the global enhanced semantic context representation, and jointly optimizes and updates the parameters of the local model based on the local task objective and mutual information constraints.

2. The method according to claim 1, characterized in that, Each intelligent agent determines its own capability representation and semantic representation through the following steps: Each intelligent agent detects local system resources, generates a resource budget vector including computing power, memory capacity and communication bandwidth, and maps the resource budget vector into a capability embedding vector through a capability encoder, which serves as its own capability representation. Each agent uses a semantic projection head to map local heterogeneous features to a shared semantic space, and calculates the prototype vector of each output semantic dimension based on local private data. The prototype vectors of all output semantic dimensions are concatenated to obtain a semantic feature vector, which serves as its own semantic representation.

3. The method according to claim 1, characterized in that, The routing coordinator uses all agents as nodes and, based on the composite feature representation, dynamically mines implicit associations between heterogeneous nodes using a graph attention network to calculate the cooperation weights between nodes, including: The composite feature representations of each agent are used as the input node features of the graph attention network, and the composite feature representations are transformed by a learnable linear transformation matrix. For any two agent nodes i and j, the transformed features are concatenated, and the attention coefficient of node i to node j is calculated using the attention weight vector. The calculation formula is as follows: ; in, Let i be a composite feature representation of agent i. Let be the composite feature representation of agent j. It is a learnable linear transformation matrix. For attention weight vectors, Indicates a splicing operation; The attention coefficients are normalized to obtain the final cooperation weight of node i to node j, calculated using the following formula: ; in, Let be the final collaboration weight between node i and node j. Let i be the set of all neighboring nodes that are connected to node i. Let be the attention coefficient of node i to node k.

4. The method according to claim 3, characterized in that, The operation of selecting a set of cooperative neighbors for each agent using a sparsity strategy and distributing the set of cooperative neighbors and their corresponding cooperative weights to each agent includes: For each agent i, based on the calculated final cooperation weights Sort all potential neighbor nodes in descending order; Retain the top K neighbors with the highest collaboration weights to construct a collaborative neighbor set, calculated using the following formula: ; in, For collaborative neighbor sets, Let be the final collaboration weight between node i and node k; The cooperative neighbor set and the corresponding final collaboration weight Distribute to each intelligent agent.

5. The method according to claim 1, characterized in that, Each agent receives semantic knowledge from its cooperating neighbors, uses local features as query conditions, and performs channel-dimensional calibration of the received semantic knowledge through an attention mechanism to generate calibrated neighbor features. This process includes: For each cooperative neighbor j, the local features of agent i will be... semantic prototype of cooperating neighbor j The concatenation is performed along the channel dimension to construct a joint feature vector containing semantic information from both parties. The calculation formula is: ; in, Indicates a splicing operation; The joint feature vector The input channel-level attention gating network learns the importance weights of channels through two fully connected layers and uses the sigmoid function to generate channel response masks. The calculation formula is: ; in, and It is a learnable parameter matrix. It is the ReLU function. For the Sigmoid function; Each element value is between [0,1], representing the retention probability of the corresponding channel; Using the generated channel response mask semantic prototype of cooperative neighbor j Perform element-wise multiplication to obtain the calibrated neighbor features. The calculation formula is: ; in, This indicates element-wise multiplication.

6. The method according to claim 5, characterized in that, The operation of weighted aggregation of multiple calibrated neighbor features using the cooperative weights to generate a globally enhanced semantic context representation includes: Obtain the set of cooperative neighbors issued by the routing coordinator. and the corresponding collaboration weights ; For all neighbors in the collaborative neighbor set, the calibrated neighbor features will be... With corresponding collaboration weights A weighted summation is performed to generate a global enhanced semantic context representation. The calculation formula is as follows: ; in, This is a global enhanced semantic context representation for agent i.

7. The method according to claim 1, characterized in that, Each agent internalizes external consensus knowledge into its local model by maximizing the mutual information between local features and the globally enhanced semantic context representation, and optimizes the operation of updating the parameters of the local model based on the local task objective and mutual information constraints, including: Agent i will use local features With global enhanced semantic context representation Constructing positive sample pairs, local features The context representations of other agents constitute negative sample pairs; Agent i calculates mutual information loss using the InfoNCE contrastive loss function. The calculation formula is: ; in, For cosine similarity, B is the temperature coefficient, and B is the set of negative samples. Agent i constructs the overall optimization objective function of the local model, and jointly optimizes the local task loss and the mutual information loss, calculated as follows: ; in, Let be the loss function for the local task, denoted as cross-entropy loss, and represent the learning objective of the agent. To balance the hyperparameters; For the true label of local tasks; This is the prediction output of the local model for the input sample; The parameters of the local model are updated using the backpropagation algorithm to improve the local features. Enhance the semantic context representation globally while maintaining the heterogeneous structure. Learning within the space allows for the internalization of externally agreed-upon knowledge.

8. A storage medium, characterized in that, The storage medium includes a stored program, wherein, when the program is executed, the method described in any one of claims 1 to 7 is performed by a processor.

9. A multi-agent collaborative evolution and knowledge routing device integrating rationality and mathematics, characterized in that, include: Multiple intelligent agents for fusion of rational and mathematical data, and a routing coordinator, wherein the routing coordinator is communicatively connected to the multiple intelligent agents; The routing coordinator is used to obtain the capability representation and semantic representation of each agent for rational-numerical fusion, and to construct a composite feature representation of each agent based on the capability representation and semantic representation; The routing coordinator is also used to take all agents used for fusion of rational and mathematical functions as nodes, dynamically mine the implicit associations between heterogeneous nodes based on the composite feature representation, calculate the cooperation weights between nodes, and select a set of cooperative neighbors for each agent through a sparsity strategy, and distribute the set of cooperative neighbors and the corresponding cooperation weights to each agent. Each agent used for the fusion of rational and mathematical knowledge receives semantic knowledge from its cooperative neighbors, uses local features as query conditions, and performs channel-dimensional calibration on the received semantic knowledge through an attention mechanism to generate calibrated neighbor features. The calibrated features of multiple neighbors are weighted and aggregated using the aforementioned collaborative weights to generate a globally enhanced semantic context representation. Each agent used for the fusion of rational and mathematical information is also used to internalize external consensus knowledge into the local model by maximizing the mutual information between local features and the global enhanced semantic context representation, and to jointly optimize and update the parameters of the local model based on the local task objective and mutual information constraints.

10. The apparatus according to claim 9, characterized in that, The intelligent agent includes: The capability mapping module is used to detect local system resources, generate a resource budget vector including computing power, memory capacity and communication bandwidth, and map the resource budget vector into a capability embedding vector through a capability encoder, which serves as its own capability representation. The semantic projection module is used to map local heterogeneous features to a shared semantic space using a semantic projection head, and to calculate the prototype vector of each output semantic dimension based on local private data. The prototype vectors of all output semantic dimensions are concatenated to obtain a semantic feature vector, which serves as its own semantic representation.