Radio map construction and non-cooperative radiated source positioning method based on agent interaction
By using dynamic sampling of UAV swarms and Gaussian regression to generate predicted distribution maps, combined with multi-agent consensus perception and task-driven semantic recovery modules, the accuracy and efficiency issues of radio map construction and radiation source localization in complex electromagnetic environments are solved, achieving efficient collaborative perception and localization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUILIN UNIV OF ELECTRONIC TECH
- Filing Date
- 2026-03-26
- Publication Date
- 2026-06-19
Smart Images

Figure CN122248526A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of radio map construction and non-cooperative radiation source localization, and in particular to a method for radio map construction and non-cooperative radiation source localization based on agent interaction. Background Technology
[0002] In the field of radiation source localization and radio map construction, there are two main types of methods: model-driven and data-driven. Existing technologies in this area suffer from the following prominent problems: Regarding dual-task collaboration, radio map construction and radiation source localization are generally treated separately. Localization accuracy is highly dependent on map construction quality, but map construction cannot obtain feedback correction from the localization results, creating a bottleneck of one-way information loss in accuracy. While some methods attempt to implement both tasks within the same framework, they are mostly executed sequentially or optimized independently, lacking a collaborative optimization mechanism. Regarding the utilization of physical priors, existing methods fail to integrate the physical laws of signal propagation into feature learning, while data-driven methods lack physical interpretability. Methods that introduce external semantic information only remain at the input layer stitching level, failing to achieve continuous guidance of physical priors for feature extraction, leading to edge artifacts under sparse sampling and limited model generalization ability. Regarding multi-agent collaboration, existing methods lack information interaction and consistent representation among agents. Most methods use single UAVs or fixed sensor networks, without addressing multi-agent collaborative perception. While a few methods can handle data from multiple receivers, all data is centrally input into the network, failing to simulate the interaction process between agents. In scenarios with aliased multi-source signals, features extracted by different agents often become confused due to a lack of physical constraints, making it difficult to form a consistent representation of the radiation source target. Furthermore, feature selection strategies are mostly statically designed and cannot be dynamically adjusted based on real-time cognitive biases. Regarding system evolution capabilities, existing methods lack closed-loop learning and adaptive optimization mechanisms. Current methods are mostly offline training and static optimization, with fixed model parameters, making it impossible to autonomously adjust strategies in dynamic environments. The system's performance degrades sharply when the environment changes, and it cannot continuously learn and accumulate experience from interaction results. A closed-loop learning system of interaction-feedback-optimization has not yet been formed, hindering its self-evolution and continuous improvement capabilities. In terms of balancing efficiency and accuracy, existing methods do not consider communication efficiency as an optimization objective. Large-scale spectrum data transmission places a heavy burden on limited computing power and channel resources. However, existing methods either only focus on reconstruction accuracy or, while reducing data volume through quantization, do not incorporate efficiency into the optimization objective. They lack task-driven semantic selection mechanisms, making it difficult to achieve a dynamic balance between accuracy and efficiency in resource-constrained scenarios. The common root cause of the above problems lies in the lack of a unified multi-agent collaborative framework and a dual-task joint optimization mechanism, which makes it difficult for the accuracy, efficiency and adaptability of spectrum cognition in complex electromagnetic environments to simultaneously meet the needs of practical applications. Summary of the Invention
[0003] The purpose of this invention is to provide a method for radio map construction and non-cooperative radiation source localization based on intelligent agent interaction to address the shortcomings of existing technologies, thereby improving the accuracy of radio map construction and radiation source localization.
[0004] To achieve the above objectives, embodiments of the present invention provide a method for constructing radio maps and locating non-cooperative radiation sources based on intelligent agent interaction, characterized by comprising the following steps: Step 1: Dynamically sample spatial electromagnetic signals using a swarm of unmanned aerial vehicles (UAVs), and propagate prior and sparse observation data through a gridded area. A Gaussian regression process is used to model the predicted distribution map representing the spatial probability of the radiation source target. Set up a two-dimensional location code that integrates global geometric topology information of the radiation source target with multi-agent consensus perception of the probability estimation of the spatial existence of the radiation source target. ; the obtained predicted distribution map With two-dimensional position encoding Perform convolutional fusion, followed by nonlinear activation, convolutional mapping, and... Function normalization generates spatial attention weight graph ; Step 2, processing sparse observation data Element-wise weighting is performed, followed by multi-layer convolutional projection and nonlinear mapping to generate an initial representation of the task orientation. Spatial attention weight map As a physical prior bias incorporated into the multi-head attention mechanism, a consistent representation of the radiation source target is generated. The output consistency is characterized after reconstruction via residual connection and feedforward network. ; Step 3: Establish a feature map for joint representation of multiple agents using adaptive convolutional kernels; perform deep semantic feature mining by stacking convolutional kernels with fixed constant filters, and combine average pooling to achieve semantic feature filtering and dimensionality reduction; dynamically adjust the number of constant filters to output semantic feature maps. ; Step 4: Design a task-oriented semantic recovery module. This module introduces a geometric topological information location encoder and combines it with the semantic feature map. The features are stitched together; features with confidence levels below a threshold in the semantic feature map are filtered out by a masked multi-head self-attention network; semantic features are reconstructed using a transposed convolutional network; and the radio map is output through parallel fully connected layers to construct the semantic feature interaction results related to the radiation source localization task. Step 5: Construct a joint loss function that integrates radio map and radiation source localization. Based on multi-dimensional task-aware state vector With semantically driven continuous action space semantic feature map We employ soft attention weighting and design a composite reward function that integrates dual-task accuracy and interaction efficiency. To collaboratively optimize the accuracy and interaction efficiency of dual tasks.
[0005] Optional, two-dimensional location encoding Position encoder based on global geometric topology information With consensus-aware encoder It was pieced together.
[0006] Optionally, step 1 includes the following steps: Step 1.1, select the region of interest. Perform grid division, For the set of real numbers, along On shaft and The axis discretizes the region into a single axis. The matrix grid, the spatial set of the discretized grid is defined as , where any specific grid is denoted as or ,satisfy and Belonging to the same Two different grids; the region of interest is the target region in the dynamic acquisition of spatial electromagnetic signals by the UAV swarm. Step 1.2, based on the spatial set Signal propagation priors and sparse observation data from multiple agents By performing Gaussian regression modeling, a predicted distribution map of the continuous signal field characterizing the spatial existence probability of the radiation source target is obtained. : ;in, For Gaussian process regression, It is a function of signal mean versus spatial sampling location. It is a signal covariance-spatial sampling position function; Step 1.3: Set up a two-dimensional location code that integrates the global geometric topology information of the radiation source target with the spatial existence probability estimation of the radiation source target perceived by multi-agent consensus. : ;in, It is a global geometric topology information position encoder. This is a consensus-aware encoder, which is used to construct a probability estimate of the spatial location of a radiation source target. Step 1.4, analyze the obtained predicted distribution map. With two-dimensional position encoding Perform convolutional fusion to obtain preliminary spliced and fused features along the channel dimension. : ,in, Indicates feature concatenation operation; Step 1.5, after nonlinear activation, convolutional mapping and Function normalization generates spatial attention weight graph : ; Optionally, the signal mean-spatial sampling location function is: ; The logarithmic distance path loss model is used for signal propagation priors. To balance the weights; the signal covariance-spatial sampling location function is: , For signal variance, The distance scale determines the signal correlation between any two points.
[0007] Optionally, step 2 includes: Step 2.1, processing sparse observation data Element-wise weighting is performed, followed by multi-layer convolutional projection and nonlinear mapping to generate an initial representation of the radiation source target. : ; where the projection matrix , The dimension for location encoding. For batch normalization operations, For each output channel, the bias vector and activation function are... ; Step 2.2: Initialize the representations of each agent regarding the task objective. Layer normalization is performed, and the normalized features are projected into a query matrix, a key matrix, and a value matrix through three independent learnable linear transformation matrices, respectively. Step 2.3, spatial attention weight map The physical prior bias term is multiplied by the dot product of the query matrix and the key matrix, and after scaling and normalization, the attention weights are obtained. The elements in the value matrix are then weighted and summed using these attention weights to generate the output of the attention head. All outputs are concatenated and fused through a linear projection layer to obtain the radiation source-target consistency representation. ; Step 2.4, Compared with the initial characterization Perform residual connections and input them into a feedforward network for semantic reconstruction, outputting a consistent representation after nonlinear mapping. .
[0008] Optionally, the output of the attention head in step 2.3 is: , They represent the first In layered self-attention mechanisms, the initial representations of the task objective by multiple agents. The query matrix, key matrix, and value matrix after layer normalization and linear transformation; This is the transpose symbol for a matrix; It is a spatial attention weight graph Adapted to the attention matrix dimension function, Key matrix Dimensions, the same layer and Same dimensions , For the total number of attention heads; The consistency of radiation source targets is characterized as follows: , It is a learnable output linear projection weight matrix.
[0009] Optionally, step 2.4 includes the following steps: Will and Perform residual connections and obtain intermediate representations through layer normalization. : intermediate representation The input is fed forward neural network to perform nonlinear mapping of features and deep semantic feature reconstruction. The output of the feedforward network and the intermediate representation are then subjected to residual connections and layer normalization. Finally, a consistent representation is output. : Before residual join, align using linear mapping. and The number of channels is then used for cross-layer feature reuse by adding elements one by one; Presentation layer normalization operation; This represents a feedforward network.
[0010] Optionally, step 3 includes: Step 3.1: Using adaptive convolutional kernels combined with He-normal weight initialization and batch normalization, a feature map for multi-agent joint representation is established. The LeakyReLU activation function is introduced to enhance the selection of nonlinear features related to the dual-task process. The feature mapping operation of multi-agent joint representation is defined as follows: ; in, For the first The input features of a convolutional layer, when hour, ; This represents a two-dimensional convolution operation; For the first The adaptive convolution weight matrix of the layer; For the first Adaptive convolutional bias term of the layer; This indicates a batch normalization operation, used to scale and translate features along the channel dimension; It is the LeakReLU activation function with negative slope penalty; Step 3.2 further enhances semantic interaction by stacking convolutional kernels to fix constant filters, and uses post-average pooling to achieve semantic feature selection and dimensionality reduction; Step 3.3, based on feature map scale Dynamically adjust the number of constant filters : The output is a semantic feature map containing semantic information from the radio map construction and radiation source localization tasks. ,in, This is the local traversal index along the height direction within the average pooling window. This is the local traversal index along the width direction within the average pooling window. This represents the feature channel capacity in the semantic feature map.
[0011] Optionally, step 3.2 includes the following steps: Using the semantic features output from step 3.1 as input, multiple sets of convolutional stacked modules are sequentially constructed for hierarchical cascaded feature extraction. In each set of convolutional stacked modules, a constant-dimensional filter is used for deep semantic feature mining, and average pooling is performed after each convolution operation. For the aforementioned multiple sets of convolutional stacked modules, let the... Group 1 The output features of the layer are as follows: , For the first The input characteristics of the group module, namely ; For the first Group 1 The output features of the layer; For the first The first in the group The fixed-dimensional convolution weight matrix of the layer, For the first The first in the group The layer's bias matrix; It is the LeakReLU activation function with negative slope penalty; Deep semantic features extracted from the last layer of this module group Perform average pooling operation: in, This represents the relative positions of semantic features in two-dimensional space after pooling. Passing the exam Feature values on each channel For discrete spatial indexes along the height of the feature vector; For discrete spatial indexes along the width of the feature vector; For the first The average pooling window size of the group module; For the first The average pooling window sliding step size of the group module; while This is the local traversal index along the height direction within the average pooling window. This is the local traversal index along the width direction within the average pooling window.
[0012] Optionally, step 4 includes the following steps: Step 4.1, change the dimension to grid Geometric topological information location encoder and semantic feature map To splice; Step 4.2: Use a masked multi-head self-attention network to filter out features with confidence scores below a threshold in the semantic feature map, generating a filtered feature map; use a transposed convolutional network to gradually restore the spatial resolution of the feature map, upsample the filtered feature map to the original image size, thereby reconstructing the spatial feature distribution, completing the semantic feature recovery related to the task, and outputting a high-resolution feature map. Step 4.3: The high-resolution feature map output by the transposed convolutional network is used as input through a parallel fully connected layer to perform two tasks in parallel: electromagnetic field map construction and radiation source localization. The output is the semantic feature interaction results related to the radio map construction and non-cooperative radiation source localization tasks.
[0013] The beneficial effects of the above-mentioned technical solution of the present invention are as follows: (1) The present invention obtains a predicted distribution map based on the Gaussian regression process and generates a spatial attention weight map by combining two-dimensional position encoding. This map integrates the predicted distribution of the signal field, geometric topology and the probability estimation of the spatial location of the radiation source target, providing guidance with physical interpretability for subsequent multi-agent consensus joint characterization.
[0014] (2) This invention proposes to convert the attention weight map into a bias term and integrate it into the attention mechanism. Selective feature extraction is performed on the sparse observation data of multiple agents, so that each agent is guided by a unified physical prior in the initial feature interaction stage, and a consistent joint representation of the radiation source target is generated. This method effectively captures the long-range correlation between deep feature information and radiation source target, improves the collaborative cognitive ability of multiple agents to multiple radiation source targets, enhances the perception robustness and scene adaptability of the system in sparse sampling and complex environment, and suppresses information interference.
[0015] (3) The present invention designs a task-driven semantic filtering and interaction strategy mechanism, which dynamically selects key features for transmission based on task requirements and continuously optimizes the interaction strategy in the reinforcement learning closed loop. This method improves the semantic filtering capability and system adaptability, and enhances the system's generalization capability and dual-task collaborative interaction efficiency in complex scenarios.
[0016] (4) The present invention embeds spatial topological information and uses a mask self-attention mechanism to suppress low-confidence features, thereby giving the feature recovery process clear spatial constraints and improving the semantic recovery quality of the system in complex electromagnetic environments. Attached Figure Description
[0017] Figure 1 This is a schematic diagram of the method flow of the present invention; Figure 2 This is a training and validation loss map for the construction of the radio map in this invention. Figure 3 Training and validation loss diagrams constructed for the radiation source localization of this invention; Figure 4 Provides accurate radio maps and the true locations of radiation sources; Figure 5 A sparse, non-uniformly sampled map; Figure 6 The results of radio map and radiation source location estimation; Figure 7 A graph showing the trend of RMSE versus signal-to-noise ratio when radio maps are constructed for simultaneous and separate completion of dual tasks. Figure 8 The graph shows the trend of radiation source localization error as a function of signal-to-noise ratio when the two tasks are completed simultaneously and separately. Figure 9 A comparison of the accuracy of different algorithms in radio map construction tasks based on sampling rate; Figure 10 This study compares the accuracy of different algorithms in radio map construction tasks under varying signal-to-noise ratios. Detailed Implementation
[0018] To make the technical problems, technical solutions and advantages of the present invention clearer, a detailed description will be given below in conjunction with the accompanying drawings and specific embodiments.
[0019] It should be understood that the phrase "one embodiment" or "an embodiment" throughout the specification means that a specific feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the invention. Therefore, "in one embodiment" or "in an embodiment" appearing throughout the specification do not necessarily refer to the same embodiment. Furthermore, these specific features, structures, or characteristics can be combined in any suitable manner in one or more embodiments.
[0020] In various embodiments of the present invention, it should be understood that the sequence number of each process described below does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0021] Example 1: As Figure 1 As shown, this embodiment provides a method for constructing radio maps and locating non-cooperative radiation sources based on intelligent agent interaction, characterized by the following steps: Step 1: For complex electromagnetic environments, use a swarm of UAVs to dynamically sample spatial electromagnetic signals, and use gridded regions to analyze prior and sparse observation data of signal propagation. A Gaussian regression process is used to model the predicted distribution map representing the spatial probability of the radiation source target. Set up a two-dimensional location code that integrates global geometric topology information of the radiation source target with multi-agent consensus perception of the probability estimation of the spatial existence of the radiation source target. ; the obtained predicted distribution map With two-dimensional position encoding Perform convolutional fusion, followed by nonlinear activation, convolutional mapping, and... Function normalization generates spatial attention weight graph This provides physically interpretable guidance for subsequent joint characterization of multi-agent consensus. The multi-agents referred to in this embodiment are multi-rotor unmanned aerial vehicles (UAVs). Step 2, processing sparse observation data Element-wise weighting is performed, followed by multi-layer convolutional projection and nonlinear mapping to generate an initial representation of the task orientation. Spatial attention weight map As a physical prior bias incorporated into the multi-head attention mechanism, a consistent representation of the radiation source target is generated. The output consistency is characterized after reconstruction via residual connection and feedforward network. This module achieves selective feature extraction based on target prior through physical prior-guided attention modulation, effectively capturing the long-range dependency of radiation source targets, thereby improving the ability of multiple agents to mine deep feature associations for the target, enhancing the system's perception robustness and scene adaptability in sparse sampling and complex electromagnetic environments, and effectively suppressing interference from irrelevant information. Step 3: Characterization of radiation source target consistency obtained in Step 2 To achieve efficient and selective semantic interaction among multiple agents, a task-driven semantic filtering and interaction strategy is designed. This strategy utilizes adaptive convolutional kernels to establish feature maps representing the joint representation of multiple agents; it performs deep semantic feature mining by using constant filters with stacked convolutional kernels, and combines this with average pooling to achieve semantic feature filtering and dimensionality reduction; finally, it dynamically adjusts the number of constant filters to output semantic feature maps. This design enhances semantic filtering capabilities and system adaptability through a task-driven interaction mechanism, thereby improving the system's generalization ability and efficiency in dual-task collaborative interaction in complex scenarios. Step 4: Design a task-oriented semantic recovery module. This module introduces a geometric topological information location encoder and combines it with the semantic feature map. The system performs splicing; features with confidence levels below a threshold in the semantic feature map are filtered out using a masked multi-head self-attention network; semantic features are reconstructed using a transposed convolutional network; and preliminary results of semantic feature interaction related to the radiation source localization task are constructed by outputting a radio map through parallel fully connected layers. This module embeds spatial topological information through multi-level semantic interaction and structured fusion, dynamically suppresses low-confidence information and noise, and improves the semantic recovery quality, robustness, and dual-task collaborative efficiency of the system in complex electromagnetic environments. Step 5: Based on the semantic feature interaction results output in Step 4, an embodied intelligent feedback closed-loop system with an optimal dual-task collaborative strategy is constructed. This system constructs a joint loss function that integrates radio map and radiation source localization. Based on multi-dimensional task-aware state vector With semantically driven continuous action space semantic feature map We employ soft attention weighting and design a composite reward function that integrates dual-task accuracy and interaction efficiency. This approach aims to collaboratively optimize the accuracy and interaction efficiency of dual tasks. It enables the system to dynamically balance dual tasks, suppress redundant information, achieve continuous feedback and optimization of strategies, and autonomously update and store the optimal interaction strategy as long-term memory. This significantly improves task accuracy and interaction efficiency in complex electromagnetic environments, enhancing the system's adaptability and robustness.
[0022] Optional, two-dimensional location encoding Position encoder based on global geometric topology information With consensus-aware encoder It was pieced together.
[0023] Optionally, step 1 includes the following steps: Step 1.1: For multi-agent (multi-rotor UAV) collaborative perception scenarios in complex electromagnetic environments, to facilitate preprocessing of the raw sampled data, the region of interest is... Perform grid division, For the set of real numbers, along On shaft and The axis discretizes the region into a single axis. The matrix grid, the spatial set of the discretized grid is defined as , where any specific grid is denoted as or ,satisfy and Belonging to the same Two different grids; Step 1.2, based on the spatial set Signal propagation priors and sparse observation data from multiple agents By performing Gaussian regression modeling, a predicted distribution map of the continuous signal field characterizing the spatial existence probability of the radiation source target is obtained. : (1); where, For Gaussian process regression, The RSS (mean-spatial sampling location function) is used to characterize the short-time statistical properties of signal strength in spatial distribution. It is a signal covariance-spatial sampling position function; Step 1.3: Set up a two-dimensional location code that integrates the global geometric topology information of the radiation source target with the spatial existence probability estimation of the radiation source target perceived by multi-agent consensus. : (2); where, As a global geometric topology information position encoder, it can generate smooth and continuous spatial coordinate references to establish the geometric topology and distance dependence of the spatial distribution of radiation source targets between grids, defined as: (3) in, For dimensional indexing, For the dimension of location encoding, the denominator is preferably 1000, and its size depends on the region of interest. Decision, areas of interest The larger the value, the larger the denominator, and it must be a positive integer power of 10. This represents the dimension of the positional encoding. The value should be a positive integer power of 2, whichever is less than the dimension of the input data. The comma in the formula is the separator for dimension indices, which is the space and channel index separator for the three-dimensional feature tensor. It is used to locate the absolute coordinates of the average pooling sliding window on the input feature map.
[0024] For dimension indexes The value of , due to the use of two-dimensional positional encoding, requires the total dimension to be taken. It is evenly distributed in both the horizontal and vertical directions, that is, distributed in each direction. One data bit, position encoding uses The output is in pairs. Each encoding calculation generates two data bits simultaneously. Therefore, the total number of calculations required in one direction is... Therefore, dimension index Each calculation generates a set Encoding pairs, filled sequentially in the corresponding directions Each data bit is then concatenated to obtain a complete two-dimensional position code, realizing a grid. Representation of spatial location.
[0025] This is a consensus-aware encoder used to construct a probability estimate of the spatial location of a radiation source target. Specifically, it's used to build a global perception confidence level for multiple agents regarding the spatial location distribution of the radiation source target. Taking advantage of the fact that the mean of the Gaussian regression output is a single-channel scalar, a multilayer perceptron (MLP) is utilized. It is subjected to high-dimensional projection and nonlinear mapping, defined as: (4) Among them, variance activation mask Defined as: (5) in, This is a confidence threshold for spatial correlation (initial value set manually, e.g., 0.5, then dynamically adjusted according to step 5), used to filter out grid pairs with sufficiently strong correlation. Indicates position The spatial correlation between the corresponding grid sampling data meets the confidence threshold requirement.
[0026] Step 1.4, analyze the obtained predicted distribution map. With two-dimensional position encoding Perform convolutional fusion to obtain preliminary spliced and fused features along the channel dimension. : (6), among which, Indicates feature concatenation operation; Step 1.5, after nonlinear activation, convolutional mapping and Function normalization generates spatial attention weight graph : (7); Optionally, the signal RSS mean - spatial sampling location function is: (8); This is a logarithmic distance path loss model for signal propagation priors, serving as a gradient constraint for physical priors. It is not an explicit term directly added to the observed signal strength. To balance the weights; the signal covariance-spatial sampling location function is: (9) For signal variance, The distance scale used to determine the signal correlation between any two points is a core hyperparameter of the Matern3 / 2 kernel. It determines the signal correlation between any two points and can be determined empirically or through physical priors. The signal correlation between two points decreases as the distance increases, and the faster the correlation decreases... The smaller the value, the slower the correlation decays. The larger. If If the sampling interval is much smaller than the sampling distance, the model will determine that there is almost no correlation between points that are slightly farther apart, resulting in overly jagged predictions and a lot of noise; if If the size is much larger than the region, the model will assume that the signals are highly correlated across the entire region, resulting in overly smooth predictions that lose local details and features. Therefore, It is usually set to several to ten times the average sampling interval. It is the distance between any two different grid points in space.
[0027] In this embodiment, all radiation sources refer to non-cooperative radiation sources. Non-cooperative radiation source localization means that there is no pre-agreed cooperative relationship or signal interaction protocol between the localization system and the target radiation source. The UAV passively receives the signals naturally radiated by the radiation source, and the sampled data includes complex components such as the radiation source's transmitted signal, channel frequency response, noise, and multipath interference. Based on the spatial attenuation characteristics of the signal, Gaussian process regression combined with path loss physical priors is used to infer the predicted distribution of received signal strength at each grid location within the region, thereby inferring the approximate location of the radiation source. During subsequent model training, the accurate estimation of the radiation source location is continuously optimized without requiring any cooperative information from the radiation source.
[0028] This embodiment systematically integrates model-driven and data-driven approaches, as well as task collaboration and adaptive decision-making. First, the scheme uses Gaussian process regression to combine the physical priors of signal propagation with sparse observation data, generating a spatial attention weight map containing physical laws. This provides interpretable and continuous guidance for subsequent deep feature learning, effectively enhancing the model's robustness under sparse sampling and suppressing artifacts. Second, this physical prior is injected as a bias term into the multi-head attention mechanism, enabling dispersed observations from multiple agents to form a consistent joint representation of the radiation source target within a unified semantic space. This solves the feature confusion problem under multi-source data aliasing and improves the accuracy of collaborative cognition. Third, a task-driven semantic filtering strategy and a reinforcement learning-based closed-loop optimization system are designed, allowing the system to perceive the environment and task status in real time, dynamically balancing dual-task accuracy and communication efficiency, achieving adaptive trade-offs. Finally, the system possesses the ability to continuously learn, accumulate experience, and solidify the optimal strategy into long-term memory, forming a self-evolving "perception-decision-optimization" intelligent closed loop, significantly improving adaptability and overall performance in dynamic and complex electromagnetic environments.
[0029] Example 2: This example should be understood as including at least all the features of any of the foregoing examples, and further improving upon them; Step 2 includes: Step 2.1, processing sparse observation data Element-wise weighting is performed, followed by multi-layer convolutional projection and nonlinear mapping to generate an initial representation of the radiation source target. : (10); where the projection matrix , The dimension for location encoding. For batch normalization operations, For each output channel, the bias vector and activation function are... The formula first applies to Perform 3×3 convolutions to extract spatial features, batch normalize the convolution results to standardize the feature distribution, and then utilize the parameters. , The number of channels is adjusted by performing convolution, and finally the output is obtained by passing the Swish activation function. Step 2.2: Initialize the representations of each agent regarding the radiation source target (mission objective). Layer normalization is performed, and the normalized features are projected into a query matrix, a key matrix, and a value matrix through three independent learnable linear transformation matrices, respectively. Step 2.3, spatial attention weight map Multiplying the result of the dot product of the query matrix and the key matrix as a physical prior bias term, and then scaling it ( ), normalization ( After obtaining the attention weights, the elements in the value matrix are weighted and summed using these weights to generate the output of the attention head. All outputs are then concatenated and fused through a linear projection layer to obtain a consistent representation of the radiation source and target. ; Step 2.4, Compared with the initial characterization Residual connections are performed to preserve initial local perceptual features and effectively alleviate the gradient vanishing problem in deep networks; these are then input into a feedforward network for semantic reconstruction, outputting a consistent representation after nonlinear mapping. .
[0030] Optionally, the output of the attention head in step 2.3 is: (11), They represent the first In layered self-attention mechanisms, the initial representations of the task objective by multiple agents. The query matrix, key matrix, and value matrix after layer normalization and linear transformation; This is the transpose symbol for a matrix; It is a spatial attention weight graph Adapted to the attention matrix dimension function, Key matrix Dimensions, the same layer and Same dimensions , For the total number of attention heads; The consistency of radiation source targets is characterized as follows: (12), It is a learnable output linear projection weight matrix.
[0031] Optionally, step 2.4 includes the following steps: Will and Perform residual connections and obtain intermediate representations through layer normalization. : (13) intermediate representation The input is fed forward neural network to perform nonlinear mapping of features and deep semantic feature reconstruction. The output of the feedforward network and the intermediate representation are then subjected to residual connections and layer normalization. Finally, a consistent representation is output. : (14) Before residual join, align using linear mapping. and The number of channels is then used for cross-layer feature reuse by adding elements one by one; Presentation layer normalization operation; This represents a feedforward network.
[0032] This embodiment processes the spatial attention weight map through an adaptation function and multiplies it as a bias term with the query-key dot product, thereby dynamically and quantitatively modulating the attention distribution and ensuring that physical laws are deeply integrated into the feature interaction process. Simultaneously, the embodiment introduces standardized designs such as layer normalization, batch normalization, and residual connections to ensure the stability of deep attention network training, effectively alleviate gradient problems, and accelerate convergence. Residual connections enable cross-layer reuse of initial local features and deep global semantic information, preserving detail awareness while integrating contextual relationships, significantly enhancing the representation quality and information capacity of the final features. This modular and clear design not only provides a reliable engineering implementation path but also possesses good scalability due to its clear interface and functional cohesion, laying a solid foundation for building more complex collaborative perception systems.
[0033] Example 3: This example should be understood as including at least all the features of any of the foregoing examples, and further improving upon them; Step 3 includes: Step 3.1, using the radiation source target consistency characterization output from Step 2. To design a task-driven selective semantic interaction strategy, we first establish a convolutional mapping module for multi-agent joint representation. By utilizing adaptive convolutional kernels combined with He-normal weight initialization and the LeakyReLU activation function with batch normalization, a feature mapping for multi-agent joint representation is established. Furthermore, the LeakyReLU activation function is introduced to enhance the selection of nonlinear features related to the dual-task process. The feature mapping operation of multi-agent joint representation is defined as follows: (15); in, For the first The input features of a convolutional layer, when hour, ; This represents a two-dimensional convolution operation; For the first The adaptive convolution weight matrix of the layer; For the first Adaptive convolutional bias term of the layer; and The initial values are obtained by the He-normal weight initialization method. This indicates the batch normalization operation, used to scale and translate features along the channel dimension; The LeakReLU activation function with negative slope penalty (where the penalty coefficient is...) ); Step 3.2 further enhances semantic interaction by stacking convolutional kernels with fixed constant filters (i.e., fixed convolutional kernel size and number of channels), and uses post-average pooling (preferably with a step size of 2) to achieve the selection and dimensionality reduction of semantic features; this process achieves the selection and dimensionality reduction of deep semantic features through a multi-level "extraction-aggregation" mechanism.
[0034] Step 3.3: Based on Step 3.2, select the scale of the feature map obtained after dimensionality reduction. Dynamically adjust the number of constant filters : (16) Output a semantic feature map containing semantic information of radio map construction and radiation source localization tasks. ,in, This is the local traversal index along the height direction within the average pooling window. This is the local traversal index along the width direction within the average pooling window. The feature channel capacity in the semantic feature map is used to enable selective interaction of deep semantic features among multiple agents; the feature channel capacity is a parameter preset according to requirements.
[0035] Optionally, step 3.2 includes the following steps: Using the semantic features output from step 3.1 as input, multiple sets of convolutional stacked modules are sequentially constructed for hierarchical cascaded feature extraction. In each set of convolutional stacked modules, a constant-dimensional filter is used for deep semantic feature mining, and average pooling is performed after each convolution operation. For the aforementioned multiple sets of convolutional stacked modules, let the... Group 1 The output features of the layer are as follows (e.g.) ): (17) For the first The input characteristics of the group module, namely ; For the first Group 1 The output features of the layer; For the first The first in the group The fixed-dimensional convolution weight matrix of the layer, For the first The first in the group The layer's bias matrix; The LeakReLU activation function with negative slope penalty (where the penalty coefficient is...) ); Deep semantic features extracted from the last layer of this module group Perform average pooling operation: (18) in, This represents the relative positions of semantic features in two-dimensional space after pooling. Passing the exam Feature values on each channel For discrete spatial indexes along the height of the feature vector; For discrete spatial indexes along the width of the feature vector; For the first The average pooling window size of the group module; For the first The average pooling window sliding step size of the group module; while This is the local traversal index along the height direction within the average pooling window. This is the local traversal index along the width direction within the average pooling window.
[0036] This embodiment mines deep consistency representations through adaptive convolution and fixed filter stacking, combined with step-by-step average pooling. This process simulates key feature extraction and dimensionality reduction transmission in multi-agent collaboration to save communication bandwidth, significantly reducing data volume while preserving the core semantic information of the task. Its dynamic resource allocation mechanism can calculate the required number of final filters in real time based on the spatial scale of the feature map, achieving an adaptive match between computational complexity and feature information density, enabling the system to intelligently focus processing resources on the most critical information regions. Finally, this step outputs a semantic feature map that has been deeply filtered and compressed, which fundamentally reduces the data processing burden of subsequent modules (especially in simulating communication or decision-making stages), providing a direct and efficient data interface for achieving collaborative optimization of "high precision" and "low overhead" at the entire system level.
[0037] Example 4: This example should be understood as including at least all the features of any of the foregoing examples, and further improving upon them; Step 4 includes the following steps: Step 4.1, change the dimension to grid Geometric topological information location encoder and semantic feature map By splicing the data, the precise relative spatial topology information between intelligent agents can be effectively supplemented. Step 4.2: Use a masked multi-head self-attention network to filter out features with confidence scores below a threshold in the semantic feature map, generating a filtered feature map; use a transposed convolutional network to gradually restore the spatial resolution of the feature map, upsample the filtered feature map to the original image size, thereby reconstructing the spatial feature distribution, completing the semantic feature recovery related to the task, and outputting a high-resolution feature map. Step 4.3: The high-resolution feature map output by the transposed convolutional network is used as input through a parallel fully connected layer to perform two tasks in parallel: electromagnetic field map construction and radiation source localization. The output is the semantic feature interaction results related to the radio map construction and non-cooperative radiation source localization tasks.
[0038] This embodiment introduces geometric position encoding and concatenates it with the semantic feature map to supplement the network with accurate relative spatial topological information, correcting the spatial structural relationships that may be lost during feature compression and laying a geometric foundation for accurate reconstruction. Utilizing a masked multi-head self-attention network, the contribution of low-confidence locations in the semantic feature map is calculated and suppressed, performing active noise filtering before feature fusion, significantly enhancing the robustness of the reconstruction process against complex electromagnetic interference. Subsequently, a transposed convolutional network is responsible for progressively upsampling the filtered features, finely reconstructing a high-quality feature map with the original spatial resolution. Finally, parallel fully connected layers synchronously decode the reconstructed features, outputting the radio map and radiation source coordinates respectively, achieving efficient and high-quality parallel reconstruction of the dual-task results, ensuring the practicality and accuracy of the final output.
[0039] Example 5: This example should be understood as including at least all the features of any of the foregoing examples, and further improving upon them; Step 5 includes the following steps: Step 5.1, Construct the joint loss function : (19) in, The loss function constructed for radio maps, The loss function for locating non-cooperative radiation sources.
[0040] It is a real radio map. The actual radio map in this embodiment is the actual radio map in the training set, and the actual radio source location coordinates are the actual radio source location coordinates in the training set.
[0041] To estimate radio maps, To estimate the coordinates of the radiation source location, To construct computational weights for the positioning task in radio mapping. The calculated weights for the non-cooperative radiation source localization task are the relative confidence levels of the model for the two tasks, satisfying the following conditions: The sum is 1; this design enables UAVs to adaptively allocate resources according to tasks under limited computing power, constrains the gradient contribution ratio of dual tasks, and avoids one task dominating the optimization process.
[0042] Loss function for radio map construction for: (20); in, No. Estimated results of measured radio signal strength at each sampling point Indicates the first The model radio signal strength estimation results for each sampling point This indicates the total number of sampling points; the measured radio signal strength in this embodiment is the measured radio signal strength in the training set. Loss function for locating non-cooperative radiation sources for: (twenty one) in, , For the first The true x and y coordinates of each radiation source and For the first Model estimation of the horizontal and vertical coordinates of a radiation source; This represents the total number of radiation sources; Step 5.2: Construct a system that reflects the system state and guides weight adjustments. Time-of-flight multidimensional task-aware state vector : (22); among which, for Loss value of radio map construction corresponding to each batch at any given time. for The loss value for locating non-cooperative radiation sources corresponding to a given batch at any given time. For the first The task estimation accuracy of each batch is compared to the moving average of the task estimation accuracy at the previous time step, which is used to smooth noise and reflect the training trend in the middle stage. The first difference of the loss function , Hyperparameters of the perceptual model and semantic cognitive communication network (mentioned in the above steps) and ); To construct computational weights for the positioning task in radio mapping. The calculation weights for non-cooperative radiation source localization tasks; Step 5.3, utilizing semantic-driven... Moment Action Space semantic feature map Perform soft attention weighting; the action As a continuous vector, its dimension is similar to that of the semantic feature map. Number of channels The same, each component This represents the soft attention weight of the corresponding channel, which is determined by comparing this weight with... semantic features of the channel at any given time Multiplication enables adaptive weighting before feature transfer: ; Step 5.4, Design Time-based composite reward function : (twenty three); The award includes a precision bonus ( ) and efficiency bonus items ( ); for Loss value of radio map construction corresponding to each batch at any given time. Loss value for locating non-cooperative radiation sources; coefficient for The weight, for The weight, It is the sum of the action vector components; semantic feature map Total number of channels; hyperparameters The efficiency reward is controlled by a scale used to adjust the global trade-off between accuracy and efficiency; the accuracy reward is calculated based on the decrease in radio map building loss and non-cooperative radiator localization loss at the next time step, and is expressed through a coefficient. and Balancing the importance of the two tasks; efficiency rewards are used to incentivize the agent to compress and transmit data based on the sum of the components of the current action vector. Total number of channels The compression ratio is calculated using the ratio of the two values, and then a logarithmic transformation is performed on the compression ratio. This provides a clear gradient excitation in the efficient compression range, avoiding over-compression that leads to accuracy loss.
[0043] Step 5.5, based on steps 5.2-5.4, optimize the state, action, and reward design feedback strategy of the PPO algorithm using a proximal strategy: at each time step The agent, based on the current state Select Action And execute it to receive a reward. Afterwards, it will be by The state-action-reward sequence is stored in real-time in an experience replay buffer; generalized advantage estimation is used to estimate the parameters by traversing the policy network. To minimize the accuracy loss in the dual tasks of radio map building and radiation source localization.
[0044] This embodiment constructs a self-learning and continuously optimizing decision-making closed loop. It formalizes the multi-objective trade-off problem in dynamic environments into a standard reinforcement learning problem, explicitly defining a multi-dimensional state vector quantifying system performance, a continuous action space representing resource allocation strategies, and a composite reward function integrating dual-task precision rewards and communication efficiency rewards. Utilizing the Proximal Policy Optimization (PPO) algorithm, the system can autonomously learn which optimization strategy (such as adjusting semantic feature weights or dual-task weights) to adopt under specific performance conditions through real-time interaction with the environment, aiming to maximize long-term overall benefits. This enables the system to achieve fully automatic, real-time dynamic balancing and strategy tuning without external intervention. Furthermore, the system can store the optimal strategy generated iteratively as long-term memory, thus possessing the ability to accumulate experience, prevent catastrophic forgetting, and quickly adapt to similar environments. Ultimately, it forms a self-evolving cognitive system that becomes increasingly intelligent with use, fundamentally solving the performance degradation problem of traditional static models in dynamic and complex scenarios. Example 6: This example should be understood as including at least all the features of any of the foregoing examples, and further improving upon them; Step 5.5 utilizes generalized advantage estimation by traversing the parameters of the policy network. Minimizing the accuracy loss in the dual tasks of radio map building and radiation source localization includes the following steps: Step 5.5.1: Using generalized advantage estimation (GAE), exponential weighted smoothing is performed through multi-step temporal difference (TD) error to calculate the current state-action pair advantage estimate at time t. The estimated advantage Used to evaluate the superiority or inferiority of an action relative to the average strategy, where, For smoothed control parameters of generalized dominance estimation, As a discount factor, This is the time step offset. for The timing difference error at each moment; The timing difference error at time t is defined as: ,in for Instant rewards for each moment for The value network output at any moment for The value network output at each time step; the pruning objective function constrained by policy entropy (finding the parameters that minimize the pruning objective function value). ): (twenty four); in, This indicates that for all time steps Expectations; These are the parameters of the policy network; This is the ratio of the probability after the policy network is updated to the probability before the update. It is used to intuitively measure how much the probability of the system taking the action using the new empirical policy (representing the soft attention weights assigned to each channel of the semantic feature map) has changed compared to the old empirical policy under the same environmental conditions. This is the pruning function, which limits the magnitude of policy updates to prevent the new policy from differing too much from the old policy, thus ensuring stable training without crashing.
[0045] and These represent the states of the policy network before and after the update at time step t, respectively. Take action below The probability of; The current state-action pair advantage estimate at time t is used to evaluate the action. The physical meaning of the relative system average strategy performance advantage is to evaluate whether the current action taken performs better than the system average strategy. To trim hyperparameters, In practice, the value is set to 0.2 to limit the magnitude of change between the old and new strategies. Within the specified area, to prevent the agent from taking too large a step in learning at once, which could cause the system to crash, and to ensure the stability of policy updates; The policy entropy represents the randomness of the system's decision-making. This term is introduced to encourage agents to explore new interactive actions and avoid getting trapped in local optima. This is the entropy coefficient that controls the exploration of new actions; empirically, it is usually set to 0.01. Step 5.5.2, obtain The state corresponding to the time , the state Input a preset value network and obtain the state value output by the value network. Combined with instant rewards Discount Factor as well as The value of state at any moment The time was calculated The temporal difference error; a loss function is constructed based on the square of the temporal difference error. (25), and solve the loss function with respect to the value network parameters. gradient Based on the calculated gradient and the preset learning rate, gradient descent is used to iteratively update the value network parameters. ,implement In step 5.5.2, the temporal difference error is continuously minimized, so that the state value output by the value network continuously approaches the target value composed of the immediate reward and the discounted next state value. The system stores the optimal strategy generated by the iteration as long-term memory, which is used to achieve dual-task collaborative optimization of high-precision construction of radio maps and radiation source localization in complex electromagnetic environments.
[0046] This invention constructs a unified dual-task architecture. First, a UAV swarm dynamically samples electromagnetic signals. Through Gaussian regression modeling and two-dimensional position encoding, a predicted distribution map and physical interpretability guidance information are obtained. Then, a consensus-aware prior-guided weighted feature extraction and consistency representation module generates task-oriented initial features and extracts radiation source target consistency enhancement representations. A task-driven selective semantic interaction strategy outputs a semantic feature map. The task-oriented semantic recovery module obtains the dual-task semantic interaction results. Finally, a reinforcement learning-driven embodied intelligent decision-making closed-loop system is used to achieve dual-task adaptive collaborative optimization. Under complex electromagnetic environments and sparse sampling conditions, this invention simultaneously improves the accuracy of radio map construction, the accuracy of non-cooperative radiation source localization, and the efficiency of multi-agent interaction.
[0047] The following experimental verification demonstrates the beneficial effects of the radio map construction and non-cooperative radiation source localization method based on agent interaction provided in the embodiments of this disclosure: To verify the effectiveness of the system framework, an open-source dataset generated using WirelessInSite software based on the Gudmundson related model was employed. In the simulation experiments, the size was... target area Divided into A grid with a spacing of 3.125m was used (the simulation used an ideal grid, which is difficult to achieve precisely in real-world environments due to positioning accuracy limitations), and the carrier frequency was set to 1.4GHz. To enhance the reliability of the results, the location and number of radiation sources (at a height of 1.5m) were randomly and uniformly placed, transmitting signals in each channel at random power levels between 18dBm and 23dBm. A sensor cluster consisting of six UAVs was deployed, performing sparse sampling along the optimized random trajectory. The initial sampling rate was set to 5%. The training set contained 40,000 samples, and the test set and training set were divided in a 9:1 ratio. The training process was completed offline in the simulation environment. Experiments showed that the average runtime of the framework was approximately 0.0249 seconds per iteration. Due to the strict limitation of the feature channels to less than 12, the total floating-point operations of the system in processing a single frame of data were precisely controlled to around 0.055G. The end-to-end inference time of the model on a general-purpose CPU was consistently within 20 milliseconds, far below the 100-millisecond threshold required by the real-time control system, providing sufficient real-time margin. By introducing controllable computational overhead, the system successfully supported asymmetric optimization strategies under low sampling rates, demonstrating good engineering practical value. Compared to the scope of this simulation experiment, the spatial scale of the actual deployment scenario is much wider, and the UAV uses distributed sampling. Mainstream methods such as optimization, potential field, and inductive avoidance can effectively avoid UAV collisions. The training hyperparameters were set to 100 training epochs, a batch size of 27, and an initial learning rate of 0.0004. Detailed simulation parameter settings are shown in Table 1.
[0048] Table 1 Simulation Parameter Settings Under the conditions of a 100m×100m area, 3.125m resolution, center frequency of 1.4GHz, transmit power of 18-23dBm, and sampling rate of 5%, the mean square error (MSE) used to measure radio map reconstruction should be as low as about 3.5dB. The mean absolute error (MAE) is used to measure the positioning accuracy of radiation source positioning, and the mean positioning error (MAE) should be less than 5m. The system should converge within 50 cycles, and the real-time control system should typically be less than 100 milliseconds (10Hz) to meet the dynamic coordination requirements of UAV swarms.
[0049] like Figures 2-3 The curves show the changes in training and validation losses of the system with training epochs across the dual tasks of radio map building and non-cooperative radiator localization. The loss for the radio map building task is approximately... After several training cycles, it stabilized and converged, with the loss of the radiation source localization task being approximately... After a training cycle, the system converged. The training and validation losses for both tasks decreased rapidly and eventually stabilized, indicating that the system has good convergence.
[0050] like Figure 4 , Figure 5 , Figure 6 The system's visualization results are compared in radio map construction and radiation source localization tasks. Figure 4 This is a real radio map showing the actual location of the radiation source. A white × indicates the actual location of the radiation source. Figure 5 This is a sparse, non-uniform sampling map. The colors in the map represent the power values measured at different grid points. The warmer the grid color, the higher the power and the higher the probability of a radiation source being present at that grid. The cooler the color, the lower the power. A white grid represents a measured value of 0. Figure 6 The images show the reconstructed radio map and radiation source location estimation results, with black × indicating the estimated radiation source locations. Even with only 5% sparse sampling, the system can still accurately recover the overall structure and local details of the radio map. The radiation source distribution boundaries are clear and without obvious artifacts. Furthermore, the positioning results highly match the actual locations, effectively verifying the system's ability to simultaneously perform two tasks in a low-sampling environment.
[0051] like Figures 7-8 As shown, with the signal-to-noise ratio increasing from 10 dB to 30 dB under 5% sparse sampling conditions, the dual-task model based on collaborative optimization significantly outperforms the single-task model in both radio map construction and radiation source localization. Experimental results demonstrate that the system, through a joint optimization mechanism, improves the accuracy of radio map construction and radiation source localization by 18.2% and 43.5%, respectively, effectively verifying the collaborative performance advantages of this framework in low-sampling and complex electromagnetic environments.
[0052] To further evaluate the performance advantages of the proposed system in radio map construction tasks, this invention compares the proposed method with two advanced baseline algorithms: (1) Fully convolutional depth-complete autoencoder algorithm: an encoder-decoder algorithm using a fully convolutional depth-complete autoencoder. (2) Semantic communication: a completion algorithm that extracts semantic features through a quantized semantic extraction strategy.
[0053] like Figures 9-10 As shown, at different sampling rates (corresponding to the number of sampling points), the RMSE of the proposed method for radio map construction is lower than that of the two baseline methods. Taking a 5% sampling rate (51 sampling points) as an example, compared with the fully convolutional depth-complete autoencoder and the semantic communication method, the RMSE of the proposed method is reduced by 32.55% and 43.31%, respectively.
[0054] like Figure 10As shown, under different signal-to-noise ratios, the RMSE of the proposed method for radio map construction is still lower than that of the two baseline methods. Taking 10 dB as an example, compared with the fully convolutional depth-complete autoencoder and the semantic communication method, the RMSE of the proposed method is reduced by 10.6% and 15.7%, respectively.
[0055] In this embodiment of the invention, the module can be implemented in software so that it can be executed by various types of processors. For example, an identified executable code module may include one or more physical or logical blocks of computer instructions, which may be constructed as objects, procedures, or functions. Nevertheless, the executable code of the identified module does not need to be physically located together, but may include different instructions stored in different bits, which, when logically combined, constitute the module and achieve the module's intended purpose.
[0056] In practice, an executable code module can be a single instruction or many instructions, and can even be distributed across multiple different code segments, different programs, and across multiple memory devices. Similarly, operational data can be identified within the module and can be implemented in any suitable form and organized within any suitable data structure. This operational data can be collected as a single dataset or distributed across different locations (including different storage devices), and can exist, at least in part, solely as electronic signals within the system or network.
[0057] When a module can be implemented using software, considering the current level of hardware technology, modules that can be implemented in software can be implemented using hardware circuits by those skilled in the art to achieve the corresponding functions, without considering cost. These hardware circuits include conventional very-large-scale integrated circuits (VLSI) or gate arrays, as well as existing semiconductors such as logic chips and transistors, or other discrete components. Modules can also be implemented using programmable hardware devices, such as field-programmable gate arrays, programmable array logic, and programmable logic devices.
[0058] The exemplary embodiments described above are with reference to the accompanying drawings. Many different forms and embodiments are feasible without departing from the spirit and teachings of the invention. Therefore, the invention should not be construed as limiting the exemplary embodiments set forth herein. Rather, these exemplary embodiments are provided to make the invention complete and convey the scope of the invention to those skilled in the art. In these drawings, component dimensions and relative dimensions may be exaggerated for clarity. The terminology used herein is for the purpose of describing particular exemplary embodiments only and is not intended to be limiting. As used herein, unless clearly indicated otherwise, the singular forms “a,” “an,” and “the” are intended to include all such forms. It will be further understood that the terms “comprising” and / or “including”, when used in this specification, indicate the presence of the stated features, integers, steps, operations, components, and / or elements, but do not exclude the presence or addition of one or more other features, integers, steps, operations, components, and / or groups thereof. Unless otherwise indicated, when stated, a range of values includes the upper and lower limits of the range and any subranges in between.
[0059] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.
Claims
1. A method for constructing radio maps and locating non-cooperative radiation sources based on intelligent agent interaction, characterized in that, Includes the following steps: Step 1: Dynamically sample spatial electromagnetic signals using a swarm of unmanned aerial vehicles (UAVs), and propagate prior and sparse observation data through a gridded area. A Gaussian regression process is used to model the predicted distribution map representing the spatial probability of the radiation source target. ; A two-dimensional location encoding is set up that integrates global geometric topology information of the radiation source target with multi-agent consensus perception of the probability estimate of the spatial existence of the radiation source target. ; the obtained predicted distribution map With two-dimensional position encoding Perform convolutional fusion, followed by nonlinear activation, convolutional mapping, and... Function normalization generates spatial attention weight graph ; Step 2, processing sparse observation data Element-wise weighting is performed, followed by multi-layer convolutional projection and nonlinear mapping to generate an initial representation of the radiation source target. Spatial attention weight map As a physical prior bias incorporated into the multi-head attention mechanism, a consistent representation of the radiation source target is generated. The output consistency is characterized after reconstruction via residual connection and feedforward network. ; Step 3: Establish a feature map for joint representation of multiple agents using adaptive convolutional kernels; perform deep semantic feature mining by stacking convolutional kernels with fixed constant filters, and combine average pooling to achieve semantic feature filtering and dimensionality reduction; dynamically adjust the number of constant filters to output semantic feature maps. ; Step 4: Design a task-oriented semantic recovery module. This module introduces a geometric topological information location encoder and combines it with the semantic feature map. The features are stitched together; features with confidence levels below a threshold in the semantic feature map are filtered out by a masked multi-head self-attention network; semantic features are reconstructed using a transposed convolutional network; and preliminary results of semantic feature interaction related to the radiation source localization task are constructed by outputting a radio map through parallel fully connected layers. Step 5: Construct a joint loss function that integrates radio map and radiation source localization. Based on multi-dimensional task-aware state vector With semantically driven continuous action space semantic feature maps We employ soft attention weighting and design a composite reward function that integrates dual-task accuracy and interaction efficiency. To collaboratively optimize the accuracy and interaction efficiency of dual tasks.
2. The method as described in claim 1, characterized in that: Two-dimensional position coding Position encoder based on global geometric topology information With consensus-aware encoder It was pieced together.
3. The method as described in claim 2, characterized in that: Step 1 includes the following steps: Step 1.1, select the region of interest. Perform grid division, For the set of real numbers, along On shaft and The axis discretizes the region into a single axis. The matrix grid, the spatial set of the discretized grid is defined as , where any specific grid is denoted as or ,satisfy and Belonging to the same Two different grids; Step 1.2, based on the spatial set Signal propagation priors and sparse observation data By performing Gaussian regression modeling, a predicted distribution map of the continuous signal field characterizing the spatial existence probability of the radiation source target is obtained. : ;in, For Gaussian process regression, It is a function of signal mean versus spatial sampling location. It is a signal covariance-spatial sampling position function; Step 1.3: Set up a two-dimensional location code that integrates the global geometric topology information of the radiation source target with the spatial existence probability estimation of the radiation source target perceived by multi-agent consensus. : ;in, It is a global geometric topology information position encoder. This is a consensus-aware encoder, which is used to construct a probability estimate of the spatial location of a radiation source target. Step 1.4, analyze the obtained predicted distribution map. With two-dimensional position encoding Perform convolutional fusion to obtain preliminary spliced and fused features along the channel dimension. : ,in, Indicates feature concatenation operation; Step 1.5, after nonlinear activation, convolutional mapping and Function normalization generates spatial attention weight graph : .
4. The method as described in claim 3, characterized in that: The signal mean-spatial sampling location function is: ; The logarithmic distance path loss model is used for signal propagation priors. To balance the weights; the signal covariance-spatial sampling location function is: , For signal variance, This is a distance scale used to determine the signal correlation between any two points. It is the distance between any two different grid points in space.
5. The method as described in claim 4, characterized in that: Step 2 includes: Step 2.1, processing sparse observation data Element-wise weighting is performed, followed by multi-layer convolutional projection and nonlinear mapping to generate an initial representation of the radiation source target. : ; where, projection matrix , The dimension for location encoding. For batch normalization operations, For each output channel, the bias vector and activation function are... ; Step 2.2: Initialize the representations of each agent regarding the task objective. Layer normalization is performed, and the normalized features are projected into a query matrix, a key matrix, and a value matrix through three independent learnable linear transformation matrices, respectively. Step 2.3, spatial attention weight map The physical prior bias term is multiplied by the dot product of the query matrix and the key matrix, and after scaling and normalization, the attention weights are obtained. The elements in the value matrix are then weighted and summed using these attention weights to generate the output of the attention head. All outputs are concatenated and fused through a linear projection layer to obtain the radiation source-target consistency representation. ; Step 2.4, Compared with the initial characterization Perform residual connections and input them into a feedforward network for semantic reconstruction, outputting a consistent representation after nonlinear mapping. .
6. The method as described in claim 5, characterized in that: The output of the attention head in step 2.3 is: , They represent the first In layered self-attention mechanisms, the initial representations of the task objective by multiple agents. The query matrix, key matrix, and value matrix after layer normalization and linear transformation; This is the transpose symbol for a matrix; It is a spatial attention weight graph Adapted to the attention matrix dimension function, Key matrix Dimensions , For the total number of attention heads; The consistency of radiation source targets is characterized as follows: , It is a learnable output linear projection weight matrix.
7. The method as described in claim 6, characterized in that: Step 2.4 includes the following steps: Will and Perform residual connections and obtain intermediate representations through layer normalization. : intermediate representation The input is fed forward neural network to perform nonlinear mapping of features and deep semantic feature reconstruction. The output of the feedforward network and the intermediate representation are then subjected to residual connections and layer normalization. Finally, a consistent representation is output. : Before residual join, align using linear mapping. and The number of channels is then used for cross-layer feature reuse by adding elements one by one; Presentation layer normalization operation; This represents a feedforward network.
8. The method as described in claim 7, characterized in that: Step 3 includes: Step 3.1: Using adaptive convolutional kernels combined with He-normal weight initialization and batch normalization, a feature map for multi-agent joint representation is established. The LeakyReLU activation function is introduced to enhance the selection of nonlinear features related to the dual tasks. The feature mapping operation of multi-agent joint representation is defined as follows: ; in, For the first The input features of the convolutional layer, when hour, ; Represents a two-dimensional convolution operation; For the first The adaptive convolution weight matrix of the layer; For the first Adaptive convolutional bias term of the layer; This indicates a batch normalization operation, used to scale and translate features along the channel dimension; The LeakReLU activation function is used with a negative slope penalty. Step 3.2 further enhances semantic interaction by stacking convolutional kernels to fix constant filters, and uses post-average pooling to achieve semantic feature selection and dimensionality reduction; Step 3.3, based on feature map scale Dynamically adjust the number of constant filters : The output is a semantic feature map containing semantic information from the radio map construction and radiation source localization tasks. ,in, This is the local traversal index along the height direction within the average pooling window. This is the local traversal index along the width direction within the average pooling window. This represents the feature channel capacity in the semantic feature map.
9. The method as described in claim 8, characterized in that: Step 3.2 includes the following steps: Using the semantic features output from step 3.1 as input, multiple sets of convolutional stacked modules are sequentially constructed for hierarchical cascaded feature extraction. In each set of convolutional stacked modules, a constant-dimensional filter is used for deep semantic feature mining, and average pooling is performed after each convolution operation. For the aforementioned multiple sets of convolutional stacked modules, let the... Group 1 The output features of the layer are as follows: , For the first The input characteristics of the group module, namely ; For the first Group 1 The output features of the layer; For the first The first in the group The fixed-dimensional convolution weight matrix of the layer, For the first The first in the group The layer's bias matrix; Deep semantic features extracted from the last layer of this module group Perform average pooling operation: in, This represents the relative positions of semantic features in two-dimensional space after pooling. Passing the exam Feature values on each channel For discrete spatial indexes along the height of the feature vector; For discrete spatial indexes along the width of the feature vector; For the first The average pooling window size of the group module; For the first The average pooling window sliding step size of the group module; while This is the local traversal index along the height direction within the average pooling window. This is the local traversal index along the width direction within the average pooling window.
10. The method as described in claim 9, characterized in that: Step 4 includes the following steps: Step 4.1, change the dimension to grid Geometric topological information location encoder and semantic feature map To splice; Step 4.2: Use a masked multi-head self-attention network to filter out features with confidence scores below a threshold in the semantic feature map, generating a filtered feature map; use a transposed convolutional network to gradually restore the spatial resolution of the feature map, upsample the filtered feature map to the original image size, thereby reconstructing the spatial feature distribution, completing the semantic feature recovery related to the task, and outputting a high-resolution feature map. Step 4.3: The high-resolution feature map output by the transposed convolutional network is used as input through a parallel fully connected layer to perform two tasks in parallel: electromagnetic field map construction and radiation source localization. The output is the semantic feature interaction results related to the radio map construction and non-cooperative radiation source localization tasks.