A cross-modal oil and gas platform identification method based on meta-learning and domain discriminator
By employing a cross-modal oil and gas platform identification method based on meta-learning and domain discriminators, combined with multi-scale feature aggregation and dynamic gradient inversion layers, the stability problem of oil and gas platform identification under single-modal remote sensing data is solved, thereby improving cross-modal identification and generalization capabilities.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 海南省航天技术创新中心
- Filing Date
- 2026-03-27
- Publication Date
- 2026-06-19
AI Technical Summary
Existing oil and gas platform identification methods mostly rely on single-modal remote sensing data, making it difficult to achieve stable identification under all-day and all-weather conditions, and the generalization ability of cross-modal identification models is insufficient.
A cross-modal oil and gas platform identification method based on meta-learning and domain discriminator is adopted. By constructing a domain discriminator with multi-scale feature aggregation sub-module and dynamic weighted gradient inversion layer, combined with YOLOv1 model, cross-modal identification and domain offset mitigation are achieved, thereby improving identification performance.
Cross-modal recognition can be achieved without precise registration, significantly improving the recognition performance of small-scale oil and gas platforms, reducing false negative and false positive rates, and enhancing the model's recognition stability and generalization ability in complex environments.
Smart Images

Figure CN122244538A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of oil and gas platform target recognition technology, specifically to a cross-modal oil and gas platform recognition method based on meta-learning and domain discriminator, which is used to extract general feature representations of oil and gas platform targets in low-light, infrared and SAR images, and enhance the response of oil and gas platform target features in the channel dimension, so as to achieve accurate recognition of cross-modal oil and gas platforms. Background Technology
[0002] With the increasing demands for marine resource development and maritime safety supervision, oil and gas platforms, as important targets at sea, have significant engineering application value for automated identification and continuous monitoring. In recent years, remote sensing technology has developed rapidly, and using satellite-acquired remote sensing images for oil and gas platform identification research has become a crucial technical means. Most existing oil and gas platform identification methods are based on remote sensing data from single satellite payloads, such as using only low-light, infrared, or SAR images to identify oil and gas platforms. However, oil and gas platform identification models trained on single-modal images have poor generalization ability and often only perform well under specific imaging conditions, making it difficult to adapt to complex and changing observation environments. For example, low-light payload imaging can significantly reveal the lighting information of oil and gas platforms at night, but it is easily affected by lighting conditions, and its image quality significantly decreases under complex weather conditions such as clouds and fog; infrared payloads have day and night observation capabilities, but target texture information is limited and easily affected by temperature changes of the target and sea surface; SAR payloads have strong penetration capabilities and all-weather imaging capabilities, but the images contain a lot of noise. Therefore, in order to address the problem that single-modal target identification methods cannot meet the requirements of all-day and all-weather identification, it is urgent to construct a cross-modal oil and gas platform identification method, which can make full use of the complementary information of multi-modal remote sensing data such as low-light, infrared and SAR, to achieve reliable observation and stable identification of oil and gas platform targets in complex scenarios.
[0003] In recent years, some scholars have attempted to research cross-modal oil and gas platform identification methods by fusing multimodal remote sensing data. These methods can mostly be categorized into feature information fusion and decision-level fusion. Specifically, feature information fusion methods construct multiple parallel feature extraction branches to extract unique information representations corresponding to different remote sensing imaging modalities, and then combine these feature information according to pre-defined fusion rules to generate a fused feature representation containing complementary information from each modality. Decision-level fusion methods, on the other hand, construct oil and gas platform target identification models for each modality based on different remote sensing images, and then comprehensively judge the identification results output by each modality model. Feature information fusion and decision-level fusion methods, to a certain extent, combine complementary feature information from different modalities, obtaining more comprehensive feature information of the monitoring scene, thereby improving the identification performance of oil and gas platform identification models in complex environments. Summary of the Invention
[0004] To address the aforementioned issues, this invention presents a cross-modal oil and gas platform identification method based on meta-learning and a domain discriminator, which aims to alleviate the domain shift problem between the source and target domains caused by differences in imaging modes and imaging conditions. The technical solution adopted in this invention: A cross-modal oil and gas platform identification method based on meta-learning and domain discriminator includes the following steps: S1: Collect remote sensing images of different modalities, preprocess them, construct a dataset, and randomly select one modal of remote sensing image as the target domain, and the remaining remote sensing images as the source domain. Divide the source domain into a meta-training set and a meta-test set for subsequent model training, validation and evaluation, and use the target domain for later inference. S2: Based on the YOLOv13 target detection model, construct an oil and gas platform identification model, which includes: an initial source module. And domain discriminator, the initial source module It includes: a feature extraction submodule, a multi-scale feature aggregation submodule, a feature fusion submodule, and a detection head. The domain discriminator includes a dynamically weighted gradient inversion layer and a domain classifier. The domain classifier includes a convolutional layer, a pooling layer, and a fully connected layer. S21: Input the data from the meta-training set into the initial source module. The feature extraction submodule extracts image features, which are then input into the multi-scale feature aggregation submodule to obtain enhanced features. Finally, the enhanced features are input into the feature fusion submodule to obtain multi-scale features. S22: Input the multi-scale features into the detection head, perform feature decoding, output the target recognition result, calculate the target recognition loss, and simultaneously input the multi-scale features into the domain discriminator for domain classification and calculate the domain classification loss. The target recognition loss and domain classification loss are weighted and fused to obtain the total loss of the meta-training set. S23: Based on the initial source module parameters Calculate the gradient of the total loss on the meta-training set and update the parameters. Get the inner update module ; S24: Input the meta-test set into the inner update module The feature extraction submodule extracts features to obtain inner layer image features. Then, the inner layer image features are input into the multi-scale feature aggregation submodule to obtain inner layer enhanced features. The inner layer enhanced features are then input into the feature fusion submodule to obtain inner layer multi-scale features. Finally, the inner layer multi-scale features are input into the detection head for feature decoding to obtain the recognition result and the total loss of the meta-test set. S3: Add the total loss of the meta-training set and the total loss of the meta-test set to obtain the total loss in the current iteration, based on the initial source module. The original parameters Calculate the gradient of the total loss during the current iteration and optimize the parameters. The optimized oil and gas platform identification model was obtained. S4: Input remote sensing images of different modalities into the optimized oil and gas platform identification model to obtain accurate identification results; Preferably, in step S21, the image features are input into the multi-scale feature aggregation submodule to obtain enhanced features, and the specific process is as follows: Image features are input into a 1×1 convolution to reduce the dimension of the feature channels, resulting in dimensionality-reduced image features. These dimensionality-reduced features are then input into a 3×3 dilated convolution to obtain local image features. Global average pooling is then used to extract global features. Upsampling is then used to concatenate the local and global features to obtain global semantic information. The channel response weight calculation module is then used to calculate the weights of the global semantic information to obtain channel response weights. The global semantic information and channel response weights are multiplied channel by channel to obtain weighted and refined feature information. Finally, the weighted and refined feature information is input into a 1×1 convolutional layer to obtain enhanced features.
[0005] Preferably, in step S22, the multi-scale features are input into the domain discriminator for domain classification, and the domain classification loss is calculated. The specific process is as follows: Multi-scale features are input into a dynamic gradient inversion layer for identity mapping to obtain mapped multi-scale features. These mapped multi-scale features are then input into a convolutional layer for feature enhancement to obtain enhanced multi-scale features. The enhanced multi-scale features are then input into a pooling layer for spatial dimensionality reduction and aggregation, and finally into a fully connected layer for dimensionality compression to obtain a two-dimensional vector. The domain classification loss is then calculated using an activation function on this two-dimensional vector. .
[0006] What are the technical advantages of this application? Cross-modal recognition capability without strict registration: Existing methods mostly rely on pixel-level or geographic information registration of multimodal images. This invention achieves cross-domain recognition between different modalities without the need for precise registration through domain-invariant feature learning and meta-learning domain generalization mechanism, reducing the difficulty of engineering implementation and improving the applicability of the system.
[0007] Explicitly modeling domain shift and improving cross-domain generalization ability: Traditional multimodal remote sensing image target recognition methods assume that the data are independent and identically distributed. This invention takes the difference between the source domain and the target domain as the optimization objective. Through the meta-training-meta-testing mechanism and the joint constraint of the domain discriminator, it effectively alleviates the model performance degradation caused by domain shift, thereby improving the recognition stability of the model under unknown modalities and unknown imaging conditions.
[0008] The performance of small-scale oil and gas platform identification is significantly improved: the multi-scale feature aggregation structure MFA-s is used to fuse features of different spatial scales with global semantic information, and the feature response of oil and gas platform targets in the channel dimension is enhanced to improve the saliency expression of small-scale oil and gas platforms in complex sea backgrounds, reduce the interference of sea clutter and irrelevant targets, and thus reduce the false negative and false positive rates.
[0009] An end-to-end system with dual innovations in training strategy and structure: Within the YOLOv13 framework, this invention simultaneously improves both the network structure and the training strategy. Specifically, it combines a meta-learning method based on dual gradient descent with a domain discriminator to apply adversarial constraints to feature representations, thereby filtering domain-related information from features and forcing the model to learn domain-independent feature representations required for cross-domain generalization. Furthermore, this invention designs a multi-scale feature fusion structure to enhance the feature response and fine-grained representation capabilities of oil and gas platform targets. These training strategies and structures form an end-to-end cross-modal oil and gas platform identification method, effectively improving the model's identification performance and cross-modal generalization ability.
[0010] This invention proposes a cross-modal oil and gas platform identification method by combining meta-learning methods with a domain discriminator to alleviate the domain offset problem caused by differences in imaging modalities and imaging conditions between the source and target domains. The method first constructs a meta-optimization process with two gradient updates during the training phase and designs a gradient inversion layer to guide the model in learning universal discriminative feature representations of oil and gas platform targets under different modalities. Then, a domain discriminator is introduced after the feature fusion submodule of the target identification model to avoid overfitting the model to the source domain data during the training phase. Furthermore, since oil and gas platform targets in remote sensing images are typically small-scale and susceptible to sea surface clutter interference, this invention further designs a multi-scale feature aggregation submodule to refine the discriminative features of oil and gas platform targets, thereby improving the model's identification performance for small-scale oil and gas platform targets. Attached Figure Description
[0011] Figure 1 This is a flowchart of a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator according to the present invention.
[0012] Figure 2 Detailed structural diagram of the YOLOv13 basic components of a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator according to the present invention.
[0013] Figure 3 This is a flowchart of the training phase of a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator according to the present invention.
[0014] Figure 4The present invention provides a detailed structural diagram of the multi-scale feature aggregation submodule of a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator.
[0015] Figure 5 The present invention provides a detailed structural diagram of the Channel Response Weight Calculation Module (CRWM) for a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator.
[0016] Figure 6 The present invention provides a detailed structural diagram of the domain discriminator in a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator.
[0017] Figure 7 This invention provides a meta-learning method based on dual gradient descent for cross-modal oil and gas platform identification based on meta-learning and domain discriminators.
[0018] Figure 8 This invention presents a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator, and compares the identification results with the benchmark method YOLOv13.
[0019] Figure 9 The present invention provides a detailed network structure diagram of a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator. Detailed Implementation
[0020] The following detailed description of the implementation of the present invention is in conjunction with the accompanying drawings. However, these descriptions do not constitute a limitation of the present invention and are merely examples. Through these descriptions, the advantages of the present invention will be more clearly understood. All modifications that can be directly derived or conceived by those skilled in the art from the content disclosed in the present invention should be considered to be within the scope of protection of the present invention. Other parts not described in detail in the embodiments are all prior art. like Figure 1 As shown, the present invention proposes a cross-modal oil and gas platform identification method based on meta-learning and domain discriminator, which includes the following steps: S1: Collect remote sensing images of different modalities, preprocess them, construct a dataset, and randomly select one modal of remote sensing image as the target domain, and the remaining remote sensing images as the source domain. Divide the source domain into a meta-training set and a meta-test set for subsequent model training, validation and evaluation, and use the target domain for later inference. Here, it is assumed that the dataset used for model training exists. Before model training begins, these data need to be divided into modalities. Domain and set domain labels for each domain. This is so that subsequent domain classifiers can classify the data source. In each iteration of the training phase, this invention first selects... One domain is used as the source domain, and one domain is used as the target domain. For example, in the application scenario of this invention, the dataset source for oil and gas platform identification includes three modalities: infrared, low-light, and SAR. These three modalities are divided into three domains. And set domain labels according to different modalities. Then, the data in the source domain is randomly divided into a meta-training set and a meta-test set, denoted as follows: and .
[0021] S2: Based on the YOLOv13 object detection model, such as Figure 2 As shown, an oil and gas platform identification model is constructed, which includes: an initial source module. And domain discriminator, the initial source module It includes: a feature extraction submodule, a multi-scale feature aggregation submodule, a feature fusion submodule, and a detection head. The domain discriminator includes a dynamically weighted gradient inversion layer and a domain classifier. The domain classifier includes a convolutional layer, a pooling layer, and a fully connected layer. S21: Input the data from the meta-training set into the initial source module. The feature extraction submodule extracts image features, which are then input into the multi-scale feature aggregation submodule to obtain enhanced features. Finally, the enhanced features are input into the feature fusion submodule to obtain multi-scale features. In this process, the data in the meta-training set is processed through Conv convolutional layers to extract basic edge and texture features. The output second convolutional layer further extracts local features, which are then fed into a depthwise separable convolutional module for downsampling and feature extraction, balancing efficiency and accuracy. The downsampled features are then refined, and the second depthwise separable convolutional module is used for further downsampling to improve feature abstraction. Lightweight feature extraction is then performed using depthwise separable convolution, followed by an attention enhancement module to strengthen key features. Lightweight feature extraction continues using depthwise separable convolution, and the second attention enhancement module further optimizes the expressive power of deep features. Finally, the image features are output.
[0022] S22: Input the multi-scale features into the detection head, perform feature decoding, output the target recognition result, calculate the target recognition loss, and simultaneously input the multi-scale features into the domain discriminator for domain classification and calculate the domain classification loss. The target recognition loss and domain classification loss are weighted and fused to obtain the total loss of the meta-training set. S23: Based on the initial source module parameters Calculate the gradient of the total loss on the meta-training set and update the parameters. Get the inner update module ; In each iteration of the model, such as Figure 3 As shown, this stage first inputs the meta-training set samples obtained in step S1 into the initial source model. The system employs feature extraction and feature fusion submodules to complete image feature extraction and multi-scale feature construction. Subsequently, the constructed multi-scale features are input into the detection head of the target recognition model for feature decoding to output the recognition result and calculate the corresponding target recognition loss. Simultaneously, the aforementioned features are input into a domain discriminator to obtain domain labels and calculate the domain classification loss. Then, the target recognition loss and domain classification loss are weighted and fused to obtain the total loss for the meta-training stage. Finally, based on the initial source model... parameters Gradient calculation is performed, and the first gradient descent, i.e., inner gradient descent, is executed to obtain the updated model parameters of the inner layer. This leads to the inner update model. .
[0023] The surrounding environment of oil and gas platform targets is mostly sea surface areas with strong scattering characteristics, which makes them easily submerged by sea surface clutter, ripples, and bright spot noise in the feature space. Furthermore, after multi-layer convolution processing in the target recognition model, small-scale oil and gas platform feature information is easily lost. Therefore, a multi-scale feature aggregation submodule (MFA-s) is designed between the feature extraction submodule and the feature fusion submodule. This structure achieves adaptive enhancement of the discriminative features of oil and gas platform targets by designing parallel dilated convolution branches with different dilation rates and combining global context information and channel response weight modeling mechanisms. Detailed structural diagram of MFA-s is as follows: Figure 4 As shown, the method first constructs a receptive field covering different spatial scales by setting 3×3 dilated convolution branches with different dilation rates (1, 2, 3, and 5), without reducing feature resolution. This allows the model to effectively extract local features of oil and gas platforms and contextual information about the surrounding sea surface. Simultaneously, a global average pooling branch is introduced to model the global information of the entire feature map and associates it with local features through upsampling operations, thus providing stable global semantic constraints for small-scale oil and gas platform targets. Notably, to reduce the number of parameters in the module, a 1×1 convolution is set before the dilated convolution to reduce the feature channel dimension. Then, features at each scale are concatenated along the channel dimension, and the Channel Response Weighting Module (CRWM) is used to calculate the weights of the channel feature information. A detailed structural diagram of this module is shown below. Figure 5As shown, it utilizes global average pooling, fully connected layers, and the sigmoid activation function to calculate the response weights of oil and gas platform targets on different channels, assigning larger weights to channels with larger feature responses. This helps the model focus on feature channels that are more discriminative for identifying oil and gas platforms and suppress redundant noise features. Finally, the concatenated features are multiplied by the channel response weights to output the feature information refined by the channel weights, and then the channel dimension is adjusted using a 1×1 convolution before being output. Because relying solely on the gradient descent process of inner and outer layers is insufficient to guarantee the discriminative and domain-independent nature of the feature maps learned by the feature extraction and feature fusion submodules. Furthermore, the lack of global supervision during training can easily lead to overfitting of the model to the source domain, thus weakening its cross-domain generalization ability. Therefore, inspired by domain adaptation methods, this invention designs a domain discriminator after the model feature fusion submodule described in step one, such as... Figure 6 As shown, based on the meta-learning strategy, the model is further guided to learn domain-invariant features. The domain discriminator consists of a Dynamic Weighted Gradient Reversal Layer (DWGRL) and a domain classifier. The domain classifier is composed of convolutional layers, pooling layers, and fully connected layers, and is used to discriminate the data domain to which the features output by the feature fusion submodule described in step one belong.
[0024] The DWGRL layer described above is an improvement upon the traditional GRL layer by adding dynamic weight factors. During the forward propagation process, the DWGRL layer acts as an identity mapping, similar to the traditional GRL layer, without altering the input features. This process is represented as follows:
[0025] in, This represents the feature information output by the feature fusion submodule. This represents the forward output features of the DWGRL layer; During backpropagation, DWGRL reverses the sign of the gradient of the domain classification loss and propagates it backward, enabling the feature extraction and feature fusion submodules to minimize the target recognition loss while maximizing the domain classification loss. This forces the domain classifier to be unable to distinguish the source of the features. The process is as follows:
[0026] in, The gradient of the domain classification loss is represented. This represents the dynamic weighting coefficient.
[0027] Traditional GRL layers will Set to a constant of 1. However, in the initial stage of training, the domain classifier cannot provide accurate classification, which will mislead the parameter updates of the model's feature extraction and feature fusion submodules. Therefore, unlike traditional GRL layers, the DWGRL layer proposed in this invention will, in the initial training stage, Set to a small value to enable the feature extraction and feature fusion submodules to extract discriminative features of oil and gas platform targets. As training progresses, The increasing number of features forces the feature extraction and feature fusion submodules to acquire the ability to extract domain-independent features. The representation is as follows:
[0028] in, This represents the ratio of the current iteration number to the total number of iterations. To control The hyperparameter for the growth rate is set to 8 in this invention.
[0029] By employing an adversarial learning mechanism between the feature extraction and feature fusion submodules and the domain discriminator, the model can effectively suppress domain-related information in the features while extracting discriminative features of oil and gas platforms, thereby enhancing the domain invariance of the features. Consequently, the oil and gas platform features extracted by the model can be used for target category identification and location positioning without relying on a specific data domain distribution, thus improving the model's cross-modal recognition performance.
[0030] S24: Input the meta-test set into the inner update module The feature extraction submodule extracts features to obtain inner layer image features. Then, the inner layer image features are input into the multi-scale feature aggregation submodule to obtain inner layer enhanced features. The inner layer enhanced features are then input into the feature fusion submodule to obtain inner layer multi-scale features. Finally, the inner layer multi-scale features are input into the detection head for feature decoding to obtain the recognition result and the total loss of the meta-test set. In each iteration of the model, the meta-test set samples from S1 are first input into the inner update model. The feature extraction and feature fusion submodules perform feature extraction and multi-scale feature construction. Then, the model's detection head is used to perform target recognition on the multi-scale features. Finally, the corresponding target recognition loss is calculated based on the recognition results and used as the total loss for the meta-test set. S3: Add the total loss of the meta-training set and the total loss of the meta-test set to obtain the total loss in the current iteration, based on the initial source module. The original parameters Calculate the gradient of the total loss during the current iteration and optimize the parameters. The optimized oil and gas platform identification model was obtained.
[0031] Among them, the total loss backpropagation process, such as Figure 7 As shown, after completing the meta-training and meta-testing phases, this invention adds the total loss of the meta-training set to the total loss of the meta-testing set to obtain the total loss in the current iteration. Subsequently, based on the initial source model... The original parameters The gradient of the total loss in the current iteration is calculated, and a second gradient descent process, namely outer gradient descent, is performed to obtain the updated model parameters. The optimized oil and gas platform identification model was obtained.
[0032] S4: Input remote sensing images of different modalities into the optimized oil and gas platform identification model to obtain accurate identification results; A comparison of the oil and gas platform identification results of the benchmark method YOLOv13 and the method proposed in this invention on infrared, low-light, and SAR remote sensing payload data is shown below. Figure 8 As shown, under different imaging modalities, the recognition confidence of the method of this invention is generally higher than that of the benchmark method YOLOv13. This indicates that the training strategy of meta-learning combined with domain discriminator proposed in this invention can effectively alleviate the domain shift problem between cross-modal data, thereby significantly improving the recognition performance and cross-modal generalization ability of oil and gas platforms. Furthermore, from... Figure 8 As can be seen, even in complex scenarios with varying target sizes or strong sea clutter backgrounds, the method of this invention can still stably identify oil and gas platform targets with reduced missed detections. This indicates that the proposed multi-scale feature fusion submodule can effectively refine and enhance the features of oil and gas platform targets, thereby improving the model's multi-scale feature representation capability.
[0033] The detailed network structure diagram of the method proposed in this invention is as follows: Figure 9 As shown, "Conv" represents a convolutional block used to extract features from the input information. "Concat" concatenates multiple input features along the channel dimension, and "Upsample" performs an upsampling operation on the feature map. "DSConv" represents depthwise separable convolution, and "DS-C3k2" is composed of DSConv modules. "A2C2f" represents a feature attention module, composed of the ABlock attention mechanism.
[0034] Key points and areas to be protected in this application The key points of this application mainly include the following aspects: (1) Meta-learning-based dual gradient update training framework In the cross-modal oil and gas platform identification task, this invention constructs a meta-learning training mechanism that includes inner-layer gradient updates and outer-layer gradient updates. Through joint optimization of the meta-training set and the meta-test set, the model explicitly learns domain-independent feature representations between different domains (i.e. different modalities) during the training process, thereby alleviating the domain offset problem between the source domain and the target domain.
[0035] (2) Domain discriminator structure based on dynamic weighted gradient reversal layer This invention proposes a domain discriminator composed of a Dynamically Weighted Gradient Reversal Layer (DWGRL) and a domain classifier. Through adversarial learning between the domain discriminator and the model's feature extraction and feature fusion submodules, the domain-invariant feature representations extracted during meta-learning are further optimized to avoid overfitting to the source domain data. Furthermore, the innovation of DWGRL lies in the introduction of dynamic weighting factors, enabling the model to focus on stable feature extraction in the early stages of training and gradually enhance domain adversarial capabilities in the later stages, thereby improving the domain invariance and generalization ability of features while ensuring feature discriminability.
[0036] (3) Multi-scale feature aggregation submodule MFA-s This invention introduces a multi-scale feature aggregation submodule between the feature extraction submodule and the feature fusion submodule. Through dilated convolution with different expansion rates, global context modeling, and channel response weight calculation, it achieves adaptive enhancement of the discriminative features of small-scale oil and gas platform targets, thereby improving the identifiability of small-scale targets in complex sea surface backgrounds.
[0037] (4) Integrated design of cross-modal oil and gas platform identification process Using YOLOv13 as the basic recognition framework, combined with a meta-learning framework based on dual gradient descent, a domain discriminator, and a multi-scale feature aggregation submodule MFA-s, an end-to-end cross-modal oil and gas platform recognition method is formed, thereby improving cross-modal recognition and generalization performance among infrared, low-light, and SAR modes.
[0038] The content to be protected in this application includes, but is not limited to, any combination of the above-mentioned key technical points and their specific implementation methods.
[0039] Although preferred embodiments of this application have been described, those skilled in the art, once they learn the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of this application. Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if these modifications and variations of this application fall within the scope of the claims of this application and their equivalents, this application also intends to include these modifications and variations.
Claims
1. A cross-modal oil and gas platform identification method based on meta-learning and domain discriminator, characterized in that, include: S1: Collect remote sensing images of different modalities, preprocess them, construct a dataset, and randomly select one modal of remote sensing image as the target domain, and the remaining remote sensing images as the source domain. Divide the source domain into a meta-training set and a meta-test set for subsequent model training, validation and evaluation, and use the target domain for later inference. S2: Based on the YOLOv13 target detection model, construct an oil and gas platform identification model, which includes: an initial source module. And domain discriminator, the initial source module It includes: a feature extraction submodule, a multi-scale feature aggregation submodule, a feature fusion submodule, and a detection head. The domain discriminator includes a dynamically weighted gradient inversion layer and a domain classifier. The domain classifier includes a convolutional layer, a pooling layer, and a fully connected layer. S21: Input the data from the meta-training set into the initial source module. The feature extraction submodule extracts image features, which are then input into the multi-scale feature aggregation submodule to obtain enhanced features. Finally, the enhanced features are input into the feature fusion submodule to obtain multi-scale features. S22: Input the multi-scale features into the detection head, perform feature decoding, output the target recognition result, calculate the target recognition loss, and simultaneously input the multi-scale features into the domain discriminator for domain classification and calculate the domain classification loss. The target recognition loss and domain classification loss are weighted and fused to obtain the total loss of the meta-training set. S23: Based on the initial source module parameters Calculate the gradient of the total loss on the meta-training set and update the parameters. Get the inner update module ; S24: Input the meta-test set into the inner update module The feature extraction submodule extracts features to obtain inner layer image features. Then, the inner layer image features are input into the multi-scale feature aggregation submodule to obtain inner layer enhanced features. The inner layer enhanced features are then input into the feature fusion submodule to obtain inner layer multi-scale features. Finally, the inner layer multi-scale features are input into the detection head for feature decoding to obtain the recognition result and the total loss of the meta-test set. S3: Add the total loss of the meta-training set and the total loss of the meta-test set to obtain the total loss in the current iteration, based on the initial source module. The original parameters Calculate the gradient of the total loss during the current iteration and optimize the parameters. The optimized oil and gas platform identification model was obtained. S4: Input remote sensing images of different modalities into the optimized oil and gas platform identification model to obtain accurate identification results.
2. The cross-modal oil and gas platform identification method based on meta-learning and domain discriminator according to claim 1, characterized in that, In step S21, the image features are then input into the multi-scale feature aggregation submodule to obtain enhanced features. The specific process is as follows: Image features are input into a 1×1 convolution to reduce the dimension of the feature channels, resulting in dimensionality-reduced image features. These dimensionality-reduced features are then input into a 3×3 dilated convolution to obtain local image features. Global average pooling is then used to extract global features. Upsampling is then used to concatenate the local and global features to obtain global semantic information. The channel response weight calculation module is then used to calculate the weights of the global semantic information to obtain channel response weights. The global semantic information and channel response weights are multiplied channel by channel to obtain weighted and refined feature information. Finally, the weighted and refined feature information is input into a 1×1 convolutional layer to obtain enhanced features.
3. The cross-modal oil and gas platform identification method based on meta-learning and domain discriminator according to claim 1, characterized in that, In step S22, the multi-scale features are input into the domain discriminator for domain classification, and the domain classification loss is calculated. The specific process is as follows: Multi-scale features are input into a dynamic gradient inversion layer for identity mapping to obtain mapped multi-scale features. These mapped multi-scale features are then input into a convolutional layer for feature enhancement to obtain enhanced multi-scale features. The enhanced multi-scale features are then input into a pooling layer for spatial dimensionality reduction and aggregation, and finally into a fully connected layer for dimensionality compression to obtain a two-dimensional vector. The domain classification loss is then calculated using an activation function on this two-dimensional vector. .