A method and system for fine-grained identification of marine debris based on UAV imagery
By adaptively adjusting the model structure in incremental learning tasks and using prior analyzers and attention mechanisms to weighted modulate features, the problems of catastrophic forgetting and insufficient fine-grained discrimination in marine debris identification are solved, achieving long-term stability and high accuracy in marine debris identification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HANGZHOU DIANZI UNIV
- Filing Date
- 2026-03-31
- Publication Date
- 2026-06-30
AI Technical Summary
Existing incremental target detection methods suffer from catastrophic forgetting problems and insufficient fine-grained discrimination capabilities in marine debris identification tasks. They are difficult to maintain long-term stability in complex scenarios and lack dynamic modeling of the similarity and differentiation difficulty of different categories.
We employ a fine-grained marine debris identification method based on UAV imagery. By constructing a class-based incremental learning task, we use a prior analyzer to calculate the task similarity index, adaptively adjust the fine-grained discrimination modulation of RoIs, and combine attention mechanism to weighted modulate features to dynamically adjust the model structure to alleviate the forgetting problem.
Enhancing the fine-grained discrimination capability at stages where distinguishing marine debris categories is more difficult avoids unnecessary structural disturbances and improves the long-term identification stability and accuracy of marine debris categories.
Smart Images

Figure CN121963002B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of computer vision and deep learning technology, specifically relating to a method and system for fine-grained identification of marine debris based on UAV imagery. Background Technology
[0002] With the rapid development of UAV remote sensing technology, intelligent sensing, and deep learning technology, automatic detection and identification of marine debris based on UAV imagery has become an important technical means in marine environmental monitoring, pollution control, and emergency response. Compared with manual inspection methods, deep learning-based target detection models can significantly improve identification efficiency and coverage in large-scale, high-frequency marine monitoring tasks.
[0003] However, in real-world applications, the classification system for marine debris is not static. On the one hand, as monitoring time extends, new types of debris constantly emerge, such as new types of plastic products, composite material debris, and special debris generated by unforeseen events. On the other hand, the distribution of debris categories varies significantly across different regions and sea conditions. This necessitates that marine debris identification systems possess the ability to continuously learn new categories, rather than relying on fixed models trained offline only once.
[0004] To address the aforementioned issues, incremental object detection methods have been proposed. Their core objective is to enable the model to progressively learn new object categories based on existing categories without revisiting all historical training data. However, existing incremental object detection methods still face the following shortcomings in marine debris identification tasks:
[0005] First, in incremental class learning, models commonly suffer from catastrophic forgetting, meaning that the detection performance of already learned categories significantly decreases when learning new categories. Although existing methods mitigate this overall forgetting problem through knowledge distillation, sample replay, or regularization constraints, they primarily focus on maintaining category-level detection performance and are insufficient to adequately address complex, fine-grained scenarios. Second, marine debris identification tasks exhibit significant fine-grained recognition characteristics. Different debris categories are often highly similar in appearance, material texture, and local structure, such as different types of plastic buoys, foam products, or composite material debris. These fine-grained differences are typically distinguished by local texture, edge structure, or subtle morphological variations, placing higher demands on the model's feature representation capabilities. In incremental class learning, when learning new categories, models tend to prioritize adapting to new discrimination boundaries, gradually weakening their focus on fine-grained discriminative features between already learned categories. This leads to implicit forgetting of fine-grained discriminative knowledge between learned categories. This problem differs from traditional category-level forgetting; it often manifests as categories still being detectable, but misclassification between them significantly increases, severely impacting the long-term stability of the system. In addition, most existing methods use fixed thresholds or static strategies to determine whether to introduce additional modules, lacking dynamic modeling of class similarity and discrimination difficulty between different classes of incremental learning tasks. This makes it difficult to enhance fine-grained discrimination capabilities while ensuring model simplicity.
[0006] Therefore, there is an urgent need in this field for a marine debris identification method that can adaptively adjust the model structure within the framework of incremental target detection, combined with inter-task category similarity analysis, and effectively alleviate the problem of knowledge forgetting in fine-grained discrimination. Summary of the Invention
[0007] Based on the aforementioned shortcomings and deficiencies in the prior art, one of the objectives of this invention is to at least solve one or more of the aforementioned problems in the prior art. In other words, one of the objectives of this invention is to provide a method and system for fine-grained identification of marine debris based on UAV imagery that meets one or more of the aforementioned requirements.
[0008] To achieve the above-mentioned objectives, the present invention adopts the following technical solution:
[0009] A method for fine-grained identification of marine debris based on UAV imagery includes the following steps:
[0010] S1. Construct a target detection dataset based on data collected by UAVs and divide it into N classes for incremental learning tasks; where the target detection dataset contains different categories of marine debris, and N is an integer greater than 1;
[0011] S2. Input the incremental learning tasks into the target detection model sequentially for training to obtain the trained target detection model. During the training of the current (t+1)th incremental learning task, samples are extracted from the data corresponding to the historical (t)th incremental learning tasks and the current (t+1)th incremental learning task to form a prior analysis dataset. The prior analysis dataset is input into the prior analyzer for similarity analysis to obtain a task similarity index used to characterize the difficulty of distinguishing marine debris categories. t∈[1, N-1]. Based on the task similarity index, it is determined whether to enable RoI fine-grained discrimination modulation during the training of the current (t+1)th incremental learning task. If yes, the RoI features in the target detection model training process are weighted and modulated based on an attention mechanism before continuing the processing of the remaining network structures. If no, the original network structure of the target detection model is maintained for training.
[0012] S3. Collect target detection data through the drone, input it into the trained target detection model, and output the target detection results.
[0013] As a preferred embodiment, step S1 specifically includes the following steps:
[0014] S11. A target detection dataset containing multiple categories of marine debris was collected using drones; each data sample includes at least the input image and the corresponding target bounding box and category label.
[0015] S12. Divide the target detection dataset into multiple non-overlapping category subsets according to all marine debris categories;
[0016] S13. Each category subset corresponds to a class incremental learning task, and they are input into the object detection model for training in a predetermined order.
[0017] As a preferred embodiment, step S2, the process of inputting the prior analysis dataset into the prior analyzer for similarity analysis, includes the following steps:
[0018] S21. Input the prior analysis dataset into the target detection model that has completed the incremental learning task of the t-th class, and extract the feature vector corresponding to each target instance from the regional interest feature layer of the model.
[0019] S22. Aggregate the feature vectors of target instances belonging to the same marine debris category to obtain the category feature prototype;
[0020] S23. Calculate the intra-task similarity between feature prototypes of each class in the current (t+1)th class incremental learning task.
[0021] Calculate the cross-task similarity between the prototype features of each class in the current (t+1)th class incremental learning task and the prototype features of each class in the historical (t)th class incremental learning tasks;
[0022] S24. Accumulate the intra-task similarity and cross-task similarity and normalize them to obtain the task similarity index.
[0023] As a preferred option, the task similarity index of the (t+1)th class incremental learning task for:
[0024] ;
[0025] in, This represents the sum of similarities within each task. It is the sum of cross-task similarity. This represents the total number of category feature prototype pairs involved in the similarity calculation.
[0026] As a preferred embodiment, step S2, determining whether RoI fine-grained discriminative modulation needs to be enabled during the training of the current (t+1)th class incremental learning task based on the task similarity index, includes the following steps:
[0027] S25. Determine if t is 1; if yes, the current class incremental learning task is the second class incremental learning task, and proceed to step S26; if no, proceed to step S27.
[0028] S26. Determine whether the task similarity index of the second class of incremental learning tasks is greater than the task similarity index of the first class of incremental learning tasks; if yes, enable RoI fine-grained discriminative modulation; if no, do not enable RoI fine-grained discriminative modulation.
[0029] S27. Calculate the distance between the task similarity index of the current class incremental learning task and the task similarity index of the historical t class incremental learning tasks, determine the target historical class incremental learning task that is closest to the current class incremental learning task, and determine whether the current class incremental learning task enables RoI fine-grained discrimination modulation during training based on the enabled status of RoI fine-grained discrimination modulation of the target historical class incremental learning task during training.
[0030] As a preferred embodiment, the task similarity index of the current incremental learning task is compared with that of the historical task. The distance between the task similarity metrics of each class of incremental learning tasks is:
[0031] ;
[0032] in, .
[0033] As a preferred embodiment, the target historical incremental learning task is the first... Incremental learning task for each class:
[0034] .
[0035] As a preferred embodiment, the attention-based weighted modulation process includes:
[0036] First, global pooling is performed on the RoI features to obtain channel statistical vectors that reflect the strength of each channel response. Then, during the activation phase, attention branches are set for different types of incremental learning tasks. The attention branches of the corresponding tasks generate channel weight coefficients based on the channel statistical vectors and perform weighted modulation on the RoI features in a residual manner.
[0037] As a preferred option, the object detection model is the object detection framework Faster R-CNN.
[0038] This invention also provides a marine debris fine-grained identification system based on UAV imagery, applying the marine debris fine-grained identification method as described in any of the preceding solutions. The marine debris fine-grained identification system includes:
[0039] The module is used to build a target detection dataset based on UAV data and divide it into N classes of incremental learning tasks; where the target detection dataset contains different categories of marine debris, and N is an integer greater than 1.
[0040] The training module is used to sequentially input incremental learning tasks into the object detection model for training, resulting in a trained object detection model. During the training of the (t+1)th incremental learning task, samples are extracted from the data corresponding to the historical (t)th incremental learning tasks and the current (t+1)th incremental learning task to form a prior analysis dataset. This prior analysis dataset is input into a prior analyzer for similarity analysis, yielding a task similarity index representing the difficulty of distinguishing marine debris categories; t∈[1, N-1]. Based on the task similarity index, it is determined whether to enable RoI fine-grained discrimination modulation during the training of the current (t+1)th incremental learning task. If yes, the RoI features in the object detection model training process are weighted and modulated based on an attention mechanism before continuing with the original processing of the remaining network structures. If no, the original network structure of the object detection model is maintained during training.
[0041] The detection module is used to collect target detection data through the drone, input it into the trained target detection model, and output the target detection results.
[0042] Compared with the prior art, the beneficial effects of this invention are:
[0043] This invention can enhance fine-grained discrimination ability in the incremental stage where the distinction between marine debris categories is more difficult, and avoid unnecessary structural disturbances in the stage where the distinction between marine debris categories is less difficult, thereby effectively alleviating the problem of fine-grained knowledge forgetting in the class incremental learning process and improving the long-term recognition stability of similar marine debris categories. Attached Figure Description
[0044] Figure 1 This is a schematic diagram illustrating the construction of the prior analysis dataset in Embodiment 1 of the present invention;
[0045] Figure 2 This is a schematic diagram of the processing flow of the prior analyzer in Embodiment 1 of the present invention;
[0046] Figure 3 This is a schematic diagram of the decision-making mechanism without a fixed threshold in Embodiment 1 of the present invention;
[0047] Figure 4 This is a flowchart of the prior decision-guided fine-grained discrimination modulation enable / bypass detection process for RoI in Embodiment 1 of the present invention. Detailed Implementation
[0048] To more clearly illustrate the embodiments of the present invention, specific implementation methods will be described below with reference to the accompanying drawings. Obviously, the drawings described below are merely some embodiments of the present invention. For those skilled in the art, other drawings and other implementation methods can be obtained based on these drawings without any creative effort.
[0049] This invention is based on the object detection framework Faster R-CNN and includes a class incremental training data construction module, a prior similarity analysis module, a decision module without a fixed threshold, a RoI fine-grained discriminative modulation module, and an object detection output module. The prior similarity analysis module is used to characterize the difficulty of class discrimination between different class incremental learning tasks. The decision module without a fixed threshold is used to adaptively determine whether to enable or bypass the RoI fine-grained discriminative modulation module based on the difficulty, thereby achieving dynamic structure adjustment without introducing a fixed threshold.
[0050] First, marine debris images collected by UAVs are input into the target detection model, and multi-level feature representations are extracted through a backbone feature extraction network. Then, according to a predetermined category introduction order, marine debris categories are divided into multiple consecutive incremental learning tasks, and the model is trained stage by stage according to the task order. After the training of each incremental learning task is completed and before the training of the next incremental learning task, a subset of samples is extracted from the data corresponding to the previously learned task and the data corresponding to the current task. Based on the Region Interest (RoI) features of the target instance, feature prototypes of each category are constructed, and the similarity relationship between category prototypes is calculated to obtain a task similarity index used to characterize the overall category discrimination difficulty of the current task. The task similarity index comprehensively considers the internal category similarity of the current task and the cross-task similarity between the current task and historical tasks, and can reflect the potential fine-grained confusion degree in subsequent incremental training. Based on this, the present invention employs a relative comparison decision mechanism without a fixed threshold, adaptively determining whether to enable the RoI fine-grained discriminative modulation module according to the task similarity index: when the decision result indicates that it should be enabled, the RoI fine-grained discriminative modulation module is enabled in the RoI feature processing stage of the target detection model, and the RoI features are weighted and modulated based on an attention mechanism to enhance the fine-grained discriminative feature response; when the decision result indicates that it should not be enabled, the RoI fine-grained discriminative modulation module is bypassed, keeping the rest of the target detection model unchanged to avoid unnecessary increases in model complexity and training perturbations. In addition, a task-level feature modulation module (i.e., weighted modulation based on an attention mechanism) can be set at the high-level feature output position of the backbone feature extraction network and kept enabled during training to enhance the stability of feature representation. Finally, after enabling or bypassing the RoI fine-grained discriminative modulation module, the current class incremental learning task continues to be trained, and the detection results containing target category and location information are output, thereby achieving continuous learning and stable identification of marine debris targets.
[0051] Example 1:
[0052] The fine-grained marine debris identification method based on UAV imagery in this embodiment specifically includes the following steps:
[0053] (1) Construct a training dataset for UAV marine debris target detection and divide the training data into N consecutive incremental learning tasks according to the predetermined category introduction order, so as to provide task sequence and data foundation for subsequent stage-by-stage continuous learning and training; where N is an integer greater than 1;
[0054] The Sea Trash dataset, a publicly available dataset for detecting floating marine debris, was used. The Sea Trash training dataset consists of 6,194 images, while the test and validation sets contain 783 and 771 images, respectively. The dataset is 1200×800 pixels in size and covers a variety of marine debris types, including plastic buoys, foam buoys, plastic bottles, foam fragments, and metal debris.
[0055] First, a target detection dataset containing multiple target categories is acquired using drones, where each sample includes at least the input image and the corresponding target bounding box and category label;
[0056] Secondly, the dataset is divided into multiple non-overlapping subsets of all target categories.
[0057] Next, multiple consecutive class incremental learning tasks are constructed based on the category subset, so that each class incremental learning task only includes the target category newly added in the current stage, and does not include the target category in the subsequent stage;
[0058] Let the set of categories included in the incremental learning task for the t-th class be:
[0059] ;
[0060] in, Let be the total number of categories included in the incremental learning task for the t-th class. For any ,satisfy:
[0061] ;
[0062] Finally, the dataset corresponding to each class incremental learning task is treated as an independent training stage and input into the object detection model in a predetermined order for training.
[0063] (2) Input the incremental learning tasks into the target detection model sequentially for training;
[0064] Before completing the training for the t-th class incremental learning task and proceeding to the (t+1)-th class incremental learning task, a subset of samples is extracted from the historical task and current task data, and prior similarity analysis is performed to obtain prior analysis results representing the difficulty of class differentiation in the current task. The specific process is as follows:
[0065] like Figure 1As shown, the learned historical task dataset includes historical task data 1, historical task data 2, ..., historical task data t. The dataset for the current incremental learning task of the (t+1)th class is the current task data t+1. The future task data that has not yet been learned does not participate in the analysis of the current stage. The prior analysis dataset is formed by sampling from the historical task dataset and the current task data t+1. The sampling method can be random sampling or stratified sampling by category. The sampling ratio is determined by preset parameters, preferably a fixed ratio of the total number of samples in each task dataset.
[0066] The aforementioned prior analysis dataset is used only for similarity analysis and does not participate in the model parameter update process during the training phase of the (t+1)th class incremental learning task; future task data will also not be included in the sampling and analysis in this phase.
[0067] The aforementioned prior analysis dataset is input into the prior analyzer for similarity analysis, which then calls the pre-trained object detection model for forward inference, such as... Figure 2 As shown, feature vectors corresponding to each target instance are extracted through RoI features to form a set of target instance feature vectors. Let the feature vector of the nth target instance in class n be denoted as: ;
[0068] The feature vectors of target instances belonging to the same category are aggregated to obtain the feature prototype of that category; in this embodiment, mean aggregation is used to obtain the feature prototype of the Nth category as follows:
[0069] ;
[0070] in, This represents the number of target instances of class 𝑐.
[0071] Then, calculate the internal similarity between feature prototypes of each category in the current task, and the cross-task similarity between feature prototypes of each category in the current task and feature prototypes of each category in historical tasks.
[0072] Specifically, cosine similarity is used to measure the similarity between any two class prototypes. The specific formula for calculating cosine similarity can be found in existing technologies and will not be elaborated here.
[0073] Finally, the intra-task similarity and cross-task similarity are summed, and the number is normalized according to the categories involved in the similarity calculation to obtain the task similarity index for the current task:
[0074] ;
[0075] in, This represents the sum of similarities within each task. It is the sum of cross-task similarity. This represents the total number of class feature prototype pairs involved in the similarity calculation;
[0076] In this embodiment, the task similarity index is used to characterize the overall category differentiation difficulty of the current class incremental learning task.
[0077] (3) Based on the prior analysis results, a relative comparison decision mechanism without a fixed threshold is adopted to determine whether RoI fine-grained discrimination modulation needs to be enabled before entering the training of the (t+1)th class incremental learning task;
[0078] After obtaining the similarity of the current task and the similarity of historical tasks, such as Figure 3 As shown, determining whether to enable RoI fine-grained discriminative modulation during the training of the (t+1)th class incremental learning task based on the task similarity index includes the following steps:
[0079] (a) Determine if t is 1; if yes, the current class incremental learning task is the second class incremental learning task, and proceed to step (b); if no, proceed to step (c).
[0080] (b) Determine the task similarity of the second class of incremental learning tasks. Is the task similarity greater than that of the incremental learning task of the first class? If yes, then enable RoI fine-grained discrimination modulation; otherwise, disable RoI fine-grained discrimination modulation, i.e., bypass.
[0081] Since there are no historical tasks during the training of the first class incremental learning task, the task similarity of the first class incremental learning task only considers the internal similarity between the feature prototypes of each class in the first class incremental learning task. That is, the internal similarity of the task is accumulated and normalized to obtain the task similarity of the first class incremental learning task.
[0082] (c) Calculate the distance between the task similarity index of the current class incremental learning task and the task similarity index of the historical t classes incremental learning tasks, determine the target historical class incremental learning task that is closest to the current class incremental learning task (i.e. the closest historical task); and determine whether the current class incremental learning task enables RoI fine-grained discrimination modulation during training based on the enabled status of RoI fine-grained discrimination modulation during training of the target historical class incremental learning task.
[0083] Specifically, the task similarity of the current incremental learning task with the historical task... The distance between the task similarities of each class of incremental learning tasks is:
[0084] ;
[0085] in, ;
[0086] The above-mentioned target historical incremental learning task is the first Incremental learning task for each class:
[0087] ;
[0088] Subsequent reading directly from the first... The enabled / bypass state of RoI fine-grained discriminative modulation corresponding to each class incremental learning task is determined and used as the enabled / bypass state of the current task; the decision output is the switch control signal of RoI fine-grained discriminative modulation.
[0089] The above process does not rely on a preset fixed threshold, but rather achieves adaptive judgment based on the relative relationships of historical tasks.
[0090] (4) Based on the decision results, enable or bypass the fine-grained discrimination modulation of RoI in the RoI feature processing stage of the target detection model, enhance the fine-grained discrimination capability when needed, and keep the rest of the model structure unchanged when not needed.
[0091] like Figure 4 As shown, the dataset corresponding to the incremental learning task of the (t+1)th class is input into the object detection model Faster R-CNN. First, multi-level feature representations are extracted through the backbone feature extraction network. Then, candidate regions are generated and RoI features are extracted. In the RoI feature processing stage of the object detection model, a RoI fine-grained discriminative modulation enable / bypass switch is set. The control signal of the switch comes from the switch control signal of the above decision output. When the decision result indicates "enable", the switch is switched to the enable path. The RoI features are weighted and modulated based on the attention mechanism and then input into the detection head of the original network to enhance the fine-grained discriminative feature response. When the decision result indicates "bypass", the switch is switched to the bypass path, so that the RoI features are directly input into the detection head, thereby keeping the rest of the model structure unchanged and avoiding unnecessary increase in complexity.
[0092] The attention-based weighted modulation process in this embodiment includes: first, global pooling of RoI features to obtain channel statistical vectors reflecting the strength of each channel response; then, setting attention branches for different classes of incremental learning tasks during the activation phase; generating channel weight coefficients based on the channel statistical vectors using the corresponding task's attention branch; and weighted modulation of the RoI features using residuals. Unlike conventional attention mechanisms (SE), this embodiment sets the modulation position during the RoI feature processing stage and performs specific task branch processing. Combined with the enable / bypass control of incremental tasks, instance features are only discriminated and enhanced at stages with high fine-grained confusion risk to reduce unnecessary perturbations and alleviate forgetting.
[0093] After completing the above enable / bypass settings, continue training the (t+1)th class incremental learning task and output the detection results. The detection results should include at least the target category and target location information, such as bounding boxes and category labels. Repeat the above steps in subsequent class incremental learning tasks to achieve continuous learning and stable recognition of marine debris targets, obtaining a trained target detection model, which can then be used for fine-grained recognition of marine debris.
[0094] (5) Collect target detection data through UAV, input it into the trained target detection model, and output the target detection results.
[0095] The following is a verification of the effectiveness of the marine debris fine-grained identification method in this embodiment:
[0096] In this embodiment, under the incremental target detection task setting, the same task division, training protocol, and evaluation criteria are used to compare and evaluate the detection performance of the baseline scheme and the recognition method of this embodiment. The evaluation metric is the average category accuracy of 11 marine debris categories (the average category accuracy obtained by averaging the detection accuracy of the 11 target categories); the baseline scheme is trained using existing experience playback.
[0097] Table 1 Comparison of Average Category Precision
[0098] ;
[0099] Under the same incremental target detection task division, training process and evaluation criteria, the detection performance of the baseline method using only the experience replay strategy and the method of the present invention are compared. As shown in Table 1, the average class accuracy of the experience replay method is 24.57% for 11 classes, while the average class accuracy of the method in Example 1 is 51.3%. This shows that the present invention can significantly improve the overall detection and recognition effect of marine debris targets in the incremental learning scenario, especially in the ability to distinguish fine-grained easily confused categories.
[0100] Based on the above-mentioned method for fine-grained identification of marine debris based on UAV imagery, the marine debris fine-grained identification system of this embodiment includes the following functional modules: construction module, training module, and detection module;
[0101] The above-mentioned building modules are used to construct a target detection dataset based on UAV data collection and divide it into N classes of incremental learning tasks;
[0102] The training module described above is used to sequentially input incremental learning tasks into the object detection model for training, resulting in a trained object detection model. During the training of the (t+1)th incremental learning task, samples are extracted from the data corresponding to the historical (t)th incremental learning tasks and the current (t+1)th incremental learning task to form a prior analysis dataset. This prior analysis dataset is input into a prior analyzer for similarity analysis, yielding a task similarity index representing the difficulty of distinguishing marine debris categories; t∈[1, N-1]. Based on the task similarity index, it is determined whether to enable RoI fine-grained discrimination modulation during the training of the current (t+1)th incremental learning task. If yes, the RoI features in the object detection model training process are weighted and modulated based on an attention mechanism before continuing with the original processing of the remaining network structures. If no, the original network structure of the object detection model is maintained during training.
[0103] The aforementioned detection module is used to collect target detection data through a drone, input it into the trained target detection model, and output the target detection results.
[0104] The specific processing procedures of the above functional modules can be found in the detailed description of the above-mentioned fine-grained identification method for marine debris, and will not be repeated here.
[0105] The above description is merely a detailed explanation of preferred embodiments and principles of the present invention. For those skilled in the art, there may be changes in specific implementation methods based on the ideas provided by the present invention, and these changes should also be considered within the scope of protection of the present invention.
Claims
1. A method for fine-grained identification of marine litter based on UAV imagery, characterized in that, Includes the following steps: S1. Construct a target detection dataset based on data collected by UAVs and divide it into N classes for incremental learning tasks; where the target detection dataset contains different categories of marine debris, and N is an integer greater than 1; S2. Input the incremental learning tasks into the target detection model sequentially for training to obtain the trained target detection model. During the training of the (t+1)th incremental learning task, samples are extracted from the data corresponding to the historical (t)th incremental learning tasks and the (t+1)th incremental learning task to form a prior analysis dataset. The prior analysis dataset is input into the prior analyzer for similarity analysis to obtain a task similarity index that characterizes the difficulty of distinguishing marine debris categories. t∈[1, N-1]. Based on the task similarity index, it is determined whether to enable RoI fine-grained discrimination modulation during the training of the (t+1)th incremental learning task. If yes, the RoI features in the target detection model training process are weighted and modulated based on an attention mechanism before continuing the processing of the remaining network structures. If no, the original network structure of the target detection model is maintained for training. The process of determining whether to enable RoI fine-grained discriminative modulation during the training of the (t+1)th class incremental learning task based on task similarity metrics includes the following steps: S25. Determine if t is 1; if yes, the current class incremental learning task is the second class incremental learning task, and proceed to step S26; if no, proceed to step S27. S26. Determine whether the task similarity index of the second class of incremental learning tasks is greater than the task similarity index of the first class of incremental learning tasks; if yes, enable RoI fine-grained discriminative modulation; if no, do not enable RoI fine-grained discriminative modulation. S27. Calculate the distance between the task similarity index of the current class incremental learning task and the task similarity index of the historical t classes incremental learning tasks, determine the target historical class incremental learning task that is closest to the current class incremental learning task, and determine whether the current class incremental learning task enables RoI fine-grained discrimination modulation during training based on the enabled status of RoI fine-grained discrimination modulation of the target historical class incremental learning task during training. S3. Collect target detection data through the drone, input it into the trained target detection model, and output the target detection results.
2. The method of claim 1, wherein, Step S1 specifically includes the following steps: S11. A target detection dataset containing multiple categories of marine debris was collected using drones; each data sample includes at least the input image and the corresponding target bounding box and category label. S12. Divide the target detection dataset into multiple non-overlapping category subsets according to all marine debris categories; S13. Each category subset corresponds to a class incremental learning task, and they are input into the object detection model for training in a predetermined order.
3. The method of claim 1, wherein, In step S2, the process of inputting the prior analysis dataset into the prior analyzer for similarity analysis includes the following steps: S21. Input the prior analysis dataset into the target detection model that has completed the incremental learning task of the t-th class, and extract the feature vector corresponding to each target instance from the regional interest feature layer of the model. S22. Aggregate the feature vectors of target instances belonging to the same marine debris category to obtain the category feature prototype; S23. Calculate the intra-task similarity between feature prototypes of each class in the current (t+1)th class incremental learning task. Calculate the cross-task similarity between the prototype features of each class in the current (t+1)th class incremental learning task and the prototype features of each class in the historical (t)th class incremental learning tasks; S24. Accumulate the intra-task similarity and cross-task similarity and normalize them to obtain the task similarity index.
4. The method for fine-grained identification of marine debris according to claim 3, characterized in that, The task similarity index of the (t+1)th class incremental learning task for: ; in, This represents the sum of similarities within each task. It is the sum of cross-task similarity. This represents the total number of category feature prototype pairs involved in the similarity calculation.
5. The method for fine-grained identification of marine debris according to claim 4, characterized in that, The task similarity index of the current incremental learning task is compared with the historical ones. The distance between the task similarity metrics of each class of incremental learning tasks is: ; in, .
6. The method for fine-grained identification of marine debris according to claim 5, characterized in that, The target historical incremental learning task is the first... Incremental learning task for each class: 。 7. The method for fine-grained identification of marine debris according to any one of claims 1-6, characterized in that, The attention-based weighted modulation process includes: First, global pooling is performed on the RoI features to obtain channel statistical vectors that reflect the strength of each channel response. Then, during the activation phase, attention branches are set for different types of incremental learning tasks. The attention branches of the corresponding tasks generate channel weight coefficients based on the channel statistical vectors and perform weighted modulation on the RoI features in a residual manner.
8. The method for fine-grained identification of marine debris according to any one of claims 1-6, characterized in that, The object detection model is the object detection framework Faster R-CNN.
9. A fine-grained marine debris identification system based on UAV imagery, employing the fine-grained marine debris identification method as described in any one of claims 1-8, characterized in that, The marine debris fine-grained identification system includes: The module is used to build a target detection dataset based on UAV data and divide it into N classes of incremental learning tasks; where the target detection dataset contains different categories of marine debris, and N is an integer greater than 1. The training module is used to sequentially input incremental learning tasks into the object detection model for training, resulting in a trained object detection model. During the training of the (t+1)th incremental learning task, samples are extracted from the data corresponding to the historical (t)th incremental learning tasks and the current (t+1)th incremental learning task to form a prior analysis dataset. This prior analysis dataset is input into a prior analyzer for similarity analysis, yielding a task similarity index representing the difficulty of distinguishing marine debris categories; t∈[1, N-1]. Based on the task similarity index, it is determined whether to enable RoI fine-grained discrimination modulation during the training of the current (t+1)th incremental learning task. If yes, the RoI features in the object detection model training process are weighted and modulated based on an attention mechanism before continuing with the original processing of the remaining network structures. If no, the original network structure of the object detection model is maintained during training. The detection module is used to collect target detection data through the drone, input it into the trained target detection model, and output the target detection results.