A YOLOv8 network-based failed spacecraft component detection network and method

By improving the backbone and neck structure of the YOLOv8 network and combining HCE, CBAM and ECA mechanisms, the illumination and motion status problems in the detection of failed spacecraft components were solved, thereby improving the detection accuracy and precision.

CN120070965BActive Publication Date: 2026-06-16XIAN INST OF OPTICS & PRECISION MECHANICS CHINESE ACAD OF SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XIAN INST OF OPTICS & PRECISION MECHANICS CHINESE ACAD OF SCI
Filing Date
2025-01-24
Publication Date
2026-06-16

Smart Images

  • Figure CN120070965B_ABST
    Figure CN120070965B_ABST
Patent Text Reader

Abstract

The application relates to a target detection network and method, in particular to a failed spacecraft component detection network and method based on a YOLOv8 network, which is used for solving the problems that the failed spacecraft key component is missed or misdetected due to illumination problems and motion state problems in the failed spacecraft component detection method in an optical image at the present stage. The failed spacecraft component detection network based on the YOLOv8 network introduces HCE to replace C2f in the YOLOv8 backbone network, effectively enhances the extraction ability of complex image noise features by combining the CBAM and ECA mechanisms, and then accurately extracts target features, improves the recognition degree of target information in the image, and increases R-GAM in the YOLOv8 neck network, ensures that information can be efficiently transmitted between different layers through the jumping connection of the residual network, optimizes and updates the feature map by combining the attention mechanism, and focuses more attention on the key component features.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to target detection networks and methods, specifically to a failure spacecraft component detection network and method based on the YOLOv8 (You Only LookOnce version 8) network. Background Technology

[0002] Artificial satellites in orbit may fail due to natural shocks, accidents, or fuel depletion. These failed satellites, as non-cooperative targets, not only waste orbital resources but may also disintegrate, posing a threat to space security. Therefore, capturing or maintaining failed satellites has become a critical task. The core of the capture mission lies in identifying the satellite's key components, such as solar panels and radar antennas; while the focus of on-orbit maintenance is identifying the satellite's overall structure and key interfaces, such as the satellite body and docking surfaces.

[0003] Optical imaging technology, due to its intuitiveness, high resolution, and rich information in target detection, has demonstrated superior applicability in the capture of failed spacecraft components. Spacecraft component detection based on optical images falls under the target detection direction of computer vision. Currently, these detection methods are mainly divided into two categories: traditional target detection methods and detection methods based on convolutional neural networks (CNNs).

[0004] Traditional object detection methods typically rely on feature fitting (such as points, lines, and circles), but these methods require adjustments to fitting parameters under different lighting conditions and object types, resulting in poor adaptability. Furthermore, they depend on complex image preprocessing procedures. With the development of deep learning technology, CNN-based detection methods have demonstrated significant effectiveness in local component detection, such as high-voltage line insulator detection and material surface defect detection.

[0005] Currently, the detection and identification of key spacecraft components face two major challenges:

[0006] (1) Illumination problem: Images taken against a deep space background often suffer from insufficient brightness, low contrast and high noise, which seriously affect image quality and visual effect, causing target information to be hidden in noise, thereby reducing the performance of the detector.

[0007] (2) Motion status issues: When photographing a moving spacecraft, shaking and blurring are likely to occur. These problems hinder the accurate identification of key components and significantly restrict the detection task.

[0008] The aforementioned problems often lead to missed or false detections of critical components in failed spacecraft. Therefore, developing efficient and robust methods for inspecting spacecraft components is of great significance in order to cope with the complex space environment and improve inspection accuracy. Summary of the Invention

[0009] The purpose of this invention is to address the shortcomings of current methods for detecting failed spacecraft components in optical images, which may lead to missed or false detections of critical components due to illumination and motion issues. This invention provides a failed spacecraft component detection network and method based on the YOLOv8 network.

[0010] To address the shortcomings of the existing technology, the present invention provides the following technical solution:

[0011] A failure spacecraft component detection network based on YOLOv8 network, characterized by including a backbone network, a neck network, and a head network;

[0012] The backbone network is used to extract features from the input image, which includes Conv (convolution), Conv, HCE (Hybrid CBAM and ECA, hybrid attention module), Conv, HCE, Conv, HCE, Conv, HCE, SPPF (Spatial Pyramid Pooling-Fast, spatial pyramid pooling module) where the input and output are connected in sequence.

[0013] The neck network is used to further process the feature map output by the backbone network. It includes Upsample, Concat, C2f (CSP Bottleneck with 2 Convolutions), Upsample, Concat, C2f, R-GAM (Residual Global Attention Mechanism), Conv, Concat, C2f, R-GAM, Conv, Concat, C2f, R-GAM; the head network is used to classify and locate the target based on the feature map output by the neck network, and it includes three Detect functions.

[0014] The second HCE output of the backbone network is also connected to the second Concat input of the neck network, the third HCE output of the backbone network is also connected to the first Concat input of the neck network, and the SPPF output of the backbone network is connected to the first Upsample input and the fourth Concat input of the neck network, respectively. The first R-GAM output of the neck network is also connected to the first Detect, the second R-GAM output is also connected to the second Detect, and the third R-GAM output is connected to the first Detect.

[0015] Each HCE includes a Conv, a Split, n BottleneckCBAMs (Bottleneck with CBAM), a Concat, and a Conv connected in sequence, where n ≥ 2. The output of each BottleneckCBAM is connected to the input of the Concat via a BottleneckECA (Bottleneck with ECA). The output of the Split is also connected to the input of the Concat.

[0016] Each R-GAM includes an MLP (Multi-Layer Perceptron), a Sigmoid activation function, a multiplication module, a depthwise convolution module, a pointwise convolution module, a Sigmoid function, a multiplication module, and a residual connection module, with the input and output connected in sequence. The input of the MLP is connected to the corresponding C2f output, which is also connected to the second input of the first multiplication module. The output of the first multiplication module is also connected to the second input of the second multiplication module. The MLP is used to transform the output of the corresponding C2f from C×W×H dimensions to W×H×C dimensions, where C is the number of channels, and W and H are the width and height of the feature map. The first Sigmoid function is used to generate a weight matrix from the output of the MLP. The first multiplication module is used to perform meta-multiplication on the output of the corresponding C2f and the weight matrix. The first step involves element-wise multiplication to generate a second feature map. The depthwise convolution module performs independent convolution operations on each channel of the second feature map to generate intermediate features. The pointwise convolution module transforms the intermediate features from W×H×C dimensions to W×H×C´ dimensions, where C´ is the number of output channels of the pointwise convolution module. The second sigmoid function is used to generate a new weight matrix from the output of the pointwise convolution module. The second multiplication module performs element-wise multiplication on the intermediate features and the new weight matrix to generate a third feature map. The residual connection module generates the output features of R-GAM from the third feature map.

[0017] Furthermore, the BottleneckCBAM includes Conv, Conv, and CBAM (Convolutional Block Attention Module) with their input and output connected in sequence; the first Conv is used to extract preliminary features from the input of the BottleneckCBAM and transform them into deeper features; the second Conv is used to extract higher-level features from the output of the first Conv; and the CBAM is used to generate a spatial attention map based on the output of the second Conv.

[0018] If shortcut=True, the output of BottleneckCBAM is the sum of the input of the first Conv and the spatial attention map; if shortcut=False, the output of BottleneckCBAM is the spatial attention map.

[0019] Furthermore, the BottleneckECA comprises Conv, Conv, and ECA (Efficient Channel Attention) connected sequentially between the input and output; the first Conv is used to extract preliminary features from the input of the BottleneckECA and transform them into features of lower dimension; the second Conv is used to extract further features from the output of the first Conv while maintaining the depth of the feature map; the ECA is used to calculate channel attention weights through local convolution operations and apply the channel attention weights to the output of the second Conv;

[0020] If shortcut=True, the output of BottleneckECA is the sum of the input of the first Conv and the output of ECA; if shortcut=False, the output of BottleneckECA is the output of ECA.

[0021] Meanwhile, this invention also provides a method for detecting failed spacecraft components based on YOLOv8 networks, which is characterized by including the following steps:

[0022] Step 1: In the YOLOv8 backbone network, replace each Bottleneck of each C2f with BottleneckCBAM, and use a BottleneckECA to connect the output of each BottleneckCBAM to the input of Concat, that is, replace each C2f with HCE.

[0023] Within the neck network of YOLOv8, the outputs of the second and third C2f are connected to the corresponding Conv inputs in the neck network and the corresponding Detect inputs in the head network via an R-GAM, respectively. The output of the fourth C2f is connected to the corresponding Detect input in the head network via an R-GAM, thus obtaining the initial YOLOv8-based failure spacecraft component detection network.

[0024] Step 2: Collect images of the failed spacecraft using the model of the failed spacecraft to be detected, perform data augmentation, and then divide the data into training set, validation set, and test set. After annotating the images in the training set and validation set, input them into the initial failed spacecraft component detection network obtained in Step 1 for training, and use the test set for evaluation to obtain the trained failed spacecraft component detection network based on YOLOv8 network.

[0025] Step 3: Input the real image of the failed spacecraft to be detected into the trained failed spacecraft component detection network to obtain the detection results of the failed spacecraft components, thus completing the failed spacecraft component detection.

[0026] Furthermore, in step 1, the working process of each HCE is as follows:

[0027] Step A1: Perform a convolution operation on the input of HCE using the first Conv of HCE to extract local features and input Split;

[0028] Step A2: Split the output of the first Conv to obtain multiple segmented features, which are then input into the first BottleneckCBAM and Concat respectively.

[0029] Step A3: The first Conv of the i-th BottleneckCBAM extracts preliminary features from the segmented features and transforms them into deeper features. Then, the second Conv extracts higher-level features from the output of the first Conv. Finally, the CBAM generates a spatial attention map based on the output of the second Conv; i∈[1,n-1]; when shortcut=true, the input of the first Conv is added to the spatial attention map and used as the output of the i-th BottleneckCBAM, which is then input into the i-th BottleneckECA and the (i+1)-th BottleneckCBAM respectively; when shortcut=false, the spatial attention map is used as the output of the i-th BottleneckCBAM and input into the i-th BottleneckECA and the (i+1)-th BottleneckCBAM respectively.

[0030] The first Conv of the i-th BottleneckECA extracts preliminary features from the output of the i-th BottleneckCBAM and transforms them into lower-dimensional features. Then, the second Conv extracts more features from the output of the first Conv while maintaining the depth of the feature map. The ECA then calculates channel attention weights through local convolution operations and applies these weights to the output of the second Conv. When shortcut=true, the input of the first Conv is added to the output of the ECA to obtain the output of the i-th BottleneckECA, which is then input into Concat. When shortcut=false, the output of the ECA is used as the output of the i-th BottleneckECA, which is then input into Concat.

[0031] The first Conv of the nth BottleneckCBAM extracts preliminary features from the segmented features and transforms them into deeper features. Then, the second Conv extracts higher-level features from the output of the first Conv. Finally, the CBAM generates a spatial attention map based on the output of the second Conv. When shortcut=true, the input of the first Conv is added to the spatial attention map and used as the output of the nth BottleneckCBAM, which is then input into Concat and the nth BottleneckECA. When shortcut=false, the spatial attention map is used as the output of the nth BottleneckCBAM and input into Concat and the nth BottleneckECA.

[0032] Follow the above process until the output of the nth BottleneckECA is input into Concat, then execute step A4;

[0033] Step A4: Concat concatenates the segmented features output from Step A2 Split, the output of the nth Bottleneck CBAM from Step A3, and the outputs of the nth Bottleneck ECA to obtain the concatenated features. This concatenated features are then input into the second Conv of HCE for convolution processing to generate the output of HCE.

[0034] Furthermore, in step 1, the working process of each R-GAM is as follows:

[0035] Step B1: Input the output of C2f into the MLP and the first multiplication module of R-GAM respectively;

[0036] Step B2: The MLP transforms the output of C2f from the C×W×H dimension to the W×H×C dimension, inputs it into the first Sigmoid, and then generates a weight matrix from the first Sigmoid and inputs it into the first multiplication module.

[0037] Step B3: The first multiplication module performs element-wise multiplication on the output of the corresponding C2f and the weight matrix to generate the second feature map, which is then input into the depthwise convolution module and the second multiplication module respectively.

[0038] Step B4: The depthwise convolution module performs independent convolution operations on each channel of the second feature map to generate the intermediate feature input point convolution module;

[0039] Step B5: The point convolution module transforms the intermediate features from W×H×C dimensions to W×H×C´ dimensions, inputs them into the second Sigmoid, and then generates a new weight matrix which is input into the second multiplication module.

[0040] Step B6: The second multiplication module performs element-wise multiplication on the second feature map and the new weight matrix to generate the third feature map, which is then input into the residual connection module. The residual connection module then generates the output of R-GAM.

[0041] Compared with the prior art, the beneficial effects of the present invention are:

[0042] (1) The present invention provides a faulty spacecraft component detection network based on YOLOv8 network. HCE is introduced in the YOLOv8 backbone network to replace C2f, that is, each Bottleneck of each C2f is replaced by BottleneckCBAM, and a BottleneckECA is used to connect the output of each BottleneckCBAM to the input of Concat. By combining CBAM and ECA mechanisms, the ability to extract noise features of complex images is effectively enhanced, thereby accurately extracting target features and improving the recognition of target information in the image. This overcomes the negative impact of image quality defects caused by illumination problems on detection and ensures the accuracy and effectiveness of detection.

[0043] (2) This invention adds R-GAM to the YOLOv8 neck network. The skip connections of the residual network ensure that information can be efficiently transferred between different layers. Combined with the attention mechanism to optimize and update the feature map, it can effectively ignore irrelevant information caused by motion blur and focus more attention on the features of key components. At the same time, R-GAM strengthens the key features of the target from both channel and spatial dimensions, enhances the network's anti-interference ability, significantly improves the network's detection accuracy of targets in motion-blurred images, and reduces detection errors caused by the instability of satellite motion.

[0044] (3) The present invention provides a method for detecting failed spacecraft components based on the YOLOv8 network. Compared with the original YOLOv8 model, it can effectively improve the mAP (mean Average Precision) of failed spacecraft components (by 2.57%). In particular, the present invention has good detection performance when dealing with challenges such as image noise and jitter blur. Attached Figure Description

[0045] Figure 1 This is a schematic diagram of the structure of a YOLOv8 network;

[0046] Figure 2 for Figure 1 Schematic diagram of the structure of C2f in the middle;

[0047] Figure 3 This is a schematic diagram of the initial failure spacecraft component detection network in step 1 of an embodiment of the failure spacecraft component detection method based on YOLOv8 network of the present invention.

[0048] Figure 4 for Figure 3 Schematic diagram of the structure of HCE in China;

[0049] Figure 5 for Figure 4 A schematic diagram of the structure of the Bottleneck CBAM;

[0050] Figure 6 for Figure 4 A structural diagram of the Bottleneck ECA;

[0051] Figure 7 for Figure 3 A schematic diagram of the structure of R-GAM. Detailed Implementation

[0052] The present invention will be further described below with reference to the accompanying drawings and exemplary embodiments.

[0053] Reference Figures 1-2 The YOLOv8 network consists of a backbone, a neck, and a head. The backbone is responsible for extracting features from the input image and consists of Conv, Conv, C2f, Conv, C2f, Conv, C2f, Conv, C2f, and SPPF, which are connected sequentially between the input and output. Figure 2As shown, the backbone network C2f includes Conv, Split, n Bottlenecks (n≥2), Concat, and Conv, with the input and output connected sequentially. The outputs of Split and the n Bottlenecks are also connected to the input of Concat. C2F is a module based on the CSP (Cross Stage Partial) architecture, which refines and fuses features by repeating the n Bottlenecks module. The neck network is used to aggregate features from the backbone network. It includes Upsample, Concat, C2f, Upsample, Concat, C2f, Conv, Concat, C2f, Conv, Concat, C2f, Conv, Concat, C2f. The head network receives the refined features from the neck network and outputs the detection results, including the prediction of the category and bounding box.

[0054] This invention discloses a method for detecting failed spacecraft components based on YOLOv8 networks, comprising the following steps:

[0055] Step 1, in Figure 1 In the YOLOv8 backbone network shown, each C2f is replaced by an HCE. Specifically, each Bottleneck in each C2f is replaced by a BottleneckCBAM, and a BottleneckECA is used to connect the output of each BottleneckCBAM to the Concat input in the C2f.

[0056] Within the YOLOv8 neck network, the outputs of the second and third C2f nodes are connected via an R-GAM to the corresponding Conv input in the neck network and the corresponding Detect input in the head network, respectively. The output of the fourth C2f node is also connected via an R-GAM to the corresponding Detect input in the head network, resulting in the initial YOLOv8-based failure spacecraft component detection network. Figure 3 As shown;

[0057] The failed spacecraft component detection network comprises a backbone network, a neck network, and a head network. The backbone network extracts features from the input image and consists of a sequence of input-output connections: Conv, Conv, HCE, Conv, HCE, Conv, HCE, Conv, HCE, Conv, HCE, SPPF. The neck network further processes the feature map output from the backbone network and consists of a sequence of input-output connections: Upsample, Concat, C2f, Upsample, Concat, C2f, R-GAM, Conv, Concat, C2f, R-GAM, Conv, Concat, C2f, R-GAM. The head network classifies and locates the target based on the feature map output from the neck network. The head network includes three Detect functions.

[0058] Reference Figure 4 Each HCE includes a Conv, a Split, n BottleneckCBAMs, a Concat, and a Conv connected sequentially to the input and output, where n ≥ 2; the output of each BottleneckCBAM is connected to the input of the Concat via a BottleneckECA; the output of the Split is also connected to the input of the Concat.

[0059] Reference Figure 5 BottleneckCBAM consists of Conv, Conv, and CBAM, where the input and output are connected sequentially. The first Conv extracts preliminary features and transforms them into deeper features. The second Conv extracts even higher-level features from the output of the first Conv. CBAM generates a spatial attention map based on the output of the second Conv. If shortcut=True (residual connection), the output of BottleneckCBAM is the sum of the input of the first Conv and the spatial attention map. If shortcut=False (no residual connection), the output of BottleneckCBAM is the spatial attention map. Shortcut (shortcut connection) refers to a connection method that directly adds the input feature map to the output of subsequent layers.

[0060] Reference Figure 6BottleneckECA consists of Conv, Conv, and ECA, which are connected sequentially by input and output. The first Conv is used to extract preliminary features from the output of BottleneckCBAM and transform them into features of lower dimension. The second Conv is used to further extract features from the output of the first Conv while maintaining the depth of the feature map. ECA is used to calculate channel attention weights through local convolution operations and apply the channel attention weights to the output of the second Conv. If shortcut=True, the output of BottleneckECA is the sum of the input of the first Conv and the output of ECA. If shortcut=False, the output of BottleneckECA is the output of ECA.

[0061] The working process of each HCE is as follows:

[0062] Step A1: Perform a convolution operation on the input of HCE using the first Conv of HCE to extract local features and input Split;

[0063] Step A2: Split the output of the first Conv to obtain multiple segmented features, which are then input into the first BottleneckCBAM and Concat respectively.

[0064] Step A3: The first Conv of the i-th BottleneckCBAM extracts preliminary features from the segmented features and transforms them into deeper features. Then, the second Conv extracts higher-level features from the output of the first Conv. Finally, the CBAM generates a spatial attention map based on the output of the second Conv; i∈[1,n-1]; when shortcut=true, the input of the first Conv is added to the spatial attention map and used as the output of the i-th BottleneckCBAM, which is then input into the i-th BottleneckECA and the (i+1)-th BottleneckCBAM respectively; when shortcut=false, the spatial attention map is used as the output of the i-th BottleneckCBAM and input into the i-th BottleneckECA and the (i+1)-th BottleneckCBAM respectively.

[0065] The first Conv of the i-th BottleneckECA extracts preliminary features from the output of the i-th BottleneckCBAM and transforms them into lower-dimensional features. Then, the second Conv extracts more features from the output of the first Conv while maintaining the depth of the feature map. The ECA then calculates channel attention weights through local convolution operations and applies these weights to the output of the second Conv. When shortcut=true, the input of the first Conv is added to the output of the ECA to obtain the output of the i-th BottleneckECA, which is then input into Concat. When shortcut=false, the output of the ECA is used as the output of the i-th BottleneckECA, which is then input into Concat.

[0066] The first Conv of the nth BottleneckCBAM extracts preliminary features from the segmented features and transforms them into deeper features. Then, the second Conv extracts higher-level features from the output of the first Conv. Finally, the CBAM generates a spatial attention map based on the output of the second Conv. When shortcut=true, the input of the first Conv is added to the spatial attention map and used as the output of the nth BottleneckCBAM, which is then input into Concat and the nth BottleneckECA. When shortcut=false, the spatial attention map is used as the output of the nth BottleneckCBAM and input into Concat and the nth BottleneckECA.

[0067] Follow the above process until the output of the nth BottleneckECA is input into Concat, then execute step A4;

[0068] Step A4: Concat concatenates the segmented features output from Step A2 Split, the output of the nth Bottleneck CBAM from Step A3, and the outputs of the nth Bottleneck ECA to obtain the concatenated features. This concatenated features are then input into the second Conv of HCE for convolution processing to generate the output of HCE.

[0069] Reference Figure 7 Each R-GAM includes an MLP, a Sigmoid, a multiplication module, a depthwise convolution module, a pointwise convolution module, a Sigmoid, a multiplication module, and a residual connection module, with the input and output connected in sequence. The input of the MLP is connected to the corresponding C2f output, which is also connected to the second input of the first multiplication module. The output of the first multiplication module is also connected to the second input of the second multiplication module.

[0070] The MLP transforms the output of the corresponding C2f from C×W×H dimensions to W×H×C dimensions (only changing the order of dimensions), where C is the number of channels, and W and H are the width and height of the feature map. The first sigmoid module generates a weight matrix from the MLP output. The first multiplication module performs element-wise multiplication on the output of the corresponding C2f and the weight matrix to generate the second feature map. The depthwise convolution module performs independent convolution operations on each channel of the second feature map to generate intermediate features. The pointwise convolution module transforms the intermediate features from W×H×C dimensions to W×H×C' dimensions, where C' is the number of output channels of the pointwise convolution module. The first sigmoid module generates a new weight matrix from the output of the pointwise convolution module. The second multiplication module performs element-wise multiplication on the intermediate features and the new weight matrix to generate the third feature map. The residual connection module generates the output features of R-GAM from the third feature map.

[0071] The working process of each R-GAM is as follows:

[0072] Step B1: Input the output of C2f into the MLP and the first multiplication module of R-GAM respectively;

[0073] Step B2: The MLP transforms the output of C2f from the C×W×H dimension to the W×H×C dimension, inputs it into the first Sigmoid, and then generates a weight matrix from the first Sigmoid and inputs it into the first multiplication module.

[0074] Step B3: The first multiplication module performs element-wise multiplication on the output of the corresponding C2f and the weight matrix to generate the second feature map, which is then input into the depthwise convolution module and the second multiplication module respectively.

[0075] Step B4: The depthwise convolution module performs independent convolution operations on each channel of the second feature map to generate the intermediate feature input point convolution module;

[0076] Step B5: The point convolution module transforms the intermediate features from W×H×C dimensions to W×H×C´ dimensions, inputs them into the second Sigmoid, and then generates a new weight matrix which is input into the second multiplication module.

[0077] Step B6: The second multiplication module performs element-wise multiplication on the second feature map and the new weight matrix to generate the third feature map, which is then input into the residual connection module. The residual connection module then generates the output of R-GAM.

[0078] Step 2: Collect images of the failed spacecraft using the model of the failed spacecraft to be detected, perform data augmentation, and then divide the data into training set, validation set, and test set. After annotating the images in the training set and validation set, input them into the initial failed spacecraft component detection network obtained in Step 1 for training, and use the test set for evaluation to obtain a trained failed spacecraft component detection network based on the YOLOv8 network.

[0079] For details of step 2, please refer to Chinese patent CN119048869A;

[0080] Step 3: Input the real image of the failed spacecraft to be detected into the trained failed spacecraft component detection network to obtain the detection results of the failed spacecraft components, thus completing the failed spacecraft component detection.

[0081] To evaluate the effectiveness of the embodiments of the present invention, ablation experiments were conducted on the backbone network and neck network of the YOLOv8 network, respectively.

[0082] “HCE”, “2” HCE”,“3 HCE”,“4 HCE”, “5 HCE”, “2” HCE+R-GAM” and “Example 4 of the present invention” Experiments were conducted using the "HCE+R-GAM" module. To better demonstrate the impact of the embodiments of the present invention on the detection accuracy of failed spacecraft components, the AP (Average Precision), mAP, and Recall for each failed spacecraft component were evaluated, and the results are shown in Table 1:

[0083] Table 1

[0084]

[0085] In Table 1, adding a module HCE means replacing the first C2f of the YOLOv8 backbone network with an HCE; adding a module 2×HCE means replacing the first and second C2fs of the YOLOv8 backbone network with an HCE; adding a module 3×HCE means replacing the first and third C2fs of the YOLOv8 backbone network with an HCE; adding a module 4×HCE means replacing the first and fourth C2fs of the YOLOv8 backbone network with an HCE; adding a module 5×HCE means replacing the first and fourth C2fs of the YOLOv8 backbone network with an HCE, and adding another HCE between the fourth HCE and SPPF; adding a module 2×HCE+R-GAM means replacing the first and second C2fs of the YOLOv8 backbone network with an HCE, and adding the same R-GAM as in the embodiment of the present invention to the YOLOv8 neck network.

[0086] As shown in Table 1, compared with the original YOLOv8 backbone network, adding HCE, 2×HCE, 3×HCE, 4×HCE, and 5×HCE modules increased mAP by 0.48%, 0.83%, 1.15%, 1.30%, and decreased it by 0.43%, respectively. This indicates that adding too many HCE modules to the YOLOv8 backbone network may lead to performance degradation. When adding the 2×HCE+R-GAM module, the mAP increased by 0.56% compared to adding the 2×HCE module, and by 1.39% compared to the original YOLOv8 backbone network. The embodiment of this invention showed the greatest improvement in AP for solar panels, increasing it by 2.72%. In summary, by adding HCE to the YOLOv8 backbone network and R-GAM to the neck network, the overall detection accuracy of the detection model is improved, as well as the detection accuracy in the face of target image noise and jitter blur. The improvement strategy proposed in this invention is effective.

Claims

1. A failed spacecraft component detection network based on YOLOv8 network, characterized in that: This includes the backbone network, neck network, and head network; The backbone network is used to extract features from the input image, and it includes Conv, Conv, HCE, Conv, HCE, Conv, HCE, Conv, HCE, Conv, HCE, SPPF, which are connected sequentially between the input and output. The neck network is used to further process the feature map output by the backbone network, and it includes Upsample, Concat, C2f, Upsample, Concat, C2f, R-GAM, Conv, Concat, C2f, R-GAM, Conv, Concat, C2f, R-GAM, and R-GAM, which are connected in sequence between the input and output. The head network is used to classify and locate the target based on the feature map output by the neck network, and it includes three Detect functions. The second HCE output of the backbone network is also connected to the second Concat input of the neck network, the third HCE output of the backbone network is also connected to the first Concat input of the neck network, and the SPPF output of the backbone network is connected to the first Upsample input and the fourth Concat input of the neck network, respectively. The first R-GAM output of the neck network is also connected to the first Detect, the second R-GAM output is also connected to the second Detect, and the third R-GAM output is connected to the first Detect. Each HCE includes a Conv, a Split, n BottleneckCBAMs, a Concat, and a Conv connected in sequence to the input and output, where n ≥ 2. The output of each BottleneckCBAM is connected to the input of the Concat via a BottleneckECA. The output of the Split is also connected to the input of the Concat. Each R-GAM includes an MLP, a Sigmoid, a multiplication module, a depthwise convolution module, a pointwise convolution module, a Sigmoid, a multiplication module, and a residual connection module, with the input and output connected in sequence. The input of the MLP is connected to the corresponding C2f output, which is also connected to the second input of the first multiplication module. The output of the first multiplication module is also connected to the second input of the second multiplication module. The MLP is used to transform the output of the corresponding C2f from a C×W×H dimension to a W×H×C dimension, where C is the number of channels, and W and H are the width and height of the feature map. The first Sigmoid is used to generate a weight matrix from the output of the MLP. The first multiplication module performs element-wise multiplication on the corresponding C2f output and the weight matrix to generate a second feature map. The depthwise convolution module performs independent convolution operations on each channel of the second feature map to generate intermediate features. The pointwise convolution module transforms the intermediate features from W×H×C dimensions to W×H×C´ dimensions, where C´ is the number of output channels of the pointwise convolution module. The second sigmoid module generates a new weight matrix from the output of the pointwise convolution module. The second multiplication module performs element-wise multiplication on the intermediate features and the new weight matrix to generate a third feature map. The residual connection module generates the output features of R-GAM from the third feature map.

2. The YOLOv8-based fault spacecraft component detection network according to claim 1, characterized in that: The BottleneckCBAM comprises Conv, Conv, and CBAM, with the input and output connected in sequence. The first Conv is used to extract preliminary features from the input of the BottleneckCBAM and transform them into deeper features. The second Conv is used to extract higher-level features from the output of the first Conv. The CBAM is used to generate a spatial attention map based on the output of the second Conv. If shortcut=True, the output of BottleneckCBAM is the sum of the input of the first Conv and the spatial attention map; if shortcut=False, the output of BottleneckCBAM is the spatial attention map.

3. A failed spacecraft component detection network based on YOLOv8 network according to claim 1 or 2, characterized in that: The BottleneckECA comprises Conv, Conv, and ECA, with the input and output connected in sequence. The first Conv is used to extract preliminary features from the input of the BottleneckECA and transform them into features of a lower dimension. The second Conv is used to extract further features from the output of the first Conv while maintaining the depth of the feature map. The ECA is used to calculate channel attention weights through local convolution operations and apply the channel attention weights to the output of the second Conv. If shortcut=True, the output of BottleneckECA is the sum of the input of the first Conv and the output of ECA; if shortcut=False, the output of BottleneckECA is the output of ECA.

4. A method for detecting failed spacecraft components based on YOLOv8 networks, characterized in that, Includes the following steps: Step 1: In the YOLOv8 backbone network, replace each Bottleneck of each C2f with BottleneckCBAM, and use a BottleneckECA to connect the output of each BottleneckCBAM to the input of Concat, that is, replace each C2f with HCE. Within the neck network of YOLOv8, the outputs of the second and third C2f are connected to the corresponding Conv inputs in the neck network and the corresponding Detect inputs in the head network via an R-GAM, respectively. The output of the fourth C2f is connected to the corresponding Detect input in the head network via an R-GAM, thus obtaining the initial YOLOv8-based failure spacecraft component detection network as described in claim 1. Step 2: Collect images of the failed spacecraft using the model of the failed spacecraft to be detected, perform data augmentation, and then divide the data into training set, validation set, and test set. After annotating the images in the training set and validation set, input them into the initial failed spacecraft component detection network obtained in Step 1 for training, and evaluate them using the test set to obtain the trained failed spacecraft component detection network based on the YOLOv8 network as described in claim 1. Step 3: Input the real image of the failed spacecraft to be detected into the trained failed spacecraft component detection network to obtain the detection results of the failed spacecraft components, thus completing the failed spacecraft component detection.

5. The method for detecting failed spacecraft components based on a YOLOv8 network according to claim 4, characterized in that: In step 1, the working process of each HCE is as follows: Step A1: Perform a convolution operation on the input of HCE using the first Conv of HCE to extract local features and input Split; Step A2: Split the output of the first Conv to obtain multiple segmented features, which are then input into the first BottleneckCBAM and Concat respectively. Step A3: The first Conv of the i-th BottleneckCBAM extracts preliminary features from the segmented features and transforms them into deeper features. Then, the second Conv extracts higher-level features from the output of the first Conv. Finally, the CBAM generates a spatial attention map based on the output of the second Conv; i∈[1,n-1]; when shortcut=true, the input of the first Conv is added to the spatial attention map and used as the output of the i-th BottleneckCBAM, which is then input into the i-th BottleneckECA and the (i+1)-th BottleneckCBAM respectively; when shortcut=false, the spatial attention map is used as the output of the i-th BottleneckCBAM and input into the i-th BottleneckECA and the (i+1)-th BottleneckCBAM respectively. The first Conv of the i-th BottleneckECA extracts preliminary features from the output of the i-th BottleneckCBAM and transforms them into lower-dimensional features. Then, the second Conv extracts more features from the output of the first Conv while maintaining the depth of the feature map. The ECA then calculates channel attention weights through local convolution operations and applies these weights to the output of the second Conv. When shortcut=true, the input of the first Conv is added to the output of the ECA to obtain the output of the i-th BottleneckECA, which is then input into Concat. When shortcut=false, the output of the ECA is used as the output of the i-th BottleneckECA, which is then input into Concat. The first Conv of the nth Bottleneck CBAM extracts preliminary features from the segmented features and transforms them into deeper features. Then, the second Conv extracts higher-level features from the output of the first Conv. Finally, the CBAM generates a spatial attention map based on the output of the second Conv. When shortcut=true, the input of the first Conv is added to the spatial attention map and used as the output of the nth BottleneckCBAM, which is then input to Concat and the nth BottleneckECA respectively; when shortcut=false, the spatial attention map is used as the output of the nth BottleneckCBAM, which is then input to Concat and the nth BottleneckECA respectively. Follow the above process until the output of the nth BottleneckECA is input into Concat, then execute step A4; Step A4: Concat concatenates the segmented features output from Step A2 Split, the output of the nth Bottleneck CBAM from Step A3, and the outputs of the nth Bottleneck ECA to obtain the concatenated features. This concatenated features are then input into the second Conv of HCE for convolution processing to generate the output of HCE.

6. A method for detecting failed spacecraft components based on a YOLOv8 network according to claim 4 or 5, characterized in that: In step 1, the working process of each R-GAM is as follows: Step B1: Input the output of C2f into the MLP and the first multiplication module of R-GAM respectively; Step B2: The MLP transforms the output of C2f from the C×W×H dimension to the W×H×C dimension, inputs it into the first Sigmoid, and then generates a weight matrix from the first Sigmoid and inputs it into the first multiplication module. Step B3: The first multiplication module performs element-wise multiplication on the output of the corresponding C2f and the weight matrix to generate the second feature map, which is then input into the depthwise convolution module and the second multiplication module respectively. Step B4: The depthwise convolution module performs independent convolution operations on each channel of the second feature map to generate the intermediate feature input point convolution module; Step B5: The point convolution module transforms the intermediate features from W×H×C dimensions to W×H×C´ dimensions, inputs them into the second Sigmoid, and then generates a new weight matrix which is input into the second multiplication module. Step B6: The second multiplication module performs element-wise multiplication on the second feature map and the new weight matrix to generate the third feature map, which is then input into the residual connection module. The residual connection module then generates the output of R-GAM.