An aircraft skin defect detection and network model training method
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CIVIL AVIATION UNIV OF CHINA
- Filing Date
- 2023-09-27
- Publication Date
- 2026-06-26
Smart Images

Figure CN117252842B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer vision, specifically to a method for detecting defects in aircraft skin and training a network model. Background Technology
[0002] Aircraft skin defect detection is a crucial aviation safety inspection task, aiming to detect and identify defects and damage to aircraft skin to ensure the structural integrity and safety of the aircraft. Traditional aircraft skin defect detection primarily relies on visual inspection by personnel, a method that is inefficient and heavily influenced by the subjective factors of the inspectors. In recent years, with the development of computer vision technology and deep learning, deep learning-based aircraft skin defect detection techniques have been widely researched and applied. However, most current mainstream methods are based on fully supervised training, which requires a large amount of labeled data to train the model. Collecting a large number of aircraft skin defect images is difficult, and labeling these images is also time-consuming. Therefore, it is necessary to develop a semi-supervised aircraft skin defect detection method that can train the model using unlabeled aircraft skin images.
[0003] The above content is only used to help understand the technical solution of the present invention and does not represent an admission that the above content is prior art. Summary of the Invention
[0004] The technical problem to be solved by this application is to provide a method for detecting defects in aircraft skin and training a network model, which has the characteristics of improving the detection capability of small defects and reducing labor costs.
[0005] In a first aspect, one embodiment provides a method for detecting defects in aircraft skin, comprising:
[0006] Acquire images of the aircraft skin as the object to be detected;
[0007] The object to be inspected is input into the aircraft skin defect detection network for defect detection, including:
[0008] The object to be detected is processed by the YOLOv7 network to obtain three feature maps of different scales, including: the first feature map, the second feature map and the third feature map;
[0009] The first feature map, the second feature map, and the third feature map are subjected to dynamic attention processing respectively to obtain the first dynamic attention feature map, the second dynamic attention feature map, and the third dynamic attention feature map.
[0010] The first dynamic attention feature map, the second dynamic attention feature map, and the third dynamic attention feature map are respectively subjected to decoupling detection processing to obtain the corresponding first decoupling feature map, the second decoupling feature map, and the third decoupling feature map;
[0011] The first decoupled feature map, the second decoupled feature map and the third decoupled feature map are fused to obtain an image with a detection box and a defect label;
[0012] The image with the detection frame and defect label is used as the result of aircraft skin defect detection;
[0013] The aircraft skin defect detection network is obtained by training the aircraft skin defect detection network model based on the constructed dynamic attention semi-supervised loss function.
[0014] In one embodiment, for any dynamic attention feature map, the decoupling detection process includes:
[0015] The input feature map is processed by 3×3 convolution and then divided into two branches. In one branch, after two 3×3 convolutions, it is processed by a 1×1 convolution to obtain a feature map with classification scores. In the other branch, after two 3×3 convolutions, it is processed by two branches, each of which is processed by a 1×1 convolution to obtain the corresponding feature map with regression scores and feature map with object scores.
[0016] The feature map with classification score, feature map with regression score, and feature map with object score are fused to obtain a decoupled feature map.
[0017] In one embodiment, the training method for the aircraft skin defect detection network model includes:
[0018] Data acquisition involves acquiring images of the aircraft skin, including: acquiring images of the aircraft skin surface under different lighting conditions and at different stages of damage;
[0019] Data classification: The collected image data are classified according to the type of defect, and the images that meet the defect clarity requirements in each category are selected to form the first image dataset, and the remaining images are formed into the second image dataset;
[0020] Image preprocessing involves preprocessing the images in the first image dataset, including filtering and denoising preprocessing.
[0021] Defect annotation involves annotating the preprocessed image with defects.
[0022] A semi-supervised defect detection network model based on dynamic decoupled attention is constructed, which includes a teacher model and a student model.
[0023] The network model is trained by taking images from the second image dataset and images from the first image dataset after defect annotation as input. Based on the teacher model and student model, the model is trained by combining the dynamic attention semi-supervised loss function to obtain a trained student model, which is then used as the aircraft skin defect detection network.
[0024] In one embodiment, the step of using images from the second image dataset and images from the first image dataset after defect annotation as input, and training the model based on the teacher model and the student model, combined with a dynamic attention semi-supervised loss function, includes:
[0025] The images in the second image dataset are subjected to the first strong data augmentation process;
[0026] The image after the first strong data augmentation and the exponential moving average obtained based on the student model are input into the teacher model for pseudo-label assignment.
[0027] The pseudo-labels are assigned to the processed feature map, and adaptive label assignment is performed to obtain reliable pseudo-labels and uncertain pseudo-labels.
[0028] The image after the first strong data augmentation process is then subjected to a second strong data augmentation process.
[0029] Weak data augmentation is performed on the images in the first image dataset after defect annotation.
[0030] The image after the second strong data augmentation process, the image after the weak data augmentation process, and the feedback loss value are input into the student model for moving exponential average processing and real label assignment processing to obtain the exponential moving average and feature map with real label accordingly.
[0031] The loss value is calculated based on the loss function, reliable pseudo-labels, feature parameters of the processed feature map with pseudo-labels, and feature parameters of the feature map with real labels. The calculated loss value is then fed back to the student model for training.
[0032] In one embodiment, the calculation of the loss value based on the loss function, reliable pseudo-labels, feature parameters of the feature map after pseudo-label processing, and feature parameters of the feature map with the real label includes:
[0033] L = L s +λL u
[0034] Where L represents the total loss function, L s L represents the supervised loss function. u Let λ represent the semi-supervised loss function, where λ is the balance factor between the supervised loss function and the semi-supervised loss function, and is a hyperparameter. CE is the cross-entropy loss function, and IoU is the regression loss function. This represents the classification score at the label position (h, w) on the feature map obtained by the student model. This represents the regression score at the label position (h, w) on the feature map obtained by the student model. This represents the objectivity score at the label position (h, w) on the feature map obtained by the student model. This represents the classification score at the label position (h, w) on the feature map obtained by the teacher model. This represents the regression score at the pseudo-label position (h, w) on the feature map obtained by the teacher model. The objectivity score represents the position (h, w) of the label on the feature map obtained by the teacher model. in,
[0035] in, For classifying losses, To regress the loss, For object-oriented loss; These represent the classification score, regression score, and objectivity score obtained from the adaptive pseudo-label assignment at the location (h, w) of the reliable pseudo-label on the feature map, respectively.
[0036] In one embodiment, the first strong data augmentation process includes Mixup data augmentation.
[0037] In one embodiment, the second strong data augmentation process includes Mosai data augmentation processing.
[0038] Secondly, one embodiment provides a training method for an aircraft skin defect detection network model, comprising:
[0039] Data acquisition involves acquiring images of the aircraft skin, including: acquiring images of the aircraft skin surface under different lighting conditions and at different stages of damage;
[0040] Data classification: The collected image data are classified according to the type of defect, and the images that meet the defect clarity requirements in each category are selected to form the first image dataset, and the remaining images are formed into the second image dataset;
[0041] Image preprocessing involves preprocessing the images in the first image dataset, including filtering and denoising preprocessing.
[0042] Defect annotation involves annotating the preprocessed image with defects.
[0043] A semi-supervised defect detection network model based on dynamic decoupled attention is constructed, which includes a teacher model and a student model.
[0044] Images from the second image dataset and images from the first image dataset after defect annotation are used as input. Based on the teacher model and student model, the model is trained using a dynamic attention semi-supervised loss function to obtain a trained student model, which is then used as the aircraft skin defect detection network.
[0045] In one embodiment, the step of using images from the second image dataset and images from the first image dataset after defect annotation as input, and training the model based on the teacher model and the student model, combined with a dynamic attention semi-supervised loss function, includes:
[0046] The images in the second image dataset are subjected to the first strong data augmentation process;
[0047] The image after the first strong data augmentation and the exponential moving average obtained based on the student model are input into the teacher model for pseudo-label assignment.
[0048] The pseudo-labels are assigned to the processed feature map, and adaptive label assignment is performed to obtain reliable pseudo-labels and uncertain pseudo-labels.
[0049] The image after the first strong data augmentation process is then subjected to a second strong data augmentation process.
[0050] Weak data augmentation is performed on the images in the first image dataset after defect annotation.
[0051] The image after the second strong data augmentation process, the image after the weak data augmentation process, and the feedback loss value are input into the student model for moving exponential average processing and real label assignment processing to obtain the exponential moving average and feature map with real label accordingly.
[0052] The loss value is calculated based on the loss function, reliable pseudo-labels, feature parameters of the processed feature map with pseudo-labels, and feature parameters of the feature map with real labels. The calculated loss value is then fed back to the student model for training.
[0053] In one embodiment, the calculation of the loss value based on the loss function, reliable pseudo-labels, feature parameters of the feature map after pseudo-label processing, and feature parameters of the feature map with the real label includes:
[0054] L = L s +λL u
[0055] Where L represents the total loss function, L s L represents the supervised loss function.u Let λ represent the semi-supervised loss function, where λ is the balance factor between the supervised loss function and the semi-supervised loss function, and is a hyperparameter. CE is the cross-entropy loss function, and IoU is the regression loss function. This represents the classification score at the label position (h, w) on the feature map obtained by the student model. This represents the regression score at the label position (h, w) on the feature map obtained by the student model. This represents the objectivity score at the label position (h, w) on the feature map obtained by the student model. This represents the classification score at the label position (h, w) on the feature map obtained by the teacher model. This represents the regression score at the pseudo-label position (h, w) on the feature map obtained by the teacher model. The objectivity score represents the position (h, w) of the label on the feature map obtained by the teacher model. in,
[0056]
[0057] in, For classifying losses, To regress the loss, For object-oriented loss; These represent the classification score, regression score, and objectivity score obtained from the adaptive pseudo-label assignment at the location (h, w) of the reliable pseudo-label on the feature map, respectively.
[0058] The beneficial effects of this invention are:
[0059] The dynamic decoupling detection process enhances feature extraction capabilities and improves the detection accuracy of small defects. Furthermore, since the aircraft skin defect detection network upon which the detection method is based is trained using a constructed dynamic attention semi-supervised loss function, it can be trained using unlabeled image data, significantly reducing the cost of manually labeled data. Attached Figure Description
[0060] Figure 1 This is a schematic flowchart of an aircraft skin defect detection method according to an embodiment of this application;
[0061] Figure 2 This is a schematic diagram of an aircraft skin defect detection network structure according to an embodiment of this application;
[0062] Figure 3 This application Figure 2 A schematic diagram of the decoupled detection network structure in the diagram;
[0063] Figure 4 This is a schematic diagram of the defect detection process of inputting the object to be detected into an aircraft skin defect detection network according to one embodiment of this application;
[0064] Figure 5 This is a schematic diagram of the teacher model structure in an aircraft skin defect detection network training model according to an embodiment of this application;
[0065] Figure 6 This is a schematic diagram of the student model structure in an aircraft skin defect detection network training model according to an embodiment of this application;
[0066] Figure 7 This is a schematic diagram of the structure of an aircraft skin defect detection network training model according to an embodiment of this application. Detailed Implementation
[0067] The present invention will now be described in further detail with reference to specific embodiments and accompanying drawings. Similar elements in different embodiments are referred to by associated similar element reference numerals. In the following embodiments, many details are described to facilitate a better understanding of this application. However, those skilled in the art will readily recognize that some features may be omitted in different situations, or may be replaced by other elements, materials, or methods. In some cases, certain operations related to this application are not shown or described in the specification. This is to avoid obscuring the core parts of this application with excessive description. For those skilled in the art, detailed description of these related operations is not necessary; they can fully understand the related operations based on the description in the specification and general technical knowledge in the art.
[0068] Furthermore, the features, operations, or characteristics described in the specification can be combined in any suitable manner to form various embodiments. At the same time, the steps or actions in the method description can be rearranged or adjusted in a manner obvious to those skilled in the art. Therefore, the various orders in the specification and drawings are only for the clear description of a particular embodiment and do not imply a necessary order, unless otherwise stated that a particular order must be followed.
[0069] The serial numbers assigned to components in this article, such as "first" and "second", are used only to distinguish the objects being described and have no sequential or technical meaning.
[0070] To facilitate the explanation of the inventive concept of this application, the following is a brief description of the aircraft skin defect detection technology.
[0071] Traditional aircraft skin defect detection primarily relies on visual inspection by personnel. This method is inefficient and heavily influenced by the subjective factors of the inspectors. In recent years, with the development of computer vision technology and deep learning, deep learning-based aircraft skin defect detection technology has been widely researched and applied. However, most current mainstream methods are based on fully supervised training, which requires a large amount of labeled data to train the model. Collecting a large number of aircraft skin defect images is difficult, and labeling these images is also very time-consuming.
[0072] Based on the above, this application provides a method for detecting defects in aircraft skin. The aircraft skin defect detection network model based on this method can be trained using unlabeled image data, greatly reducing the cost of manually annotating data. Furthermore, the introduced dynamic attention can improve the detection capability of small defects. Please refer to [reference needed]. Figure 1 The defect detection method includes:
[0073] Step S10: Obtain an image of the aircraft skin as the object to be detected.
[0074] Step S20: Input the object to be inspected into the aircraft skin defect detection network for defect detection. Please refer to [link / reference needed] for the defect detection process. Figure 2 ,include:
[0075] The object to be detected is processed by the YOLOv7 network to obtain three feature maps of different scales: a first feature map, a second feature map, and a third feature map. The YOLOv7 network includes a backbone feature extraction network and a detection head module. The input object to be detected is processed by the backbone feature extraction and detection head to obtain three feature maps of different scales: a first feature map, a second feature map, and a third feature map. Figure 2 In the diagram, H and W represent the height and width of the feature map.
[0076] The first feature map, the second feature map, and the third feature map are subjected to dynamic attention processing respectively to obtain the first dynamic attention feature map, the second dynamic attention feature map, and the third dynamic attention feature map.
[0077] The first dynamic attention feature map, the second dynamic attention feature map, and the third dynamic attention feature map are decoupled and detected respectively to obtain the corresponding first decoupled feature map, second decoupled feature map, and third decoupled feature map.
[0078] In one embodiment, for any dynamic attention feature map, please refer to... Figure 3 The decoupling detection process includes:
[0079] The input feature map is processed by 3×3 convolution and then divided into two branches. In one branch, after two 3×3 convolutions, it is processed by a 1×1 convolution to obtain a feature map with classification scores. In the other branch, after two 3×3 convolutions, it is processed by two branches, each of which is processed by a 1×1 convolution to obtain the corresponding feature map with regression scores and feature map with object scores.
[0080] This decoupled detection process combines scale-aware attention, spatial-aware attention, and task-aware attention to form a dynamic decoupled detection process, which enhances feature extraction capabilities and improves the detection accuracy of small defects.
[0081] The feature map with classification score, feature map with regression score and feature map with object score are fused to obtain a decoupled feature map. Correspondingly, the first decoupled feature map, the second decoupled feature map and the third decoupled feature map are fused to obtain an image with detection boxes and defect labels.
[0082] Step S30: The image with the detection box and defect label is used as the result of aircraft skin defect detection.
[0083] The aircraft skin defect detection network is trained based on a constructed dynamic attention semi-supervised loss function.
[0084] Based on the aforementioned aircraft skin defect detection method, the dynamic decoupled detection process formed by combining scale-aware attention, spatial-aware attention, and task-aware attention enhances feature extraction capabilities and improves the detection accuracy of small defects. Furthermore, since the aircraft skin defect detection network upon which the detection method is based is trained using a constructed dynamic attention semi-supervised loss function, it can be trained using unlabeled image data, significantly reducing the cost of manually labeled data.
[0085] In one embodiment, a method for training an aircraft skin defect detection network model is provided; please refer to [reference needed]. Figure 4 ,include:
[0086] Step S201, data acquisition, acquiring images of the aircraft skin, including: acquiring images of the aircraft skin surface under different lighting conditions and at different stages of damage.
[0087] Because the images of aircraft skin surfaces at different stages of damage are acquired under different lighting conditions, the accuracy of the trained network model in detecting aircraft skin images under different lighting conditions can be improved.
[0088] Step S202: Data classification. The collected image data are classified according to the type of defect. The images that meet the defect clarity requirements in each category are selected to form the first image dataset, and the remaining images are selected to form the second image dataset.
[0089] Defects in aircraft skin can be categorized into scratches, rivet damage, paint peeling, and rust. Images containing defects are selected according to their respective categories. From these categories, the images with higher resolution are grouped into a first image dataset, and the remaining images form a second image dataset. Resolution requirements can be set according to actual needs or the ratio between the first and second image datasets.
[0090] Step S203, image preprocessing, preprocessing the images in the first image dataset, including filtering and denoising preprocessing.
[0091] By preprocessing such as filtering and denoising, an aircraft skin image that meets the requirements is obtained.
[0092] Step S204, Defect annotation: Annotate the preprocessed image for defects.
[0093] In one embodiment, the LabelImg image annotation tool can be used to annotate the aircraft skin surface images in the preprocessed first image dataset and store them in a .txt file. During the defect annotation process, bounding boxes (such as rectangles) can be used to select aircraft skin surface defects and labels can be added according to the type of aircraft skin surface defect.
[0094] In one embodiment, the image after defect annotation is further enhanced using data augmentation methods. Data augmentation methods include one or more of flipping, rotating, scaling, Mosaic, and MixUp enhancements.
[0095] Step S205: Construct a semi-supervised defect detection network model based on dynamic decoupled attention. This semi-supervised defect detection network model includes a teacher model and a student model.
[0096] A semi-supervised defect detection network model based on dynamic decoupled attention was constructed, enabling the network model to utilize unlabeled image data. The introduced dynamic decoupled detector enhances the model's ability to detect small defects, further improving the model's detection accuracy.
[0097] Step S206, network model training: taking the images in the second image dataset and the images in the first image dataset after defect annotation as input, the model is trained based on the teacher model and student model, combined with the dynamic attention semi-supervised loss function, to obtain the trained student model, and the student model is used as the aircraft skin defect detection network.
[0098] In one embodiment, please refer to Figure 5 , Figure 6 and Figure 7 The specific method of step S206 may include:
[0099] The images in the second image dataset are subjected to a first strong data augmentation process. In one embodiment, the first strong data augmentation process includes a Mixup data augmentation process.
[0100] The image after the first strong data augmentation and the exponential moving average obtained based on the student model are input into the teacher model for pseudo-label assignment.
[0101] In one embodiment, please refer to Figure 5 The teacher model's processing involves taking images from the second image dataset (after Mixup data augmentation) and the exponential moving average of the student module parameters as input. This input is processed by a YOLOv7 network to obtain three feature maps of different scales: a first feature map, a second feature map, and a third feature map. Dynamic attention processing is then applied to the first, second, and third feature maps respectively, resulting in a first dynamic attention feature map, a second dynamic attention feature map, and a third dynamic attention feature map. Finally, decoupling detection processing is performed on each of these three dynamic attention feature maps.
[0102] Please refer to the decoupling detection process. Figure 2 The input feature map is processed by 3×3 convolution and then divided into two branches. In one branch, after two 3×3 convolutions, it is processed by a 1×1 convolution to obtain a feature map with classification scores. In the other branch, after two 3×3 convolutions, it is processed by two branches, each of which is processed by a 1×1 convolution to obtain the corresponding feature map with regression scores and feature map with object scores.
[0103] The feature maps with classification scores obtained from each decoupled detection process are fused to obtain the classification score at the pseudo-label position (h, w) on the feature map. The feature maps with regression scores obtained from each decoupled detection process are fused to obtain the regression score at the pseudo-label position (h, w) on the feature map. The feature maps with object scores obtained from each decoupled detection process are fused to obtain the object score at the pseudo-label position (h, w) on the feature map.
[0104] The pseudo-labels are assigned to the processed feature map, and adaptive label assignment is performed to obtain reliable and uncertain pseudo-labels. The adaptive label assignment method can be implemented using existing techniques. The obtained reliable pseudo-labels can be used to calculate the loss function. Specifically, the classification score obtained from the adaptive pseudo-label assignment sampling at the location (h, w) of the reliable pseudo-label on the feature map can be derived based on the reliable pseudo-labels. The regression score obtained by sampling from the adaptive pseudo-label assignment at the location (h,w) of the reliable pseudo-label on the feature map. The objectivity score is obtained by sampling from the adaptive pseudo-label assignment at the location (h, w) of the reliable pseudo-label on the feature map.
[0105] The image after the first strong data augmentation process is then subjected to a second strong data augmentation process. In one embodiment, the second strong data augmentation process includes Mosai data augmentation.
[0106] The images in the first image dataset after defect annotation are subjected to weak data augmentation. In one embodiment, the weak data augmentation includes any one or more of flipping, rotating, and scaling.
[0107] The images after the second strong data augmentation, the images after the weak data augmentation, and the feedback loss value are input into the student model for exponential moving average processing and real label assignment processing to obtain the exponential moving average and feature maps with real labels respectively.
[0108] Please refer to Figure 6 The student model's processing involves inputting the image after strong data augmentation, the image after weak data augmentation, and the feedback loss value into the student model. After processing by the YOLOv7 network, three feature maps of different scales are obtained: a first feature map, a second feature map, and a third feature map. Dynamic attention processing is then applied to the first, second, and third feature maps respectively, resulting in a first dynamic attention feature map, a second dynamic attention feature map, and a third dynamic attention feature map. Finally, decoupled detection processing is performed on the first, second, and third dynamic attention feature maps.
[0109] Please refer to the decoupling detection process. Figure 2The input feature map is processed by 3×3 convolution and then divided into two branches. In one branch, after two 3×3 convolutions, it is processed by a 1×1 convolution to obtain a feature map with classification scores. In the other branch, after two 3×3 convolutions, it is processed by two branches, each of which is processed by a 1×1 convolution to obtain the corresponding feature map with regression scores and feature map with object scores.
[0110] The feature maps with classification scores obtained from each decoupled detection process are fused to obtain the classification score at the label position (h, w) on the feature map. The feature maps with regression scores obtained from each decoupled detection process are fused to obtain the regression score at the label position (h, w) on the feature map. The feature maps with object scores obtained from each decoupled detection process are fused to obtain the object score at the label position (h, w) on the feature map. Based on the scores obtained, the scores are used to calculate the exponential moving average to update the teacher model, and also to calculate the loss value in combination with other parameters.
[0111] Exponential moving average is a parameter smoothing technique used to reduce noise fluctuations in optimized parameters and make the parameters more likely to approach local minima. In one embodiment, the exponential moving average is calculated using existing techniques.
[0112] The loss value is calculated based on the loss function, reliable pseudo-labels, feature parameters of the processed feature map assigned to the pseudo-labels, and feature parameters of the feature map with the real labels. This calculated loss value is then fed back into the student model for training. The calculation of the loss value includes:
[0113] L = L s +λL u
[0114] Where L represents the total loss function, L s L represents the supervised loss function. u Let λ represent the semi-supervised loss function, where λ is the balance factor between the supervised loss function and the semi-supervised loss function, and is a hyperparameter. CE is the cross-entropy loss function, and IoU is the regression loss function. This represents the classification score at the label position (h, w) on the feature map obtained by the student model. This represents the regression score at the label position (h, w) on the feature map obtained by the student model. This represents the objectivity score at the label position (h, w) on the feature map obtained by the student model. This represents the classification score at the label position (h, w) on the feature map obtained by the teacher model. This represents the regression score at the pseudo-label position (h, w) on the feature map obtained by the teacher model. The objectivity score represents the position (h, w) of the label on the feature map obtained by the teacher model. in,
[0115]
[0116] in, For classifying losses, To regress the loss, For object-oriented loss; These represent the classification score, regression score, and objectivity score obtained from the adaptive pseudo-label assignment at the location (h, w) of the reliable pseudo-label on the feature map, respectively.
[0117] In the above process, the semi-supervised aircraft skin defect detection model framework based on dynamic attention first simultaneously inputs unlabeled skin image data and labeled skin image data. Then, different data augmentation methods are applied to these two types of image data: the unlabeled skin image data undergoes Mixup data augmentation, and the augmented skin image dataset is then directly input into the teacher model (the teacher model is a dynamic decoupling detector based on dynamic attention) for strong data augmentation. The labeled skin image data undergoes weak data augmentation and is then input into the student model (the student model is a dynamic decoupling detector based on dynamic attention) along with the strongly augmented unlabeled skin image data. Based on the training of the teacher model, pseudo-labels are assigned to the unlabeled skin image data. After generating pseudo-labels, they are divided into reliable pseudo-labels and uncertain pseudo-labels according to the pseudo-label allocator. The two types of pseudo-labels are compared with the real labels in the student model based on the labeled skin image data to calculate the loss, thereby optimizing the training of the student model. During the training process, the student model updates the parameters of the teacher model through exponential moving average.
[0118] The dynamic attention-based decoupling detector is primarily used to construct the teacher and student models in a semi-supervised aircraft skin defect detection model. The teacher model's network model is based on the YOLOv7 model, incorporating dynamic attention and a decoupling detection head. The teacher model's input consists of data-augmented unlabeled skin image data and the exponential moving average of the student model's weights. The exponential moving average of the student model's weights is used to update the parameters in the teacher model, optimizing it. The unlabeled skin image data, after a series of feature extraction operations, is input to the decoupling detection head. The output of the decoupling detection head consists of feature maps with pseudo-labels at three different scales. The student model's network structure is consistent with the teacher model. The student model's input consists of unlabeled skinned images after strong data augmentation, labeled skinned images after weak data augmentation, and the gradient of the loss function. The gradient of the loss function is used to optimize the student model's parameters for better training. The unlabeled and labeled skinned images undergo a series of feature extraction operations before being input into the decoupled detection head. The output of the decoupled detection head also consists of feature maps of three different scales. Because the student model's input includes labeled skinned images, the output feature maps contain defect category and location information. We extract this category and location information and, together with the category and location information of the pseudo-labels input to the teacher model, calculate four loss functions. The gradient of the loss function is then backpropagated to the student model to optimize it. Another output of the student model is the weights of the network parameters from the student model's training process. These weights are calculated using an exponential moving average and passed to the teacher model to update its parameters and optimize it.
[0119] One embodiment of this application provides a computer-readable storage medium storing a program, the stored program including methods that can be loaded by a processor and processed in any of the above embodiments.
[0120] Those skilled in the art will understand that all or part of the functions of the various methods in the above embodiments can be implemented by hardware or by computer programs. When all or part of the functions in the above embodiments are implemented by computer programs, the program can be stored in a computer-readable storage medium, which may include: read-only memory, random access memory, disk, optical disk, hard disk, etc., and the program is executed by a computer to achieve the above functions. For example, the program can be stored in the memory of a device, and when the program in the memory is executed by the processor, all or part of the above functions can be achieved. In addition, when all or part of the functions in the above embodiments are implemented by computer programs, the program can also be stored in a server, another computer, disk, optical disk, flash drive, or external hard drive, etc., and can be downloaded or copied to the memory of a local device, or the system of the local device can be updated. When the program in the memory is executed by the processor, all or part of the functions in the above embodiments can be achieved.
[0121] The above examples illustrate the present invention only to aid in understanding it and are not intended to limit the scope of the invention. Those skilled in the art can make various simple deductions, modifications, or substitutions based on the principles of this invention.
Claims
1. A method for detecting defects in aircraft skin, characterized in that, include: Acquire images of the aircraft skin as the object to be detected; The object to be inspected is input into the aircraft skin defect detection network for defect detection, including: The object to be detected is processed by the YOLOv7 network to obtain three feature maps of different scales, including: the first feature map, the second feature map and the third feature map; The first feature map, the second feature map, and the third feature map are subjected to dynamic attention processing respectively to obtain the first dynamic attention feature map, the second dynamic attention feature map, and the third dynamic attention feature map. The first dynamic attention feature map, the second dynamic attention feature map, and the third dynamic attention feature map are respectively subjected to decoupling detection processing to obtain the corresponding first decoupling feature map, the second decoupling feature map, and the third decoupling feature map; The first decoupled feature map, the second decoupled feature map and the third decoupled feature map are fused to obtain an image with a detection box and a defect label; The image with the detection frame and defect label is used as the result of aircraft skin defect detection; The aircraft skin defect detection network is obtained by training the aircraft skin defect detection network model based on the constructed dynamic attention semi-supervised loss function; For any dynamic attention feature map, the decoupling detection process includes: The input feature map is processed by 3×3 convolution and then divided into two branches. In one branch, after two 3×3 convolutions, it is processed by a 1×1 convolution to obtain a feature map with classification scores. In the other branch, after two 3×3 convolutions, it is processed by two branches, each of which is processed by a 1×1 convolution to obtain the corresponding feature map with regression scores and feature map with object scores. The feature map with classification score, feature map with regression score and feature map with object score are fused to obtain a decoupled feature map; The training method for the aircraft skin defect detection network model includes: Data acquisition involves acquiring images of the aircraft skin, including: acquiring images of the aircraft skin surface under different lighting conditions and at different stages of damage; Data classification: The collected image data are classified according to the type of defect, and the images that meet the defect clarity requirements in each category are selected to form the first image dataset, and the remaining images are formed into the second image dataset; Image preprocessing involves preprocessing the images in the first image dataset, including filtering and denoising preprocessing. Defect annotation involves annotating the preprocessed image with defects. A semi-supervised defect detection network model based on dynamic decoupled attention is constructed, which includes a teacher model and a student model. The network model training takes the images in the second image dataset and the images in the first image dataset after defect annotation as input. Based on the teacher model and student model, the model is trained by combining the dynamic attention semi-supervised loss function to obtain the trained student model, and the student model is used as the aircraft skin defect detection network. The process of using images from the second image dataset and images from the first image dataset after defect annotation as input, and training the model based on the teacher model and student model, combined with a dynamic attention semi-supervised loss function, includes: The images in the second image dataset are subjected to the first strong data augmentation process; The image after the first strong data augmentation and the exponential moving average obtained based on the student model are input into the teacher model for pseudo-label assignment. The pseudo-labels are assigned to the processed feature map, and adaptive label assignment is performed to obtain reliable pseudo-labels and uncertain pseudo-labels. The image after the first strong data augmentation process is then subjected to a second strong data augmentation process. Weak data augmentation is performed on the images in the first image dataset after defect annotation. The image after the second strong data augmentation process, the image after the weak data augmentation process, and the feedback loss value are input into the student model for moving exponential average processing and real label assignment processing to obtain the exponential moving average and feature map with real label accordingly. The loss value is calculated based on the loss function, reliable pseudo-labels, feature parameters of the processed feature map with pseudo-labels, and feature parameters of the feature map with real labels. The calculated loss value is then fed back to the student model for training.
2. The aircraft skin defect detection method as described in claim 1, characterized in that, The calculation of the loss value based on the loss function, reliable pseudo-labels, feature parameters of the feature map after pseudo-labeling, and feature parameters of the feature map with real labels includes: Where L represents the total loss function, The loss function represents the supervised loss function. This represents the loss function for semi-supervised learning. It is a balance factor between the supervised loss function and the semi-supervised loss function, and is a hyperparameter; CE is the cross-entropy loss function, and IoU is the regression loss function. This indicates the position of the label on the feature map obtained by the student model. Classification and scoring This indicates the position of the label on the feature map obtained by the student model. The score at the regression point, This indicates the position of the label on the feature map obtained by the student model. Scoring based on objectivity This indicates the position of the label on the feature map obtained by the teacher model. Classification and scoring This indicates the position of the pseudo-label on the feature map obtained by the teacher model. The score at the regression point, This indicates the position of the label on the feature map obtained by the teacher model. Object-oriented score; ,in, , , ;in, For classifying losses, To regress the loss, For object-oriented loss; These represent the positions of reliable pseudo-labels on the feature map. The classification score, regression score, and objectivity score are obtained from the adaptive pseudo-label assignment sampling.
3. The aircraft skin defect detection method as described in claim 1, characterized in that, The first strong data augmentation process includes Mixup data augmentation.
4. The aircraft skin defect detection method as described in claim 1, characterized in that, The second strong data augmentation process includes Mosai data augmentation.
5. A training method for an aircraft skin defect detection network model, characterized in that, include: Data acquisition involves acquiring images of the aircraft skin, including: acquiring images of the aircraft skin surface under different lighting conditions and at different stages of damage; Data classification: The collected image data are classified according to the type of defect, and the images that meet the defect clarity requirements in each category are selected to form the first image dataset, and the remaining images are formed into the second image dataset; Image preprocessing involves preprocessing the images in the first image dataset, including filtering and denoising preprocessing. Defect annotation involves annotating the preprocessed image with defects. A semi-supervised defect detection network model based on dynamic decoupled attention is constructed. This semi-supervised defect detection network model includes a teacher model and a student model. Both the teacher model and the student model adopt the aircraft skin defect detection network structure as described in claim 1. Images from the second image dataset and images from the first image dataset after defect annotation are used as input. Based on the teacher model and student model, the model is trained using a dynamic attention semi-supervised loss function to obtain a trained student model, which is then used as the aircraft skin defect detection network.
6. The training method as described in claim 5, characterized in that, The process of using images from the second image dataset and images from the first image dataset after defect annotation as input, and training the model based on the teacher model and student model, combined with a dynamic attention semi-supervised loss function, includes: The images in the second image dataset are subjected to the first strong data augmentation process; The image after the first strong data augmentation and the exponential moving average obtained based on the student model are input into the teacher model for pseudo-label assignment. The pseudo-labels are assigned to the processed feature map, and adaptive label assignment is performed to obtain reliable pseudo-labels and uncertain pseudo-labels. The image after the first strong data augmentation process is then subjected to a second strong data augmentation process. Weak data augmentation is performed on the images in the first image dataset after defect annotation. The image after the second strong data augmentation process, the image after the weak data augmentation process, and the feedback loss value are input into the student model for moving exponential average processing and real label assignment processing to obtain the exponential moving average and feature map with real label accordingly. The loss value is calculated based on the loss function, reliable pseudo-labels, feature parameters of the processed feature map with pseudo-labels, and feature parameters of the feature map with real labels. The calculated loss value is then fed back to the student model for training.
7. The training method as described in claim 6, characterized in that, The calculation of the loss value based on the loss function, reliable pseudo-labels, feature parameters of the feature map after pseudo-labeling, and feature parameters of the feature map with real labels includes: Where L represents the total loss function, The loss function represents the supervised loss function. This represents the loss function for semi-supervised learning. It is a balance factor between the supervised loss function and the semi-supervised loss function, and is a hyperparameter; CE is the cross-entropy loss function, and IoU is the regression loss function. This indicates the position of the label on the feature map obtained by the student model. Classification and scoring This indicates the position of the label on the feature map obtained by the student model. The score at the regression point, This indicates the position of the label on the feature map obtained by the student model. Scoring based on objectivity This indicates the position of the label on the feature map obtained by the teacher model. Classification and scoring This indicates the position of the pseudo-label on the feature map obtained by the teacher model. The score at the regression point, This indicates the position of the label on the feature map obtained by the teacher model. Object-oriented score; ,in, , , ;in, For classifying losses, To regress the loss, For object-oriented loss; These represent the positions of reliable pseudo-labels on the feature map. The classification score, regression score, and objectivity score are obtained from the adaptive pseudo-label assignment sampling.