A method and system for detecting a bolt loosening mark on a vehicle chassis

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using improved YOLOv11 and U-Net++ networks to detect bolt loosening marks, the problem of inaccurate detection results under different perspectives has been solved, achieving higher accuracy in bolt loosening detection and adapting to complex scenarios and various bolt types.

CN121883483BActive Publication Date: 2026-06-16HANGZHOU SHENHAO TECH

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HANGZHOU SHENHAO TECH
Filing Date: 2026-03-17
Publication Date: 2026-06-16

Application Information

Patent Timeline

17 Mar 2026

Application

16 Jun 2026

Publication

CN121883483B

IPC: G06T7/00; G06T7/11; G06T5/50; G06T3/4038; G06N3/0464; G06N3/045; G06N3/048; G06N3/09

CPC: G06T7/0004; G06T7/11; G06T5/50; G06T3/4038; G06N3/0464; G06N3/045; G06N3/048; G06N3/09

AI Tagging

Application Domain

Image enhancement Image analysis

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing methods for detecting bolt anti-loosening marks are inaccurate because the anti-loosening marks in two-dimensional images deviate from their respective shooting angles.

⚗Method used

An improved YOLOv11 network is used for bolt target detection and localization, and an improved U-Net++ network is used for segmentation. The bolt looseness is determined by comparing the visible range and angle of the top and side of the bolt. A coordinate attention module and an adaptive weight method are embedded in the backbone network for feature fusion to improve detection accuracy.

🎯Benefits of technology

It improves the accuracy of bolt anti-loosening mark detection, reduces computation and error, adapts to different types and sizes of bolts, improves feature recognition in complex scenarios, and reduces the false detection rate in dense scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN121883483B_ABST

Patent Text Reader

Abstract

The present application relates to the technical field of intelligent detection of rail transit, and particularly relates to a method and system for detecting a loosening mark of a bolt on a car bottom, comprising the following steps: obtaining an original image; identifying and outputting a bolt image based on an improved YOLOv11 network; obtaining a segmentation mask image based on an improved U-Net++ network; determining a bolt region loosening line vector and a non-bolt region loosening line vector; if the visible range of the top of the bolt is greater than the visible range of the side of the bolt, then performing projection correction on the non-bolt region loosening line vector; and comparing the included angle of the vectors to determine whether the bolt is loose. When the visible range of the top of the bolt is greater than the visible range of the side of the bolt, the detection result is greatly affected by the shooting angle. Therefore, the projection correction is performed on the non-bolt region loosening line vector only when the visible range of the top of the bolt is greater than the visible range of the side of the bolt, so as to improve the accuracy of the detection result while avoiding an increase in the calculation amount and calculation error caused by excessive correction.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent detection technology for rail transit, and in particular to a method and system for detecting anti-loosening marks on undercarriage bolts. Background Technology

[0002] In the rail transit sector, numerous bolts connect the car body, bogies, chassis, and various structural components on the underside of vehicles. To ensure these bolts remain secure under long-term vibration, impact, and temperature variations, the industry standardly applies anti-loosening markings using high-visibility anti-loosening paint or markers in a standardized manner, such as double-line markings, after tightening. This marking serves not only as a quality control measure on the production line but also provides a visual basis for subsequent vehicle inspection and maintenance. The integrity of the markings directly reflects whether the bolts have rotated, loosened, or been damaged by external forces; therefore, effective inspection is a crucial step in ensuring the safe operation of vehicles.

[0003] Currently, in addition to manual visual inspection or fixed-position camera shooting, fixed camera shooting can also be used. Although this saves the cost of manual on-site inspection, the fixed and single shooting angle creates a large number of blind spots, causing many parts to be missed. At the same time, for the top imaging point of most nuts, the viewing angle is easily affected by the installation position, making it difficult to capture a clear image. Subsequently, a large number of images still need to be manually checked one by one. The overall process has poor stability and cannot form a systematic and long-term traceable digital inspection archive.

[0004] With the development of intelligent operation and maintenance and automated inspection technologies, undercarriage inspection robots have begun to be used. These robots aim to automatically identify the position, posture, and anti-loosening marking status of bolts to achieve rapid defect localization, thereby reducing the cost and risk of manual operation and maintenance. In recent years, the maturity of deep learning technologies such as object detection, semantic segmentation, and multimodal processing has provided new solutions in this area. For example, the paper "Research on Visual Detection Algorithm for Loose Fasteners on Subway Undercarriages" (Journal of Railway Science and Engineering) proposes an improved technical approach combining YOLOv5 with DeepLabv3+ and multi-feature fusion judgment to alleviate the problem of traditional methods having excessively high requirements for shooting conditions and the clarity of anti-loosening markings. Furthermore, the paper "Research on Train Bolt Loosening State Detection Algorithm Based on Key Points" (Modern Electronics Technology) proposes an algorithm based on key point detection to address the problems of diverse bolt types and complex shooting environments. By improving the ResNet-18 model and integrating a Spatial Transformation Network (STN) module, the detected bolt corner points are topologically classified into a hexagonal structure, and then a Siamese network is used for loosening state classification.

[0005] However, in actual subway car undercarriage scenarios, there are many types of nuts and bolts, with varying shapes, sizes, marking methods, and large deviations in projection angles, resulting in significant differences in detection difficulty. Existing technologies cannot maintain reliable accuracy when dealing with nuts and bolts of various shapes and sizes, as well as different marking methods. Summary of the Invention

[0006] The technical problem to be solved by the present invention is to overcome the problem that the existing bolt anti-loosening mark detection method has inaccurate detection results due to the deviation of the anti-loosening mark in two-dimensional images from different shooting angles.

[0007] Therefore, the first objective of this invention is to provide a method for detecting the anti-loosening mark of vehicle underbody bolts, comprising the following steps:

[0008] Obtain the original image containing the bolt to be inspected;

[0009] The original image is used to detect and locate bolt targets based on the improved YOLOv11 network, and the identified bolt images are output.

[0010] The bolts and anti-loosening lines in the bolt image are segmented using an improved U-Net++ network to obtain a segmentation mask image;

[0011] Extract the anti-loosening lines in the bolt area and the non-bolt area from the segmentation mask image, obtain the vectors of the anti-loosening lines in the bolt area and the non-bolt area, compare the visible range of the top and side of the bolt, if the visible range of the top of the bolt is greater than that of the side of the bolt, then perform projection correction on the vector of the anti-loosening line in the non-bolt area; otherwise, proceed directly to the next step.

[0012] Compare the angle between the anti-loosening line vectors in the bolt area and the anti-loosening line vectors in the non-bolt area. If the angle exceeds a preset threshold, it is determined that the bolt has become loose.

[0013] When the visible area at the top of the bolt is greater than the visible area on the side of the bolt, the detection results are greatly affected by the shooting angle. Therefore, the anti-loosening line vector in the non-bolt area is only projected and corrected when the visible area at the top of the bolt is greater than the visible area on the side of the bolt. This improves the accuracy of the detection results while avoiding excessive correction that would increase the amount of calculation and the calculation error.

[0014] Preferably, the improved YOLOv11 network includes a preprocessing module, which is used to fuse the original image with a grayscale image generated from the original image to generate a dual-modal feature map, thereby achieving input feature fusion. The original image is an RGB three-channel image.

[0015] After preprocessing, a bimodal feature map is generated and input into the backbone network to improve feature recognition in complex scenes with uneven lighting.

[0016] Preferably, the improved YOLOv11 network includes a backbone network for extracting feature layers for small targets from the bimodal feature map. The backbone network includes four levels of C3k2 modules. The outputs of the first two levels of C3k2 modules are embedded with coordinate attention modules, and the output of the first level of C3k2 modules is introduced into the P2 output layer.

[0017] By embedding coordinate attention modules into the first two levels of the C3k2 module in the backbone network, key features such as texture of the target are prioritized to be enhanced during the multi-level feature extraction process, while irrelevant information is suppressed, thus solving the problem of feature sparsity for small targets and finally outputting a high-level feature map containing rich semantic information.

[0018] Preferably, the improved YOLOv11 network further includes a feature fusion module, which employs an adaptive weighting method for feature fusion. The adaptive weighting method includes:

[0019] Align the feature layers output by the backbone network;

[0020] The aligned feature layers are then subjected to global average pooling and input into a fully connected layer to reduce the dimensionality to one-dimensional weight values.

[0021] The one-dimensional weight values are activated using an activation function to obtain the original weights;

[0022] The original weights are normalized to obtain the final normalized weights used for feature fusion;

[0023] A multi-scale feature fusion method is used to multiply each aligned feature layer with its corresponding normalized weight element by element and then sum them to output a fused feature map.

[0024] The weights of features at different scales are dynamically adjusted to enhance the unique features of different types of bolts, covering extremely small bolts and adapting to bolts of more sizes and categories.

[0025] Preferably, the improved YOLOv11 network includes a detection head module. The anchors of the detection head module are generated based on the bolt annotation dataset of the vehicle undercarriage image and generated by the K-Means clustering algorithm, replacing the default-sized native anchors. The generated anchors are more closely matched to the bolt size.

[0026] Preferably, the detection head module is configured with independent regression and classification branches. The regression branch uses a small target convolution kernel for target localization and calculates the regression loss using the EPGIoU loss function. The classification branch adds a BatchNorm layer after the convolutional layer and is configured with the GELU activation function. The classification loss is calculated using the Focal Loss function.

[0027] Independent classification and regression branches can employ different loss function calculation methods that better suit the characteristics of bolts. Combined with bolt-specific anchors, this improves the positioning accuracy of small targets and the matching degree of bolt size.

[0028] Preferably, the improved YOLOv11 network further includes a post-processing module, which uses a Gaussian weighted method to calculate confidence scores to filter overlapping prediction boxes. The Gaussian weighted method includes:

[0029] The predicted boxes are sorted in descending order of their original confidence levels;

[0030] Calculate the IoU between the prediction box with the highest confidence and other prediction boxes, and filter the prediction boxes according to the preset threshold;

[0031] The original confidence scores of the selected prediction boxes are converted into linear confidence scores using the linearly weighted Soft-NMS method.

[0032] The linear confidence scores were converted to Gaussian confidence scores using the Gaussian weighted Soft-NMS method.

[0033] The confidence score calculated using the Gaussian weighting method is used to filter overlapping prediction boxes, which solves the problem of excessive suppression of overlapping dense bolt boxes and reduces the false detection rate in dense scenes.

[0034] Preferably, the improved U-Net++ network includes a preprocessing module for preprocessing bolt images. This preprocessing module performs red enhancement on the bolt images to obtain a stitched image. The red enhancement method includes:

[0035] Generate HSV three-channel images based on bolt images;

[0036] The HSV three-channel image uses the H channel to filter the red area to obtain the red mask area;

[0037] The bolt image is stitched together with the red mask area to generate a stitched image.

[0038] Preferably, the improved U-Net++ network includes an encoder that introduces a CBAM attention mechanism. The CBAM attention mechanism uses channel attention to focus on red features and spatial attention to locate the anti-loosening line position. The encoder's convolutional layers use dilated convolution to gradually expand the receptive field.

[0039] Preferably, the improved U-Net++ network includes a decoder and a fine feature fusion branch module, which is used to fuse the low-level edge features of the encoder with the high-level semantic features of the decoder.

[0040] Preferably, the improved U-Net++ network employs a multi-loss joint optimization strategy to calculate the loss, the multi-loss joint optimization strategy including:

[0041] The Focal Dice Loss function is introduced to calculate the fine-structure segmentation loss;

[0042] The Laplacian operator is introduced to calculate the edge loss;

[0043] The total loss is calculated by weighting the fine-structure segmentation loss and the edge loss.

[0044] Preferably, the projection correction method includes:

[0045] Extract the top mask area and the overall mask area of the bolt based on the segmented mask image;

[0046] The bolt tilt angle is determined based on the angle difference between the short side midline of the minimum bounding rectangle of the bolt top mask area and the short side midline of the minimum bounding rectangle of the overall bolt mask area.

[0047] The offset step size is determined based on the vector magnitude from the center of the minimum bounding rectangle of the top mask area of the bolt to the center of the minimum bounding rectangle of the overall mask area of the bolt.

[0048] The starting point of the anti-loosening line vector in the bolt area is used as the reference point. The reference point is then corrected based on the bolt tilt angle and offset step size, and used as the starting point of the corrected anti-loosening line vector in the non-bolt area.

[0049] The starting point of the anti-loosening line vector in the bolt area is projected and corrected according to the bolt tilt angle and offset step size. The data is directly extracted from the segmentation mask image, which is simple to calculate and highly accurate.

[0050] Preferably, if the visible range of the top of the bolt is greater than the visible range of the side of the bolt, then the point closest to the center of the smallest bounding rectangle of the mask area of the bolt area anti-loosening line is selected from the mask area of the non-bolt area anti-loosening line as the termination point of the non-bolt area anti-loosening line vector.

[0051] As a preferred method, the method for determining that the visible range of the bolt top is greater than the visible range of the bolt side is as follows: extract the bolt side mask area based on the segmentation mask image, calculate the area ratio of the bolt side and top based on the bolt top mask area and the bolt side mask area, and if the area ratio of the bolt top is greater than the area ratio of the bolt side, then it is determined that the visible range of the bolt top is greater than the visible range of the bolt side.

[0052] The second objective of this invention is to provide a vehicle underbody bolt anti-loosening marking detection system, comprising:

[0053] The image acquisition module is configured to acquire a raw image containing the bolt to be inspected;

[0054] An improved YOLOv11 network is configured to detect and locate bolt targets in the original image and output the identified bolt images.

[0055] An improved U-Net++ network is configured to segment the bolts and anti-loosening lines in the bolt image to obtain a segmentation mask image;

[0056] The vector extraction module is configured to extract the anti-loosening lines in the bolt area and the non-bolt area based on the segmentation mask image, and obtain the vectors of the anti-loosening lines in the bolt area and the non-bolt area.

[0057] The pre-comparison module compares the visible range of the bolt top and the bolt side. If the visible range of the bolt top is greater than that of the bolt side, the anti-loosening line vector in the non-bolt area is projected and corrected; otherwise, no correction is made.

[0058] The anti-loosening judgment module compares the angle between the anti-loosening line vector of the bolt area and the anti-loosening line vector of the non-bolt area. If the angle exceeds a preset threshold, it is determined that the bolt has become loose.

[0059] A third objective of the present invention is to provide a computer device, including a memory and a processor, wherein the memory stores computer instructions, and the processor executes the computer instructions to perform the method described in the first objective of the invention.

[0060] A fourth objective of this invention is to provide a computer-readable storage medium storing computer instructions which, when executed by a computer, are described in accordance with the method described in the first objective.

[0061] Compared with the prior art, the present invention has the following beneficial effects:

[0062] 1. When the visible range of the top of the bolt is greater than that of the side of the bolt, the detection results are greatly affected by the shooting angle. Therefore, the anti-loosening line vector in the non-bolt area is only projected and corrected when the visible range of the top of the bolt is greater than that of the side of the bolt. This improves the accuracy of the detection results while avoiding the increase in calculation load and calculation error caused by over-correction.

[0063] 2. By embedding coordinate attention modules into the first two levels of the C3k2 module in the backbone network, key features such as texture of the target are strengthened first during the multi-level feature extraction process, while irrelevant information is suppressed, thus solving the problem of sparse features of small targets and finally outputting a high-level feature map containing rich semantic information.

[0064] 3. Dynamically adjust the weights of features at different scales, and enhance the unique features of different types of bolts to cover extremely small bolts and adapt to more sizes and types of bolts.

[0065] 4. After preprocessing the original image, a bimodal feature map is generated and then input into the backbone network to improve the feature recognition in complex scenes with uneven lighting. Before the bolt image is input into the improved U-Net++ network, the bolt image is enhanced with red to obtain a stitched image, allowing the model to focus on red features in the early stage of training and reduce background interference. Attached Figure Description

[0066] Figure 1 This is a flowchart illustrating Embodiment 1 of the present invention;

[0067] Figure 2 This is a diagram of the improved YOLOv11 network structure according to Embodiment 1 of the present invention;

[0068] Figure 3 This is a partial bolt image output by the improved YOLOv11 network according to Embodiment 1 of the present invention;

[0069] Figure 4 A comparative schematic diagram of the bolt image output by the improved YOLOv11 network and the segmentation mask image output by the improved U-Net++ network, which is part of an embodiment of the present invention.

[0070] Figure 5 This is a comparison image of the low-resolution image that needs to be discarded in step S4 of embodiment one of the present invention and the bolt image that meets the resolution requirements;

[0071] Figure 6 This is a schematic diagram of step S4 of embodiment one of the present invention, which outputs the anti-loosening line vectors for bolted areas and non-bolted areas based on the segmentation mask image. Detailed Implementation

[0072] The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the described embodiments of the present invention without inventive effort are within the scope of protection of the present invention. Unless otherwise defined, the technical or scientific terms used herein should have the ordinary meaning understood by those skilled in the art to which this invention pertains.

[0073] To further understand this invention, some terms mentioned in this invention and its embodiments are first explained:

[0074] YOLO is a real-time object detection algorithm based on deep learning. It simplifies the object detection task into a single regression problem, predicting the category and location of all objects in an image through a single forward propagation. Its architecture mainly consists of three parts: Backbone, Neck (feature fusion module), and Head (detection head module). This embodiment is mainly based on YOLOv11.

[0075] P2 layer: This is the detection layer in the YOLO model, designed to improve the detection capability of small targets. Introducing the P2 layer can improve the performance of the YOLO model in small target detection. Traditional YOLO models usually use P3, P4 and P5 layers to handle targets of different scales, while the P2 layer is specifically optimized for small targets and can better capture the features of small targets.

[0076] Coordinate Attention (CA) is a lightweight attention mechanism designed to introduce precise positional information into channel attention to improve model performance in tasks such as classification, detection, and segmentation.

[0077] Multi-scale feature fusion is a widely used technique in computer vision that aims to improve model performance by integrating feature information from different scales. It excels in tasks such as object detection, image segmentation, and classification, effectively addressing the challenges posed by variations in object scale.

[0078] Anchor: also known as a prior box, is a template for predicting the bounding box of a target. It uses pre-set rectangles of different sizes and aspect ratios to help the model more accurately locate and detect targets of different sizes.

[0079] Regression loss: focuses on optimizing the positional and dimensional differences between the predicted bounding box and the true bounding box, ensuring that the model can accurately define the target location.

[0080] Classification loss: focuses on optimizing the model's prediction accuracy for the target category, ensuring that the model can correctly distinguish between different categories of targets.

[0081] BatchNorm, or Batch Normalization (BN), is a widely used technique in deep neural network training. It normalizes intermediate features to a mean of 0 and a variance of 1 by calculating the mean and variance of each mini-batch, and introduces learnable scaling parameters γ and offset parameters β to restore the model's expressive power. It significantly accelerates convergence, stabilizes training, and provides a degree of regularization.

[0082] Unet++ is an improved image segmentation model based on Unet. It addresses the coarse feature fusion and vanishing gradient problems of the original U-Net through nested dense skip connections and deep supervision, resulting in significantly better performance in fine-grained segmentation and small object segmentation tasks. The overall network structure includes: encoder, nested dense connection blocks, decoder, and deep-supervised segmentation head.

[0083] HSV three-channel image: refers to a grayscale image formed by extracting one of the three components H (hue), S (saturation), or V (brightness) in the HSV color space, where each pixel value represents the intensity of that channel.

[0084] CBAM: Convolutional Block Attention Module, is a lightweight yet efficient attention mechanism designed to enhance the feature representation capabilities of convolutional neural networks by simultaneously focusing on channel and spatial dimensions. It is widely used in computer vision tasks such as image classification, object detection, and semantic segmentation, and includes channel attention modules and spatial attention modules.

[0085] Edge loss is a loss function designed for image segmentation tasks, focusing specifically on the degree of matching between the predicted result and the ground truth label in the target boundary region. Unlike traditional region losses (such as cross-entropy and Dice loss), edge loss shifts the optimization focus from the statistics of the entire region to the geometric characteristics of the boundary, guiding model training by calculating the difference between the predicted boundary and the ground truth boundary.

[0086] Low-level edge features: These are the most basic and distinctive visual information in an image, typically corresponding to the edges, contours, lines, and other structures of a target.

[0087] High-level semantic features: In computer vision, high-level semantic features are abstract features with semantic information obtained by gradually abstracting, fusing and combining them through multi-layer networks and feature extraction algorithms. They no longer correspond to specific pixels or local structures of the image, but can represent high-level information such as the category, attributes, overall shape and contextual relationships of the target in the image.

[0088] Difficult sample weighting methods: Difficult sample weighting methods are mainly divided into two categories: static weighting and dynamic weighting. Static weighting predefines weights based on the sample distribution before training, mainly to solve the problem of class imbalance. Dynamic weighting dynamically calculates weights / selects samples based on the model's real-time prediction results during training. It can adapt to all difficult sample scenarios such as easily confused, low-quality, and small objects, and is also the most commonly used solution in computer vision (detection, segmentation, classification).

[0089] Dice-enhanced fine-structure segmentation: Dice enhancement is the core optimization strategy for solving fine-structure segmentation. By designing a loss function with Dice loss as the core and using targeted training techniques, the model focuses on pixel-level matching and contour continuity of fine structures, significantly improving the segmentation accuracy of fine structures (solving problems such as missed detections, breaks, and artifacts).

[0090] Difficult-to-classify sample weights: These are weight coefficients assigned to difficult-to-classify samples that have mispredicted by the model, have high loss values, or low confidence. Their core function is to amplify the loss contribution of difficult-to-classify samples, allowing the model to prioritize learning these types of samples (small targets and occluded targets in detection) during gradient updates.

[0091] Easy-to-classify sample weights: Weight coefficients are assigned to easy-to-classify samples that are correctly predicted by the model, have low loss values, and high confidence. Their core function is to suppress the loss contribution of easy-to-classify samples and prevent their gradients from drowning out the gradients of difficult-to-classify samples (large and clear targets in detection).

[0092] Minimum Bounding Rectangle (MBR): A commonly used concept in spatial image databases and Geographic Information Systems (GIS) to describe the bounding rectangle of a spatial object. A minimum bounding rectangle is a simple rectangle that tightly encloses one or more spatial objects (such as points, lines, polygons, etc.) such that the boundary of this rectangle represents the minimum and maximum coordinate values of the object in each dimension.

[0093] To facilitate a better understanding of the present invention by those skilled in the art, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The following are merely exemplary and do not limit the scope of protection of the present invention.

[0094] Example 1

[0095] A method for detecting anti-loosening marks on vehicle underbody bolts, such as Figure 1 As shown, it includes:

[0096] S1: Obtain the original image containing the bolt to be detected.

[0097] S2: Based on the improved YOLOv11 network, bolt target detection and localization are performed on the original image, and the identified bolt images are output. For example... Figure 2 The diagram shown is a diagram of the improved YOLOv11 network structure in this implementation.

[0098] This step aims to output bolt images using an improved YOLOv11 network, such as... Figure 3 The image shown is an example of a partial bolt image.

[0099] In one alternative embodiment, the improved YOLOv11 network uses bolt type labels and corresponding undercarriage bolt atlases as training datasets during pre-training, thereby enabling it to identify the category of bolts in the original image.

[0100] In one alternative embodiment, the improved YOLOv11 network uses a small-target augmentation strategy to scale the feature maps at multiple scales during pre-training, simulating changes in bolt distance to enhance the model's scale robustness. During training, samples are inversely proportional to the number of samples per class to avoid the majority class dominating the training.

[0101] The dataset used during pre-training is augmented with scenes such as shadows, dirt, and blur to simulate the degradation effects of complex environments under real vehicles, thereby improving the model's stability and generalization ability under non-ideal imaging conditions.

[0102] The improved YOLOv11 network includes a preprocessing module, a backbone network, a feature fusion module, a detection head module, and a postprocessing module.

[0103] For ease of understanding, the following describes the various modules in the improved U-Net++ network that are relevant to the embodiments of the present invention.

[0104] I. Preprocessing module.

[0105] This is used to fuse the original image with a grayscale image generated from the original image to generate a dual-modal feature map, thereby achieving input-side feature fusion. The original image is an RGB three-channel image.

[0106] After preprocessing, a bimodal feature map is generated and input into the backbone network to improve feature recognition in complex scenes with uneven lighting.

[0107] The formula for input feature fusion can be:

[0108]

[0109] in The feature image of the RGB image. The feature image of the grayscale image. For a 1×1 convolution operation, This is a channel latitude splicing operation.

[0110] II. Backbone Network.

[0111] Used to extract features from the dual-modal feature map to obtain a feature layer for small targets.

[0112] The backbone network consists of four levels of C3k2 modules. Coordinate attention modules are embedded at the outputs of the first two C3k2 modules. The output of the first-level C3k2 module is connected to the P2 output layer, which performs multi-level feature extraction on the bimodal feature map, ultimately yielding a high-level feature set FCA4 with a size of 40×40×512. The coordinate attention module is constructed using a convolutional attention mechanism that combines channel and spatial coordinates.

[0113] By embedding coordinate attention modules into the first two levels of the C3k2 module in the backbone network, key features such as texture of the target are prioritized to be enhanced during the multi-level feature extraction process, while irrelevant information is suppressed, thus solving the problem of feature sparsity for small targets and finally outputting a high-level feature map containing rich semantic information.

[0114] The formula for embedding CA coordinate attention into the C3k2 module can be:

[0115]

[0116] Where F represents the feature map, AvgPool represents average pooling, MaxPool represents max pooling, and ⊕ represents channel concatenation. For element-wise multiplication, MLP stands for Multilayer Perceptron, Mc and Ms are the channel attention map and spatial attention map, respectively, and Split is the channel splitting.

[0117] III. Feature Fusion Module.

[0118] An adaptive weighting method is used for feature fusion, with weights calculated based on the semantic features of bolts at different scales.

[0119] Adaptive weighting methods include:

[0120] S231: Align the feature layers output by the backbone network.

[0121] The feature layer results output by the backbone network (including The feature layers can be aligned by upsampling or downsampling to unify the size. The unified size can be 80×80×256.

[0122] S232: After performing global average pooling on the aligned feature layers, input them into the fully connected layer to reduce the dimensionality to one-dimensional feature values.

[0123] The vector obtained by global average pooling of the feature layer is input into the fully connected layer. The dimension of the vector can be 1×1×256. In the fully connected layer, the dimension is reduced to 1 dimension to obtain one-dimensional feature values.

[0124] S233: Activate the one-dimensional feature values through an activation function to obtain the original weights w. i .

[0125] The activation function can be the Sigmoid function.

[0126] S234: Normalize the original weights to obtain the final weights used for feature fusion. .

[0127] The weights are obtained by normalization using the Softmax function. Ensure that the weight sum is 1.

[0128] S235: The multi-scale feature fusion method is used to multiply each aligned feature layer with its corresponding normalized weight element by element and then sum them to output the fused feature map.

[0129] A multi-scale feature fusion method is used to combine each aligned feature layer with its corresponding normalized weights. After multiplying each element and summing the results, a fused feature map is output. The size of the fused feature map can be 80×80×256. The weights of features at different scales are dynamically adjusted to enhance the unique features of different types of bolts, cover extremely small bolts, and adapt to bolts of more sizes and categories.

[0130] The formula for calculating and fusing adaptive weights can be:

[0131]

[0132] in This is the output feature map of the feature fusion module. For the original weights, The weights are normalized using the Softmax function. The fused feature map is represented by the fully connected layer (FC). It is the Sigmoid function.

[0133] IV. Detection Head Module.

[0134] It is configured with independent regression and classification branches, that is, the coupled detection head is split into independent classification and regression branches.

[0135] Independent classification and regression branches can employ different loss function calculation methods that better suit the characteristics of bolts. Combined with bolt-specific anchors, this improves the positioning accuracy of small targets and the matching degree of bolt size.

[0136] The regression branch uses a small target convolution kernel to reduce feature dilution during small target localization and improve bounding box regression accuracy. The regression loss uses EPGIoU (Enhanced Generalized Intersection over Union) instead of the original CIoU to improve small box localization accuracy.

[0137] The regression loss EPGioU formula can be:

[0138]

[0139] in For prediction boxes With real frame The intersection and union ratio, The squared Euclidean distance between the center of the predicted bounding box and the center of the ground truth bounding box is given. This represents the angular deviation between the predicted bounding box and the ground truth bounding box. The angle loss weight is used.

[0140] The classification branch retains 3×3 convolutional kernels to enhance category feature extraction. The classification branch contains convolutional layers, and a BatchNorm layer with GELU activation function is added after the convolutional layers to improve category discrimination.

[0141] The classification loss uses Focal Loss to reduce the weight of easily classified samples, focus on difficult-to-classify bolts and bolts photographed from different angles, and set higher weights for bolt types with fewer samples to balance the training process.

[0142] The classification weighted Focal Loss formula can be:

[0143]

[0144] in For the number of bolt types, For the model to predict the first The probability of a class For real labels, Category weights This is the focusing coefficient.

[0145] The total loss formula can be:

[0146]

[0147] for Adjust the weight of bolt inspection tasks.

[0148] Anchors are generated based on bolt annotation datasets from vehicle undercarriage images and using the K-Means clustering algorithm. These anchors replace the default-sized native anchors, resulting in anchors that better match the bolt sizes.

[0149] The K-Means Anchor clustering formula can be:

[0150]

[0151] The formula for the Anchor matching Intersection over Union (IoU) threshold can be:

[0152]

[0153] Where 'a' is the set of all bolt width-to-height ratios in the dataset. The set of cluster centers represents the size of the generated anchors. IoU is the intersection-union ratio, which measures the degree of matching between bolts and anchors.

[0154] V. Post-processing module.

[0155] The confidence scores calculated using a Gaussian weighting method filter for overlapping prediction boxes. The Gaussian weighting method includes:

[0156] S251: Sort the prediction boxes in descending order of their original confidence levels;

[0157] S252: Calculate the IoU between the prediction box with the highest confidence and other prediction boxes, and filter the prediction boxes according to the preset threshold;

[0158] S253: Convert the original confidence scores of the selected prediction boxes into linear confidence scores using the linearly weighted Soft-NMS method;

[0159] S254: Convert linear confidence scores to Gaussian confidence scores using the Gaussian weighted Soft-NMS method.

[0160] The confidence score calculated using the Gaussian weighting method is used to filter overlapping prediction boxes, which solves the problem of excessive suppression of overlapping dense bolt boxes and reduces the false detection rate in dense scenes.

[0161] The linear Soft-NMS formula can be:

[0162]

[0163] The Gaussian weighted Soft-NMS formula can be:

[0164]

[0165] in For the first The original confidence level of each frame, It is the first Linear confidence of each box It is the first The Gaussian confidence scores of the boxes are given by M, where M is the candidate baseline box. For other detection boxes to be processed, The threshold of native IoU, is the Gaussian attenuation coefficient.

[0166] S3: Based on the improved U-Net++ network, the bolts and anti-loosening lines in the bolt image are segmented to obtain a segmentation mask image.

[0167] This step aims to segment the bolt image output from step S1 using an improved U-Net++ network, resulting in the following: Figure 4 The segmentation mask shown is as follows. Figure 4 A schematic diagram comparing the partial segmentation mask image and the bolt image output by the improved U-Net++ network.

[0168] The improved U-Net++ network is pre-trained on a dataset of bolt images.

[0169] The improved U-Net++ network includes: a preprocessing module, an encoder, a decoder, a fine feature fusion branch module, and a loss module.

[0170] For ease of understanding, the following describes the various modules in the improved U-Net++ network that are relevant to the embodiments of the present invention.

[0171] 1. Preprocessing module, used to suppress noise in bolt images and the HSV three-channel images converted from bolt images.

[0172] S311: Before the bolt image is input into the improved U-Net++ network, a Gaussian filter kernel and a median filter kernel are introduced to suppress high-frequency noise in the bolt image and the HSV three-channel image converted from the bolt image, thereby preserving the bolt detail edges.

[0173] In this embodiment, a 3×3 Gaussian filter kernel and a median filter kernel are selected based on the original image size.

[0174] S312: Before the bolt image is input into the improved U-Net++ network, the CLAHE model (Contrast Limited Adaptive Histogram Equalization) is used to enhance the local brightness of the bolt image. The HSV three-channel image generated from the bolt image is enhanced by linear contrast stretching to improve the image contrast and strengthen the difference between the bolt and the background.

[0175] Linear contrast stretching maps the pixel values of the V channel to the [0,255] range to obtain an infrared image, enhancing the characteristic temperature of the bolt, as shown in the following formula:

[0176]

[0177] in This represents the pixel value at the current coordinates of the V-channel infrared image.

[0178] S313: Before the bolt image is input into the improved U-Net++ network, red enhancement is performed on the bolt image to obtain a stitched image. The red enhancement method includes:

[0179] Generate HSV three-channel images based on bolt images;

[0180] The HSV three-channel image uses the H channel to filter the red area to obtain the red mask area;

[0181] The bolt image is stitched together with the red mask area to generate a stitched image.

[0182] Before inputting the bolt image into the improved U-Net++ network, red enhancement is applied to the bolt image to obtain a stitched image. This allows the model to focus on red features during the early stages of training, reducing background interference. The stitching formula can be:

[0183]

[0184] in / For the RGB image and the feature image of the red anti-loosening line mask, For a 1×1 convolution operation, This is a channel-dimensional splicing operation.

[0185] II. Encoder.

[0186] The encoder introduces the CBAM attention mechanism, which uses channel attention to focus on red features and spatial attention to locate the loose lines. The encoder's convolutional layers use dilated convolution to gradually expand the receptive field, avoiding the loss of fine line features caused by pooling.

[0187] III. Decoder and fine feature fusion branch module.

[0188] The fine feature fusion branch module is used to fuse the low-level edge features of the encoder with the high-level semantic features of the decoder, which is beneficial for extracting less obvious features.

[0189] IV. Loss Module: The loss is calculated using a multi-loss joint optimization strategy.

[0190] Multi-loss joint optimization strategies include:

[0191] S351: Introduce the Focal Dice Loss function to calculate the fine-structure segmentation loss;

[0192] By incorporating Focal's hard-to-separate sample weighting method and Dice's enhanced fine-structure segmentation into the loss function, a FocalDice Loss fusion optimization loss function is generated, thereby solving the problems of low pixel ratio in anti-loose lines and imbalance between positive and negative samples.

[0193] Fine feature fusion provides higher quality feature input for Focal Dice Loss, enabling Focal Loss to more accurately weight difficult samples and Dice Loss to more accurately calculate pixel-level overlap.

[0194] The formula for Dice-enhanced fine-structure segmentation loss can be:

[0195] ;

[0196] The formula for the Focal Dice Loss fusion optimization loss can be:

[0197] ;

[0198] in To predict the mask pixel values, The actual label mask pixel value (1 = anti-loosening line, 0 = background). To prevent smooth terms with a denominator of 0, For Focal weights.

[0199] S352: Introduce the Laplacian operator to calculate the edge loss;

[0200] The edge loss formula for the Laplacian operator can be:

[0201] ;

[0202] ;

[0203] The Laplacian operator is introduced to calculate edge loss, enhance the integrity of the anti-loosening line edge, and solve the problem of anti-loosening line breakage.

[0204] in, For edge detection convolution kernels, The edge map of the real mask. To predict the edge map of the mask.

[0205] S353: Calculate the total loss by weighting the fine-structure segmentation loss and the edge loss.

[0206] The total loss formula can be:

[0207] .

[0208] S4: Extract the anti-loosening lines for the bolt area and the non-bolt area based on the segmentation mask image, determine the vectors of the anti-loosening lines for the bolt area and the non-bolt area, and if the visible range of the top of the bolt is greater than the visible range of the side of the bolt, then perform projection correction on the vector of the anti-loosening line for the non-bolt area, and compare the angle between the vector of the anti-loosening line for the bolt area and the non-bolt area to determine whether the bolt is loose.

[0209] S41: Discard bolt images with low resolution that cannot be used to determine the anti-loosening line.

[0210] The image of the positioning nut is denoised using Gaussian filtering to obtain a denoised image. Then, the variance of the Laplacian response map is calculated from the denoised image using the Laplacian operator. This variance is used as a sharpness index and compared with a preset threshold. Segmentation masks with low sharpness that cannot be used for anti-loosening line detection are discarded. Figure 5 The image shown is a comparison between low-resolution images that need to be discarded and bolt images that meet the resolution requirements.

[0211] The formula for calculating Gaussian kernel weights can be:

[0212] ;

[0213] The formula for the 4-neighborhood Laplace kernel operator can be:

[0214] ;

[0215] The formula for calculating the mean of the Laplace response plot can be:

[0216] ;

[0217] The formula for calculating the standard deviation of a Laplace response plot can be:

[0218] ;

[0219] The formula for calculating the variance (clarity index) of the Laplace response plot can be:

[0220] ;

[0221] Where f(x,y) is the pixel value at (x,y) after Gaussian blur, H is the width of the nut ROI, W is the height of the nut ROI, and L(x,y) is the pixel value of the Laplacian response map at (x,y). The nut ROI refers to the "Region of Interest" set specifically for detecting nuts in the target detection task. By limiting the detection range, the efficiency and accuracy of nut detection can be improved.

[0222] S42: Use morphological operations to remove images with broken, discontinuous, or irregular features in the masking feature area of the anti-loosening line.

[0223] S43: Extract the anti-loosening line of the bolt area based on the segmentation mask image, and calculate the vector of the anti-loosening line of the bolt area.

[0224] The anti-loosening line vector of the bolt area is determined by the starting point and the ending point, such as... Figure 6 As shown, it includes the following steps:

[0225] S431: Based on the overall bolt mask area and the anti-loosening line mask area, the anti-loosening line mask area of the bolt area is obtained.

[0226] ;

[0227] in This is the mask area on the top of the bolt surface. To prevent loosening of the mask area.

[0228] S432: Calculate the coordinates of the center point of the minimum bounding rectangle of the bolt area anti-loosening line mask region as the starting point of the bolt area anti-loosening line vector.

[0229] Let the coordinates of the four corner points of the minimum bounding rectangle of the bolt area anti-loosening line mask area be... , , Find the center coordinates:

[0230] = ;

[0231] S433: Calculate the center point of the short side of the minimum bounding rectangle of the bolt area anti-loosening line mask region as the termination point of the bolt area anti-loosening line vector.

[0232] First, calculate the center points of the short sides of the two smallest circumscribed rectangles of the anti-loosening line mask located in the bolt area.

[0233] ;

[0234] Based on the distance from the center of the nut, the point closer to the center is selected as the termination point of the bolt anti-loosening line vector. The calculation formula is as follows:

[0235] ;

[0236] ;

[0237] S44: Determine whether the visible area at the top of the bolt is greater than the visible area on the side of the bolt.

[0238] Extract the overall 2D bolt mask from the segmented mask image, extract the bolt side mask area from the segmented mask image, and calculate the area ratio of the bolt side or top based on the overall bolt mask, bolt side mask, and bolt top mask. This will determine whether the visible area of the bolt top is greater than the visible area of the bolt side. If so, the anti-loosening line is located at the bolt top; otherwise, it is located at the bolt top.

[0239] The formula for determining the position of the anti-loosening line can be:

[0240] ;

[0241] Among them, S top S is the area of the bolt top. side It is the area of the side of the bolt.

[0242] S45: If the visible range of the top of the bolt is greater than the visible range of the side of the bolt, then the identified bolt area anti-loosening line is located at the top of the bolt. Then, the starting point of the non-bolt area anti-loosening line vector is projected and corrected using the starting point of the bolt area anti-loosening line vector as the reference point. The point closest to the center of the smallest bounding rectangle of the non-bolt area anti-loosening line mask area is selected from the non-bolt area anti-loosening line mask area as the ending point of the non-bolt area anti-loosening line vector.

[0243] Projection correction methods include:

[0244] S451: Extract the top mask area and the overall mask area of the bolt based on the segmented mask diagram;

[0245] S452: Calculate the bolt tilt angle .

[0246] Bolt tilt angle It is the angle between the centerline of the short side of the smallest bounding rectangle of the bolt top mask area and the centerline of the short side of the smallest bounding rectangle of the overall bolt mask area. The calculation formula can be:

[0247] ;

[0248] in Let be the rotation angle of the short side midline of the smallest bounding rectangle of the bolt top mask area relative to the horizontal plane. The rotation angle of the short side midline of the circumscribed rectangle of the bolt's overall mask area relative to the horizontal plane.

[0249] S453: Calculate the offset step size l.

[0250] The offset step size is the vector magnitude from the center of the smallest bounding rectangle of the bolt top mask area to the center of the smallest bounding rectangle of the overall bolt mask area. The calculation formula can be:

[0251] ;

[0252] In the formula, P top The minimum circumscribed rectangle of the bolt top mask area is P. top+side The center of the smallest bounding rectangle of the masking code area on the bolt surface.

[0253] S454: The starting point of the bolt area anti-loosening line vector is used as the reference point. The reference point is corrected based on the bolt inclination angle and offset step size, and used as the starting point coordinates of the corrected non-bolt area anti-loosening line vector. The formula is:

[0254] = ;

[0255] S455: Select the point closest to the center of the smallest bounding rectangle of the non-bolt area anti-loosening line mask region from the non-bolt area anti-loosening line mask region as the termination point of the non-bolt area anti-loosening line vector. .

[0256] S45: If the bolt area anti-loosening line is located on the side of the bolt, then the starting point of the non-bolt area anti-loosening line vector is the same as the starting point of the bolt area anti-loosening line vector, and the ending point of the non-bolt area anti-loosening line vector is located at the center point of the minimum bounding rectangle of the non-bolt area anti-loosening line mask area.

[0257] Let the coordinates of the diagonal point of the minimum bounding rectangle of the non-bolt region anti-loosening line mask region be... , The termination point of the anti-loosening line vector in the non-bolt area is:

[0258] ;

[0259] Through steps S41-S45, the following is obtained: Figure 6 The image shown is annotated with the anti-loosening line vectors in the bolt area and the non-bolt area based on the segmentation mask image.

[0260] S46: Calculate the included angle of the anti-loosening wire using the vector dot product formula. To determine if the anti-loosening markings on the bolts are loose.

[0261] The formula for the vector dot product is as follows:

[0262] ;

[0263] If the anti-loosening line is on top of the bolt, the included angle of the anti-loosening line is calculated using the starting point coordinates of the modified anti-loosening line projection vector in the non-bolt area.

[0264] The determination of whether a bolt is loose is based on the angle between the anti-loosening line and a preset threshold: if the anti-loosening line is at the top of the bolt, then when the angle θ between the anti-loosening lines is... loose An angle greater than 15° is considered loose; if the anti-loosening line is on the side of the bolt, then when the included angle θ of the anti-loosening line is... loose A value greater than 10° is considered loose.

[0265] Example 2

[0266] A system for detecting the anti-loosening marking of underbody bolts, comprising:

[0267] The image acquisition module is configured to acquire a raw image containing the bolt to be inspected;

[0268] An improved YOLOv11 network is configured to detect and locate bolt targets in the original image and output the identified bolt images.

[0269] An improved U-Net++ network is configured to segment the bolts and anti-loosening lines in the bolt image to obtain a segmentation mask image;

[0270] The vector extraction module is configured to extract the anti-loosening lines in the bolt area and the non-bolt area based on the segmentation mask image, and obtain the vectors of the anti-loosening lines in the bolt area and the non-bolt area.

[0271] The pre-comparison module compares the visible range of the bolt top and the bolt side. If the visible range of the bolt top is greater than that of the bolt side, the anti-loosening line vector in the non-bolt area is projected and corrected; otherwise, no correction is made.

[0272] The anti-loosening judgment module compares the angle between the anti-loosening line vector of the bolt area and the anti-loosening line vector of the non-bolt area. If the angle exceeds a preset threshold, it is determined that the bolt has become loose.

[0273] The method for projection correction is:

[0274] Extract the top mask area and the overall mask area of the bolt based on the segmented mask image;

[0275] The bolt tilt angle is determined based on the angle difference between the short side midline of the minimum bounding rectangle of the bolt top mask area and the short side midline of the minimum bounding rectangle of the overall bolt mask area.

[0276] The offset step size is determined based on the vector magnitude from the center of the minimum bounding rectangle of the top mask area of the bolt to the center of the minimum bounding rectangle of the overall mask area of the bolt.

[0277] The starting point of the anti-loosening line vector in the bolt area is used as the reference point. The reference point is then corrected based on the bolt tilt angle and offset step size, and used as the starting point of the corrected anti-loosening line vector in the non-bolt area.

[0278] Example 3

[0279] This embodiment provides an electronic device for implementing the above-described method for detecting anti-loosening marks on vehicle underbody bolts. The electronic device includes at least one processor, a memory, and a computer program stored in the memory and executable on the processor.

[0280] When the processor executes the computer program, it implements all the steps of the vehicle underbody bolt anti-loosening mark detection method as described in Embodiment 1. Specifically, the processor is configured to:

[0281] The method involves acquiring an original image containing the bolt to be detected; performing bolt target detection and localization on the original image and outputting the identified bolt image; segmenting the bolt and anti-loosening line in the bolt image to obtain a segmentation mask image; and configuring an anti-loosening judgment module to extract anti-loosening lines in the bolt area and non-bolt area based on the segmentation mask image, obtaining the anti-loosening line vectors in both the bolt area and non-bolt area, comparing the visible range of the bolt top and bolt side, and if the visible range of the bolt top is greater than that of the bolt side, then the anti-loosening line vector in the non-bolt area is projected and corrected; otherwise, proceeding directly to the next step; and comparing the angle between the anti-loosening line vector in the bolt area and the anti-loosening line vector in the non-bolt area, and if the angle exceeds a preset threshold, then determining that the bolt is loose.

[0282] The electronic device can be a dedicated high-performance computing workstation, server, or computing terminal integrated into the bearing design software system. Through the electronic device of this embodiment, high-efficiency and high-reliability motor optimization can be achieved through hardware execution.

[0283] Example 4

[0284] This embodiment provides a computer-readable storage medium on which a computer program is stored. The computer-readable storage medium includes, but is not limited to, various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0285] When the computer program in the storage medium is loaded and executed by one or more processors (e.g., the processor of the electronic device described in Embodiment 3), the processor performs all or part of the steps of the undercarriage bolt anti-loosening mark detection method as described in Embodiment 1.

[0286] Through the storage medium of this embodiment, the method for detecting the anti-loosening mark of the vehicle bottom bolts can be disseminated, distributed, and used in the form of a software product, making it easy to deploy the technical solution of the present invention in different computing environments.

[0287] The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the specific details of the above embodiments. Within the scope of the technical concept of the present invention, various simple modifications can be made to the technical solution of the present invention, and these simple modifications all fall within the protection scope of the present invention.

Claims

1. A method for detecting anti-loosening marks on vehicle underbody bolts, characterized in that, Includes the following steps: Obtain the original image containing the bolt to be inspected; The original image is used to detect and locate bolt targets based on the improved YOLOv11 network, and the identified bolt images are output. The bolts and anti-loosening lines in the bolt image are segmented using an improved U-Net++ network to obtain a segmentation mask image; Extract the anti-loosening lines in the bolt area and the non-bolt area from the segmentation mask image, obtain the vectors of the anti-loosening lines in the bolt area and the non-bolt area, compare the visible range of the top and side of the bolt, if the visible range of the top of the bolt is greater than that of the side of the bolt, then perform projection correction on the vector of the anti-loosening line in the non-bolt area; otherwise, proceed directly to the next step. Compare the angle between the anti-loosening line vector of the bolt area and the anti-loosening line vector of the non-bolt area. If the angle exceeds a preset threshold, it is determined that the bolt has become loose. The projection correction includes: Extract the top mask area and the overall mask area of the bolt based on the segmented mask image; The bolt tilt angle is determined based on the angle difference between the short side midline of the minimum bounding rectangle of the bolt top mask area and the short side midline of the minimum bounding rectangle of the overall bolt mask area. The offset step size is determined based on the vector magnitude from the center of the minimum bounding rectangle of the top mask area of the bolt to the center of the minimum bounding rectangle of the overall mask area of the bolt. The starting point of the anti-loosening line vector in the bolt area is used as the reference point. The reference point is then corrected based on the bolt tilt angle and offset step size, and used as the starting point of the corrected anti-loosening line vector in the non-bolt area.

2. The method for detecting anti-loosening marks on vehicle underbody bolts according to claim 1, characterized in that, The improved YOLOv11 network includes a preprocessing module, which is used to fuse the original image with a grayscale image generated from the original image to generate a dual-modal feature map, thereby achieving input feature fusion. The original image is an RGB three-channel image.

3. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 2, characterized in that, The improved YOLOv11 network includes a backbone network for extracting feature layers for small targets from the bimodal feature map. The backbone network includes four levels of C3k2 modules. The outputs of the first two levels of C3k2 modules are embedded with coordinate attention modules, and the output of the first level of C3k2 modules is introduced into the P2 output layer.

4. The method for detecting anti-loosening marks on vehicle underbody bolts according to claim 3, characterized in that, The improved YOLOv11 network further includes a feature fusion module, which employs an adaptive weighting method for feature fusion. This adaptive weighting method includes: Align the feature layers output by the backbone network; The aligned feature layers are then subjected to global average pooling and input into the fully connected layer to reduce the dimensionality to one-dimensional weight values. The one-dimensional weight values are activated using an activation function to obtain the original weights; The original weights are normalized to obtain the final normalized weights used for feature fusion; A multi-scale feature fusion method is adopted to multiply each aligned feature layer with its corresponding normalized weight element by element and then sum them to output a fused feature map.

5. The method for detecting anti-loosening marks on vehicle underbody bolts according to claim 1, characterized in that, The improved YOLOv11 network includes a detection head module, whose anchors are generated based on a bolt annotation dataset of vehicle undercarriage images using the K-Means clustering algorithm.

6. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 5, characterized in that, The detection head module is configured with independent regression and classification branches. The regression branch uses a small target convolution kernel for target localization, and the regression loss is calculated using the EPGIoU loss function. The classification branch adds a BatchNorm layer after the convolutional layer and is configured with the GELU activation function. The classification loss is calculated using the Focal Loss function.

7. The method for detecting the anti-loosening mark of vehicle underbody bolts according to any one of claims 1-6, characterized in that, The improved YOLOv11 network also includes a post-processing module, which uses a Gaussian weighted method to calculate confidence scores to filter overlapping prediction boxes. The Gaussian weighted method includes: The predicted boxes are sorted in descending order of their original confidence levels; Calculate the IoU between the prediction box with the highest confidence and other prediction boxes, and filter the prediction boxes according to the preset threshold; The original confidence scores of the selected prediction boxes are converted into linear confidence scores using the linearly weighted Soft-NMS method. The linear confidence scores were converted to Gaussian confidence scores using the Gaussian weighted Soft-NMS method.

8. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 1, characterized in that, The improved U-Net++ network includes a preprocessing module for bolt images. This preprocessing module performs red enhancement on the bolt images to obtain a stitched image. The red enhancement method includes: Generate HSV three-channel images based on bolt images; The HSV three-channel image uses the H channel to filter the red area to obtain the red mask area; The bolt image is stitched together with the red mask area to generate a stitched image.

9. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 1, characterized in that, The improved U-Net++ network includes an encoder that introduces a CBAM attention mechanism. The CBAM attention mechanism uses channel attention to focus on red features and spatial attention to locate the anti-loosening line position. The encoder convolutional layers use dilated convolution to gradually expand the receptive field.

10. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 9, characterized in that, The improved U-Net++ network includes a decoder and a fine feature fusion branch module, which is used to fuse the low-level edge features of the encoder with the high-level semantic features of the decoder.

11. The method for detecting the anti-loosening mark of vehicle underbody bolts according to any one of claims 8-10, characterized in that, The improved U-Net++ network employs a multi-loss joint optimization strategy to calculate the loss, which includes: The Focal Dice Loss function is introduced to calculate the fine-structure segmentation loss; The Laplacian operator is introduced to calculate the edge loss; The total loss is calculated by weighting the fine-structure segmentation loss and the edge loss.

12. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 1, characterized in that, If the visible area of the top of the bolt is larger than the visible area of the side of the bolt, then the point closest to the center of the smallest bounding rectangle of the mask area of the bolt area anti-loosening line is selected from the mask area of the non-bolt area anti-loosening line as the termination point of the vector of the non-bolt area anti-loosening line.

13. The method for detecting the anti-loosening mark of vehicle underbody bolts according to claim 1, characterized in that, The method to determine whether the visible area of the bolt top is greater than the visible area of the bolt side is as follows: extract the bolt side mask area based on the segmentation mask image, calculate the area ratio of the bolt side and top based on the bolt top mask area and the bolt side mask area, and if the area ratio of the bolt top is greater than the area ratio of the bolt side, then it is determined that the visible area of the bolt top is greater than the visible area of the bolt side.

14. A system for detecting the anti-loosening marking of vehicle underbody bolts, characterized in that, include: The image acquisition module is configured to acquire a raw image containing the bolt to be inspected; An improved YOLOv11 network is configured to detect and locate bolt targets in the original image and output the identified bolt images. An improved U-Net++ network is configured to segment the bolts and anti-loosening lines in the bolt image to obtain a segmentation mask image; The vector extraction module is configured to extract the anti-loosening lines in the bolt area and the non-bolt area based on the segmentation mask image, and obtain the vectors of the anti-loosening lines in the bolt area and the non-bolt area. The pre-comparison module compares the visible range of the bolt top and the bolt side. If the visible range of the bolt top is greater than that of the bolt side, the anti-loosening line vector in the non-bolt area is projected and corrected; otherwise, no correction is made. The anti-loosening judgment module compares the angle between the anti-loosening line vector of the bolt area and the anti-loosening line vector of the non-bolt area. If the angle exceeds a preset threshold, it is determined that the bolt has become loose. The projection correction includes: Extract the top mask area and the overall mask area of the bolt based on the segmented mask image; The bolt tilt angle is determined based on the angle difference between the short side midline of the minimum bounding rectangle of the bolt top mask area and the short side midline of the minimum bounding rectangle of the overall bolt mask area. The offset step size is determined based on the vector magnitude from the center of the minimum bounding rectangle of the top mask area of the bolt to the center of the minimum bounding rectangle of the overall mask area of the bolt. The starting point of the anti-loosening line vector in the bolt area is used as the reference point. The reference point is then corrected based on the bolt tilt angle and offset step size, and used as the starting point of the corrected anti-loosening line vector in the non-bolt area.

15. A computer device, characterized in that, The method includes a memory and a processor, wherein the memory stores computer instructions, and the processor executes the computer instructions to perform the method as described in any one of claims 1-13.

16. A computer-readable storage medium, characterized in that, The storage medium stores computer instructions that, when executed by a computer, describe the method as described in any one of claims 1-13.