Bolt loosening disease target detection control method based on image recognition

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a bolt target detection dataset and building a YOLO series target detection network, the problems of low efficiency and insufficient accuracy in the detection of bolt defects in steel bridges have been solved. This has enabled high-precision, lightweight, and real-time detection of bolt detachment defects, improving detection efficiency and safety.

CN122244566APending Publication Date: 2026-06-19CCCC THIRD HARBOR ENGINEERING CO LTD +1

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: CCCC THIRD HARBOR ENGINEERING CO LTD
Filing Date: 2026-05-20
Publication Date: 2026-06-19

Application Information

Patent Timeline

20 May 2026

Application

19 Jun 2026

Publication

CN122244566A

IPC: G06V10/764; G06V10/774; G06V10/80; G06V10/82; G06V10/52; G06V10/40; G06N3/096; G06N3/09

AI Tagging

Application Domain

Character and pattern recognition Biological models

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122244566A_ABST

Patent Text Reader

Abstract

A bolt loosening defect target detection and control method based on image recognition, belonging to the field of target detection and control technology, includes: Step 1: Constructing a bolt target detection dataset for steel bridge bolts; Step 2: Building a target detection network based on the bolt target detection dataset; Step 3: Training and optimizing the model based on the PyTorch deep learning framework; Step 4: Detecting bolt loosening defects based on the optimal model. This invention solves the drawbacks of traditional manual inspection, realizing intelligent and automated detection of bolt loosening defects in steel bridges, improving detection efficiency and accuracy, and ensuring the operational safety of steel bridge structures.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of target detection and control technology, specifically relating to a method for detecting and controlling bolt loosening defects based on image recognition. Background Technology

[0002] Steel bridges are increasingly widely used in my country's transportation infrastructure due to their advantages such as high strength, large span, and convenient construction. High-strength bolts at the connection points of steel bridges are core components ensuring the integrity and stability of the structure. However, during bridge operation, these bolts are subjected to high-frequency vibrations and impacts from vehicle loads over long periods, making them highly susceptible to loosening or even detachment. Bolt detachment directly leads to the deterioration of the mechanical properties of the steel bridge connection nodes, and in severe cases, can cause major safety accidents such as structural instability and collapse.

[0003] Currently, the detection of bolt defects in steel bridges mainly relies on manual inspection, where inspectors check the condition of bolts by visual observation or using simple tools. This method has many limitations:

[0004] Low inspection efficiency: Large steel bridges have tens of thousands or even hundreds of thousands of bolts. Manual inspection requires a lot of manpower and time, which is difficult to meet the needs of routine bridge maintenance.

[0005] Low safety factor: Some bolts are located in dangerous areas such as high altitude and near the edge of the bridge, and manual inspection can easily lead to safety accidents such as falls from heights;

[0006] High rate of missed detection and misjudgment: Manual inspection is greatly affected by personnel experience, physical strength and ambient lighting, and is prone to missing hidden parts or early-stage detachment risks, and is also prone to misjudgment of bolt condition.

[0007] With the development of computer vision and deep learning technologies, image recognition-based target detection technology has been initially applied in the field of structural defect detection in engineering projects. However, existing detection methods primarily target general targets or other structural defects (such as cracks and spalling), lacking a dedicated detection scheme for bolt detachment in steel bridges. Furthermore, some models suffer from insufficient detection accuracy, poor real-time performance, and excessive model size, making them unsuitable for the complex conditions of on-site bridge inspections. Therefore, there is an urgent need to develop a high-precision, lightweight, and real-time-efficient bolt detachment defect detection method to improve the detection efficiency and reliability of steel bridge bolt defects. Summary of the Invention

[0008] The purpose of this invention is to provide a target detection and control method for bolt loosening defects based on image recognition, which solves the drawbacks of traditional manual inspection, realizes intelligent and automated detection of bolt loosening defects in steel bridges, improves detection efficiency and accuracy, and ensures the operational safety of steel bridge structures.

[0009] The present invention employs the following technical solution.

[0010] A method for detecting and controlling bolt loosening defects based on image recognition, comprising:

[0011] Step 1: Construct a bolt target detection dataset for steel bridge bolts;

[0012] Step 2: Construct a target detection network for the bolt target detection dataset;

[0013] Step 3: Train and optimize the model using the PyTorch deep learning framework;

[0014] Step 4: Detect bolt loosening defects based on the optimal model.

[0015] Preferably, step 1 specifically includes:

[0016] Step 1-1: Collect bolt images through three channels to construct the original dataset:

[0017] Steps 1-2: To improve the model's generalization ability, perform augmentation operations on the original dataset to form the base dataset:

[0018] Steps 1-3: Use the open-source annotation tool LabelImg to annotate the images of the enhanced base dataset, marking two types of targets: bolt presence and bolt detachment. Generate an XML tag file in VOC format as the annotated dataset. Randomly divide the annotated dataset into training and validation sets in an 8:2 ratio. The training and validation sets constitute the bolt target detection dataset.

[0019] Preferably, step 1-1 specifically includes:

[0020] Collect 500 bolt images of actual steel bridge connection nodes;

[0021] Create a scaled-down model of the steel bridge node plate and take 1000 images of the bolts;

[0022] We crawled 500 bolt images from a professional engineering image library to supplement the variety of bolt types and scenarios.

[0023] Preferably, steps 1-2 specifically include:

[0024] Randomly crop bolt regions from images in the original dataset and add perturbation to the target location;

[0025] The images of the original dataset are translated in the plane, and the missing areas are filled with black pixels;

[0026] The brightness of image pixels in the original dataset is increased or decreased proportionally to simulate different lighting environments.

[0027] Inject Gaussian noise or salt-and-pepper noise into the images of the original dataset to simulate interference in on-site captured images;

[0028] The images in the original dataset are randomly rotated between 0 and 45° to adapt to bolt images taken from different angles;

[0029] Perform horizontal or vertical mirroring on the images in the original dataset to increase sample diversity.

[0030] Preferably, step 2 specifically includes:

[0031] A YOLO series of object detection networks was constructed for the bolt object detection dataset. The YOLO series of object detection networks includes eight sub-models, namely the YOLOv5 series and the YOLOv8 series.

[0032] Preferably, step 3 specifically includes:

[0033] Step 3-1: Configure the training environment;

[0034] Step 3-2: Perform transfer learning initialization to train the sub-model;

[0035] Step 3-3: Conduct model performance evaluation and selection.

[0036] Preferably, step 3-1 specifically includes:

[0037] A training environment was built based on the PyTorch deep learning framework, which includes eight sub-models: the YOLOv5 series and the YOLOv8 series, which are the YOLO series models. The hardware configuration of the PyTorch deep learning framework is an Intel Core i5-13600KF CPU and an NVIDIA 4060Ti GPU. The software configuration of the PyTorch deep learning framework is Python 3.8, CUDA 11.7, and PyCharm 2023 compiler.

[0038] Preferably, step 3-2 specifically includes:

[0039] The pre-trained parameters of YOLO series models on the bolt target detection dataset were transferred to the bolt detection task. The parameters of the feature extraction layer of the backbone network were frozen, and only the classification and regression layers were trained.

[0040] Preferably, step 3-3 specifically includes:

[0041] The training results of the eight sub-models are evaluated from multiple dimensions using the following core metrics:

[0042] Intersection and Union The calculation formula is as follows:

[0043] ;

[0044] in, For detection box With real frame The area of intersection For detection box With real frame The area of the union of the sets;

[0045] Accuracy The calculation formula is as follows:

[0046] ;

[0047] in, The number of bolt targets correctly detected by the sub-model. The number of non-bolt targets falsely detected by the sub-model;

[0048] Recall rate The proportion of true positive samples that are correctly detected is measured by the following formula: ;

[0049] in, The target number of bolts that were missed during inspection;

[0050] Mean accuracy include and ;

[0051] Frames per second of the sub-model ;

[0052] The comprehensive index of each sub-model is calculated based on the core indicators: 0.25 × +0.2× +0.15× +0.12× +0.1× +0.1× The sub-model with the highest comprehensive index is selected as the optimal model.

[0053] Preferably, step 4 specifically includes:

[0054] The image of the steel bridge bolt to be detected is input into the trained optimal model. The optimal model extracts the image features of the steel bridge bolt image through its backbone network. After multi-scale feature fusion through its Neck layer, the decoupling head outputs the category, location box and confidence of the bolt target. Finally, the detection result is output and visualized annotation is achieved.

[0055] Step 4 specifically includes:

[0056] The image of the steel bridge bolt to be detected is input into the trained optimal model. The optimal model extracts the image features of the steel bridge bolt image through its backbone network. After multi-scale feature fusion through its Neck layer, the decoupling head outputs the category (presence / detachment), location box and confidence of the bolt target. Finally, the detection result is output and the visualization annotation is realized (marking the location and confidence of the detached bolt).

[0057] In step 4, after multi-scale feature fusion of its Neck layer, the method for outputting the category, location box, and confidence score of the bolt target by the decoupling head specifically includes:

[0058] After multi-scale feature fusion in the Neck layer, a four-step optimization process is added: feature consistency verification, dynamic confidence correction, dual threshold filtering, and anomaly suppression. The final output includes the category, bounding box, and corrected confidence score of the bolt target, as detailed below:

[0059] Step 4-1: Extract the fused feature maps of the Neck layer at three scales, and obtain the class prediction, bounding box coordinates and initial confidence scores at each scale by decoupling the head branch;

[0060] Step 4-2: Calculate the multi-scale feature consistency coefficient and correct the initial confidence level;

[0061] In step 4-2, the multi-scale feature consistency coefficient The calculation formula is:

[0062] ;

[0063] in The consistency coefficient of multi-scale features; For scale indexing; For the first The location box output by the scale decoupling head; The average coordinate frame of the three scale location boxes; For the first The intersection-over-union ratio of the location bounding box and the average coordinate bounding box at the scale; For the first Clarity score of scale-fused feature map;

[0064] In step 4-2, the formula for calculating the corrected initial confidence level is as follows:

[0065] ;

[0066] Corrected final confidence level; The initial confidence level of the decoupling head output; This is the position accuracy gain coefficient; The position box is the output after merging; To train the average reference box of similar targets in the training set; This is the intersection-union ratio of the location box and the reference box;

[0067] Step 4-3: For the two types of targets, bolt presence and bolt detachment, adaptively generate dynamic screening thresholds to filter out low-reliability results;

[0068] In step 4-3, the formula for calculating the adaptively generated dynamic filtering threshold is:

[0069] ;

[0070] in For category Dynamic filtering threshold; These are the category weight coefficients; For categories in the training set The average confidence level of the target; For categories in the training set Standard deviation of the confidence level of the target;

[0071] Step 4-4: Introduce the anomaly feature recognition module;

[0072] In step 4-3, the method of introducing the anomaly feature recognition module specifically includes:

[0073] First, perform the following abnormal feature suppression coefficient. The calculation formula is as follows:

[0074] ;

[0075] in The abnormal feature suppression coefficient; The suppression intensity coefficient; For noise and obstruction assessment values;

[0076] The decision rule is then executed, which is as follows:

[0077] like and If the value is 0.3, then the category of the target will be output. Location box and corrected confidence level ;

[0078] like < or If the value is less than 0.3, it is considered an invalid target and will not be output.

[0079] The beneficial effects of the present invention are as follows: Compared with the prior art, the technical effects of the present invention include:

[0080] High detection accuracy: This invention uses YOLOv8n as the core model to target the mAP of bolt loosening defects. 0.5:0.95 The accuracy rate reached 97.8%, with both precision and recall exceeding 99%, significantly reducing the false negative and false positive rates. The detection accuracy was far superior to traditional manual inspection and conventional YOLOv5 models.

[0081] High real-time performance: The detection time for a single image of the model is less than 1ms, and the FPS meets the batch detection needs of engineering sites. It can quickly complete the bolt defect investigation of large steel bridges, and the detection efficiency is more than 100 times higher than that of manual inspection.

[0082] Lightweight model: The YOLOv8n model has only 3.2M parameters, which is much smaller than models such as YOLOv5l (43.7M) and YOLOv8l (43.7M). It can be deployed on mobile devices, drones and other portable devices, and is suitable for the complex inspection environment of bridge sites.

[0083] Good generalization ability: Through multi-channel image acquisition and multi-dimensional data enhancement, the model can adapt to complex working conditions with different lighting, shooting angles and bolt arrangements, and shows good stability and reliability in actual bridge inspection.

[0084] This invention enables intelligent detection of bolt loosening defects in steel bridges, providing bridge maintenance units with an efficient and accurate detection tool. It can effectively reduce maintenance costs and improve the level of structural safety, and has significant engineering practical value and promotion prospects. Attached Figure Description

[0085] Figure 1 This is a flowchart of the bolt loosening defect target detection and control method based on image recognition described in this invention. Detailed Implementation

[0086] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below in conjunction with the accompanying drawings of the embodiments of this invention. The embodiments described in this application are merely some embodiments of this invention, and not all embodiments. Based on the spirit of this invention, any other embodiments obtained by those skilled in the art without creative effort are within the protection scope of this invention.

[0087] like Figure 1 As shown, the bolt loosening defect target detection and control method based on image recognition of the present invention includes:

[0088] Step 1: Construct a bolt target detection dataset for steel bridge bolts;

[0089] In a preferred but non-limiting embodiment of the present invention, step 1 specifically includes:

[0090] Step 1-1: Collect bolt images through three channels to construct the original dataset:

[0091] In a preferred but non-limiting embodiment of the present invention, step 1-1 specifically includes:

[0092] Real bridge steel plate photography: 500 bolt images of actual steel bridge connection nodes were collected, with high image resolution and good imaging conditions;

[0093] Experimental node plate photography: A scaled-down model of the steel bridge node plate was made, and 1,000 bolt images were taken. The bolts were arranged in a regular and highly controllable manner.

[0094] In step 1-1, the method for creating a scaled-down model of the steel bridge node plate and taking 1000 bolt images specifically includes:

[0095] I. Method for Constructing a Scaled-Down Model of a Steel Bridge Node Plate

[0096] 1.1 Model Design Basis

[0097] Using the node plates of key connection parts such as the web and flanges of an actual steel bridge (such as a large-span steel truss bridge or a steel box girder bridge) as prototypes, the model is designed at a scale of 1:5 to 1:10 (the scale selection must take into account both the feasibility of model fabrication and the identifiability of bolt details). The design must ensure that the core structure of the model is consistent with the prototype, including the size and proportion of the node plates, bolt hole diameter and spacing, bolt arrangement, etc., to ensure that the stress state and appearance of the bolts in the model are similar to those of the actual bridge.

[0098] 1.2 Material Selection

[0099] Node plate substrate: Use 2-5mm thick thin steel plate or aluminum alloy plate to ensure the flatness of the plate and facilitate drilling and bolt installation; acrylic plate can also be used, which has good light transmittance and can be adapted to image acquisition under different lighting conditions.

[0100] Bolt test pieces: Select M3~M6 nylon or stainless steel miniature bolts (matching the scale ratio), with large hexagonal bolt heads (consistent with the shape of Shiqiao high-strength bolts), and prepare 1500~2000 pieces (reserve spare parts), and also equip nuts and washers of the corresponding specifications.

[0101] 1.3 Model Manufacturing Process

[0102] Drawing preparation: Based on the actual bridge node plate drawings, draw the model node plate processing drawings according to the scale, specifying the length and width dimensions of the node plate, the position coordinates of the bolt holes, the hole diameter and the hole spacing, and marking the number of rows and columns of the bolts.

[0103] Plate cutting: Laser cutting or wire cutting equipment is used to cut the node plate substrate according to the processing drawing, ensuring that the plate edges are free of burrs and the dimensional error is ≤ ±0.1mm.

[0104] Bolt hole drilling: Use a bench drill to precisely drill bolt holes on the node plate. After drilling, chamfer the hole opening to prevent the bolt from getting stuck during installation, and ensure that the hole position deviation is ≤ ±0.05mm.

[0105] Bolt installation: According to the designed layout, the bolts, washers and nuts are installed in sequence on the bolt holes of the node plate, divided into three states: fully tightened (simulating normal working state), semi-loose (simulating early disease state), and completely detached (simulating severe disease state). The number of bolts in each state is allocated as needed (e.g., 70% of the bolts are tightened, 20% are loose, and 10% are detached).

[0106] Model fixation: Fix the assembled node board model to a wooden or metal support to ensure the overall stability of the model and prevent displacement during image acquisition.

[0107] II. Method for capturing 1000 bolt images

[0108] 2.1 Shooting Preparation

[0109] Shooting equipment: Use a smartphone with 12 megapixels or higher (such as iPhone 12 and above) or an entry-level SLR camera, along with a tripod (to ensure shooting stability), a fill light (to adjust the lighting), and a laser level (to calibrate the shooting angle). The equipment parameters need to be adjusted in advance to ensure image clarity.

[0110] Shooting environment: Choose an indoor laboratory environment to avoid interference from outdoor wind, rain, dust and other factors; set up a simple shooting table, fix the node board model on the table, and set up the shooting equipment on a tripod, maintaining an adjustable distance from the model.

[0111] Scene presets: To ensure image diversity, three basic scene presets are provided:

[0112] Standard scenario: uniform lighting, shooting angle perpendicular to the node plate surface, and bolts unobstructed;

[0113] Complex lighting scenes: Set different lighting conditions such as strong light, weak light, and side backlight;

[0114] Multi-angle scenarios: Set different shooting tilt angles within the range of 0°~45° to simulate different detection perspectives on site.

[0115] 2.2 Shooting Parameter Settings

[0116] Resolution: Set the image resolution to 3024×3024 pixels (to ensure bolt details are recognizable), and the format to JPG;

[0117] Focal length: Fixed focal length of 26mm (equivalent focal length) to avoid image distortion caused by zooming;

[0118] Exposure: Manually adjust the exposure parameters to ensure appropriate contrast between the bolts and the node plate, avoiding overexposure or underexposure;

[0119] White Balance: Set to "Auto White Balance" to adapt to color reproduction under different lighting conditions.

[0120] 2.3 Batch Image Acquisition

[0121] To achieve efficient acquisition of 1000 images, shooting will be conducted in the following batches and under the following conditions to ensure image diversity and coverage:

[0122] Batch 1: Standard Operating Conditions (300 sheets)

[0123] Shooting distance: 0.8~1.0m (fixed), shooting angle: 0° (perpendicular to the node plate);

[0124] Lighting conditions: Uniform white light (supplementary light illuminating from the front, brightness 70~150 lux);

[0125] Bolt status: Sequentially photograph bolts in different areas of the node plate, including three states: tight, loose, and detached. Each image contains 6 to 12 bolts to ensure that bolts in different states are evenly distributed in the image.

[0126] Batch 2: Variable Distance Working Conditions (200 images)

[0127] Shooting angle: 0° (fixed); Lighting conditions: uniform white light;

[0128] Shooting distance: Four settings were set at 0.6m, 0.8m, 1.0m, and 1.2m respectively, with 50 images taken at each distance. The focus was on capturing the changes in the clarity of the bolts at different distances to ensure no distortion at close range and that they could be identified at long range.

[0129] Batch 3: Variable lighting conditions (200 images)

[0130] Shooting distance: 0.8m (fixed), shooting angle: 0°;

[0131] Lighting conditions: Four levels were set: low light (0~30 lux), normal light (70~150 lux), strong light (200~350 lux), and side backlight (the light source is at a 45° angle to the shooting direction). 50 photos were taken at each lighting level, focusing on capturing the outline and detail of the bolts under different lighting conditions.

[0132] Batch 4: Variable Angle Working Conditions (200 images)

[0133] Shooting distance: 0.8m (fixed); Lighting conditions: uniform white light;

[0134] Shooting angles: Four tilt angles (horizontal rotation) were set at 0°, 15°, 30° and 45° respectively. 50 images were taken at each angle to simulate the oblique inspection perspective on site, focusing on capturing perspective distortion images of bolts caused by angle changes.

[0135] Batch 5: Mixed Operating Conditions (100 images)

[0136] Randomly combine shooting distance (0.6~1.2m), lighting (dark light to strong light), and angle (0°~45°), and add slight occlusion (such as thin plastic sheet covering the edge of bolt) to some images to simulate complex interference scenarios on site and improve the generalization of the dataset.

[0137] 2.4 Image Filtering and Processing

[0138] Screening: After shooting, review each image and remove invalid images that are blurry, overexposed, or misaligned to ensure that the number of valid images is ≥1000;

[0139] Naming and Classification: Images are named according to the rule of working condition type-serial number-bolt status (e.g., "Standard Working Condition-001-Tightening"), and folders are created according to working condition type for easy storage, which facilitates subsequent dataset annotation and model training.

[0140] Web scraping: 500 bolt images were scraped from a professional engineering image library to supplement the variety of bolt types and scenarios.

[0141] The original dataset was augmented to 5000 images to form the base dataset, as shown below:

[0142] Steps 1-2: To improve the model's generalization ability, perform the following augmentation operations on the original dataset to form a base dataset (used in combination):

[0143] In a preferred but non-limiting embodiment of the present invention, steps 1-2 specifically include:

[0144] Random cropping: Randomly crop the bolt region from the image of the original dataset to add perturbation to the target location;

[0145] Random translation: The image of the original dataset is translated in a plane, and the missing areas are filled with black pixels;

[0146] Brightness adjustment: Proportionally increase or decrease the brightness of image pixels in the original dataset to simulate different lighting environments;

[0147] Noise addition: Gaussian noise or salt-and-pepper noise is injected into the images of the original dataset to simulate interference in the images taken on-site;

[0148] Angle rotation: The images in the original dataset are randomly rotated from 0 to 45° to adapt to bolt images taken from different angles;

[0149] Mirroring: Performs horizontal or vertical mirroring on images in the original dataset to increase sample diversity.

[0150] Each of the 2000 images in the original dataset is processed individually, and each image is randomly combined with 2-3 of the above enhancement methods to generate 3000 basic enhanced images. These images are then merged with the images in the original dataset to form a total of 5000 basic dataset images.

[0151] Steps 1-3: Use the open-source annotation tool LabelImg to annotate the images of the enhanced base dataset, marking two types of targets: bolt presence and bolt detachment. Generate an XML tag file in VOC format (containing image path, size, and target bounding box coordinates) as the annotated dataset. Randomly divide the annotated dataset into a training set (4000 images) and a validation set (1000 images) in an 8:2 ratio for model training and performance verification. The training set and the validation set constitute the bolt target detection dataset.

[0152] Step 2: Construct a target detection network for the bolt target detection dataset;

[0153] In a preferred but non-limiting embodiment of the present invention, step 2 specifically includes:

[0154] A YOLO series of object detection networks was constructed for the bolt object detection dataset. The YOLO series of object detection networks includes eight sub-models: the YOLOv5 series (YOLOv5s / m / l / x) and the YOLOv8 series (YOLOv8n / s / m / l). The core structure of each model is as follows:

[0155] YOLOv5 series: It adopts CSPDarknet as the backbone network, combines FPN+PAN structure to realize multi-scale feature fusion, and uses anchor box mechanism and non-maximum suppression (NMS) to optimize detection results at the output end;

[0156] YOLOv8 series: Based on CSPDarknet, it replaces the traditional C3 module with a lightweight C2f module, introduces a decoupled head to implement category and bounding box regression respectively, and adopts an anchor-free detection approach to improve detection flexibility and accuracy.

[0157] Step 3: Train and optimize the model using the PyTorch deep learning framework;

[0158] In a preferred but non-limiting embodiment of the present invention, step 3 specifically includes:

[0159] Step 3-1: Configure the training environment;

[0160] In a preferred but non-limiting embodiment of the present invention, step 3-1 specifically includes:

[0161] A training environment was built based on the PyTorch deep learning framework, which includes eight sub-models: the YOLOv5 series (YOLOv5s / m / l / x) and the YOLOv8 series (YOLOv8n / s / m / l). The hardware configuration of the PyTorch deep learning framework is an Intel Core i5-13600KF CPU and an NVIDIA 4060Ti GPU. The software configuration of the PyTorch deep learning framework is Python 3.8, CUDA 11.7, and PyCharm 2023 compiler.

[0162] The method for building a training environment based on the PyTorch deep learning framework, including eight sub-models of the YOLOv5 series (YOLOv5s / m / l / x) and the YOLOv8 series (YOLOv8n / s / m / l), includes:

[0163] Create a separate conda environment (critical, to avoid polluting the system environment).

[0164] Open Anaconda Prompt (Windows) / Terminal (Linux), and create an environment named bolt_det (the name can be customized). The specific code is shown below:

[0165] `conda create -n bolt_det python=3.8` # Select Python version 3.8 (compatible with PyTorch 1.12.1+, and YOLOv5 / v8)

[0166] To activate the environment, the specific code is as follows:

[0167] conda activate bolt_det # Windows / Linux common

[0168] Install CUDA + cuDNN (GPU-accelerated core):

[0169] PyTorch's GPU acceleration relies on CUDA and cuDNN. It is recommended to install them directly via conda (which automatically matches the versions and avoids manually configuring the path). The specific code is shown below:

[0170] # Install CUDA 11.7 + cuDNN (compatible with YOLOv8n and mainstream NVIDIA graphics cards)

[0171] conda install pytorch-cuda=11.7 -c nvidia

[0172] If you choose to install CUDA manually (not recommended): you need to download the corresponding version from the NVIDIA website (https: / / developer.nvidia.com / cuda-toolkit-archive) and configure the system environment variables (Windows: add CUDA_PATH; Linux: configure LD_LIBRARY_PATH).

[0173] Install PyTorch + TorchVision (core framework):

[0174] Choose the appropriate PyTorch installation command based on your CUDA version (preferably use the command recommended on the official website, https: / / pytorch.org / get-started / locally / ):

[0175] Scenario 1: GPU version (recommended, fast training speed), the specific code is shown below:

[0176] # PyTorch 1.12.1 + TorchVision 0.13.1 (Compatible with CUDA 11.7, YOLOv5 / v8)

[0177] pip install torch==1.12.1+cu117 torchvision==0.13.1+cu117 torchaudio==0.12.1 --extra-index-url https: / / download.pytorch.org / whl / cu117

[0178] Scenario 2: CPU version (for backup when there is no NVIDIA graphics card), the specific code is as follows:

[0179] pip install torch==1.12.1+cpu torchvision==0.13.1+cpu torchaudio==0.12.1 --extra-index-url https: / / download.pytorch.org / whl / cpu

[0180] Verify that PyTorch is installed successfully:

[0181] Activate the bolt_det environment and run Python code to verify, as shown below:

[0182] import torch

[0183] import torchvision

[0184] # Verify PyTorch version

[0185] print(f"PyTorch version: {torch.__version__}")

[0186] # Verify CUDA availability (GPU version must output True)

[0187] print(f"Is CUDA available: {torch.cuda.is_available()}")

[0188] # Verify the number of GPU devices (output ≥1 if a GPU is present)

[0189] print(f"GPU count: {torch.cuda.device_count()}")

[0190] # Verify TorchVision version

[0191] print(f"TorchVision version: {torchvision.__version__}")

[0192] If the output shows "True" indicating CUDA availability, it means the GPU version was installed successfully.

[0193] If the output is False, check if the NVIDIA driver / CUDA version matches, or reinstall PyTorch.

[0194] Install YOLO model dependencies (for bolt detection tasks):

[0195] After setting up the PyTorch environment, you need to install the runtime dependencies for YOLOv5 / v8, as well as dataset processing tools. The specific code is shown below:

[0196] # 1. Install the official YOLO repository (ultralytics compatible with YOLOv5 / v8)

[0197] pip install ultralytics==8.0.200 # Fixed version to avoid compatibility issues

[0198] # 2. Install dataset processing / image enhancement dependencies

[0199] pip install opencv-python==4.8.0.76 # Image processing

[0200] pip install albumentations==1.3.1 # Data augmentation

[0201] pip install labelImg==1.8.6 # Labeling tool (optional)

[0202] pip install matplotlib==3.7.2 # Visualization

[0203] pip install pandas==2.0.3 # Data Management

[0204] pip install scikit-learn==1.2.2 # Dataset splitting

[0205] Verify the compatibility of the YOLO model with the PyTorch environment:

[0206] Run the following code in the bolt_det environment to verify whether the YOLOv8n sub-model can be loaded normally (based on PyTorch). The specific code is shown below:

[0207] import YOLO from ultralytics

[0208] import torch

[0209] # 1. Load the YOLOv8n model (built on PyTorch)

[0210] model = YOLO('yolov8n.pt') # Automatically download pre-trained weights (PyTorch format .pt)

[0211] # 2. Check the model device (it should display cuda:0, indicating that it is loaded onto the GPU).

[0212] print(f"Model deployment device: {next(model.model.parameters()).device}")

[0213] # 3. Test reasoning (using arbitrary bolt images)

[0214] img_path = "bolt_test.jpg" # Replace with the path to your bolt test image

[0215] results = model(img_path)

[0216] 4. Output the inference results (verifying the environment is normal), the specific code is shown below:

[0217] print(f"Number of detected targets: {len(results[0].boxes)}")

[0218] results[0].show() # Visualize the detection results

[0219] If the model can be loaded normally, the detection results can be output and the image can be displayed, it means that the training environment of PyTorch + YOLO model has been successfully set up.

[0220] Step 3-2: Perform transfer learning initialization to train the sub-model;

[0221] In a preferred but non-limiting embodiment of the present invention, step 3-2 specifically includes:

[0222] The pre-trained parameters of YOLO series models on the bolt target detection dataset (VOC general dataset) are transferred to the bolt detection task. The parameters of the feature extraction layer of the backbone network are frozen, and only the classification and regression layers are trained, which reduces the difficulty of model training and shortens the training cycle.

[0223] In step 3-2, the pre-trained parameters of the YOLO series models on the VOC general dataset are transferred to the bolt detection task. The parameters of the feature extraction layer of the backbone network are frozen, and only the classification and regression layers are trained. This includes:

[0224] S1: Defines the configuration file for the bolt inspection task;

[0225] Create a bolt.yaml configuration file, specifying the dataset path, number of categories, and category names (for compatibility with the Ultralytics framework). The code is as follows:

[0226] #bolt.yaml

[0227] path: . / bolt_dataset # Root directory of the bolt dataset

[0228] train:images / train# Path to the training set images

[0229] val:images / val # Path to the validation set images

[0230] test:images / val # Test set path (optional)

[0231] #Category Configuration

[0232] nc:2# Number of categories: bolt(0), lost(1)

[0233] names:['bolt','lost']# Category name (consistent with the name in the XML annotation)

[0234] S2: Load the pre-trained model and freeze the backbone network;

[0235] By loading the pre-trained VOC weights of YOLOv8n using the ultralytics library, the backbone network layers are precisely frozen, allowing only the classification / regression layers to update their parameters. The code is as follows:

[0236] from ultralyticsimportYOLO

[0237] importtorch

[0238] #1. Load the YOLOv8nVOC pre-trained model

[0239] model=YOLO('yolov8n.pt') # Automatically load VOC / COCO pre-trained weights (official default)

[0240] #2. Examine the model structure and locate the backbone network layer (the backbone of YOLOv8n is model.model[0:8]).

[0241] print("YOLOv8n model structure:")

[0242] fori,layerinenumerate(model.model):

[0243] print(f"Layer{i}:{layer.__class__.__name__}")

[0244] #3. Freeze all parameters of the backbone network.

[0245] The backbone network of #YOLOv8n corresponds to the first 8 layers of model.model (CSPDarknet), which can be adjusted according to the actual structure.

[0246] backbone_layers=model.model[:8]

[0247] forparaminbackbone_layers.parameters():

[0248] param.requires_grad=False # Freeze parameters and prevent gradient updates

[0249] #4. Verify the freeze effect (backbone network parameter gradients are off, classification / regression layer gradients are on).

[0250] print("\nParameter freezing verification:")

[0251] fori,(name,param)inenumerate(model.model.named_parameters()):

[0252] if "backbone"innameori < 8: # Backbone network identifier

[0253] print(f"Parameter{i}({name}):requires_grad={param.requires_grad}")

[0254] else: #Classification / Regression layer (head)

[0255] print(f"Parameter{i}({name}):requires_grad={param.requires_grad}")

[0256] Key points:

[0257] The YOLOv8 model structure: model.model contains three parts: backbone, neck (feature fusion), and head (classification / regression);

[0258] The freeze range can be adjusted using the layer index of model.model (e.g., the first 9 layers for the backbone in YOLOv8s). The key is to ensure that the backbone network's requirements_grad=False.

[0259] The classification / regression layer (head) requires_grad=True by default and requires no additional settings.

[0260] S3: Configure training parameters and start transfer learning training;

[0261] To configure training hyperparameters and train only the classification / regression layer for bolt detection tasks, the code is as follows:

[0262] #Training parameter configuration (focusing on adapting to frozen training)

[0263] training_args={

[0264] 'data':'bolt.yaml', # Dataset configuration file

[0265] 'epochs':100, # Number of training epochs (freezing training can reduce the number of epochs)

[0266] 'batch':14, # Batch size (adapts to GPU memory)

[0267] 'lr0':0.001, #Initial learning rate (can be slightly higher when training is frozen)

[0268] 'lrf':0.01, #Learning rate decay factor

[0269] 'imgsz':640, # Input image size

[0270] 'device':0iftorch.cuda.is_available()else'cpu',#GPU / CPU

[0271] 'freeze':8,# Extra fallback: Freeze the first 8 layers (same as S2)

[0272] 'optimizer':'SGD', # Optimizer

[0273] 'patience': 10, #Pain level for early cessation

[0274] 'save':True, #Save the best model

[0275] 'project':'bolt_detection', # Directory for saving training results

[0276] 'name':'yolov8n_frozen', #Experiment name

[0277] 'exist_ok':True # Overwrite existing experiment directories

[0278] }

[0279] The hyperparameters can be set as follows: optimizer is SGD, initial learning rate is 0.001, batch size is 14, number of iterations is 200-600, image input size is 640×640 pixels, and learning rate adjustment strategy is exponential decay.

[0280] #Start training (update classification / regression layer parameters only)

[0281] results=model.train(**training_args)

[0282] S4: Post-training validation and model saving;

[0283] After training, verify the model performance and save the lightweight weights that only update the classification / regression layers. The code is as follows:

[0284] #1. Validate the model (test accuracy on the validation set)

[0285] val_results=model.val()

[0286] print(f"validation set mAP") 0.5:0.95： {val_results.box.map:.2f}")

[0287] print(f"bolt category AP: {val_results.box.map75[0]:.2f}")

[0288] print(f"lost category AP: {val_results.box.map75[1]:.2f}")

[0289] #2. Save the trained model (containing only the parameters updated for the classification / regression layers).

[0290] model.save('yolov8n_bolt_frozen.pt')

[0291] #3. Reasoning Test (Verifying the effectiveness of bolt detection)

[0292] test_img='. / bolt_dataset / images / val / bolt_lost_001.jpg'

[0293] infer_results=model(test_img)

[0294] infer_results[0].show() # Visualize the detection results.

[0295] Step 3-3: Conduct model performance evaluation and selection.

[0296] In a preferred but non-limiting embodiment of the present invention, step 3-3 specifically includes:

[0297] The training results of the eight sub-models are evaluated from multiple dimensions using the following core metrics:

[0298] Intersection and Union The detection bounding box measures the degree of overlap between the detection bounding box and the ground truth bounding box. The ground truth bounding box is the manually labeled actual location of the target, representing the true location and extent of the bolt or bolt detachment in the image. The detection bounding box is the predicted location bounding box of the target in the image by the YOLO sub-model. It is the predicted location and extent of the bolt or bolt detachment after the sub-model learns the data features. Its calculation formula is as follows:

[0299] ;

[0300] in, For detection box With real frame The area of intersection For detection box With real frame The area of the union of the sets;

[0301] Accuracy : Measures the proportion of true positive samples among the results predicted as positive samples. The formula for calculation is:

[0302] ;

[0303] in, The number of bolt targets correctly detected by the sub-model. The number of non-bolt targets falsely detected by the sub-model;

[0304] Recall rate The proportion of true positive samples that are correctly detected is measured by the following formula: ;

[0305] in, The target number of bolts that were missed during inspection;

[0306] Mean accuracy Combining different The average accuracy (AP) below the threshold reflects the overall detection performance of the sub-model, including ( =0.5 )and ( 10 units with a step size of 0.05, ranging from 0.5 to 0.95. );

[0307] Frames per second of the sub-model : Measures the real-time performance of the sub-model detection; the higher the value, the stronger the real-time performance.

[0308] The comprehensive index of each sub-model is calculated based on the core indicators: 0.25 × +0.2× +0.15× +0.12× +0.1× +0.1× The sub-model with the highest comprehensive index is selected as the optimal model.

[0309] As demonstrated by a comprehensive comparison of metrics, the YOLOv8n model performs optimally: With a detection accuracy of 97.8%, the model has a small number of parameters (only 3.2M) and is lightweight. The FPS meets the requirements for real-time detection, so YOLOv8n was selected as the core model for detecting bolt loosening defects, which is also the optimal model.

[0310] Step 4: Detect bolt loosening defects based on the optimal model.

[0311] In a preferred but non-limiting embodiment of the present invention, step 4 specifically includes:

[0312] The image of the steel bridge bolt to be detected is input into the trained optimal model. The optimal model extracts the image features of the steel bridge bolt image through its backbone network. After multi-scale feature fusion in its Neck layer, the decoupling head outputs the category (presence / detachment), location box and confidence of the bolt target. Finally, the detection result is output and the visualization annotation is realized (marking the location and confidence of the detached bolt).

[0313] In addition, the current method for outputting the category (presence / detachment), location box, and confidence score of bolt targets by the decoupling head after multi-scale feature fusion of the Neck layer still has the following shortcomings:

[0314] The category confidence score is not correlated with the location box reliability: The decoupled head outputs the category confidence score and location box coordinates independently, without considering the impact of the location box positioning accuracy on the category judgment. This may result in false detections where "the location box deviates from the true target but the category confidence score is too high".

[0315] The confidence threshold is standardized: the characteristic differences between the two types of targets, namely bolt presence and bolt detachment, are not distinguished. Using a fixed threshold to screen the results can easily lead to missed detection of small samples of detached targets or false detection of present targets.

[0316] Multi-scale feature fusion gain is not fully transformed: the multi-scale features output by the Neck layer do not participate in confidence correction, and only rely on the decoupled head for a single inference, failing to utilize the consistency of target features at different scales to improve output reliability.

[0317] Error tolerance mechanism in abnormal scenarios: When there is strong noise, occlusion or other interference in the image, the decoupling head directly outputs the result, which lacks the ability to identify and suppress abnormal features and has insufficient anti-interference capability.

[0318] In a preferred but non-limiting embodiment of the present invention, after multi-scale feature fusion of the Neck layer in step 4, the method for outputting the category (presence / detachment), location box, and confidence level of the bolt target by the decoupling head specifically includes:

[0319] After multi-scale feature fusion in the Neck layer, a four-step optimization process is added: feature consistency verification, dynamic confidence correction, dual threshold filtering, and anomaly suppression. The final output includes the category, bounding box, and corrected confidence score of the bolt target, as detailed below:

[0320] Step 4-1: Extract the fused feature maps of the Neck layer at three scales (large, medium, and small), and obtain the class prediction, bounding box coordinates, and initial confidence scores at each scale by decoupling the head branch.

[0321] Step 4-2: Calculate the multi-scale feature consistency coefficient, correct the initial confidence, and use this to correlate the location box positioning accuracy with the category confidence.

[0322] In step 4-2, the multi-scale feature consistency coefficient The calculation formula is:

[0323] ;

[0324] in The multi-scale feature consistency coefficient (value range [0,1]) reflects the consistency of the location boxes and the clarity of features across three scales. For scale index (1=large scale, 2=medium scale, 3=small scale); For the first The location box output by the scale decoupling head; The average coordinate frame of the three scale location boxes (calculated from the average coordinates of the top left and bottom right corners of each scale location box); For the first The intersection-over-union ratio (IoU) of the scaled location bounding box and the average coordinate bounding box measures positional consistency. For the first The sharpness score of the scaled fused feature map (range [0,1]) is obtained by calculating the gradient variance of the fused feature map using the Laplacian operator, reflecting the sharpness of the features;

[0325] In step 4-2, the formula for calculating the corrected initial confidence level is as follows:

[0326] ;

[0327] The final confidence level after correction (value range [0,1]); The initial confidence level of the decoupling head output (value range [0,1]); This is the position accuracy gain coefficient (empirical value 0.3), which balances the weights of feature consistency and positioning accuracy. The output bounding box after fusion (obtained by averaging the coordinates of the three scale bounding boxes); The average reference box of similar targets in the training set (obtained based on the statistical analysis of labeled data, reflecting the typical size and location distribution of the targets); The intersection-union ratio (IUU) of the predicted location box and the reference box is used to measure the reasonableness of the positioning.

[0328] Step 4-3: For the two types of targets, bolt presence and bolt detachment, adaptively generate dynamic screening thresholds to filter out low-reliability results;

[0329] In step 4-3, the formula for calculating the adaptively generated dynamic filtering threshold is:

[0330] ;

[0331] in For category Dynamic filtering threshold ( =0 indicates that the bolt exists. =1 indicates that the bolt has come loose); Category weight coefficients ( =0 =0.6, =1 =0.4), suitable for small sample targets with shedding characteristics; For categories in the training set The target's average confidence level (obtained based on validation set statistics); For categories in the training set The standard deviation of the confidence level of the target reflects the degree of dispersion of the confidence level distribution of the category.

[0332] Step 4-4: Introduce an anomaly feature recognition module to suppress invalid outputs in scenarios with strong noise, occlusion, etc.

[0333] In step 4-3, the method of introducing the anomaly feature recognition module specifically includes:

[0334] First, perform the following abnormal feature suppression coefficient. The calculation formula is as follows:

[0335] ;

[0336] in This is the anomaly feature suppression coefficient (value range [0,1]), which suppresses the output under abnormal scenarios; The suppression intensity coefficient (empirical value 0.8); The noise and occlusion evaluation value (range [0,1]) is obtained by summing the proportion of noise pixels and the proportion of occlusion area in the fused feature map;

[0337] The decision rule is then executed, which is as follows:

[0338] like and If the value is 0.3, then the category of the target will be output. Location box and corrected confidence level ;

[0339] like < or If the value is less than 0.3, it is considered an invalid target and will not be output.

[0340] The technical advantages of this invention, which uses multi-scale feature fusion in its Neck layer to output the category (presence / detachment), location box, and confidence score of bolt targets from the decoupling head, are as follows:

[0341] The false detection rate is significantly reduced: By using multi-scale feature consistency verification and location box association correction, false detection targets with high confidence but offset positions are effectively filtered out. According to actual tests, the false detection rate has been reduced from 0.6% in the original scheme to below 0.15%.

[0342] The false negative rate has been further reduced: The adaptive dual threshold design for small sample targets such as bolt detachment avoids false negatives caused by a single threshold. The recall rate of detachment targets has been increased from 99.8% to 99.95%, almost eliminating the risk of false negatives for key defects.

[0343] Enhanced anti-interference capability: The abnormal feature suppression module can effectively cope with complex scenarios such as strong noise and occlusion. Under simulated on-site interference environment (Gaussian noise variance of 0.05 and occlusion area of 30%), the detection accuracy rate is still above 97%, which is 5 percentage points higher than the original solution.

[0344] Improved confidence level reliability: The revised confidence level is strongly correlated with the actual state of the target and the positioning accuracy. Among targets with a confidence level ≥ 0.8, the proportion of real and valid targets has increased from 98% in the original plan to 99.7%, providing a more reliable quantitative basis for subsequent maintenance decisions.

[0345] Real-time performance is unaffected: all new processes are lightweight calculations (the computational complexity of feature consistency coefficient, confidence correction, etc. is O(1)), the detection time for a single image only increases by 0.1ms, and the total time is still controlled within 1.1ms, which fully meets the real-time detection requirements of the engineering site.

[0346] Examples of the present invention are shown below:

[0347] 3.1 Dataset Construction and Implementation

[0348] Image acquisition: Images of the actual bridge node plate and experimental model bolts were taken using an iPhone 12 smartphone (12 megapixels, f / 1.6 aperture). Supplementary images were obtained by crawling from a professional engineering image library. A total of 2,000 original images were collected.

[0349] Data augmentation: The original images are subjected to random cropping (cropping area ratio 0.6-1.0), random translation (translation step size 0-50 pixels), brightness adjustment (brightness coefficient 0.5-1.5), Gaussian noise addition (noise variance 0.01-0.05), 0-45° rotation and horizontal mirroring operations in sequence, expanding to 5000 augmented images.

[0350] Labeling and segmentation: Use the LabelImg tool to label two types of targets, "bolt (bolt present)" and "lost (bolt detached)," and generate XML tag files; divide the training set (4000 images) and the validation set (1000 images) in an 8:2 ratio to ensure a balanced distribution of bolt scenes between the training set and the validation set.

[0351] 3.2 Model Training Implementation

[0352] Environment setup: Install the PyTorch 1.12.1 framework on a Windows 10 system, configure CUDA 11.7 to accelerate training, and set up a Python 3.8 development environment.

[0353] Transfer learning: Load the pre-trained weights of YOLOv8n on the VOC dataset, freeze the backbone network parameters, train only the classification and regression layers of the head, set the initial learning rate to 0.001, and use the SGD optimizer for iterative training.

[0354] Hyperparameter tuning: During training, when the number of iterations reaches 50, 100, and 150, the learning rate is exponentially decayed to 50% of its original value; the batch size is set to 14 to ensure sufficient GPU memory; when the model loss converges at 500 iterations, training is stopped.

[0355] 3.3 Verification of Detection Results

[0356] One hundred measured images of steel bridge bolts (including 20 images of bolt detachment) were selected and input into the YOLOv8n model for detection. The results show:

[0357] The accuracy rate for detecting loose bolts reached 99.4%, the recall rate reached 99.8%, the false negative rate was only 0.2%, and the false positive rate was 0.6%.

[0358] Model mAP 0.5-9.5 The accuracy rate was 97.8%, significantly better than the YOLOv5s model (mAP). 0.5-9.5 (80.2%)

[0359] The detection time for a single image is only 0.99ms, with an FPS exceeding 1000, meeting the real-time detection needs of engineering sites.

[0360] The beneficial effects of the present invention are as follows: Compared with the prior art, the technical effects of the present invention include:

[0361] High detection accuracy: This invention uses YOLOv8n as the core model to target the mAP of bolt loosening defects. 0.5:0.95 The accuracy rate reached 97.8%, with both precision and recall exceeding 99%, significantly reducing the false negative and false positive rates. The detection accuracy was far superior to traditional manual inspection and conventional YOLOv5 models.

[0362] High real-time performance: The detection time for a single image of the model is less than 1ms, and the FPS meets the batch detection needs of engineering sites. It can quickly complete the bolt defect investigation of large steel bridges, and the detection efficiency is more than 100 times higher than that of manual inspection.

[0363] Lightweight model: The YOLOv8n model has only 3.2M parameters, which is much smaller than models such as YOLOv5l (43.7M) and YOLOv8l (43.7M). It can be deployed on mobile devices, drones and other portable devices, and is suitable for the complex inspection environment of bridge sites.

[0364] Good generalization ability: Through multi-channel image acquisition and multi-dimensional data enhancement, the model can adapt to complex working conditions with different lighting, shooting angles and bolt arrangements, and shows good stability and reliability in actual bridge inspection.

[0365] This invention enables intelligent detection of bolt loosening defects in steel bridges, providing bridge maintenance units with an efficient and accurate detection tool. It can effectively reduce maintenance costs and improve the level of structural safety, and has significant engineering practical value and promotion prospects.

[0366] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit it. Although the present invention has been described in detail with reference to the above embodiments, those skilled in the art should understand that modifications or equivalent substitutions can still be made to the specific implementation of the present invention without departing from the spirit and scope of the present invention. Any modifications or equivalent substitutions should be covered within the scope of protection of the claims of the present invention.

Claims

1. A bolt loosening disease target detection control method based on image recognition, characterized by, include: Step 1: Construct a bolt target detection dataset for steel bridge bolts; Step 2: Construct a target detection network for the bolt target detection dataset; Step 3: Train and optimize the model using the PyTorch deep learning framework; Step 4: Detect bolt loosening defects based on the optimal model; Step 4 specifically includes: The image of the steel bridge bolt to be detected is input into the trained optimal model. The optimal model extracts the image features of the steel bridge bolt image through its backbone network. After multi-scale feature fusion through its Neck layer, the decoupling head outputs the category, location box and confidence of the bolt target. Finally, the detection result is output and visualized annotation is achieved. In step 4, after multi-scale feature fusion of its Neck layer, the method for outputting the category, location box, and confidence score of the bolt target by the decoupling head specifically includes: After multi-scale feature fusion in the Neck layer, a four-step optimization process is added: feature consistency verification, dynamic confidence correction, dual threshold filtering, and anomaly suppression. The final output includes the category, bounding box, and corrected confidence score of the bolt target, as detailed below: Step 4-1: Extract the fused feature maps of the Neck layer at three scales, and obtain the class prediction, bounding box coordinates and initial confidence scores at each scale by decoupling the head branch; Step 4-2: Calculate the multi-scale feature consistency coefficient and correct the initial confidence level; In step 4-2, the multi-scale feature consistency coefficient The calculation formula is: ； wherein is a multi-scale feature consistency coefficient; is a scale index; is a decoupled head output of a position box of the scale; is an average coordinate box of the position boxes of the 3 scales; is an intersection over union of the position box of the scale and the average coordinate box; is a clarity score of the fused feature map of the scale; In step 4-2, the formula for calculating the corrected initial confidence level is as follows: ； Corrected final confidence level; The initial confidence level of the decoupling head output; This is the position accuracy gain coefficient; The position box is the output after merging; To train the average reference box of similar targets in the training set; This is the intersection-union ratio of the location box and the reference box; Step 4-3: For the two types of targets, bolt presence and bolt detachment, adaptively generate dynamic screening thresholds to filter out low-reliability results; In step 4-3, the formula for calculating the adaptively generated dynamic filtering threshold is: ； in For category Dynamic filtering threshold; These are the category weight coefficients; For categories in the training set The average confidence level of the target; For categories in the training set Standard deviation of the confidence level of the target; Step 4-4: Introduce the anomaly feature recognition module; In step 4-3, the method of introducing the anomaly feature recognition module specifically includes: First, perform the following abnormal feature suppression coefficient. The calculation formula is as follows: ； in The abnormal feature suppression coefficient; The suppression intensity coefficient; For noise and obstruction assessment values; The decision rule is then executed, which is as follows: like and If the value is 0.3, then the category of the target will be output. Location box and corrected confidence level ; like < or If the value is less than 0.3, it is considered an invalid target and will not be output.

2. The bolt loosening defect target detection and control method based on image recognition according to claim 1, characterized in that, Step 1 specifically includes: Step 1-1: Collect bolt images through three channels to construct the original dataset: Steps 1-2: To improve the model's generalization ability, perform augmentation operations on the original dataset to form the base dataset: Steps 1-3: Use the open-source annotation tool LabelImg to annotate the images of the enhanced base dataset, marking two types of targets: bolt presence and bolt detachment. Generate an XML tag file in VOC format as the annotated dataset. Randomly divide the annotated dataset into training and validation sets in an 8:2 ratio. The training and validation sets constitute the bolt target detection dataset.

3. The bolt loosening defect target detection and control method based on image recognition according to claim 2, characterized in that, Step 1-1 specifically includes: Collect 500 bolt images of actual steel bridge connection nodes; Create a scaled-down model of the steel bridge node plate and take 1000 images of the bolts; We crawled 500 bolt images from a professional engineering image library to supplement the variety of bolt types and scenarios.

4. The bolt loosening defect target detection and control method based on image recognition according to claim 3, characterized in that, Steps 1-2 specifically include: Randomly crop bolt regions from images in the original dataset and add perturbation to the target location; The images of the original dataset are translated in the plane, and the missing areas are filled with black pixels; The brightness of image pixels in the original dataset is increased or decreased proportionally to simulate different lighting environments. Inject Gaussian noise or salt-and-pepper noise into the images of the original dataset to simulate interference in on-site captured images; The images in the original dataset are randomly rotated between 0 and 45° to adapt to bolt images taken from different angles; Perform horizontal or vertical mirroring on the images in the original dataset to increase sample diversity.

5. The bolt loosening defect target detection and control method based on image recognition according to claim 4, characterized in that, Step 2 specifically includes: A YOLO series of object detection networks was constructed for the bolt object detection dataset. The YOLO series of object detection networks includes eight sub-models, namely the YOLOv5 series and the YOLOv8 series.

6. The bolt loosening defect target detection and control method based on image recognition according to claim 5, characterized in that, Step 3 specifically includes: Step 3-1: Configure the training environment; Step 3-2: Perform transfer learning initialization to train the sub-model; Step 3-3: Conduct model performance evaluation and selection.

7. The bolt loosening defect target detection and control method based on image recognition according to claim 6, characterized in that, Step 3-1 specifically includes: A training environment was built based on the PyTorch deep learning framework, which includes eight sub-models: the YOLOv5 series and the YOLOv8 series, which are the YOLO series models. The hardware configuration of the PyTorch deep learning framework is an Intel Core i5-13600KF CPU and an NVIDIA 4060Ti GPU. The software configuration of the PyTorch deep learning framework is Python 3.8, CUDA 11.7, and PyCharm 2023 compiler.

8. The bolt loosening defect target detection and control method based on image recognition according to claim 7, characterized in that, Step 3-2 specifically includes: The pre-trained parameters of YOLO series models on the bolt target detection dataset were transferred to the bolt detection task. The parameters of the feature extraction layer of the backbone network were frozen, and only the classification and regression layers were trained.

9. The bolt loosening defect target detection and control method based on image recognition according to claim 8, characterized in that, Step 3-3 specifically includes: The training results of the eight sub-models are evaluated from multiple dimensions using the following core metrics: Intersection and Union The calculation formula is as follows: ； in, For detection box With real frame The area of intersection For detection box With real frame The area of the union of the sets; Accuracy The calculation formula is as follows: ； in, The number of bolt targets correctly detected by the sub-model. The number of non-bolt targets falsely detected by the sub-model; Recall rate The proportion of true positive samples that are correctly detected is measured by the following formula: ; in, The target number of bolts that were missed during inspection; Mean accuracy include and ; Frames per second of the sub-model ; The comprehensive index of each sub-model is calculated based on the core indicators = 0.25 × +0.2× +0.15× +0.12× +0.1× +0.1× The sub-model with the highest comprehensive index is selected as the optimal model.