Lightweight intelligent detection method for railway tunnel lining diseases
By constructing a lightweight convolutional neural network model, the problems of low efficiency and high cost in the existing railway tunnel lining defect detection are solved, realizing efficient and low-cost detection of multiple types of defects, which is suitable for long-distance tunnel detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA RAILWAY DESIGN GRP CO LTD
- Filing Date
- 2024-12-23
- Publication Date
- 2026-06-26
AI Technical Summary
Existing methods for detecting defects in railway tunnel linings rely on manual identification, which is inefficient and lacks standardized criteria. Machine learning methods consume a lot of computation in long tunnels, and deep learning models have a large number of parameters, resulting in high load capacity and cost of detection equipment, and a lack of lightweight models.
A lightweight intelligent detection model is constructed using convolutional neural networks, including modules for efficient feature extraction, lightweight feature fusion, and feature enhancement. By adaptively scaling and padding images, combined with cross-layer weighted fusion and contextual Transformer modules, efficient detection of lining defects is achieved.
It improves the intelligence and efficiency of railway tunnel lining defect detection, reduces hardware memory requirements, enables accurate detection of multiple types of defects, is suitable for long-distance tunnel inspection, and reduces engineering costs.
Smart Images

Figure CN120014431B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image detection technology, specifically a lightweight intelligent detection method for defects in railway tunnel lining. Background Technology
[0002] The stability of railway tunnels is crucial to the safety of railway operations and the lives and property of the people. Therefore, quality inspection of railway tunnels is of great significance. Among them, tunnel lining is an important inspection item in quality inspection projects, and the research and application of defect detection technology is particularly critical.
[0003] Currently, based on the non-destructive detection properties of tunnel lining radar, tunnel lining defect detection technology is typically conducted using ground-penetrating radar images. This primarily includes subjective identification methods based on manual search, machine learning methods based on feature extraction engineering, and intelligent detection methods based on deep learning. Subjective identification methods rely on the radar imaging mechanism to subjectively search for areas exhibiting typical defect characteristics. For long and narrow tunnels, subjective identification is extremely labor-intensive; furthermore, this method heavily depends on the personal experience of the personnel involved, leading to inconsistent standards for identifying various defects.
[0004] Machine learning methods extract image features containing texture, color, and edge information from the defect areas in radar images, train multi-classifiers for various defects, and then assist manual identification. These methods heavily rely on feature extraction from defect areas. In long-format radar images of tunnel lining, defect targets are often sparsely distributed. Therefore, directly performing feature extraction and computation globally would drastically increase hardware and time consumption. Furthermore, due to the complex background interference in the lining radar images, these methods are prone to generating many false alarms, hindering accurate and rapid defect identification.
[0005] In recent years, with the rapid development of computer technology and artificial intelligence, deep learning-based methods have been gradually introduced into radar image analysis of tunnel lining. The core step of these methods is to detect defect areas with specific image features from radar images. However, there are few studies directly applied to lining defect detection, especially those based on network design using typical defect samples and features. Furthermore, existing methods only detect single types of defects, such as voiding defects. From an application perspective, existing lining defect detection models have a large number of parameters and consume significant memory. In engineering, they rely on graphics processing units (GPUs) and parallel computing, which greatly increases computer load and engineering costs, and may reduce the stability of detection equipment. Lightweight model research is a key challenge in current engineering applications. However, there is currently no lightweight model specifically designed for railway tunnel lining defect detection.
[0006] In summary, existing tunnel lining inspection projects are gradually transitioning from manual identification to intelligent detection. Researching high-performance, lightweight methods for detecting various defects in railway tunnel linings is of great significance for replacing manual identification, improving the accuracy of detecting various lining defects, and achieving low-energy consumption and low-cost engineering deployment and application. Summary of the Invention
[0007] This invention aims to solve at least one of the technical problems existing in the prior art. To this end, this invention proposes a lightweight intelligent detection method for railway tunnel lining defects. This method is mainly applied to the detection of railway tunnel lining defects, and its conclusions are comparable to those obtained through manual identification. It can effectively replace manual intelligent identification and improve the efficiency and accuracy of long-distance tunnel lining quality detection.
[0008] To address the above problems, this invention provides a lightweight intelligent detection method for railway tunnel lining defects, comprising the following steps:
[0009] Step 1: Acquire multiple radar images of railway tunnel linings at a specific mileage length. The images contain various reflective structures, including secondary lining, reinforcing steel, and initial support.
[0010] Step 2: Based on the mileage information of each image input data, perform annotation, segmentation, and target relative coordinate transformation operations on the lining radar image to construct an image dataset for training;
[0011] Step 3: For each segmentation map F i Scale normalization is performed, and adaptive scaling and padding are used to obtain a fixed input scale of c·c, where c = 640; assuming the segmentation map F i The size is a×b (a≥b), and the adaptive scaling factor is defined as the reciprocal of the ratio of the longer side a to the fixed length c, i.e. After scaling, the longer side is scaled to c, and the shorter side length is λ·b. To convert to a normalized scale, double-sided short side padding with 0 is used. The formula for calculating the single-sided short side padding pixel size m is:
[0012]
[0013] Step 4: Construct a convolutional neural network for identifying voided structures and rebar, including an efficient feature extraction network, a lightweight feature fusion network, a feature enhancement module, and a feature decoding structure; input the location and category labels of all targets in the segmentation map to train the neural network and obtain weight parameters;
[0014] Step 5: Perform the preprocessing operations of Step 2 and Step 3 on the image to be inspected, and then infer the network structure based on the training weight parameters of Step 4 to detect the category and location parameters of steel bars and void structures in all segmented images.
[0015] Step 6: Based on the segmentation map F i The test results determine whether there is a tendon deficiency;
[0016] Step 7: Based on the segmentation parameters in Step 2, stitch together all segmentation map detection results to obtain a long-width detection result with the same size as the input data.
[0017] Preferably, step 2 includes the following sub-steps:
[0018] Step 2-1: The segmentation method adopts sliding window segmentation. The step size s of the sliding window is set to 1000 pixels, which corresponds to a segmentation image mileage of approximately 10m and a segmentation margin g of 200 pixels.
[0019] Step 2-2: Perform segmentation and shifting processing on the segmentation edges to ensure the integrity of the target and prevent the segmentation operation from damaging the reinforcing bars and the detached target; determine whether there is a target at the edge of each window. If the segmentation line passes through the target, the shifting processing is performed. If the segmentation line does not pass through the target, the width of the segmented area remains unchanged; obtain several segmentation images in this way.
[0020] Steps 2-3: To facilitate training, the initial labeled bounding box positions are converted into bounding box coordinates in the segmentation map; assuming the i-th segmentation map F... i The original parameters of the target bounding box in the image are (x1, y1, x2, y2), and the delay is e pixels. i The formula for calculating the relative position (x1', y1', x2', y2') is as follows:
[0021]
[0022] y′1=y1
[0023]
[0024] y′2=y2.
[0025] Preferably, step 5 includes the following sub-steps:
[0026] Step 5-1: Input the preprocessed image into an efficient feature extraction network to obtain initial feature maps F at three scales. i,80×80×128 F i,40×40×256 F i,20×20×512 ;
[0027] Step 5-2: Use a lightweight feature fusion network to perform cross-layer weighted fusion of the initial features to obtain a fused feature map P at three scales. i,80×80×64 P i,40×40×128 P i,20×20×256 ;
[0028] Step 5-3: Analyze the fused feature map Pi,80×80×64 P i,40×40×128 P i,20×20×256 Perform contextual feature enhancement to obtain enhanced feature maps.
[0029] Step 5-4: Enhance the feature map The input feature decoding structure is used to obtain the prediction vector; the final prediction vector contains (t) representing the location of the prediction box. x ,t y ,t w ,t h ), representing the confidence score of the current prediction box. confi And Probability, representing the probability of the target class. class To normalize the predicted bounding box output, the center point coordinates of the predicted bounding box (b x ,b y ,b w ,b h ) is determined by the relative predicted coordinates (t) of the corresponding grid. x ,t y ,t w ,t h The calculation formula is as follows:
[0030] b x =c x +2σ(t x -0.5
[0031] b y =c y +2σ(t y -0.5
[0032] b w =p w ·(2σ(t w )) 2
[0033] b h =p h ·(2σ(t h )) 2 ;
[0034] Among them (c x ,c y ) represents the relative coordinates of the top-left corner of the grid responsible for predicting the target.
[0035] Preferably, step 5-1 includes the following sub-steps:
[0036] Step 5-1-1: Apply depthwise separable convolution, batch normalization, and SiLU activation to the input image, and perform shallow feature extraction by the DW-CBS module;
[0037] Step 5-1-2: Multiple stacking of the GhostELA-DWC3 module and the DW-CBS module for multi-scale feature extraction. GhostELA-DWC3 is an improved deep feature extraction module. The stacking method is as follows: stack the DW-CBS module and the GhostELA-DWC3 module twice to obtain the feature map F. i,80×80×128 Feature map F is obtained by superimposing it three times. i,40×40×256 F is obtained by stacking four times. i,20×20×512 .
[0038] Preferably, step 5-2 includes the following sub-steps:
[0039] Step 5-2-1: For F i,20×20×512 Spatial pyramid pooling is performed using SPPF blocks, followed by GhostELA-DWC3, DW-CBS modules, and nearest neighbor interpolation to obtain double-upsampled features F. i,40×40×128 This feature is related to F i,40×40×256 The upsampled fusion feature map F is obtained by stitching together along the channel direction. i,40×40×384 In this way, an upsampled fusion feature map F at another scale is obtained. i,80×80×192 ;
[0040] Step 5-2-2: For feature map F i,80×80×192 Further, the GhostELA-DWC3 module was used to obtain the fused feature map P. i,80×80×64 Based on the idea of cross-layer weighted fusion, the channel weights W of the multi-layer feature maps are calculated using fast normalization:
[0041]
[0042] In the formula w i For learnable weights, ε = 0.0001 is used for stable gradient propagation; in each w i The SiLU activation function is then applied to ensure the weighted value w. i >0; therefore, each normalized weight value is limited to between 0 and 1;
[0043] Step 5-2-3: Initial feature map, upsampled fused feature map F i,80×80×128 and fused feature map P i,80×80×64 Weighted concatenation is performed along the channel directions to obtain the fused feature map P. i,40×40×128 In this way, the fused feature map P is obtained. i,20×20×256 .
[0044] Preferably, step 5-3 includes the following sub-steps:
[0045] Step 5-3-1: Fuse the feature map P i,80×80×64The formula for converting query Q, key K, and value V into a vector is as follows:
[0046] Q = P i,80×80×64 ·M q
[0047]
[0048] V = P i,80×80×64 ·M v ;
[0049] Among them, M q , M v These are the embedding matrices used to transform queries, key vectors, and value vectors, respectively.
[0050] Step 5-3-2: Assume the center key of the context region is X. cen If the size of the surrounding region is ×, where = 3, then calculating × convolutions yields the key vector information for each surrounding region; the learned context key K Static It reflects static information about the center and its surroundings;
[0051] Step 5-3-3: Concatenate the context key and query vector to form the key [K] Static Self-attention encoding is performed using two consecutive 1×1 convolutions:
[0052]
[0053] Among them, M att Represents a 1×1 convolution. This indicates a 1×1 convolution using a SiLU activation layer; therefore, the local attention matrix W... att It learns based on query features and contextual key features, that is, it enhances the "self-attention" of local regions by mining contextual features;
[0054] Step 5-3-4: Summarize the feature matrix V, and perform a Softmax operation on the channel dimension to calculate the dynamic context self-attention weight matrix, as shown below:
[0055]
[0056] in This represents the attention score for the Softmax operation;
[0057] Step 5-3-5: Transfer static context features K Static and dynamic context features K dynamic Fusion is performed through a channel overlay and fusion mechanism; in this way, P i,40×40×128 and P i,20×20×256 Perform feature enhancement.
[0058] Preferably, step 6 includes the following sub-steps:
[0059] Step 6-1: Since the reinforcing bars in tunnel lining are usually arranged side by side, F i The length of the target area for the reinforcing bars in the detection result image is estimated as the distance between the leftmost and rightmost detection boxes. Assuming the number of reinforcing bar targets predicted in step 5-4 is N, and the set of abscissas of the center of each target is... i represents the sequence value from left to right. The length of the target area for the rebar is calculated as follows:
[0060]
[0061] Step 6-2: Calculate the average spacing of the predicted targets Assuming the preset spacing is d, and the threshold values for the average spacing and the preset spacing are θ, then the criterion for determining missing reinforcement is:
[0062]
[0063] Step 6-3: Determine the current segmentation diagram F based on the criteria for determining missing reinforcement. i The test results will indicate whether there is a rib defect. If there is a rib defect, it will be marked with the word "Lackrebar".
[0064] The advantages of this invention compared to the prior art are:
[0065] The present invention provides a lightweight intelligent detection method for railway tunnel lining defects. Compared with the traditional manual subjective identification method, the proposed method is based on deep learning to train a lining defect detection model, and predicts the type and location of defects in the image to be inspected, thereby improving the intelligence and efficiency of the identification.
[0066] Compared with machine learning methods, the proposed method is based on convolutional neural networks, which can extract nonlinear features containing depth information. These features are multi-scale, rotation-invariant, and environmentally robust, which is beneficial for improving the detection accuracy of voided structures and reinforcing bars in complex scenarios.
[0067] Existing lining defect detection models mostly target void defects. The proposed method includes two categories in the network model training: voided structures and reinforcing bars. It also adds post-processing of the reinforcing bar target for missing reinforcement determination, enabling the method of this invention to effectively detect both voided lining defects and missing reinforcing bars.
[0068] The proposed method constructs a detection model for lining voids and rebar targets. This detection model, based on an "end-to-end" detection architecture, introduces numerous innovative contributions. First, in the feature extraction and fusion sections, a lightweight convolutional module is improved and constructed from the perspectives of gradient splitting and redundant feature extraction, ensuring efficient feature representation and improving inference speed. Second, a cross-layer weighted channel feature fusion method is proposed in the feature fusion network. By establishing a weight competition mechanism for fused features, the channel feature fusion effect is improved and the number of model parameters is significantly reduced. Finally, a feature enhancement structure is added to the model. This structure constructs a multi-head attention convolutional module based on a context Transformer to enhance the feature interaction within the context, improving detection performance in complex scenarios.
[0069] From a feasibility perspective, the method of this invention implements a "segmentation-detection-merging" process for long-format tunnel lining images. Compared to directly detecting long-format lining radar images, this significantly improves detection efficiency. The lightweight detection model constructed by the proposed method has lower requirements for hardware memory and video memory, thus exhibiting strong portability. In summary, this method provides a practical, lightweight, and intelligent defect detection method for railway tunnel lining inspection, replacing manual identification, and is conducive to achieving high efficiency and intelligentization in railway tunnel engineering. Attached Figure Description
[0070] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0071] Figure 1 This is the overall structural framework for detecting defects in railway tunnel lining provided in this embodiment of the invention;
[0072] Figure 2 This is a schematic diagram of the original data sliding window segmentation provided in an embodiment of the present invention;
[0073] Figure 3 A diagram of a lightweight detection network model for lining defects constructed according to an embodiment of the present invention;
[0074] Figure 4 This is a schematic diagram of the efficient layer aggregation feature extraction module provided in an embodiment of the present invention;
[0075] Figure 5 A flowchart of the feature enhancement module provided in an embodiment of the present invention;
[0076] Figure 6This is a diagram of the feature decoupling network structure provided in an embodiment of the present invention;
[0077] Figure 7 This is a composite image of radar images of a long tunnel lining. Detailed Implementation
[0078] The embodiments of this application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain this application, and should not be construed as limiting this application.
[0079] In the description of this application, it should be noted that, unless otherwise expressly specified and limited, the terms "installation," "connection," and "joining" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; they can refer to the internal communication of two components or the interaction between two components. Those skilled in the art can understand the specific meaning of the above terms in this application according to the specific circumstances.
[0080] The present invention will now be described in further detail with reference to the accompanying drawings.
[0081] Combination Figures 1 to 7 The present invention provides a lightweight intelligent detection method for railway tunnel lining defects. The operating system is WINDOWS10, the processor is Intel i7-10875H with a main frequency of 2.90GHz, and the memory is 16.00GB. The experiments involved are debugged on the PyCharm software platform using Python and PyTorch architecture.
[0082] Along Figure 1 The overall structural framework and other specific example diagrams are used to illustrate the method described in this paper, which includes the following steps:
[0083] Step 1: Acquire multiple radar images of railway tunnel linings along a specific mileage. The images contain reflective structures of various types, including secondary lining, reinforcing steel, and initial support.
[0084] Step 2: Based on the mileage information of each input data point, perform operations such as annotation, segmentation, and target relative coordinate transformation on the lining radar image to construct an image dataset for training. The specific operation method is as follows: Figure 2 As shown;
[0085] Step 2-1: The segmentation method adopts sliding window segmentation. The step size s of the sliding window is set to 1000 pixels, which corresponds to a segmentation image mileage of approximately 10m and a segmentation margin g of 200 pixels.
[0086] Step 2-2: To prevent the segmentation operation from damaging the reinforcing bars and the detached target, and to ensure the integrity of the target, a segmentation and shifting process is performed on the segmentation edges. The process is determined based on whether a target exists at the edge of each window. If the segmentation line penetrates the target, the shifting process is implemented; if the segmentation line does not penetrate the target, the width of the segmented area remains unchanged. Several segmentation images are obtained in this way.
[0087] Steps 2-3: To facilitate training, the initial labeled bounding box positions are converted into bounding box coordinates in the segmentation image. Assume the i-th segmentation image F... i The original parameters of the target bounding box in the image are (x1, y1, x2, y2), and the delay is e pixels. i The formula for calculating the relative position (x1', y1', x2', y2') is as follows:
[0088]
[0089] y′1=y1
[0090]
[0091] y′2=y2
[0092] Step 3: Perform adaptive scaling, padding, and other preprocessing operations on the segmented image set of the original data to obtain a preprocessed image with a fixed scale of 640×640 pixels; that is, for each segmented image F i Scale normalization is performed, and adaptive scaling and padding are used to obtain a fixed input scale of c·c (c = 640). Assume the segmentation map F... i The size is a×b (a≥b), and the adaptive scaling factor is defined as the reciprocal of the ratio of the longer side a to the fixed length c, i.e. After scaling, the longer side is scaled to c, and the shorter side length is λ·b. To convert to a normalized scale, double-sided short side padding with 0 is used. The formula for calculating the single-sided short side padding pixel size m is:
[0093]
[0094] Step 4: Construct a tunnel lining structure detection network. Based on the target bounding box annotation data in the training set, train a detection model for reinforcing bars and voided structures to obtain model parameters and neural network weight files; that is, construct a convolutional neural network for identifying voided structures and reinforcing bars, such as... Figure 3 As shown, it includes an efficient feature extraction network, a lightweight feature fusion network, a feature enhancement module, and a feature decoding structure. The location and category labels of all targets in the segmentation maps are input to train the neural network and obtain the weight parameters.
[0095] Step 5: Perform preprocessing operations such as Step 2 and Step 3 on the image to be inspected, based on the training weight parameters from Step 4, using... Figure 3 The network structure shown is used for inference to detect the category and location parameters of steel bars and void structures in all segmentation diagrams;
[0096] Step 5-1: Input the preprocessed image into an efficient feature extraction network to obtain initial feature maps F at three scales. i,80×80×128 F i,40×40×256 F i,20×20×512 ;
[0097] Step 5-1-1: Perform shallow feature extraction on the input image using depthwise separable convolution, batch normalization, and SiLU activation (DW-CBS module);
[0098] Step 5-1-2: Then, the GhostELA-DWC3 module and the DW-CBS module are stacked multiple times for multi-scale feature extraction. GhostELA-DWC3 is an improved deep feature extraction module, and its structure is as follows: Figure 4 As shown. Specifically, the feature map F is obtained by overlaying the DW-CBS module and the GhostELA-DWC3 module twice. i,80×80×128 Feature map F is obtained by superimposing it three times. i,40×40×256 F is obtained by stacking four times. i,20×20×512 ;
[0099] Step 5-2: Use a lightweight feature fusion network to perform cross-layer weighted fusion of the initial features to obtain a fused feature map P at three scales. i,80×80×64 P i,40×40×128 P i,20×20×256 ;
[0100] Step 5-2-1: For F i,20×20×512 Spatial pyramid pooling is performed using SPPF blocks, followed by GhostELA-DWC3, DW-CBS modules, and nearest neighbor interpolation to obtain double-upsampled features F. i,40×40×128 This feature is related to F i,40×40×256 The upsampled fusion feature map F is obtained by stitching together along the channel direction. i,40×40×384 In this way, an upsampled fusion feature map F at another scale is obtained. i,80×80×192 ;
[0101] Step 5-2-2: For feature map F i,80×80×192 Further, the GhostELA-DWC3 module was used to obtain the fused feature map P. i,80×80×64 Based on the idea of cross-layer weighted fusion, the channel weights W of the multi-layer feature maps are calculated using fast normalization:
[0102]
[0103] Where w i For learnable weights, ε = 0.0001 is used for stable gradient propagation. In each w... i The SiLU activation function is then applied to ensure the weighted value w. i >0. Therefore, each normalized weight value is limited to between 0 and 1.
[0104] Step 5-2-3: Initial feature map, upsampled fused feature map F i,80×80×128 and fused feature map P i,80×80×64 Weighted concatenation is performed along the channel directions to obtain the fused feature map P. i,40×40×128 In this way, the fused feature map P is obtained. i,20×20×256 ;
[0105] Step 5-3: Further, adopt... Figure 5 The method of fusing feature maps P i,80×80×64 P i,40×40×128 P i,20×20×256 Perform contextual feature enhancement to obtain enhanced feature maps.
[0106] Step 5-3-1: Using the fused feature map P i,80×80×64 For example, converting it into a vector of query Q, key K, and value V, the formula is as follows:
[0107] Q = P i,80×80×64 ·M q
[0108]
[0109] V = P i,80×80×64 ·M v
[0110] Among them, M q , M v These are the embedding matrices used to transform queries, key vectors, and value vectors, respectively.
[0111] Step 5-3-2: Assume the center key of the context region is X. cen If the size of the surrounding region is × (=3), then calculating × convolutions will yield the key vector information for each surrounding region. Similar to sliding window convolution, the learned context key K... Static It reflects static information about the center and its surroundings.
[0112] Step 5-3-3: Concatenate the context key and query vector to form the key [K] Static Self-attention encoding is performed using two consecutive 1×1 convolutions:
[0113]
[0114] Among them, M att Represents a 1×1 convolution. This represents a 1×1 convolution using a SiLU activation layer. Therefore, the local attention matrix W... att It learns based on query features and contextual key features, that is, it enhances the "self-attention" of local regions by mining contextual features;
[0115] Step 5-3-4: Summarize the value feature matrix (V), and perform a Softmax operation on the channel dimension to calculate the dynamic context self-attention weight matrix, as shown below:
[0116]
[0117] in This represents the attention score for the Softmax operation;
[0118] Step 5-3-5: Transfer static context features K Static and dynamic context features K dynamic Fusion is performed through a channel overlay and fusion mechanism. In this way, P... i,40×40×128 and P i,20×20×256 Perform feature enhancement.
[0119] Step 5-4: Enhance the feature map The input feature decoding structure is used to obtain the prediction vector. The specific process is as follows: Figure 6 As shown. The final prediction vector contains (t) representing the location of the prediction box. x ,t y ,t w ,t h ), representing the confidence score of the current prediction box. confi And Probability, representing the probability of the target class. class To normalize the predicted bounding box output, the center point coordinates (b) of the predicted bounding box are... x ,b y ,b w ,b h ) is determined by the relative predicted coordinates (t) of the corresponding grid. x ,t y ,t w ,t h The calculation formula is as follows:
[0120] b x =c x +2σ(t x -0.5
[0121] b y =c y+2σ(t y -0.5
[0122] b w =p w ·(2σ(t w )) 2
[0123] b h =p h ·(2σ(t h )) 2
[0124] Among them (c x ,c y ) represents the relative coordinates of the top-left corner of the grid responsible for predicting the target.
[0125] Step 6: Determine the segmentation diagram F based on the preset spacing of the reinforcing bars and the number of reinforcing bars to be inspected. i The test results are used to determine whether there is a rib defect and to obtain a post-processed test result image.
[0126] Step 6-1: Since the reinforcing bars in tunnel lining are usually arranged side by side, F i The length of the rebar target area in the detection result image can be estimated as the distance between the leftmost and rightmost detection boxes. Assuming the number of rebar targets predicted in step 5-4 is N, and the set of x-coordinates of each target center is... i represents the sequence value from left to right. The length of the target area for the rebar is calculated as follows:
[0127]
[0128] Step 6-2: Calculate the average spacing of the predicted targets Assuming the preset spacing is d, and the threshold values for the average spacing and the preset spacing are θ, then the criterion for determining missing reinforcement is: The formula is:
[0129]
[0130] Step 6-3: Determine the current segmentation diagram F based on the criteria for determining missing reinforcement. i The test results will indicate whether there is a rebar defect. If there is a rebar defect, it will be marked with the words "Lack rebar".
[0131] Step 7: Merge all post-processed detection segmentation maps. This involves stitching together the detection results of all segmentation maps based on the segmentation parameters from Step 2 to obtain a final long-width detection result with the same size as the input data. The detection result is shown below. Figure 7 As shown.
[0132] The present invention and its embodiments have been described above. This description is not restrictive, and the accompanying drawings are only one embodiment of the present invention; the actual structure is not limited thereto. In conclusion, if those skilled in the art are inspired by this description and design similar structures and embodiments without departing from the spirit of the invention, such designs should fall within the protection scope of the present invention.
Claims
1. A lightweight intelligent detection method for railway tunnel lining defects, characterized in that, Includes the following steps: Step 1: Acquire multiple radar images of railway tunnel linings along a specific mileage. The images contain various reflective structures, including secondary lining, reinforcing steel, and initial support. Step 2: Based on the mileage information of each image input data, perform annotation, segmentation, and target relative coordinate transformation operations on the lining radar image to construct an image dataset for training; Step 3: For each segmentation image Scale normalization is performed, and adaptive scaling and padding are used to obtain a fixed input scale. Where c=640; assuming a segmentation graph The size is The adaptive scaling factor is defined as the longer side. a With fixed length c The reciprocal of the ratio, i.e. After scaling, the longer side is scaled to... c The length of the shorter side is If both short sides are filled with 0, then the fill size of one short side is... m The calculation formula is: ; Step 4: Construct a convolutional neural network for identifying voided structures and rebar, including an efficient feature extraction network, a lightweight feature fusion network, a feature enhancement module, and a feature decoding structure; input the location and category labels of all targets in the segmentation map to train the neural network and obtain weight parameters; Step 5: Perform the preprocessing operations of Step 2 and Step 3 on the image to be inspected, and then infer the network structure based on the training weight parameters of Step 4 to detect the category and location parameters of steel bars and void structures in all segmented images. Step 5 includes the following sub-steps: Step 5-1: Input the preprocessed image into an efficient feature extraction network to obtain initial feature maps at three scales. , , ; Step 5-2: Use a lightweight feature fusion network to perform cross-layer weighted fusion of the initial features to obtain fused feature maps at three scales. , , ; Step 5-3: Analyze the fused feature map , , Perform contextual feature enhancement to obtain enhanced feature maps. , , ; Step 5-4: Enhance the feature map , , The input feature decoding structure is used to obtain the prediction vector; the final prediction vector contains representations of the predicted bounding box positions. , representing the confidence score of the current prediction box. and representing the probability of the target category To normalize the predicted bounding box output, the coordinates of the center point of the predicted bounding box are... Based on the relative predicted coordinates of the corresponding grid The calculation formula is as follows: ; in The relative coordinates of the top-left corner of the grid responsible for predicting the target; Step 6: Based on the segmentation map The test results determine whether there is a tendon deficiency; Step 7: Based on the segmentation parameters in Step 2, stitch together all segmentation map detection results to obtain a long-width detection result with the same size as the input data.
2. The lightweight intelligent detection method for railway tunnel lining defects according to claim 1, characterized in that: Step 2 includes the following sub-steps: Step 2-1: Use sliding window segmentation as the segmentation method and set the step size of the sliding window. s For 1000 pixels, the corresponding segmentation image distance is 10m, and the segmentation margin is... g 200 pixels; Step 2-2: Perform segmentation and shifting processing on the segmentation edges to ensure the integrity of the target and prevent the segmentation operation from damaging the reinforcing bars and the detached target; determine whether there is a target at the edge of each window. If the segmentation line passes through the target, the shifting processing is performed. If the segmentation line does not pass through the target, the width of the segmented area remains unchanged; obtain several segmentation images in this way. Steps 2-3: To facilitate training, the initial labeled bounding box positions are converted into bounding box coordinates in the segmentation map; assuming the first... i A segmentation image The original parameters of the target bounding box in the image are: , delayed pixels Then the relative position The calculation formula is as follows: 。 3. The lightweight intelligent detection method for railway tunnel lining defects according to claim 1, characterized in that: Step 5-1 includes the following sub-steps: Step 5-1-1: Apply depthwise separable convolution, batch normalization, and SiLU activation to the input image, and perform shallow feature extraction by the DW-CBS module; Step 5-1-2: Multiple stacking of the GhostELA-DWC3 module and the DW-CBS module for multi-scale feature extraction. GhostELA-DWC3 is an improved deep feature extraction module. The stacking method is as follows: stack the DW-CBS module and the GhostELA-DWC3 module twice to obtain the feature map. Feature maps are obtained by stacking three times. Get it by stacking four times .
4. The lightweight intelligent detection method for railway tunnel lining defects according to claim 1, characterized in that: Step 5-2 includes the following sub-steps: Step 5-2-1: For Spatial pyramid pooling is performed using SPPF blocks, followed by GhostELA-DWC3, DW-CBS modules, and nearest neighbor interpolation to obtain double-upsampled features. This feature is related to Upsampled and fused feature maps are obtained by stitching together along the channel direction. In this way, an upsampled fusion feature map at another scale is obtained. ; Step 5-2-2: Process the feature map Further use of the GhostELA-DWC3 module to obtain fused feature maps ; Based on the idea of cross-layer weighted fusion, the channel weights of multi-layer feature maps W Calculated using fast normalization: ; In the formula For learnable weights, ε = 0.0001 is used for stable gradient propagation; in each The SiLU activation function is then applied to ensure the weighting. >0; therefore, each normalized weight value is limited to between 0 and 1; Step 5-2-3: Initial feature map, upsampled fused feature map and fused feature maps Weighted concatenation along the channel directions is performed to obtain a fused feature map. In this way, a fused feature map is obtained. .
5. The lightweight intelligent detection method for railway tunnel lining defects according to claim 1, characterized in that: Step 5-3 includes the following sub-steps: Step 5-3-1: Fuse the feature maps Convert to query Q ,key K Sum V Vector, formula as follows: ; in, , , These are the embedding matrices used to transform queries, key vectors, and value vectors, respectively. Step 5-3-2: Assume the center key of the context region is The size of the surrounding area is k × k ,in k =3, then calculate k × k Each convolution yields key vector information for each surrounding region; the learned context keys K Static It reflects static information about the center and its surroundings; Step 5-3-3: Concatenate the context key and query vector to form a composite key. K Static , Q Self-attention encoding is performed using two consecutive 1×1 convolutions. ; in, Represents a 1×1 convolution. This indicates a 1×1 convolution using a SiLU activation layer; therefore, the local attention matrix... It learns based on query features and contextual key features, that is, it enhances the "self-attention" of local regions by mining contextual features; Step 5-3-4: Summarize the feature matrix V, and perform a Softmax operation on the channel dimension to calculate the dynamic context self-attention weight matrix, as shown below: ; in This represents the attention score for the Softmax operation; Step 5-3-5: Transfer static context features and dynamic context features Fusion is achieved through a channel overlay and fusion mechanism; in this way, and Perform feature enhancement.
6. The lightweight intelligent detection method for railway tunnel lining defects according to claim 1, characterized in that: Step 6 includes the following sub-steps: Step 6-1: Since the reinforcing bars in tunnel lining are usually arranged side by side, The length of the target area for the reinforcing bars in the detection result image is estimated as the distance between the leftmost and rightmost detection boxes. Assuming the number of reinforcing bar targets predicted in step 5-4 is... N The set of x-coordinates of the center of each target is , i The length of the target area for the reinforcing bars is calculated according to the sequence of values from left to right: ; Step 6-2: Calculate the average spacing of the predicted targets Assuming the preset spacing is d The threshold values for average spacing and preset spacing are: The criteria for determining a missing reinforcement are: ; ; Step 6-3: Determine the current segmentation diagram based on the criteria for determining missing reinforcement. The test results will indicate whether there is a rib defect. If there is a rib defect, it will be marked with the words "Lackrebar".