A method for detecting rail surface defects
By using multi-source data fusion and neural network detection, the problems of low efficiency and poor accuracy in detecting track surface defects have been solved, achieving efficient and automated defect detection and early warning.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- RES INST OF ZHEJIANG UNIV TAIZHOU
- Filing Date
- 2026-03-26
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies for detecting defects on track surfaces are inefficient, subjective, have a high rate of missed detections, produce fragmented data, and are limited by single sensors, making it difficult to meet the needs of large-scale track inspection.
A multi-source data acquisition system, including LiDAR and industrial cameras, is adopted, combined with YOLOv5-DeepLabV3+ hybrid neural network, to perform multi-source data fusion, extract the geometric and texture features of the disease, and realize automated detection.
It improves the detection accuracy of micro-cracks and spalling, supports detection at night and under complex lighting conditions, achieves full-process automation, generates traceable disease reports, and triggers real-time early warnings.
Smart Images

Figure CN122300569A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of rail transit infrastructure inspection technology, and relates to a method for detecting defects on the track surface. Background Technology
[0002] As the fundamental carrier of railway transportation, the surface quality of railway tracks directly affects the safety and stability of train operation. With the increase in railway operating mileage and transport capacity, track surface defects (such as fatigue cracks, corrugation, spalling, edge thickening, and abrasions) are showing a trend of increasing prevalence. Traditional inspection mainly relies on manual inspection and small flaw detectors, which have the following significant drawbacks: 1. Low inspection efficiency: Manual inspection requires walking along the track or using a track vehicle, and the inspection time per kilometer is about 2-3 hours, which is difficult to meet the needs of large-scale line inspection; 2. High subjectivity and high missed detection rate: The inspection results depend on the experience of the inspectors, and the ability to identify small cracks (<0.5mm) and hidden defects is insufficient, making it easy to miss defects; 3. Fragmented data: Manual records lack unified standards, making it difficult to form a traceable digital defect archive; 4. Limitations of single sensors: Existing automated inspection equipment mostly uses single vision or ultrasonic technology. The vision method is greatly affected by light, and the ultrasonic method has insufficient resolution for shallow surface defects. Summary of the Invention
[0003] In order to overcome at least one deficiency of the prior art, the present invention provides a method for detecting defects on track surfaces.
[0004] To achieve the above objectives, the present invention adopts the following technical solution: a method for detecting defects on track surfaces, comprising the following steps: Step 1: Multi-source data acquisition: Collect three-dimensional point cloud data and two-dimensional texture images of the track surface using a multi-source data acquisition system mounted on the inspection vehicle, while maintaining the spatial calibration of the sensor coordinate system and the track coordinate system during acquisition; Step 2: Data preprocessing: Denoise, downsample, and normal vector estimation are performed on the point cloud data; distortion correction, illumination equalization, and region of interest extraction are performed on the image; and spatiotemporal registration of the point cloud and the image is achieved through feature point matching. Step 3: Disease feature extraction: Input the registered multi-source data into the YOLOv5-DeepLabV3+ hybrid neural network to extract the geometric and texture features of the disease simultaneously; Step 4: Disease type identification and localization: Based on the extracted features, the disease type is determined by a classifier, and its spatial location on the track is determined by combining the point cloud coordinates; Step 5: Disease severity assessment: Based on the set standards, combined with disease size thresholds and historical data statistical analysis, the disease is classified into levels (mild / moderate / severe / critical), and a disease report including location, type, level, and development rate is generated; Step 6: Trigger real-time early warnings for severe or higher-level defects, push them to the maintenance management system, and set maintenance plans based on the distribution patterns of the defects.
[0005] Furthermore, the multi-source data acquisition system includes a detection platform, a lidar, an industrial camera, and a spatial calibration structure. The lidar is used to scan the rail surface and the side of the rail head. The industrial camera is installed next to the lidar at a 30° angle to the rail surface and is illuminated by a ring LED light source. The spatial calibration structure is a calibration plate installed at the front of the detection vehicle. The transformation matrix between the lidar coordinate system {L} and the camera coordinate system {C} is obtained by hand-eye calibration.
[0006] Furthermore, the lidar and camera achieve microsecond-level time synchronization via the PTP protocol to ensure data spatiotemporal consistency.
[0007] Furthermore, the data preprocessing in step 2 includes... Step 21: Point cloud data denoising: Statistical filtering is used to calculate the average distance between points in the neighborhood of each point and remove outliers whose mean distance exceeds 3 times the standard deviation. Step 22: Point cloud data downsampling: Use voxel grid filtering and set the voxel size; Step 23: Point cloud data normal vector estimation: Based on the PCA algorithm of K nearest neighbors (K=10), calculate the normal vector of each point for subsequent defect region segmentation; Step 24: Image distortion correction: Obtain the in-camera distortion coefficients using the Zhang Zhengyou calibration method, and use OpenCV's undistort function to correct radial and tangential distortion; Step 25: Image Illumination Equalization: The CLAHE algorithm is used to perform histogram equalization on image blocks to improve the visibility of details in shadow areas; Step 26: Image ROI extraction: Based on the track edge detection results, crop out the key areas of the track head, track waist, and track bottom.
[0008] Furthermore, step 21 includes 1. For each point p i (x i y i , z i Search for all neighborhood points p within its radius r. j (x j y j , z j ), denoted as set Nr(p i ); 2. Calculate the average Euclidean distance d between neighboring points. i : 3. Calculate all di global mean μ d With standard deviation σ d ; 4. If d i >μ d +k xσ d Then determine p i We identify outliers and remove them; k=3.
[0009] Furthermore, step 22 includes Define the voxel size s; The point cloud is spatially partitioned, and all points falling into the same voxel are represented by the centroid p. voxel : Where n is the number of points within the voxel; 3. Output the downsampled point cloud.
[0010] Furthermore, step 23 includes For each point p i Search for its K nearest neighbors (K=10 in this example); Calculate the covariance matrix C: Where pˉ is the centroid of the neighborhood point; 3. Perform eigenvalue decomposition on C and take the eigenvector corresponding to the smallest eigenvalue; 4. Unify the direction of the normal vector.
[0011] Furthermore, step 24, image distortion correction, includes... Step 241: Obtain the camera intrinsic parameter matrix K and distortion coefficients using Zhang Zhengyou's calibration method. ; Step 242: Remap the original image I(u,v); And substitute it into the distortion model: Step 243: Recover the distortion-free image by inverse mapping.
[0012] Furthermore, step 25, image illumination equalization, includes... Step 251: Divide the image into M×N blocks (8×8 pixels in this example); Step 252: Calculate the histogram for each block and perform contrast equalization. Step 253: Use bilinear interpolation to eliminate block artifacts.
[0013] Furthermore, step 26, image ROI extraction, includes... Step 261: Perform Canny edge detection on the corrected image to obtain a binary edge map; Step 262: Use Hough transform to detect straight segments and identify the upper and lower edges of the rail head and the side lines of the rail web; Step 263: Calculate the vertices of the ROI polygon based on geometric relationships; Step 264: Cut out the ROI and scale it to a uniform size for network input.
[0014] In summary, the advantages of this invention are: This invention uses multi-source data fusion to compensate for the shortcomings of single sensors, greatly improving the accuracy of detecting micro-cracks and micro-peeling; the entire process is automated, reducing manual intervention and supporting detection under nighttime and complex lighting conditions; it supports online updates and is compatible with different track types and detection scenarios. Attached Figure Description
[0015] Figure 1 This is a flowchart of the detection method of the present invention.
[0016] Figure 2 This is a diagram of the architecture of the multi-source data acquisition system of the present invention.
[0017] Figure 3 This is a flowchart of the data preprocessing process of the present invention. Detailed Implementation
[0018] The following specific examples illustrate the implementation of the present invention. Those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that, unless otherwise specified, the following embodiments and features described therein can be combined with each other.
[0019] Example: like Figures 1-3 As shown, a method for detecting defects on track surfaces includes the following steps: Step 1: Multi-source data acquisition: Collect three-dimensional point cloud data and two-dimensional texture images of the track surface using a multi-source data acquisition system mounted on the inspection vehicle, while maintaining the spatial calibration of the sensor coordinate system and the track coordinate system during acquisition; The multi-source data acquisition system includes a detection platform, a lidar, an industrial camera, and a spatial calibration structure. The lidar is used to scan the rail surface and the side of the rail head. The industrial camera is installed next to the lidar at a 30° angle to the rail surface and is illuminated by a ring LED light source. The spatial calibration structure is a calibration board (1000mm×1000mm checkerboard). The calibration board is installed at the front of the detection vehicle. The transformation matrix between the lidar coordinate system {L} and the camera coordinate system {C} is obtained by hand-eye calibration.
[0020] The lidar and camera achieve microsecond-level time synchronization through the PTP protocol, ensuring data spatiotemporal consistency.
[0021] Step 2: Data preprocessing: Denoise, downsample, and normal vector estimation are performed on the point cloud data; distortion correction, illumination equalization, and region of interest (ROI) extraction are performed on the image; and spatiotemporal registration of the point cloud and the image is achieved through feature point matching. Step 2, data preprocessing, includes the following steps: Step 21: Point cloud data denoising: Using statistical filtering, calculate the average distance of points within the neighborhood (radius 0.5mm) of each point, and remove outliers whose mean distance exceeds 3 times the standard deviation; 1. For each point p i (x i y i , z i Search for all neighborhood points p within its radius r. j (x j y j , z j ), denoted as set Nr(p i ); 2. Calculate the average Euclidean distance d between neighboring points. i : 3. Calculate all d i global mean μ d With standard deviation σ d ; 4. If d i >μ d +k xσ d Then determine p i We identify outliers and remove them; k=3.
[0022] Step 22: Point cloud data downsampling: Use voxel grid filtering, with the voxel size set to 1mm. 3 While preserving key morphological features, it greatly reduces the number of point clouds and improves processing speed; Define the voxel size s; The point cloud is spatially partitioned, and all points falling into the same voxel are represented by the centroid p.voxel : Where n is the number of points within the voxel; 3. Output the downsampled point cloud.
[0023] Step 23: Point cloud data normal vector estimation: Based on the PCA algorithm of K nearest neighbors (K=10), calculate the normal vector of each point for subsequent defect region segmentation; For each point p i Search for its K nearest neighbors (K=10 in this example); Calculate the covariance matrix C: Where pˉ is the centroid of the neighborhood point; 3. Perform eigenvalue decomposition on C and take the eigenvector corresponding to the smallest eigenvalue; 4. Unify the direction of the normal vector. Step 24: Image distortion correction: Obtain the in-camera distortion coefficients using the Zhang Zhengyou calibration method, and use OpenCV's undistort function to correct radial and tangential distortion; Step 24, image distortion correction, includes the following steps: Step 241: Obtain the camera intrinsic parameter matrix K and distortion coefficients using Zhang Zhengyou's calibration method. ; Step 242: Remap the original image I(u,v); And substitute it into the distortion model: Step 243: Recover the distortion-free image by inverse mapping.
[0024] Step 25: Image Illumination Equalization: The CLAHE algorithm is used to perform histogram equalization on image blocks to improve the visibility of details in shadow areas; Step 25, image illumination equalization, includes the following steps: Step 251: Divide the image into M×N blocks (8×8 pixels in this example); Step 252: Calculate the histogram for each block and perform contrast equalization. Step 253: Use bilinear interpolation to eliminate block artifacts.
[0025] Step 26: Image ROI Extraction: Based on the results of track edge detection (Canny operator + Hough transform), key areas such as the track head, track waist, and track bottom are cropped to reduce irrelevant background interference and reduce the amount of data processed by 70%. Step 26, image ROI extraction, includes the following steps: Step 261: Perform Canny edge detection on the corrected image to obtain a binary edge map; Step 262: Use Hough transform to detect straight segments and identify the upper and lower edges of the rail head and the side lines of the rail web; Step 263: Calculate the vertices of the ROI polygon based on geometric relationships; Step 264: Cut out the ROI and scale it to a uniform size for network input.
[0026] Step 3: Disease feature extraction: Input the registered multi-source data into the YOLOv5-DeepLabV3+ hybrid neural network to extract the geometric features (length, width, depth, area) and texture features (grayscale distribution, edge gradient) of the disease. Step 3, disease feature extraction, includes: Step 31: Dataset Construction: Collect track data and label 5 types of defects (transverse cracks, longitudinal cracks, corrugation, peeling, and thick edges). A total of 12,000 images and 5 million point cloud points were labeled and divided into training set, validation set, and test set in a 7:2:1 ratio. Step 32: Build a YOLOv5-DeepLabV3+ hybrid neural network: In YOLOv5's CSPDarknet53, an SE attention module is introduced to enhance the focus on minute defect features. The channel attention weight calculation formula is as follows: in σ represents the global average pooling result for the c-th channel, W1 and W2 are the parameters of the fully connected layer, and σ is the Sigmoid activation function. By combining YOLOv5's FPN+PAN feature pyramid with DeepLabV3+'s ASPP (hollow spatial pyramid pooling) module, multi-scale features are fused at three different scales (1 / 4, 1 / 8, and 1 / 16), with void ratios of 6, 12, and 18, respectively.
[0027] Step 33: Training parameters: Adam optimizer was used, initial learning rate 0.001, batch size=16, number of iterations 100 epochs, loss function CIoU Loss+CrossEntropy Loss, training time on NVIDIA RTX 3090 GPU was 48 hours, model size 28MB.
[0028] Step 4: Disease type identification and location: Based on the extracted features, the disease type (crack / wear / stripping / fat edge) is determined by a classifier (such as ResNet-50), and its spatial location on the track (mileage, distance from sleeper) is determined by combining the point cloud coordinates. Step 4, disease type identification and location, includes the following steps: Feature extraction: For the test set samples, the disease bounding boxes (x, y, w, h) and class probabilities (P_crack, P_wear, P_spall, P_fat_edge) output by the network are used as geometric and semantic features, respectively. The variance of the point cloud normal vector (σ_n) is also considered. 2 A value >0.01 indicates a crack, σ_n 2 (Values <0.005 are considered fat edges) are used as auxiliary features.
[0029] Classifier: A pre-trained ResNet-50 network is used, with a 128-dimensional feature vector as input (64-dimensional geometry + 32-dimensional semantics + 32-dimensional normal vector) and the output is the disease type. The accuracy on the test set reaches 96.3%.
[0030] Spatial positioning: By using the calibrated transformation matrix, the centroid coordinates (X,Y,Z) of the defect area in the point cloud are converted into track mileage (S) and distance from sleeper (D), with a positioning error of <5mm.
[0031] Step 5: Disease severity assessment: Based on the set standards, combined with disease size thresholds and historical data statistical analysis, the disease is classified into levels (mild / moderate / severe / critical), and a disease report including location, type, level, and development rate is generated; Step 6: Trigger real-time alerts for severe or higher-level defects, push them to the maintenance management system, and recommend repair solutions (such as grinding, welding, or replacement) based on the distribution patterns of the defects.
[0032] Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without inventive effort should fall within the scope of protection of the present invention.
Claims
1. A method for detecting defects on track surfaces, characterized in that: Includes the following steps: Step 1: Multi-source data acquisition: Collect three-dimensional point cloud data and two-dimensional texture images of the track surface using a multi-source data acquisition system mounted on the inspection vehicle, while maintaining the spatial calibration of the sensor coordinate system and the track coordinate system during acquisition; Step 2: Data preprocessing: Denoise, downsample, and normal vector estimation are performed on the point cloud data; distortion correction, illumination equalization, and region of interest extraction are performed on the image; and spatiotemporal registration of the point cloud and the image is achieved through feature point matching. Step 3: Disease feature extraction: Input the registered multi-source data into the YOLOv5-DeepLabV3+ hybrid neural network to extract the geometric and texture features of the disease simultaneously; Step 4: Disease type identification and localization: Based on the extracted features, the disease type is determined by a classifier, and its spatial location on the track is determined by combining the point cloud coordinates; Step 5: Disease severity assessment: Based on the set standards, combined with disease size thresholds and historical data statistical analysis, the disease is classified into levels, and a disease report including location, type, level, and development rate is generated. Step 6: Trigger real-time early warnings for severe or higher-level defects, push them to the maintenance management system, and set maintenance plans based on the distribution patterns of the defects.
2. The method for detecting track surface defects according to claim 1, characterized in that: The multi-source data acquisition system includes a detection platform, a lidar, an industrial camera, and a spatial calibration structure. The lidar is used to scan the rail surface and the side of the rail head. The industrial camera is installed next to the lidar at a 30° angle to the rail surface and is illuminated by a ring LED light source. The spatial calibration structure is a calibration plate installed at the front of the detection vehicle. The transformation matrix between the lidar coordinate system {L} and the camera coordinate system {C} is obtained by hand-eye calibration.
3. The method for detecting track surface defects according to claim 2, characterized in that: The lidar and camera achieve microsecond-level time synchronization via the PTP protocol, ensuring data spatiotemporal consistency.
4. The method for detecting track surface defects according to claim 1, characterized in that: The data preprocessing in step 2 includes... Step 21: Point cloud data denoising: Statistical filtering is used to calculate the average distance between points in the neighborhood of each point and remove outliers whose mean distance exceeds 3 times the standard deviation. Step 22: Point cloud data downsampling: Use voxel grid filtering and set the voxel size; Step 23: Point cloud data normal vector estimation: Based on the K-nearest neighbor PCA algorithm, calculate the normal vector of each point for subsequent defect region segmentation; Step 24: Image distortion correction: Obtain the in-camera distortion coefficients using the Zhang Zhengyou calibration method, and use OpenCV's undistort function to correct radial and tangential distortion; Step 25: Image Illumination Equalization: The CLAHE algorithm is used to perform histogram equalization on image blocks to improve the visibility of details in shadow areas; Step 26: Image ROI extraction: Based on the track edge detection results, crop out the key areas of the track head, track waist, and track bottom.
5. The method for detecting track surface defects according to claim 4, characterized in that: Step 21 includes 1. For each point p i (x i y i , z i Search for all neighborhood points p within its radius r. j (x j y j , z j ), denoted as set Nr(p i ); 2. Calculate the average Euclidean distance d between neighboring points. i :
3. Calculate all d i global mean μ d With standard deviation σ d ; 4. If d i >μ d +k xσ d Then determine p i We identify outliers and remove them; k=3.
6. The method for detecting track surface defects according to claim 4, characterized in that: Step 22 includes Define the voxel size s; The point cloud is spatially partitioned, and all points falling into the same voxel are represented by the centroid p. voxel : Where n is the number of points within the voxel; 3. Output the downsampled point cloud.
7. The method for detecting track surface defects according to claim 4, characterized in that: Step 23 includes For each point p i Search for its K nearest neighbors (K=10 in this example); Calculate the covariance matrix C: Where pˉ is the centroid of the neighborhood point; 3. Perform eigenvalue decomposition on C and take the eigenvector corresponding to the smallest eigenvalue; 4. Unify the direction of the normal vector.
8. The method for detecting track surface defects according to claim 4, characterized in that: The image distortion correction step 24 includes the following steps: Step 241: Obtain the camera intrinsic parameter matrix K and distortion coefficients using Zhang Zhengyou's calibration method. ; Step 242: Remap the original image I(u,v); And substitute it into the distortion model: Step 243: Recover the distortion-free image by inverse mapping.
9. The method for detecting track surface defects according to claim 4, characterized in that: Step 25, image illumination equalization, includes... Step 251: Divide the image into M×N blocks (8×8 pixels in this example); Step 252: Calculate the histogram for each block and perform contrast equalization. Step 253: Use bilinear interpolation to eliminate block artifacts.
10. The method for detecting track surface defects according to claim 4, characterized in that: The step 26, image ROI extraction, includes... Step 261: Perform Canny edge detection on the corrected image to obtain a binary edge map; Step 262: Use Hough transform to detect straight segments and identify the upper and lower edges of the rail head and the side lines of the rail web; Step 263: Calculate the vertices of the ROI polygon based on geometric relationships; Step 264: Cut out the ROI and scale it to a uniform size for network input.