A low-sample rail vehicle outer surface intelligent image detection method based on meta learning

By constructing a hierarchical meta-task library and a dual-loop training model, combined with a high-speed imaging array and image correction algorithm, the accuracy and efficiency issues of track vehicle external surface detection in new scenarios were solved, achieving rapid adaptation to low-sample conditions and high-definition image acquisition.

CN122243930APending Publication Date: 2026-06-19SUZHOU LINGYUEZHIWEI DIGITAL TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SUZHOU LINGYUEZHIWEI DIGITAL TECHNOLOGY CO LTD
Filing Date
2026-03-18
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies for detecting images of the outer surface of rail vehicles suffer from decreased detection accuracy when faced with new scenarios. They also have high sample requirements and are difficult to adapt, resulting in high sample annotation costs and long model adaptation times.

Method used

A low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning is proposed. By constructing a hierarchical meta-task library, utilizing a meta-learning task system and a dual-loop training model, it achieves rapid adaptation of low-sample images and generates high-definition panoramic images by combining a high-speed imaging array and image correction algorithms.

Benefits of technology

It enables efficient acquisition of high-definition images without stopping or slowing down, reducing sample annotation costs and model adaptation time, and improving detection accuracy and adaptation efficiency. It is suitable for long-distance rail vehicle detection without stopping at stations.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122243930A_ABST
    Figure CN122243930A_ABST
Patent Text Reader

Abstract

This invention discloses a low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning, belonging to the field of rail vehicle detection technology. The method includes the following steps: S1, image acquisition of a vehicle moving at constant speed; S2, construction of a meta-learning task system; S3, construction of a meta-learning detection model; S4, scene-specific detection model; S5, defect identification and result output. This invention builds a hierarchical meta-task library based on three core dimensions: vehicle type, defect type, and operating environment. It solves the problems of mismatch between general meta-learning tasks and vehicle detection scenarios, and weak generalization ability. For new vehicle types, new defects, and new environments, only a very small number of labeled samples are needed to quickly generate scene-specific detection models, significantly reducing sample labeling costs and model adaptation time. It achieves defect classification, pixel-level localization, and severity level assessment, solving the pain points of traditional detection models such as large sample requirements and difficulty in adaptation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of rail vehicle inspection technology, specifically to a low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning. Background Technology

[0002] Accurate detection of defects on the outer surface of rail vehicles is a key link in ensuring driving safety and extending the service life of vehicles. Image detection technology has become the mainstream technical means for detecting defects on the outer surface of rail vehicles due to its advantages of non-contact and high efficiency. At present, existing image detection technologies for the outer surface of rail vehicles mostly integrate machine vision and deep learning technologies. After acquiring images of the outer surface of the vehicle through trackside imaging equipment, convolutional neural networks, target detection algorithms and other technologies are used to complete the identification and detection of defects.

[0003] Currently, the general defect representations learned by general detection models are out of touch with the actual operation and maintenance conditions of rail vehicles, resulting in a significant drop in detection accuracy when facing different detection scenarios. Moreover, when facing new scenarios, the difficulty of adapting to the scenario is high, and the sample requirements are large. For scenarios such as new vehicle models, new defect types, and new operating environments, it is often necessary to re-collect and label a large number of samples for full model training, which not only generates high sample labeling costs but also consumes a lot of model adaptation time, resulting in extremely low adaptation efficiency. Summary of the Invention To address the shortcomings of existing technologies, this invention provides a low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning, which solves the problems mentioned in the background technology.

[0004] To achieve the above objectives, the present invention provides the following technical solution: a low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning, comprising the following steps: S1. Image acquisition of a vehicle moving at a constant speed: Under normal operation conditions where the rail vehicle does not stop or reduce speed, high-definition images of the outer surface are dynamically acquired and distortion correction is performed through a high-speed synchronous imaging array. S2. Constructing a meta-learning task system: Based on three core dimensions—vehicle type, defect type, and operating environment—a hierarchical meta-task library for rail vehicle defects is constructed, dividing each meta-task into a support set for low-sample learning and a query set for effect verification. S3. Building a meta-learning detection model: Using the hierarchical meta-task library constructed in step S2 as training data, perform basic meta-training to allow the model to learn the general representation rules of defects in rail vehicles and obtain the pre-trained meta-model parameters. S4, Scene-Specific Detection Model: Low-sample defect images in new scenarios are obtained as new support sets. Based on the pre-trained meta-model obtained in step S3, only local parameters are quickly fine-tuned and adapted to obtain a scenario-specific detection model. S5. Defect Identification and Result Output: The high-definition image corrected in step S1 is input into the scene-specific detection model to complete the classification, pixel-level localization and severity level assessment of defects, and generate maintenance work order information.

[0005] Furthermore, in step S1, the acquisition of high-definition images without stopping or slowing down specifically includes the following sub-steps: S1-1: Deploy a multi-view high-speed linear array camera, laser velocity sensor and synchronization trigger on the trackside. The camera resolution is ≥4K and the frame rate is ≥1000 frames / second. S1-2: Dynamically adjust the shooting frequency based on laser velocity measurement data; the adaptation algorithm formula is as follows: in, For camera shooting frequency, For the real-time speed of rail vehicles, The horizontal pixel density of the image. The physical spacing between adjacent pixels is determined to ensure that the image is not stretched or interrupted during scanning.

[0006] Furthermore, step S1 also includes the following sub-steps: S1-3: Motion blur compensation and geometric distortion correction, using adaptive Wiener filtering and perspective transformation algorithms, with the following formulas: Motion blur compensation (Wiener filtering): in, To correct the frequency domain image, The motion fuzzy kernel (based on vehicle speed calculation) is used. For the frequency domain of the original blurred image, This is the noise suppression coefficient (value ranges from 0.01 to 0.05). This is the inverse Fourier transform; Geometric distortion correction (perspective transformation): in, These are the pixel coordinates of the distorted image. For the corrected coordinates, Given a 3×3 perspective transformation matrix, the matrix parameters are solved using a vehicle body calibration template. ; By vehicle speed With exposure time Confirmed, the expression is: in, The blur length (in pixels). Using the Dirac function, we can accurately model the linear fuzziness caused by high-speed motion.

[0007] Furthermore, step S1 also includes the following sub-steps: S1-4: Multi-view image stitching, using SIFT feature matching and RANSAC algorithm, generates a complete and distortion-free high-definition panoramic image of the vehicle's outer surface. Through precise algorithm adaptation, high-definition image acquisition without blurring or distortion under high-speed motion is achieved, providing high-quality data input for subsequent detection.

[0008] Furthermore, steps S1-3 also include illumination normalization processing, using a Gamma correction algorithm, the formula of which is: in, The image is after illumination normalization. These are the original image pixel values. This is the Gamma correction factor, ranging from 1.2 to 1.8, which is adaptively adjusted according to ambient light.

[0009] Furthermore, in step S2, the specific configuration of the hierarchical meta-task library includes hierarchical dimensions and meta-task structure: Layered dimensions: In terms of vehicle type, it includes high-speed rail, subway, urban rail, and regular-speed trains; Defect type dimension includes scratches, rust, paint peeling, cracks, and vehicle body deformation; Operating environment dimensions include dryness, humidity, high temperature, dust, and low light at night; Meta-task structure: Each meta-task is represented as The N-way K-shot low-sample configuration is adopted, where N is the number of defect categories, K=5~10, and the number of query set samples is 3~5 times that of the support set. The specific process of step S2 is as follows: According to the three-dimensional combination rule of "vehicle type-defect type-operating environment", an initial meta-task set is generated. The combination logic matches the subclass of each dimension with all subclasses of the other two dimensions one by one to achieve full coverage of all actual operation and maintenance scenarios of rail vehicles. The initial meta-task set is filtered to remove invalid combinations that do not have actual operation and maintenance conditions, that is, "vehicle type-defect-environment" combinations that will not appear in actual operation. Only valid meta-tasks that fit the actual operation and maintenance on site are retained to ensure the practicality of the meta-task library. For each effective meta-task, samples are split into support sets for low-sample learning. and query sets used for effect verification It adopts the N-way K-shot low-sample standard configuration; Assign hierarchical meta-task weight coefficients to each meta-task This coefficient is the core basis for calculating the weighted loss in subsequent basic element training, and its calculation and value rules are as follows: The weighting coefficient is calculated based on the actual maintenance frequency ratio of each meta-task's corresponding operating condition. The higher the maintenance frequency, the greater the corresponding weighting coefficient. The sum of the weighting coefficients of all valid meta-tasks is 1.

[0010] Furthermore, in step S3, the basic meta-training adopts a hierarchical meta-task weighted loss function, the specific formula of which is: in, For hierarchical meta-task weight coefficients, For the inner loop learning rate, These are global parameters of the model. The total loss function is the fusion of classification and localization, where, This represents the total loss function after the final fusion of classification and localization tasks; , These are the weighting coefficients; It is the cross-entropy loss function; That is, the generalized intersection-union ratio loss function; In step S3, the basic meta-training adopts a dual-loop training mode consisting of an inner loop and an outer loop. This is the core step in the model learning general representation rules. The training process uses the hierarchical meta-task library constructed in S2 as the data foundation. The specific process is as follows: Inner loop: Support set for each meta-task With meta-learning rate Using this as a baseline, calculate the loss and apply it to the model's global parameters. Perform local updates to allow the model to quickly adapt to the defective features of a single meta-task; Outer loop: Applies the updated parameters from the inner loop to the query set of the corresponding meta-task. Calculate the loss value of this meta-task, and combine it with the weight coefficients of the hierarchical meta-tasks. Calculate the global weighted loss; using the base learning rate Using this as a baseline, the original global parameters of the model are analyzed based on the global weighted loss. Optimize and update the model to learn the common defect representation rules of all meta-tasks; Looping and Iteration: For all valid meta-tasks in the hierarchical meta-task library, execute the above inner and outer loops in sequence to complete one round of global training; repeat this process until the set number of 1500-2500 iterations to ensure that the model learns fully.

[0011] Furthermore, in step S4, the new scenarios include new vehicle models, new defects, and new environments.

[0012] Furthermore, in step S4, the parameter update formula for rapid local parameter adaptation is: in, For fixed backbone network parameters, These are the initial detection head parameters. For the low-sample support set of the new scenario, the fine-tuned parameters account for 5% to 10% of the total parameters.

[0013] This invention provides a low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning, which has the following beneficial effects: 1. This low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning builds a hierarchical meta-task library based on three core dimensions: vehicle type, defect type, and operating environment. It solves the problems of mismatch between general meta-learning tasks and vehicle detection scenarios and weak generalization ability. For new vehicle types, new defects, new environments, and other new scenarios, only a very small number of labeled samples are needed to quickly generate scenario-specific detection models, which greatly reduces the cost of sample labeling and model adaptation time. It realizes defect classification, pixel-level localization, and severity level assessment, and solves the pain points of traditional detection models, such as large sample requirements and difficulty in adaptation.

[0014] 2. This meta-learning-based intelligent image detection method for the outer surface of low-sample rail vehicles enables dynamic acquisition and full-dimensional optimization of high-definition images of the outer surface of rail vehicles under normal operating conditions without stopping or reducing speed. It addresses the pain point of traditional detection methods requiring stopping and reducing speed, which impacts operational efficiency, by dynamically adjusting the shooting frequency using a high-speed linear array camera combined with laser velocity measurement data. This method is suitable for long-distance rail vehicle inspection scenarios without station stops. Furthermore, it achieves motion blur compensation, geometric distortion correction, and illumination normalization through adaptive Wiener filtering, perspective transformation, and Gamma correction, respectively. Finally, it uses SIFT feature matching and RANSAC algorithms to stitch together multi-view images, generating a complete and distortion-free high-definition panoramic image. This effectively eliminates image distortion caused by high-speed motion, shooting angle, and ambient lighting, and enhances the feature recognition of weak texture defects such as micro-scratches and shallow corrosion. This provides high-definition, high-completeness, and high-recognition high-quality image data for subsequent defect detection, ensuring the accuracy of subsequent detection from the data source. Attached Figure Description

[0015] Figure 1 This is a flowchart illustrating the steps of a low-sample intelligent image detection method for the outer surface of a rail vehicle based on meta-learning, according to the present invention. Detailed Implementation

[0016] The embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and examples. The following examples are for illustrative purposes only and should not be construed as limiting the scope of the invention.

[0017] like Figure 1 As shown, this invention provides a technical solution: a low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning. The core terms are defined as follows: The dual-speed optimized MAML meta-learning detection model employs a meta-learning model with differentiated optimization of inner and outer loop learning rates; the backbone network refers to the network module responsible for image feature extraction in the model, preferably a lightweight ResNet series network; the detection head refers to the network module responsible for defect classification and pixel-level localization output in the model, preferably a YOLO series detection head; low sample size refers to 5-10 labeled defect images for new scenes; uniform speed vehicle refers to the rail vehicle's real-time speed fluctuation within ±2km / h when passing through the image acquisition section, including the following steps: S1: Under normal operating conditions where the rail vehicle does not stop or reduce speed, high-definition images of the outer surface are dynamically acquired and distortion corrected through a high-speed synchronous imaging array. During the acquisition process, the rail vehicle must meet the criteria for a uniform-speed vehicle, and the imaging equipment must meet the technical parameter requirements of laser speed sensor accuracy ≥ ±0.1km / h and synchronous trigger synchronization delay ≤1ms. This step achieves high-definition image acquisition without interference, blurring, or distortion during normal vehicle operation, without affecting operational efficiency, and solves the pain point of traditional acquisition requiring stopping or reducing speed, making it suitable for long-distance non-station scenarios. The high-definition image acquisition without stopping or slowing down described in step S1 specifically includes the following sub-steps and corresponding algorithm models: S1-1: Deploy a multi-view high-speed linear array camera, laser velocity sensor and synchronization trigger on the trackside. The camera resolution is ≥4K and the frame rate is ≥1000 frames / second. S1-2: Dynamically adjust the shooting frequency based on laser velocity measurement data; the adaptation algorithm formula is as follows: in, For camera shooting frequency, For the real-time speed of rail vehicles, The horizontal pixel density of the image. The physical spacing between adjacent pixels is determined to ensure that the image is not stretched or interrupted during scanning. S1-3: Motion blur compensation and geometric distortion correction, using adaptive Wiener filtering and perspective transformation algorithms, with the following formulas: Motion blur compensation (Wiener filtering): in, To correct the frequency domain image, The motion fuzzy kernel (based on vehicle speed calculation) is used. For the frequency domain of the original blurred image, This is the noise suppression coefficient (value ranges from 0.01 to 0.05). This is the inverse Fourier transform; Geometric distortion correction (perspective transformation): in, These are the pixel coordinates of the distorted image. For the corrected coordinates, Given a 3×3 perspective transformation matrix, the matrix parameters are solved using a vehicle body calibration template. ; By vehicle speed With exposure time Confirmed, the expression is: in, The blur length (in pixels). Using the Dirac function, we can accurately model the linear ambiguity caused by high-speed motion; It also includes illumination normalization processing, using the Gamma correction algorithm, with the formula as follows: in, The image is after illumination normalization. These are the original image pixel values. The Gamma correction factor (valued between 1.2 and 1.8, adaptively adjusted according to ambient light) is used when the average image brightness is ≤80. Use a value of 1.6~1.8; when 80 < average image brightness < 180, Using a value of 1.2 to 1.4 can unify the image brightness under different lighting conditions and enhance the feature recognition of weak texture defects (micro-scratches, shallow corrosion); S1-4: Multi-view image stitching. SIFT feature matching and RANSAC algorithm (which are existing public technologies and will not be described in detail here) are used to generate a complete, distortion-free, high-definition panoramic image of the vehicle's outer surface with a stitching pixel error of ≤1 pixel. The panoramic image resolution is ≥8192×6144. Through precise algorithm adaptation, high-definition image acquisition without blurring or distortion under high-speed motion is achieved, providing high-quality data input for subsequent detection. S2: First, identify the specific subcategories for each dimension of vehicle model, defect type, and operating environment. Then, generate meta-tasks according to the three-dimensional combination rule of "vehicle model-defect type-operating environment". After eliminating invalid combinations without actual operation and maintenance conditions (the combination of "vehicle model-defect-environment" will not appear in actual operation), construct a hierarchical meta-task library for rail vehicle defects. Divide each meta-task into a support set for low-sample learning and a query set for effect verification. This step constructs a meta-learning task system that fits the real operation and maintenance scenario of rail vehicles, solving the problems of mismatch between general meta-learning tasks and vehicle detection scenarios and weak generalization ability. The specific configuration of the hierarchical meta-task library mentioned in step S2 is as follows: 1) Layering dimensions: vehicle type (high-speed rail, subway, urban rail, conventional train), defect type (scratches, rust, paint peeling, cracks, body deformation), and operating environment (dry, humid, high temperature, dust, low light at night); 2) Meta-task structure: Each meta-task is represented as... It adopts an N-way K-shot low-sample configuration (N is the number of defect categories, K=5~10), the number of query set samples is 3~5 times that of the support set, and the hierarchical meta-task weight coefficients are used. The calculation is based on the actual maintenance frequency ratio of each meta-task's corresponding working condition. The higher the maintenance frequency, the larger the corresponding weight coefficient, and the sum of the weight coefficients of all meta-tasks is 1. This clarifies the criteria for meta-task division and sample configuration, making the meta-training process stable and controllable, and ensuring that the general representation is highly matched with the actual working conditions. S3: Build a dual-speed optimized MAML meta-learning detection model. Using the hierarchical meta-task library constructed in step S2 as training data, perform basic meta-training to allow the model to learn the general representation rules of defects in rail vehicles and obtain pre-trained meta-model parameters. This step enables the model to have the core ability of "learning to learn" and lays a strong generalization foundation for rapid adaptation to new scenarios with low sample size. The basic meta-training described in step S3 uses a hierarchical meta-task weighted loss function, the specific formula of which is: in, The weight coefficients for hierarchical meta-tasks (based on operation and maintenance frequency statistics). The learning rate for the inner loop is also known as the meta-learning rate. These are global parameters of the model. The total loss function is the fusion of classification and localization, where, This represents the total loss function after the final fusion of classification and localization tasks. It is used to measure the difference between the model's prediction results and the actual results. The smaller the value, the better the model's prediction performance. , These are the weighting coefficients. Take a value of 0.6 to 0.8. 0.2~0.4 is used to adjust the relative importance of classification loss and localization loss in the total loss function; It is the cross-entropy loss function, used to measure the predicted class in a classification task. Compared to the real category The differences between them can effectively reflect the accuracy of the classification; That is, the generalized intersection-union loss function, used to evaluate the predicted bounding boxes in object detection tasks. With the true bounding box The degree of overlap and relative positional relationship between them are analyzed to optimize positioning accuracy; Basic meta-training requires initialization: Initialize the global parameters of the dual-speed optimized MAML model. The Xavier random initialization method is used to determine the basic learning rate. Learning rate In addition to hyperparameters such as the number of iterations in the inner and outer loop training, the general parameters for basic meta-training are set as follows: Meta-learning rate The number of iterations is 1500-2500 rounds, and the batch size is 8-16. S4: Obtain 5-10 low-sample defect images from new scenarios (new car models, new defects, new environments) as a new support set. Based on the pre-trained meta-model obtained in step S3, quickly fine-tune and adapt only local parameters to obtain a scenario-specific detection model. Pseudo-labels can be generated for unlabeled images of the new scenario based on the pre-trained meta-model. Valid pseudo-label samples are selected using a confidence threshold ≥0.9 to expand the size of the new support set. This step can quickly adapt to the new scenario with only a very small number of new samples, without retraining the entire model, which greatly reduces the cost of sample labeling and adaptation time. The parameter update formula for the rapid local parameter adaptation mentioned in step S4 is as follows: in, For fixed backbone network parameters, These are the initial detection head parameters. For the low-sample support set of the new scenario, fine-tuning parameters account for 5% to 10% of the total parameters; Pseudo-label generation and expansion process: Based on the pre-trained meta-model, inference is performed on unlabeled images of new scenes. Valid pseudo-labels are selected with a confidence level ≥ 0.9 as the threshold. The pseudo-label annotation format is consistent with the manual annotation. After expansion, the ratio of manually annotated samples to pseudo-label samples in the new support set is 1:1 to 1:3. Hybrid training is then performed. S5: Input the high-definition image corrected in step S1 into the scene-specific detection model to complete the classification, pixel-level localization and severity level assessment of defects, and generate maintenance work order information; this step achieves high-precision detection based on high-definition images and transforms the detection results into structured information that can be directly used for maintenance. The specific criteria for assessing the severity level of defects are as follows: Based on the pixel area of ​​the defect and the characteristics of the vehicle body area, three levels of standards are defined. Critical areas of the vehicle body include the vehicle body connection parts, load-bearing parts, and sealing parts, while non-critical areas are the ordinary panel areas of the vehicle body. ① Minor defects: defective pixel area < 500 pixels and located in a non-critical area; ② Moderate defects: 500 ≤ defective pixel area < 2000 pixels, or defective pixel area < 500 pixels and located in a critical area; ③ Severe defects: defective pixel area ≥ 2000 pixels, or 500 ≤ defective pixel area < 2000 pixels and located in a critical area. The output standard for pixel-level positioning is as follows: the high-definition panoramic image stitched together in steps S1-4 is used as the coordinate reference system, the origin of the coordinate system is the upper left corner of the panoramic image, the coordinate unit is pixels, and the defect positioning box is output in the form of diagonal pixel coordinates to achieve pixel-level accurate positioning. The maintenance work order information is standardized and structured, and includes at least nine core contents: inspection time, vehicle number, vehicle model, operating environment, defect category, pixel-level location coordinates, defect severity level, vehicle panel location, and maintenance suggestions.

[0018] Example: Detection of paint peeling defects on the outer surface of a subway train traveling at 60 km / h without stopping. Four high-speed linear array cameras (4096×3072 resolution, 15000 frames / second), a laser velocity sensor (measurement range 0~120 km / h, accuracy ±0.1 km / h), high-frequency supplementary lighting (brightness ≥5000 lux), and trackside edge computing equipment are deployed along the trackside. The hierarchical meta-task library contains 60 meta-tasks, with a total sample size of 1800 images, supporting set K=10 and a query set of 30 images. The implementation steps are as follows: Step S1: High-definition capture without stopping or slowing down S1-1: The trackside camera array is installed on both sides of the subway track, with the camera optical axis making an angle of 30° with the surface of the car body and a distance of 2.5m from the center line of the track; S1-2: The subway passes through the detection area at 60km / h (16.67m / s), and the laser sensor collects the vehicle speed in real time. Image horizontal pixel density pixels / m, physical spacing between adjacent pixels Substitute into the formula to calculate the shooting frequency. Frames per second, the synchronization trigger activates the camera to capture images; S1-3: Motion blur compensation: blur kernel Based on vehicle speed With exposure time Calculate the fuzzy length. Pixels, noise suppression coefficient Substituting the values ​​into the Wiener filter formula completes the deblurring; Geometric distortion correction: Select four calibration vertices of the vehicle body and solve for the perspective transformation matrix. Distortion correction completed; illumination normalization: average image brightness 75, taken as... Substituting this into the Gamma correction formula enhances brightness; S1-4: SIFT feature matching and RANSAC algorithm are used to stitch together images from 4 cameras to generate a high-definition panoramic image of the subway's outer surface with a resolution of 8192×6144. The image is blur-free, distortion-free, and the texture of paint peeling defects is clearly visible. Step S2: Application of the Hierarchical Metatask Library Select Metatask (Vehicle type: Subway, Defect type: Paint peeling, Environment: Daytime drying), Supports collection For 10 images with paint peeling annotations, please search the collection. There are 30 verification images, with weighting coefficients. ; Step S3: Basic Meta-training Model structure: Lightweight ResNet-18, defect attention module, and YOLOv5s detection head; Training parameters: Meta-learning rate Inner loop learning rate Iterate 2000 times, batch size 16; weights of the fusion loss function are taken as follows: , Training results: The meta-task library achieved an mAP of 92.3%, and the pre-training parameters were saved. ; Step S4: Rapid adaptation with low sample size New scenario requirement: Adapting to the paint peeling defect of the new train models on Metro Line 16, only 5 labeled samples are provided. ; Adaptation operation: Fixed Fine-tune the detector head parameters and learning rate. The process involved 10 iterations, taking 8 minutes. Pseudo-label enhancement involved generating pseudo-labels for 20 unlabeled paint-removal images, and then selecting 12 samples with a confidence level ≥ 0.9 to expand the support set, resulting in a final support set of 17 images. The ratio of manually labeled to pseudo-labeled samples was 1:2.4. After adaptation, the results were validated: classification accuracy 94.5%, localization IoU 86.2%, and grade assessment accuracy 91.3%, passing performance validation. Step S5: Defect Detection and Output The high-definition panoramic image generated by S1 is input into the scene-specific model, with a single-frame inference time of 16ms. The output results include: defect category (paint peeling), location box (3562,1892,3712,2048), severity level (medium, judgment criteria: defect pixel area of ​​1200 pixels, located in a non-critical area), panel coordinates (panel 2 on the left side of Metro Line 16), and a standardized maintenance work order containing nine core contents such as detection time, metro number, and vehicle type is automatically generated.

[0019] This invention builds a hierarchical meta-task library based on three core dimensions: vehicle model, defect type, and operating environment. It solves the problems of mismatch between general meta-learning tasks and vehicle detection scenarios, as well as weak generalization ability. For new vehicle models, new defects, new environments, and other new scenarios, only a very small number of labeled samples are needed to quickly generate scenario-specific detection models, which greatly reduces the cost of sample labeling and model adaptation time. It enables defect classification, pixel-level localization, and severity assessment, solving the pain points of traditional detection models, such as large sample requirements and difficulty in adaptation.

[0020] This invention enables dynamic acquisition and full-dimensional optimization of high-definition images of the outer surface of rail vehicles under normal operating conditions without stopping or reducing speed. It utilizes a high-speed linear array camera combined with laser velocity measurement data to dynamically adjust the shooting frequency, solving the pain point of traditional inspections requiring stopping and reducing speed, which impacts operational efficiency. This makes it suitable for long-distance rail vehicle inspection scenarios without station stops. Furthermore, it employs adaptive Wiener filtering, perspective transformation, and Gamma correction to achieve motion blur compensation, geometric distortion correction, and illumination normalization, respectively. SIFT feature matching and RANSAC algorithms are then used to stitch multi-view images together, generating a complete and distortion-free high-definition panoramic image. This effectively eliminates image distortion caused by high-speed motion, shooting angle, and ambient lighting, and enhances the feature recognition of weak texture defects such as micro-scratches and shallow corrosion. This provides high-definition, high-completeness, and high-recognition high-quality image data for subsequent defect detection, ensuring the accuracy of subsequent inspections from the data source.

[0021] The embodiments of the present invention are given for illustrative and descriptive purposes only, and are not intended to be exhaustive or to limit the invention to the forms disclosed. Many modifications and variations will be apparent to those skilled in the art. The embodiments were chosen and described in order to better illustrate the principles and practical application of the invention, and to enable those skilled in the art to understand the invention and to design various embodiments with various modifications suitable for a particular purpose.

Claims

1. A low-sample intelligent image detection method for the outer surface of rail vehicles based on meta-learning, characterized in that: Includes the following steps: S1. Image acquisition of a vehicle moving at a constant speed: Under normal operation conditions where the rail vehicle does not stop or reduce speed, high-definition images of the outer surface are dynamically acquired and distortion correction is performed through a high-speed synchronous imaging array. S2. Constructing a meta-learning task system: Based on three core dimensions—vehicle type, defect type, and operating environment—a hierarchical meta-task library for rail vehicle defects is constructed, dividing each meta-task into a support set for low-sample learning and a query set for effect verification. S3. Building a meta-learning detection model: Using the hierarchical meta-task library constructed in step S2 as training data, perform basic meta-training to allow the model to learn the general representation rules of defects in rail vehicles and obtain the pre-trained meta-model parameters. S4, Scene-Specific Detection Model: Low-sample defect images in new scenarios are obtained as new support sets. Based on the pre-trained meta-model obtained in step S3, only local parameters are quickly fine-tuned and adapted to obtain a scenario-specific detection model. S5. Defect Identification and Result Output: The high-definition image corrected in step S1 is input into the scene-specific detection model to complete the classification, pixel-level localization and severity level assessment of defects, and generate maintenance work order information.

2. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 1, characterized in that: In step S1, the acquisition of high-definition images without stopping or slowing down specifically includes the following sub-steps: S1-1: Deploy a multi-view high-speed linear array camera, laser velocity sensor and synchronization trigger on the trackside. The camera resolution is ≥4K and the frame rate is ≥1000 frames / second. S1-2: Dynamically adjust the shooting frequency based on laser velocity measurement data; the adaptation algorithm formula is as follows: in, For camera shooting frequency, For the real-time speed of rail vehicles, The horizontal pixel density of the image. The physical spacing between adjacent pixels is set to ensure that the image is not stretched or interrupted during scanning.

3. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 1, characterized in that: Step S1 further includes the following sub-steps: S1-3: Motion blur compensation and geometric distortion correction, using adaptive Wiener filtering and perspective transformation algorithms, with the following formulas: Motion blur compensation (Wiener filtering): in, To correct the frequency domain image, The motion fuzzy kernel (based on vehicle speed calculation) is used. For the frequency domain of the original blurred image, This is the noise suppression coefficient (value ranges from 0.01 to 0.05). This is the inverse Fourier transform; Geometric distortion correction (perspective transformation): in, These are the pixel coordinates of the distorted image. For the corrected coordinates, Given a 3×3 perspective transformation matrix, the matrix parameters are solved using a vehicle body calibration template. ; By vehicle speed With exposure time Confirmed, the expression is: in, The length of the blur (in pixels). Using the Dirac function, we can accurately model the linear fuzziness caused by high-speed motion.

4. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 3, characterized in that: Step S1 further includes the following sub-steps: S1-4: Multi-view image stitching, using SIFT feature matching and RANSAC algorithm, generates a complete and distortion-free high-definition panoramic image of the vehicle's outer surface. Through precise algorithm adaptation, high-definition image acquisition without blurring or distortion under high-speed motion is achieved, providing high-quality data input for subsequent detection.

5. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 3, characterized in that: Steps S1-3 also include illumination normalization processing, using a Gamma correction algorithm, the formula of which is: in, The image is after illumination normalization. These are the original image pixel values. This is the Gamma correction factor, ranging from 1.2 to 1.8, which is adaptively adjusted according to ambient light.

6. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 1, characterized in that: In step S2, the specific configuration of the hierarchical meta-task library is the hierarchical dimension and meta-task structure: Layered dimensions: In terms of vehicle type, it includes high-speed rail, subway, urban rail, and regular-speed trains; Defect type dimension includes scratches, rust, paint peeling, cracks, and vehicle body deformation; Operating environment dimensions include dryness, humidity, high temperature, dust, and low light at night; Meta-task structure: Each meta-task is represented as It adopts an N-way K-shot low-sample configuration, where N is the number of defect categories, K=5~10, and the number of query set samples is 3~5 times that of the support set; The specific process of step S2 is as follows: According to the three-dimensional combination rule of "vehicle type-defect type-operating environment", an initial meta-task set is generated. The combination logic matches the subclass of each dimension with all subclasses of the other two dimensions one by one to achieve full coverage of all actual operation and maintenance scenarios of rail vehicles. The initial meta-task set is filtered to remove invalid combinations that do not have actual operation and maintenance conditions, that is, "vehicle type-defect-environment" combinations that will not appear in actual operation. Only valid meta-tasks that fit the actual operation and maintenance on site are retained to ensure the practicality of the meta-task library. For each effective meta-task, samples are split into support sets for low-sample learning. and query sets used for effect verification It adopts the N-way K-shot low-sample standard configuration; Assign hierarchical meta-task weight coefficients to each meta-task This coefficient is the core basis for calculating the weighted loss in subsequent basic element training, and the calculation and value rules are as follows: The weighting coefficient is calculated based on the actual maintenance frequency ratio of each meta-task's corresponding operating condition. The higher the maintenance frequency, the greater the corresponding weighting coefficient. The sum of the weighting coefficients of all valid meta-tasks is 1.

7. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 1, characterized in that: In step S3, the basic meta-training adopts a hierarchical meta-task weighted loss function, the specific formula of which is: in, For hierarchical meta-task weight coefficients, For the inner loop learning rate, These are global parameters of the model. The total loss function is the fusion of classification and localization, where, This represents the total loss function after the final fusion of classification and localization tasks; , These are the weighting coefficients; It is the cross-entropy loss function; That is, the generalized intersection-union ratio loss function; In step S3, the basic meta-training adopts a dual-loop training mode consisting of an inner loop and an outer loop. This is the core step in the model learning general representation rules. The training process uses the hierarchical meta-task library constructed in S2 as the data foundation. The specific process is as follows: Inner loop: Support set for each meta-task With meta-learning rate Using this as a baseline, calculate the loss and apply it to the model's global parameters. Perform local updates to allow the model to quickly adapt to the defective features of a single meta-task; Outer loop: Applies the updated parameters from the inner loop to the query set of the corresponding meta-task. Calculate the loss value of this meta-task, and combine it with the weight coefficients of the hierarchical meta-tasks. Calculate the global weighted loss; using the base learning rate Using this as a baseline, the original global parameters of the model are analyzed based on the global weighted loss. Optimize and update the model to learn the common defect representation rules of all meta-tasks; Looping and Iteration: For all valid meta-tasks in the hierarchical meta-task library, execute the above inner and outer loops in sequence to complete one round of global training; repeat this process until the set number of 1500-2500 iterations to ensure that the model learns fully.

8. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 1, characterized in that: In step S4, the new scenarios include new vehicle models, new defects, and new environments.

9. The method for intelligent image detection of the outer surface of a rail vehicle based on meta-learning according to claim 1, characterized in that: In step S4, the parameter update formula for rapid local parameter adaptation is: in, For fixed backbone network parameters, These are the initial detection head parameters. For the low-sample support set of the new scenario, the fine-tuned parameters account for 5% to 10% of the total parameters.