Image-based train anti-snaking damper loosening identification method
By constructing a visual anisotropic dimensionality reduction structure and orthogonal projection technology, the problems of dynamic background pixel intrusion and high-light reflection spots in the identification of loose train anti-hunting vibration dampers were solved, and high-precision identification in complex environments was achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUITIE TECH CO LTD
- Filing Date
- 2026-04-02
- Publication Date
- 2026-06-16
Smart Images

Figure CN121963059B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to an image-based method for identifying loose train anti-hunting vibration dampers, belonging to the field of image understanding technology. Background Technology
[0002] The current common technical strategy in this field is to construct a two-dimensional bounding box based on a visual target tracking algorithm and extract the geometric center of the bounding box to characterize the physical spatial position of the mechanical parts. This allows for the calculation of the relative displacement between different related parts to determine the structural connection status. This method can provide intuitive part contour tracking information under static or low-speed stable operating conditions.
[0003] However, as the monitoring environment extends to high-speed alternating loads and complex outdoor lighting scenarios, the aforementioned two-dimensional tracking logic reveals serious fundamental limitations. In addition to the hardware's vibration resistance, the stability of the back-end recognition algorithm has become the core bottleneck restricting the detection accuracy. For example, Chinese invention patent CN113947116B discloses a camera-based non-contact real-time detection method for loose train tracks. It extracts the extreme gray values of pixels in the track and fastener areas using K-means clustering as virtual feature points and uses optical flow and FFT to decompose the natural frequencies. Since the main body of the anti-hunting vibration damper often presents the characteristics of a metal cylinder lacking high-frequency texture, the image stream acquired by the visual sensor under broadband mechanical vibration excitation will inevitably produce motion blur. Under this degraded visual condition, the lateral edges of the traditional two-dimensional bounding box are prone to dynamic background pixel intrusion around the track. At the same time, the high-gloss reflection spots on the metal surface will transiently slip with the vehicle's posture. This boundary distortion originating from the underlying optics causes a large amount of non-real physical drift noise to be mixed into the extracted geometric center coordinate sequence, making the monitoring system face the risk of false alarms.
[0004] Therefore, the technical problem to be solved by this invention is how to break through the rigid dependence of traditional two-dimensional visual tracking on high-definition spatial edge features and construct a visual state tracking and analysis method that is immune to dynamic background pixel intrusion, adapts to spatial perspective distortion, and decouples high and low frequency environmental noise. Summary of the Invention
[0005] To address the problems mentioned in the background art, the technical solution of the present invention is as follows: A method for identifying loose train anti-hunting shock absorbers based on images, comprising the following steps:
[0006] Step 101: Obtain continuous multi-frame train operation video image data including the train anti-hunting damper and associated reference components;
[0007] Step 102: Based on the macroscopic low-frequency contour template, delineate the first two-dimensional image block of the train anti-hunting damper and the second two-dimensional image block of the associated reference component in the continuous multi-frame train operation video image data respectively;
[0008] Step 103: Project the pixel gray values of the first two-dimensional image block and the second two-dimensional image block orthogonally along the direction perpendicular to the visual principal axis vector to construct the corresponding one-dimensional pixel distribution array, and calculate the weighted gray centroid of the one-dimensional pixel distribution array as the time-series tracking anchor point.
[0009] Step 104: Based on the temporal tracking anchor points of multiple consecutive frames, calculate the relative position difference between the weighted gray centroids of the first two-dimensional image block and the second two-dimensional image block within the corresponding frame, construct a relative coordinate difference sequence, set a moving average calculation window along the relative coordinate difference sequence, generate a local temporal baseline value based on the moving average calculation window, and subtract the local temporal baseline value from the relative coordinate difference sequence to generate a zero-mean residual sequence.
[0010] Step 105: Calculate the time series dispersion index of the zero-mean residual sequence. When the time series dispersion index is greater than the judgment threshold, output the identification result indicating that the train anti-hunting vibration damper is loose.
[0011] Preferably, the method includes the following steps: Step 201, after generating the zero-mean residual sequence, calculate the absolute difference between adjacent frame data nodes in the zero-mean residual sequence to generate the neighboring frame pixel gradient value; Step 202, compare the neighboring frame pixel gradient value with the pixel gradient threshold; Step 203, when the neighboring frame pixel gradient value is greater than the pixel gradient threshold, determine the corresponding data node as an outlier, and perform linear interpolation operation based on the normal data nodes before and after the outlier to replace the residual value corresponding to the outlier.
[0012] Preferably, step 301, calculating the time series dispersion index of the zero-mean residual sequence, is performed according to the following formula: Where D is the temporal dispersion index, and N is the number of frames included in the moving average calculation window. is the residual value corresponding to the i-th frame in the zero-mean residual sequence.
[0013] Preferably, the macroscopic low-frequency profile template is generated through the following steps: Step 401, acquiring reference image data under static conditions of the train; Step 402, performing a Gaussian low-pass filter operation on the reference image data to extract the global low-frequency profile structure of the train anti-hunting damper and associated reference components; wherein, the global low-frequency profile structure is the energy distribution envelope reflecting the geometric boundary of the main body of the component, after filtering out texture noise with wavelengths smaller than a preset pixel threshold and fine reflective spots on the metal surface in the image by setting the Gaussian filter kernel size to 5×5 to 11×11 and the standard deviation σ to 1.2 to 2.0; Step 403, configuring the global low-frequency profile structure as a macroscopic low-frequency profile template.
[0014] Preferably, the visual principal axis vector is determined through the following steps: Step 501, extract the two-dimensional pixel coordinate set of the geometric centerline of the train anti-hunting vibration damper in the first two-dimensional image block; Step 502, apply the least squares method to fit the two-dimensional pixel coordinate set of the geometric centerline to generate the visual principal axis vector; Step 503, construct an orthogonal projection coordinate system in the image space based on the visual principal axis vector.
[0015] Preferably, the method includes the following steps: Step 601, after calculating the time series dispersion index, recording the time series dispersion index in multiple consecutive calculation cycles to generate a dispersion time series evolution sequence; Step 602, calculating the monotonically increasing slope of the dispersion time series evolution sequence; Step 603, outputting a warning signal when the monotonically increasing slope is greater than the evolution acceleration threshold; wherein, the evolution acceleration threshold is determined based on the average slope of the dispersion time series evolution sequence of the train in a healthy state under normal service conditions, and is set to 3 to 5 times the average slope, with a specific value range of 0.05 pixels per calculation cycle to 0.20 pixels per calculation cycle. When the measured slope exceeds this range, it is determined that the vibration damper connection stiffness has entered the nonlinear decay stage.
[0016] Preferably, the threshold is generated through the following steps: Step 701, receiving a reference value characterizing the allowable mechanical displacement amplitude of the train; Step 702, mapping the reference value to the image pixel coordinate system according to the visual imaging ratio to generate a pixel displacement calibrated value; the calculation method for the visual imaging ratio is performed according to the following formula: , among which, T pixel To calibrate the generated pixel displacement, A mech For the received reference value characterizing the allowable mechanical displacement amplitude of the train, f is the physical focal length of the visual sensor, δ is the physical size of the pixel of the visual sensor, and Z is the shooting distance of the visual sensor; step 703, the pixel displacement calibrator is configured as the judgment threshold.
[0017] Preferably, the method includes the following steps: Step 801, receiving vehicle speed data representing the actual operating speed of the train and video frame rate parameters of multiple consecutive frames of train operation video image data; Step 802, calculating the ratio of vehicle speed data to video frame rate parameters, and generating a correction factor representing the sampling density of unit spatial displacement; Step 803, updating the number of window frames included in the moving average calculation window using the correction factor.
[0018] Preferably, the method includes the following steps: Step 901, after outputting the identification result indicating that the train anti-hunting shock absorber is loose, extracting the original video image segment corresponding to the time interval that triggered the identification result; Step 902, performing a color space conversion operation on the first two-dimensional image block and the second two-dimensional image block in the original video image segment, mapping the pixel grayscale values to a status label image, and setting the pseudo-color mapping interval of the status label image according to the residual value distribution of the zero-mean residual sequence; Step 903, outputting the identification result containing the status label image.
[0019] Preferably, the step of calculating the weighted gray centroid of a one-dimensional pixel distribution array includes the following steps: Step 1001, setting a truncation threshold based on the distribution of pixel gray values in the one-dimensional pixel distribution array; Step 1002, retaining the pixel position indices in the one-dimensional pixel distribution array whose pixel gray values are greater than the truncation threshold; Step 1003, using the pixel position indices and their corresponding pixel gray values to perform a weighted summation calculation to generate the weighted gray centroid.
[0020] Compared with the prior art, the beneficial effects of the present invention are:
[0021] 1. In the identification of loosening of anti-hunting vibration dampers in trains, this method constructs a visual anisotropic dimensionality reduction structure to reshape the generation logic of target tracking anchor points. For cylindrical metal components lacking high-frequency textures, this method extracts a one-dimensional visual principal axis vector representing the central axis of the component. The spatial coordinates of all pixels in the tracking area are orthogonally projected onto this principal axis vector. A one-dimensional mass distribution array is generated by accumulating pixel gray values along the orthogonal projection coordinates. The weighted gray centroid coordinates are calculated using the gray values of the pixels as weights. This mechanism forces the two-dimensional feature domain to collapse into a one-dimensional principal axis space. This physical dimensionality reduction operation spontaneously blocks the intrusion of dynamic background pixels perpendicular to the principal axis direction and the interference of the sliding of the high-gloss reflection spots on the metal surface on the positioning of coordinate points. The output weighted gray centroid coordinates purely map the distribution centroid of the physical mass inside the component at the optical level, thereby improving the signal-to-noise ratio of the tracking coordinate sequence in the underlying image processing domain and ensuring that the original visual data source input to the time series analysis model has real physical representativeness.
[0022] 2. A spatial attitude adaptive calibration mechanism is constructed by utilizing the inherent geometric topology of the tested scene. To address the nonlinear translational illusion induced on the two-dimensional imaging plane by the three-dimensional spatial attitude deflection when a train passes through a complex track section, this method extracts two local reference domains with a defined physical distance on the same rigid reference component. The temporal coordinates of the two domains are connected to generate a dynamic visual baseline vector that represents the real-time image attitude. The vertical projection distance from the centroid coordinates of the target component to this dynamic visual baseline vector is calculated to generate a projection distance sequence. The rigidity of the reference component itself is used as a dynamic scale in the image domain to transform the absolute difference comparison in the two-dimensional coordinate system into a one-dimensional orthogonal projection distance analysis. This spatial dimensionality reduction process relies on the geometric projection principle to eliminate pseudo-relative displacement components caused by camera perspective distortion or vehicle roll, so that the final displacement feature sequence purely reflects the real physical relative motion state between mechanical components.
[0023] 3. By strengthening the environmental adaptability of the image understanding model through a multi-dimensional temporal sequence purification mechanism, this method addresses the environmental interference faced by long-term outdoor monitoring. It establishes a moving average calculation window along the relative coordinate difference sequence to generate local temporal baseline values. It uses the difference between the original data and the baseline to generate a zero-mean residual sequence. This low-pass filtering logic removes low-frequency baseline drift caused by environmental factors such as thermodynamic creep of the camera bracket. At the same time, this method calculates the absolute difference between adjacent frames to generate the gradient value of the adjacent frame. It compares it with the dynamic extreme value threshold representing the limit of physical displacement, determines and uses linear interpolation to replace tracking anomalies that exceed the physical limit. The above mechanism introduces the dynamic constraints of heavy machinery into the cleaning process of visual pixels. By truncating the transmission path of optical flying point interference and slow environmental deformation to the final decision model through high and low frequency joint constraints, it establishes the consistency of the visual feature sequence under extreme temperature difference and strong light conditions. Attached Figure Description
[0024] Figure 1 This is a flowchart illustrating the overall steps of visual identification of loose train anti-hunting shock absorber according to the present invention.
[0025] Figure 2 This is a flowchart of the residual sequence outlier determination and time series data cleaning process of the present invention.
[0026] The objectives, features, and advantages of this invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation
[0027] The technical solutions of the embodiments of this application will be clearly described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of this application are within the scope of protection of this application.
[0028] An image-based method for identifying loose train anti-hunting dampers includes the following steps:
[0029] Step 101: Obtain continuous multi-frame train operation video image data including the train anti-hunting damper and associated reference components;
[0030] Step 102: Based on the macroscopic low-frequency contour template, delineate the first two-dimensional image block of the train anti-hunting damper and the second two-dimensional image block of the associated reference component in the continuous multi-frame train operation video image data respectively;
[0031] Step 103: Project the pixel gray values of the first two-dimensional image block and the second two-dimensional image block orthogonally along the direction perpendicular to the visual principal axis vector to construct the corresponding one-dimensional pixel distribution array, and calculate the weighted gray centroid of the one-dimensional pixel distribution array as the time-series tracking anchor point.
[0032] Step 104: Based on the temporal tracking anchor points of multiple consecutive frames, calculate the relative position difference between the weighted gray centroids of the first two-dimensional image block and the second two-dimensional image block within the corresponding frame, construct a relative coordinate difference sequence, set a moving average calculation window along the relative coordinate difference sequence, generate a local temporal baseline value based on the moving average calculation window, and subtract the local temporal baseline value from the relative coordinate difference sequence to generate a zero-mean residual sequence.
[0033] Step 105: Calculate the time series dispersion index of the zero-mean residual sequence. When the time series dispersion index is greater than the judgment threshold, output the identification result indicating that the train anti-hunting vibration damper is loose.
[0034] Preferably, the method includes the following steps: Step 201, after generating the zero-mean residual sequence, calculate the absolute difference between adjacent frame data nodes in the zero-mean residual sequence to generate the neighboring frame pixel gradient value; Step 202, compare the neighboring frame pixel gradient value with the pixel gradient threshold; Step 203, when the neighboring frame pixel gradient value is greater than the pixel gradient threshold, determine the corresponding data node as an outlier, and perform linear interpolation operation based on the normal data nodes before and after the outlier to replace the residual value corresponding to the outlier.
[0035] Preferably, step 301, calculating the time series dispersion index of the zero-mean residual sequence, is performed according to the following formula: Where D is the temporal dispersion index, and N is the number of frames included in the moving average calculation window. is the residual value corresponding to the i-th frame in the zero-mean residual sequence.
[0036] Preferably, the macroscopic low-frequency profile template is generated through the following steps: Step 401, acquiring reference image data under static conditions of the train; Step 402, performing a Gaussian low-pass filter operation on the reference image data to extract the global low-frequency profile structure of the train anti-hunting damper and associated reference components; wherein, the global low-frequency profile structure is the energy distribution envelope reflecting the geometric boundary of the main body of the component, after filtering out texture noise with wavelengths smaller than a preset pixel threshold and fine reflective spots on the metal surface in the image by setting the Gaussian filter kernel size to 5×5 to 11×11 and the standard deviation σ to 1.2 to 2.0; Step 403, configuring the global low-frequency profile structure as a macroscopic low-frequency profile template.
[0037] Preferably, the visual principal axis vector is determined through the following steps: Step 501, extract the two-dimensional pixel coordinate set of the geometric centerline of the train anti-hunting vibration damper in the first two-dimensional image block; Step 502, apply the least squares method to fit the two-dimensional pixel coordinate set of the geometric centerline to generate the visual principal axis vector; Step 503, construct an orthogonal projection coordinate system in the image space based on the visual principal axis vector.
[0038] Preferably, the method includes the following steps: Step 601, after calculating the time series dispersion index, recording the time series dispersion index in multiple consecutive calculation cycles to generate a dispersion time series evolution sequence; Step 602, calculating the monotonically increasing slope of the dispersion time series evolution sequence; Step 603, outputting a warning signal when the monotonically increasing slope is greater than the evolution acceleration threshold; wherein, the evolution acceleration threshold is determined based on the average slope of the dispersion time series evolution sequence of the train in a healthy state under normal service conditions, and is set to 3 to 5 times the average slope, with a specific value range of 0.05 pixels per calculation cycle to 0.20 pixels per calculation cycle. When the measured slope exceeds this range, it is determined that the vibration damper connection stiffness has entered the nonlinear decay stage.
[0039] Preferably, the threshold is generated through the following steps: Step 701, receiving a reference value characterizing the allowable mechanical displacement amplitude of the train; Step 702, mapping the reference value to the image pixel coordinate system according to the visual imaging ratio to generate a pixel displacement calibrated value; the calculation method for the visual imaging ratio is performed according to the following formula: , among which, T pixel To calibrate the generated pixel displacement, A mech For the received reference value characterizing the allowable mechanical displacement amplitude of the train, f is the physical focal length of the visual sensor, δ is the physical size of the pixel of the visual sensor, and Z is the shooting distance of the visual sensor; step 703, the pixel displacement calibrator is configured as the judgment threshold.
[0040] Preferably, the method includes the following steps: Step 801, receiving vehicle speed data representing the actual operating speed of the train and video frame rate parameters of multiple consecutive frames of train operation video image data; Step 802, calculating the ratio of vehicle speed data to video frame rate parameters, and generating a correction factor representing the sampling density of unit spatial displacement; Step 803, updating the number of window frames included in the moving average calculation window using the correction factor.
[0041] Preferably, the method includes the following steps: Step 901, after outputting the identification result indicating that the train anti-hunting shock absorber is loose, extracting the original video image segment corresponding to the time interval that triggered the identification result; Step 902, performing a color space conversion operation on the first two-dimensional image block and the second two-dimensional image block in the original video image segment, mapping the pixel grayscale values to a status label image, and setting the pseudo-color mapping interval of the status label image according to the residual value distribution of the zero-mean residual sequence; Step 903, outputting the identification result containing the status label image.
[0042] Preferably, the step of calculating the weighted gray centroid of a one-dimensional pixel distribution array includes the following steps: Step 1001, setting a truncation threshold based on the distribution of pixel gray values in the one-dimensional pixel distribution array; Step 1002, retaining the pixel position indices in the one-dimensional pixel distribution array whose pixel gray values are greater than the truncation threshold; Step 1003, using the pixel position indices and their corresponding pixel gray values to perform a weighted summation calculation to generate the weighted gray centroid.
[0043] Example 1: In the case of trains subjected to alternating loads and varying outdoor lighting conditions, the continuous video stream of the train's anti-hunting damper acquired by the visual sensor suffers from motion blur and optical distortion. Dynamic background pixels around the track and high-gloss reflections on the metal surface cause distortion of high-frequency spatial feature boundaries, introducing drift noise into the coordinate tracking sequence. To address this optical degradation problem, continuous multi-frame train operation video image data containing the train's anti-hunting damper and associated reference components are acquired. Based on a global low-frequency contour template, a first two-dimensional image block of the train's anti-hunting damper and a second two-dimensional image block of the associated reference components are delineated in the continuous multi-frame train operation video image data. The geometric centerline two-dimensional pixel coordinate set of the train's anti-hunting damper in the first two-dimensional image block is extracted and applied... The least squares method is used to fit and generate the visual principal axis vector. The pixel gray values of the first two-dimensional image block and the second two-dimensional image block are orthogonally projected along the direction perpendicular to the visual principal axis vector to construct the corresponding one-dimensional pixel distribution array. This orthogonal projection operation filters out background pixels and optical artifacts perpendicular to the component's principal axis direction. Based on this, a truncation threshold is set according to the pixel gray value distribution in the one-dimensional pixel distribution array. The pixel position index with a pixel gray value greater than the truncation threshold is retained. The pixel position index and its corresponding pixel gray value are used to apply a weighted summation to generate a weighted gray centroid, which is set as the temporal tracking anchor point. This step replaces the two-dimensional edge analysis with the aggregation calculation of the one-dimensional pixel quality distribution. The output weighted gray centroid coordinates are used to characterize the distribution centroid of the physical quality inside the component at the optical level.
[0044] Based on the temporal tracking anchor points of multiple consecutive frames, the relative position difference between the weighted gray-level centroids of the first and second two-dimensional image blocks within the corresponding frame is calculated to construct a relative coordinate difference sequence. A moving average calculation window is set along the relative coordinate difference sequence to generate a local temporal baseline value. Then, the local temporal baseline value is subtracted from the relative coordinate difference sequence to generate a zero-mean residual sequence, thus removing the slow baseline drift component caused by thermal deformation of the camera mount. Finally, according to the formula... The temporal dispersion index of a zero-mean residual sequence is calculated; where D is the temporal dispersion index, with pixels as its unit, and N is the number of frames included in the moving average calculation window, which is a dimensionless constant. The residual value corresponding to the i-th frame in the zero-mean residual sequence is expressed in pixels. When the temporal dispersion index is greater than the judgment threshold, the identification result indicating the loosening of the train anti-hunting vibration damper is output. Through pixel orthogonal projection and temporal difference calculation, the identification result corresponds to the physical relative motion difference between mechanical components. The image jitter components caused by ambient light jumps and train common-mode vibration are suppressed in the above data processing flow. The number of frames N in the moving average calculation window is determined according to the isolation boundary of the signal characteristic frequency band. Based on the video frame rate f of the continuous multi-frame train operation video image data, an integer satisfying that N divided by f is greater than twice the mechanical loosening characteristic period of the train anti-hunting vibration damper and less than one-tenth of the thermal creep time constant of the camera support structure is selected so that the zero-mean residual sequence separates the low-frequency baseline drift caused by ambient temperature difference and the high-frequency displacement characteristics caused by component loosening. When executing step 105 to determine the judgment threshold, the physical focal length, pixel physical size, and shooting distance parameters of the visual sensor are read, the scaling factor between the image plane and the object plane is calculated, and the pixel span in the image coordinate system is measured in real time as 200 pixels by locating two feature anchor points with a known physical distance of 150.0 mm on the associated reference component. Thus, the projection scaling factor under the current dynamic object distance is calculated to be 0.75 mm / pixel. The extracted weighted gray centroid coordinate difference is linearly compensated and scaled using this scaling factor to decouple the imaging scale pseudo translation error caused by the vehicle body roll. The extreme value of the safety displacement of the shock absorber defined by the rail transit standard is mapped to the corresponding pixel displacement value. Combining the temporal discreteness distribution law of the locking state and loosening state under historical working conditions, the feature envelopes of the two types of states are selected to achieve the maximum inter-class separation value as the judgment benchmark.
[0045] Example 2: This example constructs a 1:1 scale physical vibration test platform for a railway vehicle bogie anti-hunting damper. The hardware specifications for acquiring video image data are limited to a global exposure industrial vision sensor with a resolution of 1920×1080 and a sampling frequency of 500Hz. A Gaussian white noise light source signal with an amplitude of 20dB is actively introduced into the test platform environment. A programmable strobe lamp simulates the transient jump in ambient light from 1000lx to 50000lx when a train enters or exits a tunnel. Under this condition, continuous multi-frame train operation video image data containing the reference component and the train anti-hunting damper are generated. The number of frames N contained in the moving average calculation window is set to face real-time baseline tracking. In order to balance the preservation of real low-frequency vibrations, based on the laws of mechanical vibration, when the sampling frequency of the visual sensor is fixed at 500Hz, the main frequency of common-mode vibration during normal train operation is concentrated in the range of 0.5Hz to 2Hz. If a smaller frame number N is selected, the sliding window will smooth out the real mechanical loosening low-frequency relative displacement. If an excessively large frame number N is selected, the system will not be able to filter out the low-frequency baseline drift caused by the thermal expansion of the camera bracket. Combining the above-mentioned signal frequency band isolation boundary to establish decision rules, the sliding time window width corresponding to the frame number N is selected to be greater than twice the real loosening excitation period and less than the lower limit of the thermodynamic baseline drift time constant. Based on this physical decision logic, the test range of the frame number N is determined to be 80 to 150.
[0046] The scheme using a two-dimensional spatial bounding box to extract the geometric center was set as the first control group, and the scheme including orthogonal projection but lacking sliding temporal differential filtering was set as the second control group. The experimental group using the scheme of this invention was set up. During the transient jump condition triggering stage, the surface of the metal component in the original image experienced a slippage of the high-brightness reflection spot. The geometric center sequence extracted by the first control group was affected by optical distortion, and its extracted two-dimensional coordinate residual produced a jump peak with an amplitude of 12.4 pixels within 0.05 seconds, leading to an increase in the output temporal dispersion index D error. The experimental group extracted the geometric centerline two-dimensional pixels of the train anti-hunting vibration damper in the first two-dimensional image block. The coordinate set is fitted with the visual principal axis vector, and the pixel gray values of the first two-dimensional image block and the second two-dimensional image block are orthogonally projected along the direction perpendicular to the visual principal axis vector to construct a one-dimensional pixel distribution array. This spatial dimension reduction mapping disperses the projection of the specular reflection spot of the three-dimensional cylinder in the non-principal axis direction. The weighted gray centroid is suppressed to 1.2 pixels in the coordinate residual jump amplitude under the same transient jump condition. The second control group suppresses the optical reflection jump, but due to the ambient temperature rising at a gradient of 2°C per minute, the camera bracket produces thermodynamic creep, which causes the weighted gray centroid coordinates to produce a continuous 4.5 pixel unidirectional shift within 10 minutes.
[0047] The actual mechanical loosening gap was set to three discrete gradients: 0.5 mm, 1.0 mm, and 2.0 mm. For the frame number N, a lower limit out-of-range control group with a value of 10 and an upper limit out-of-range control group with a value of 500 were set, according to the formula... The temporal dispersion index D for each group was calculated. When the number of frames N in the control group exceeding the lower limit was 10, due to the short window, the low-frequency relative displacement caused by 2.0 mm mechanical loosening was followed by the moving average baseline, resulting in a zero-mean residual sequence output. Approaching zero, the calculated temporal dispersion index D is less than 0.5 pixels, triggering a false alarm. When the upper limit exceeds the range, and the frame number N of the control group is 500, the slow displacement introduced by environmental thermal creep cannot be subtracted from the baseline. When the actual mechanical loosening gap is 0.5 mm, the output temporal dispersion index D reaches 6.8 pixels, exceeding the judgment threshold and generating a false alarm. In the experimental group, when the frame number N is set to 100, the zero-mean residual sequence... The low-frequency component of thermal creep and the actual mechanical loosening component were separated. For loosening gaps of 0.5 mm, 1.0 mm and 2.0 mm, the temporal dispersion index D output by the test group showed a step-like linear increase of 1.8 pixels, 3.5 pixels and 6.7 pixels, respectively. Environmental disturbances and baseline drift were suppressed by orthogonal projection dimensionality reduction and window temporal difference filtering. The output temporal dispersion index D maps the physical loosening state of the train anti-hunting vibration damper.
[0048] Example 3: When a train crosses the boundary between light and dark or is affected by extreme weather, drastic changes in ambient light intensity cause local absolute grayscale values of the image to drift. Filtering pixels with static constant parameters will cause the loss of effective target features in the one-dimensional pixel distribution array. The system reads the one-dimensional pixel distribution array constructed from the first two-dimensional image block in the current frame, extracts the pixel grayscale values of all one-dimensional spatial nodes in the array, and calculates the mathematical expectation and standard deviation of the pixel grayscale values. The mathematical expectation and three times the standard deviation are added together to generate the dynamic truncation threshold for the current frame. This logic incorporates the overall grayscale shift caused by ambient light into the mathematical expectation variable, and uses the standard deviation parameter to filter out local high-frequency reflection disturbances, making the truncation threshold adaptable to the optical environment of the current frame. The system retains the pixel position indexes in the one-dimensional pixel distribution array whose pixel grayscale values are greater than the dynamic truncation threshold, and uses the pixel position indexes and their corresponding pixel grayscale values to apply a weighted summation calculation to generate a weighted grayscale centroid as a time-series tracking anchor point.
[0049] To address coordinate jumps in the zero-mean residual sequence caused by track obstructions or strong light flickering, a pixel gradient threshold is established based on the physical motion limits of the shock absorber's mechanical system. The system reads the nominal relative velocity of the train's anti-hunting shock absorber under extreme load conditions, multiplies it by the sampling period of the visual sensor to calculate the theoretical displacement per frame. Then, combining the physical focal length and object distance parameters of the visual sensor, the theoretical displacement per frame is converted into a pixel dimension translation in the corresponding image coordinate system. This pixel dimension translation is multiplied by a system tolerance coefficient of 1.1 to establish the pixel gradient threshold. After generating the zero-mean residual sequence, the zero... The absolute difference between adjacent frame data nodes in the mean residual sequence generates the pixel gradient value of the adjacent frame and compares it with the pixel gradient threshold. When the pixel gradient value of the adjacent frame is greater than the pixel gradient threshold, the corresponding physical relative motion velocity exceeds the rigid body physical motion limit of the vibration damper. The system determines the corresponding data node as an anomaly point and performs linear interpolation operation based on the normal data nodes before and after the anomaly point to replace the residual value corresponding to the anomaly point. This data cleaning process establishes a logical benchmark based on the physical properties of mechanical kinematics, blocks the transmission path of non-structural distortion signals, and makes the calculated output time-series dispersion index map the real physical relative motion state of the component.
[0050] Example 4: In the case of initial deployment on an unknown type of rail vehicle or physical reset of the spatial pose of the vision sensor, initiate the pre-baseline calibration procedure; acquire reference image data under static conditions of the train, perform Gaussian low-pass filtering on the reference image data to filter out surface texture and dynamic high-frequency interference components, extract the global low-frequency contour structure of the train anti-hunting damper and associated reference components; establish the rigid body transformation matrix mapping relationship between the two-dimensional pixel coordinate system where the global low-frequency contour structure is located and the real physical coordinate system by combining the preset spatial coordinate boundary, and configure the global low-frequency contour structure as a macroscopic low-frequency contour template.
[0051] When determining the judgment threshold, a historical train operation image dataset containing calibrated physical loosening gap values is loaded; the first subset corresponding to the locking state is extracted and the temporal dispersion benchmark value is calculated; simultaneously, the second subset corresponding to the physical gap being greater than the safety limit extreme value is extracted and the temporal dispersion feature value is calculated; a data discrimination vector is constructed based on the upper boundary of the temporal dispersion benchmark value and the lower boundary of the temporal dispersion feature value; within the discrimination vector interval, a support vector machine classification algorithm is applied to solve for the hyperplane parameters with the maximum inter-class margin; the absolute value of the intercept corresponding to the hyperplane parameters is extracted and established as the judgment threshold. Through this offline calibration process, the critical state of mechanical deformation of the shock absorber is converted into a numerical decision boundary. This judgment threshold is used as a comparison benchmark to define the temporal dispersion index of the zero-mean residual sequence, so that the output identification result of the train anti-hunting shock absorber loosening maps the physical relative motion state between components.
[0052] Example 5: When the system faces the objective condition that oil or local mud adheres to the surface of the train anti-hunting vibration damper, causing non-structural loss of edge pixels within the first two-dimensional image block, the system initiates the anti-interference fitting procedure for the visual principal axis vector. It reads the two-dimensional pixel coordinate set of the geometric centerline of the train anti-hunting vibration damper in the first two-dimensional image block, calculates the spatial distribution variance of the two-dimensional pixel coordinate set, and multiplies this value by a constant 2 to establish a distance truncation scale. It calculates the orthogonal Euclidean distance from each coordinate point in the two-dimensional pixel coordinate set to the initial prefitted line, and removes discrete coordinate points whose orthogonal Euclidean distance is greater than the scale value based on the distance truncation scale, retaining the remaining coordinate points to construct an interior point coordinate subset. It applies the least squares method to perform linear iterative fitting on the interior point coordinate subset, terminating the iteration when the absolute difference in the slope of the line output from two consecutive iterations is less than 0.01. It extracts the converged line direction parameters and outputs them as the visual principal axis vector. This numerical calculation process filters out coordinate extrema introduced by local optical contamination based on the spatial distribution density attribute, establishing that the geometric reference for the orthogonal projection operation is parallel to the physical mechanical axis of the vibration damper.
[0053] For the calibration of the system tolerance coefficient in the pixel gradient threshold calculation logic, the system executes the on-site deployment pre-deployment data calibration procedure; the vision sensor is controlled to collect benchmark verification video stream data of the target model train running continuously for 100 hours under full load without mechanical loosening, and the neighboring frame pixel gradient values of all related frames in this period are calculated according to the aforementioned temporal differential filtering process to construct the numerical envelope set of normal service state; the maximum neighboring frame pixel gradient value in the numerical envelope set is extracted, and the maximum neighboring frame pixel gradient value is divided by the maximum theoretical displacement of a single frame derived from the mechanical limit parameters to generate a dynamic compensation ratio; the dynamic compensation ratio is added to a fixed safety margin value of 0.05 to establish the system tolerance coefficient of the current model train. This parameter extraction logic transforms the inherent assembly tolerance of the specific train bogie and the normal vibration amplitude of the service track section into a quantitative calculation operator, so that the pixel gradient threshold generated by this coefficient is adapted to the background fluctuation boundary of the target deployment environment.
[0054] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the present invention can be implemented in other specific forms without departing from the spirit or essential characteristics of the present invention.
[0055] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims
1. An image-based method for identifying loose train anti-hunting vibration dampers, characterized in that, Includes the following steps: Step 101: Obtain continuous multi-frame train operation video image data including the train anti-hunting damper and associated reference components; Step 102: Based on the macroscopic low-frequency contour template, delineate the first two-dimensional image block of the train anti-hunting damper and the second two-dimensional image block of the associated reference component in the continuous multi-frame train operation video image data respectively; Step 103: Project the pixel gray values of the first two-dimensional image block and the second two-dimensional image block orthogonally along the direction perpendicular to the visual principal axis vector to construct the corresponding one-dimensional pixel distribution array, and calculate the weighted gray centroid of the one-dimensional pixel distribution array as the time-series tracking anchor point. Step 104: Based on the temporal tracking anchor points of multiple consecutive frames, calculate the relative position difference between the weighted gray centroids of the first two-dimensional image block and the second two-dimensional image block within the corresponding frame, construct a relative coordinate difference sequence, set a moving average calculation window along the relative coordinate difference sequence, generate a local temporal baseline value based on the moving average calculation window, and subtract the local temporal baseline value from the relative coordinate difference sequence to generate a zero-mean residual sequence. Step 105: Calculate the time series dispersion index of the zero-mean residual sequence. When the time series dispersion index is greater than the judgment threshold, output the identification result indicating that the train anti-hunting vibration damper is loose.
2. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, Includes the following steps: Step 201: After generating the zero-mean residual sequence, calculate the absolute difference between adjacent frame data nodes in the zero-mean residual sequence to generate the pixel gradient value of the adjacent frame. Step 202: Compare the pixel gradient values of neighboring frames with the pixel gradient threshold. Step 203: When the pixel gradient value of a neighboring frame is greater than the pixel gradient threshold, the corresponding data node is determined to be an outlier. Linear interpolation is then performed based on the normal data nodes before and after the outlier to replace the residual value corresponding to the outlier.
3. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, Step 301, calculating the time series dispersion index of the zero-mean residual sequence, is performed according to the following formula: Where D is the temporal dispersion index, and N is the number of frames included in the moving average calculation window. is the residual value corresponding to the i-th frame in the zero-mean residual sequence.
4. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, The macroscopic low-frequency profile template is generated through the following steps: Step 401, acquire reference image data under static conditions of the train; Step 402, perform Gaussian low-pass filtering on the reference image data to extract the global low-frequency profile structure of the train anti-hunting damper and associated reference components; wherein, the global low-frequency profile structure is the energy distribution envelope that reflects the geometric boundary of the main body of the component after filtering out texture noise with wavelengths smaller than the preset pixel threshold and fine reflective spots on the metal surface by setting the Gaussian filter kernel size to 5×5 to 11×11 and the standard deviation σ to 1.2 to 2.0; Step 403, configure the global low-frequency profile structure as a macroscopic low-frequency profile template.
5. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, The visual principal axis vector is determined through the following steps: Step 501, extract the two-dimensional pixel coordinate set of the geometric centerline of the train anti-hunting vibration damper in the first two-dimensional image block; Step 502, apply the least squares method to fit the two-dimensional pixel coordinate set of the geometric centerline to generate the visual principal axis vector; Step 503, construct an orthogonal projection coordinate system in the image space based on the visual principal axis vector.
6. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, Includes the following steps: Step 601: After calculating the time-series dispersion index, record the time-series dispersion index over multiple consecutive calculation cycles to generate a dispersion time-series evolution sequence; Step 602: Calculate the monotonically increasing slope of the dispersion time-series evolution sequence; Step 603: When the monotonically increasing slope is greater than the evolution acceleration threshold, output a warning signal; wherein, the evolution acceleration threshold is determined based on the average slope of the dispersion time-series evolution sequence of the train in its healthy state under normal service conditions, and is set to 3 to 5 times the average slope, with a specific value range of 0.05 pixels per calculation cycle to 0.20 pixels per calculation cycle. When the measured slope exceeds this range, it is determined that the vibration damper connection stiffness has entered the nonlinear decay stage.
7. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, The threshold is generated through the following steps: Step 701, receive the reference value representing the allowable mechanical displacement amplitude of the train; Step 702, map the reference value to the image pixel coordinate system according to the visual imaging ratio to generate a pixel displacement calibrated value. The calculation method for the visual imaging ratio is performed according to the following formula: , among which, T pixel To calibrate the generated pixel displacement, A mech For the received reference value characterizing the allowable mechanical displacement amplitude of the train, f is the physical focal length of the visual sensor, δ is the physical size of the pixel of the visual sensor, and Z is the shooting distance of the visual sensor; step 703, the pixel displacement calibrator is configured as the judgment threshold.
8. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, Includes the following steps: Step 801: Receive vehicle speed data representing the actual operating speed of the train and video frame rate parameters of multiple consecutive frames of train operation video image data; Step 802: Calculate the ratio of vehicle speed data to video frame rate parameters to generate a correction factor representing the sampling density of unit spatial displacement; Step 803: Update the number of window frames included in the moving average calculation window using the correction factor.
9. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, Includes the following steps: Step 901: After outputting the identification result indicating that the train's anti-hunting shock absorber is loose, extract the original video image segment corresponding to the time interval that triggered the identification result; Step 902: Perform a color space conversion operation on the first two-dimensional image block and the second two-dimensional image block in the original video image segment, mapping the pixel grayscale values to a status label image. The pseudo-color mapping interval of the status label image is set according to the residual value distribution of the zero-mean residual sequence; Step 903: Output the identification result containing the status label image.
10. The image-based method for identifying loose train anti-hunting vibration dampers according to claim 1, characterized in that, The steps for calculating the weighted gray centroid of a one-dimensional pixel distribution array include the following steps: Step 1001, setting a truncation threshold based on the distribution of pixel gray values in the one-dimensional pixel distribution array; Step 1002, retaining the pixel position indices in the one-dimensional pixel distribution array whose pixel gray values are greater than the truncation threshold; Step 1003, using the pixel position indices and their corresponding pixel gray values to perform a weighted summation calculation to generate the weighted gray centroid.