Railway vehicle running gear anomaly detection method based on symbolic regression and generative adversarial network
By employing methods based on symbolic regression and generative adversarial networks, the structural and dynamic characteristics of the running gear of rail vehicles are learned, and a health baseline is established. This solves the problem of untimely fault detection in existing technologies and enables online real-time anomaly detection and safety improvement.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- JIANGXI KMAX IND CO LTD
- Filing Date
- 2022-10-27
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies make it difficult to establish accurate fault detection models for the running gear of rail vehicles through mechanistic analysis, resulting in untimely fault detection.
A method based on symbolic regression and generative adversarial networks is adopted. By using a structural learning model and a generative adversarial network model, the structural characteristics of the running gear of the rail vehicle are learned as a health baseline, and anomaly detection is performed by combining real-time monitoring data.
It enables online real-time anomaly detection of the running gear of rail vehicles, which can dynamically track and promptly detect anomalies, improving safety and uptime.
Smart Images

Figure CN115688036B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of fault detection, and in particular to a method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks. Background Technology
[0002] The running gear is a core system of modern rail vehicles, and has become a highly complex electromechanical device. Its operation relies on the coordinated operation of equipment from multiple fields, including mechanics, thermodynamics, electromechanical systems, electronics, and computers. It also features a complex operating environment and long service life. Therefore, utilizing the rapidly developing machine learning and big data technologies to achieve online fault detection and prediction of the running gear, thereby improving the safety and uptime of rail vehicles, has become a key trend in the development of rail transit equipment technology.
[0003] Faults in the running gear are typically monitored by the temperature of the axle box bearings. Temperature sensors are now widely deployed; generally, each axle of a rail vehicle is equipped with eight sensors. Two are placed at both ends of the axle to sense ambient temperature, and the remaining six are evenly distributed on the two bearings at both ends of the axle, positioned at the upper left, upper, and upper right of the bearings, respectively. Because axle temperature typically changes drastically during axle faults, it is difficult to detect faults early based solely on temperature thresholds. Mechanism analysis is usually required to establish fault detection and prediction models. However, the running gear integrates various electromechanical components from diverse sources, making precise mechanism analysis difficult to implement. Summary of the Invention
[0004] The technical problem to be solved by this invention is: addressing the technical problems existing in the prior art, this invention provides a method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks, learning the structural features of the running gear of rail vehicles as a health baseline, and supporting online real-time anomaly detection.
[0005] To solve the above-mentioned technical problems, the technical solution proposed by this invention is as follows:
[0006] A method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks includes the following steps:
[0007] Collect sensor data from the running gear of the rail vehicle and store it in a database;
[0008] A structural learning model and a generative adversarial network model are established. When training the structural learning model, the optimal model is selected and simplified according to the FFX algorithm and the Deap algorithm. The simplified optimal model predicts the sensor data of the target period based on the historical data of the database. The generative adversarial network model generates the data sequence of the target period based on the predicted and actual values of the sensor data of the target period. The results of the structural learning model and the generative adversarial network model are superimposed as the health baseline data.
[0009] Acquire sensor data from the running gear of the rail vehicle within the target time period as real-time monitoring data, and calculate the real-time deviation based on the real-time monitoring data and the corresponding health baseline data;
[0010] Calculate the average error of all real-time deviations within the target time period. If the average error exceeds the preset alarm threshold, an anomaly is determined to have occurred.
[0011] Furthermore, the specific steps for establishing a structure learning model include:
[0012] Set the operation subset P as the initial value of the basis function set Ψ, use the dependent variable X as the operand, and use the corresponding target variable y as the reference for calculating the evaluation index;
[0013] Select a target basis function from the basis function set Ψ as an operator or operand, and add it to the current basis function set. Repeat this step until the size and quality of the current basis function set reach the preset iteration trigger threshold.
[0014] The current model is obtained by fusing the current set of basis functions and model parameters. The target variable is estimated based on the dependent variable using the current model. The model parameters are learned by calculating the error between the estimated and actual values of the target variable.
[0015] The quality of the current model is evaluated using model evaluation metrics. If the quality of the current model does not meet the requirements, the process returns to the step of setting the set of operations P as the initial value of the set of basis functions Ψ, or returns to the step of selecting the target basis function from the set of basis functions Ψ as an operator or operand. If the quality of the current model meets the requirements, the current model is output.
[0016] Furthermore, the expression for the current model is as follows:
[0017]
[0018] In the above formula, i = 1, ..., n b , Let X be the estimated value of the target variable, X be the dependent variable, and Ψ be the combined matrix from x1 to xm. i Let ai be the current set of basis functions, and Ψ be a. iThe coefficients of each basis function in the equation, b is the offset, nb represents the number of basis function sets, m is the dimension of the input data, and x1 to x... m These represent m observed variables, i.e., the sensing data from m sensors.
[0019] Furthermore, the model evaluation metrics specifically include:
[0020] Model optimization objective: The optimization objective is to minimize the sum of squared errors between the estimated and actual values of the target variable, i.e., the accuracy factor;
[0021] Model generalization: The accuracy and complexity of the model are measured using an accuracy factor and a complexity factor. The model complexity is used as a constraint on the model, and the model is solved in conjunction with the accuracy factor to ensure both the accuracy and generalization of the model.
[0022] Furthermore, the accuracy factor is expressed as follows:
[0023]
[0024] In the above formula, RMSE is the standard root mean square error, ti is the i-th time of sampling point t, and y ti Let be the target variable observation at the i-th time point of sampling point t. Let be the target variable estimate at the i-th time point of sampling point t, and n be the size of the target variable y vector.
[0025] The complexity factor is represented by the number of basis functions, and the expression is as follows:
[0026]
[0027] In the above formula, X is the dependent variable, is the combined matrix from x1 to xm, and ai is the set of basis functions Ψ. i The coefficients of each basis function are given, b is the offset, nb represents the number of basis function sets, and f(X) is the structure learning model.
[0028] Furthermore, when measuring model accuracy and model complexity using accuracy and complexity factors, a balance is struck between the two, as expressed below:
[0029] L = RMSE + λC, λ ∈ [0, 1]
[0030] In the above formula, RMSE is the accuracy factor, C is the complexity factor, and λ is the balance factor between accuracy and complexity.
[0031] Furthermore, the specific steps for selecting and simplifying the optimal model based on the FFX algorithm and the Deap algorithm during the training of the structure learning model include:
[0032] Generate a set of basis functions;
[0033] Identify the optimal basis function and use path regularization learning to learn the parameters;
[0034] By filtering redundant candidate functions based on minimizing the error rate and complexity, the optimal model set is obtained, and the optimal model set is used as the population.
[0035] According to the genetic algorithm, the population is reproduced to generate a candidate set of offspring, and mutation and crossover operations are performed.
[0036] The candidate set of offspring is evaluated. Based on the evaluation results, the best offspring are selected to form the population and the process is repeated until the population converges. The optimal model set corresponding to the population is then output.
[0037] Furthermore, the generator of the generative adversarial network includes a differential equation generator, the expression of which is as follows:
[0038]
[0039] Where t∈[0,1], the generator G consists of multiple subnetworks, each learning the dynamic characteristics of a specific waveform. The system state h=yy^, y is the actual observed value of the target variable, and y^ is the estimated value of the target variable, z is random noise, and θ G These are the network parameters for a generative adversarial network.
[0040] Furthermore, the generator of the generative adversarial network includes a differential equation solver, as shown in the following formula:
[0041] h t+1 =ODETimeStep(h t , Δ, G(h t , t, z; θ G ))
[0042] Where t∈[0,1], the generator G consists of multiple subnetworks, each learning the dynamic features of a specific waveform, h t To generate samples, Δ is the step size.
[0043] Furthermore, the discriminator of the generative adversarial network model uses RNN classification for judgment.
[0044] Compared with the prior art, the advantages of the present invention are as follows:
[0045] This invention establishes a structural learning model and a generative adversarial network model, which can use the learned system structural features as a health baseline and optimize and simplify the model. In the real-time anomaly detection process, the structural learning model and the generative adversarial network model are used to estimate the health baseline, and the presence of anomalies is determined based on the relationship between the average error between the health baseline and the real-time monitoring data and the alarm threshold, thereby realizing dynamic real-time tracking of anomaly detection of the running gear of rail vehicles. Attached Figure Description
[0046] Figure 1 This is a diagram illustrating the relationship between the structure learning model and the generative adversarial network in an example of the present invention.
[0047] Figure 2 This is a flowchart illustrating the establishment of a structure learning model in an embodiment of the present invention.
[0048] Figure 3 This is a schematic diagram of the generative adversarial network structure in an embodiment of the present invention.
[0049] Figure 4 This is a schematic diagram illustrating the training results of the structural learning model and the generative adversarial network model on the structural features of a certain axis temperature in an embodiment of the present invention.
[0050] Figure 5 This is a flowchart illustrating an embodiment of the present invention.
[0051] Figure 6 This is a flowchart of a method according to an embodiment of the present invention. Detailed Implementation
[0052] The present invention will be further described below with reference to the accompanying drawings and specific preferred embodiments, but this does not limit the scope of protection of the present invention.
[0053] Symbol definition
[0054] Assume the data sampling points are t = [t1, t2, ..., tn], ordered non-decreasingly in terms of time units, with equal intervals by default.
[0055] Suppose a single sampling result is (X, y), where X is the dependent variable and y is the target variable. The dependent variable X ∈ R. n×m As shown in formula (1), it is an n-row and m-column matrix, where the n rows represent the observation results at n times corresponding to t, and the m columns represent m different observation variables.
[0056]
[0057] The target variable y, as shown in formula (2), is the observation results at n times corresponding to t.
[0058] y=[y(t1),y(t2),...,y(tn )] T (2)
[0060] Suppose we have an operation subset P, as shown in formula (3), where each element represents a basic arithmetic operation. The current operations include: addition (add(x, y)) which represents the sum of two input vectors, i.e., returns x + y; subtraction (sub(x, y)) which represents the difference between two input vectors, i.e., returns x y; multiplication (mul(x, y)) which represents the dot product of two input vectors, i.e., returns x * y; division (divide(x, y)) which represents the quotient of two input vectors, i.e., returns x / y; sine (sin(x)) which returns the sine result of the input vector; cosine (cos(x)) which returns the cosine result of the input vector; and logarithm (log(x, i)) which returns the logarithmic result of the input vector, i.e., returns log i (x); The exponentiation operation pow(x, i) returns the result of exponentiation of the input vector, i.e., x. i The maximum value operation max(x) returns the maximum value of the input vector; the minimum value operation min(x) returns the minimum value of the input vector.
[0061]
[0062] The basis function vector Ψ, defined as in formula (4), has a total of n. b There are n elements, where each element φ i Let P represent a basis function, which is a linear or nonlinear complex function composed of several operators in the set of operators P.
[0063]
[0064] Typically, analyzing equipment data requires a combined approach of mechanism and data analysis. However, the increasing complexity of rail transit equipment makes it difficult to establish accurate mechanistic models. Furthermore, the globalization of manufacturing and complex equipment ownership relationships exacerbate knowledge fragmentation, often hindering mechanism-guided data analysis in practice. Therefore, utilizing data-driven techniques for structure learning to build machine learning models capable of accurately predicting trends in equipment data is of paramount practical significance.
[0065] The FFX (Deterministic Optimization) algorithm and the Deap genetic programming algorithm offer insights as structure generation and parameter learning engines. The former can be used to generate initial solutions, while the latter can be used to prune the results generated by FFX to form a more simplified model.
[0066] like Figure 1As shown, this embodiment proposes a structure learning model based on symbolic regression, which integrates deterministic optimization algorithms and genetic algorithms for training, thereby revealing the underlying physical mechanism of the system. Since symbolic regression is an NP-hard problem, searching for a fully expressible expression is not always feasible; therefore, a generative adversarial neural network is also used to learn the dynamic features of the system through residual learning.
[0067] like Figure 2 As shown, the process of building a structure learning model is as follows:
[0068] 1) Initialization: Set the operation subset P according to formula (3) as the initial value of the basis function set Ψ, and input the dependent variable X as the operand, where X is from x1 to x2. m The merged matrix, from x1 to x m These represent m observed variables, i.e., the sensing data from m sensors, and the corresponding target variable y (i.e., the axle box bearing temperature, hereinafter referred to as axle temperature) as a reference for calculating the evaluation index.
[0069] 2) Structure generation: Based on expert experience or traversal, select several basis functions from the basis function set Ψ as operators or operands to form new, more complex basis functions, and add them to the current basis function set Ψ. i In this process, the method iterates in a loop, using pre-defined structural evaluation indicators to control the size and quality of the basis function set. The current basis function set Ψ i The iteration stops when the size and quality reach the iteration trigger threshold, and the current basis function set Ψ is changed. i Proceed to the next step;
[0070] 3) Parameter learning: For the current set of basis functions Ψ input... i The model is fused to obtain the current model. The target variable is estimated using the current model. The model parameters are randomly initialized, and the estimated value of the target variable is calculated. The model parameters are learned by comparing the error with the actual observed value y. The expression of the current model is as follows:
[0071]
[0072] In the above formula, i = 1, ..., n b , Ψ is the estimated value of the target variable, X is the dependent variable, and Ψ is the estimated value of the target variable. i Let ai be the current set of basis functions, and Ψ be a. i The coefficients of each basis function are given, b is the offset, nb represents the number of basis functions, m is the dimension of the input data, and model parameter learning is the learning of parameters {a}. i}, i = 1, ..., n b Learn from b;
[0073] 4) Model Selection: Evaluate the quality of the current model using model evaluation metrics to decide whether to accept the current learning results. If the learning results are unsatisfactory, return to step 2, or even return to step 1 to adjust the model input. If the learning results are satisfactory, output the current model.
[0074] In this embodiment, the model evaluation metrics include:
[0075] Model optimization objective: namely, the accuracy factor, aims to minimize the sum of squared errors between the estimated and actual values of the target variable, i.e., to minimize the following equation:
[0076]
[0077] In the above formula, RMSE is the standard root mean square error, n is the magnitude of the target variable y vector, and ti is the i-th time of sampling point t. Let y represent the target variable estimate at the i-th time point of sampling point t. ti Let represent the target variable observation at the i-th time of sampling point t.
[0078] Model generalization: Unlike general regression methods, this embodiment considers both model accuracy and model complexity as measured by accuracy factors and complexity factors, respectively, and uses a balancing factor λ between accuracy and complexity to balance the two, as shown in formula (7). The accuracy factor is the sum of squared errors between the estimated value and the true value, and its expression is shown in formula (6). The complexity factor is measured by the model size, which is C, as shown in formula (8), and its expression is as follows:
[0079] L=RMSE+λC, λ∈[0, 1] (7)
[0080]
[0081] In the above formula, RMSE is the accuracy factor, C is the complexity factor, λ is the balance factor between accuracy and complexity, and X is the dependent variable, ranging from x1 to x2. m The merged matrix, where ai is the set of basis functions Ψ i The coefficients of each basis function are given, b is the offset, nb represents the number of basis function sets, f(X) is the structure learning model, and the size() formula indicates that the complexity factor is a standard function. Here, the L0 norm method is used to limit the complexity of the model, and λ is used as a balancing factor between accuracy and complexity. This is equivalent to solving the model with accuracy under the constraint of complexity, which is a constrained solution problem.
[0082] As can be seen, the complexity factor C is represented by the number of basis functions. The more basis functions there are, the more complex the function is, which in turn indicates a worse complexity. The value of L decreases as the iteration process progresses. The smaller the value of L, the higher the evaluation of the current model.
[0083] 5) Conclusion: The final output of the framework is one or more preferred models, along with an evaluation of each model (i.e., L in equation (7)). These models can express the underlying dynamic characteristics and operating mechanisms of the system, and can also provide support for other tasks in the later stages. The expressions for these preferred models are as follows:
[0084]
[0085]
[0086] In the above formula, i = 1, ..., n b , Let X be the estimated value of the target variable, X be the dependent variable, Ψ be the combined matrix from x1 to xm, and Ψ be the combined value of the target variable. i Let ai be the current set of basis functions, and Ψ be a. i The coefficients of each basis function in the equation, b is the offset, nb represents the number of basis function sets, m is the dimension of the input data, and x1 to x... m Let f1 to f2 represent m observed variables, i.e., the sensing data from m sensors. nm Let M be the model to be solved, M be the set of models to be solved, and nm be the size of the set of models to be solved.
[0087] After constructing the structure learning model, the training process for the structure learning model first executes the FFX algorithm to obtain an initial solution, which involves the following three steps:
[0088] 1) Generate a large-scale basis function set in a deterministic manner;
[0089] 2) In the structure generation step, when selecting several basis functions from the basis function set Ψ, the optimal basis function is identified, and in the parameter learning step, the estimate of the target variable is calculated. When learning model parameters based on the error between the actual observed value y, path regularization learning is used for parameter learning.
[0090] 3) After the model selection step, redundant candidate basis functions are filtered out by using the parameters L corresponding to each model and the objective of minimizing the error rate and complexity;
[0091] Through the above steps, the optimal model set M is selected. The optimal model set M output by FFX is analyzed to obtain the basis function set, which serves as the initial input for Deap genetic programming, i.e., as the initial population. Then, the following three steps are executed iteratively until convergence:
[0092] 1) Reproduction step: Generate a candidate set of offspring based on the candidate set of the current population, and ensure the diversity of the population through mutation and crossover operations;
[0093] 2) Evaluation step: Evaluate the candidate descendant set generated in the previous step;
[0094] 3) Selection step: Based on the evaluation results of the previous step, select the best candidate set of offspring to form the next generation population.
[0095] Through the above steps, the model in the candidate set of the convergent population is the simplified optimal model.
[0096] like Figure 3 As shown, the generative adversarial network model in this implementation learns dynamic features through the following steps:
[0097] First, the generator obtains the initial system state h0 and random noise z, and then generates a sequence of fake samples. The generator consists of a differential equation generator and a differential equation solver. The differential equation generator uses the following structure:
[0098]
[0099] Here, t∈[0,1], the generator G consists of multiple subnetworks, each learning the dynamic characteristics of a specific waveform, and the system state. h is used to simulate the dynamic characteristics of the system, derived from the difference between the symbolic regression predictor and the observed value, and y is the actual observed value of the target variable. For the estimated value of the target variable, θ G These are the network parameters of the generative adversarial network. They are initially obtained from random data and are trained iteratively through a generator. Here, the same training method is used for general generative adversarial networks.
[0100] The differential equation solver uses numerical integration to solve ordinary differential equations, as shown in the following formula:
[0101] h t+1 =ODETimeStep(h t , Δ, G(h t , t, z; θ G (11)
[0102] Where h t To generate samples, Δ is the step size, h t+1 =h t +Δ*dh / dt.
[0103] Then the discriminator obtains the sample sequence and gives the probability that the sample is true. The discriminator is mainly used to determine whether the sequence is a generated sequence. Here, RNN classification is used to determine whether the sequence is a generated sequence or an actual sequence.
[0104] In this embodiment, the simplified optimal model symbol regression result is superimposed with the sequence generation result of the generative adversarial network model as the prediction value for anomaly detection.
[0105] For the simplified optimal model and the generative adversarial network model, a certain type of electric locomotive was selected, and cross-validation was performed on a dataset collected during 1000 normal operations (for each operation, the first 50% of the data was used as the training set, and the last 50% as the validation set). Six structural features of axle temperatures were learned, labeled ZX_WD_1, ZX_WD_2, ZX_WD_3, ZX_WD_4, ZX_WD_5, and ZX_WD_6. The experimental results for learning the structural feature ZX_WD_4 of the fourth axle temperature are as follows: Figure 4 As shown.
[0106] Experimental results show that the structural features learned by the simplified optimal model and the dynamic features learned by the generative adversarial network model have good interpretability, expressing the interaction relationships between multiple signals and demonstrating a good ability to fit the trajectory of real data, thus serving as a benchmark for the normal state of the system. Therefore, using the learned system structural features and dynamic features as a health baseline supports various applications of the online real-time anomaly detection framework.
[0107] like Figure 5 and Figure 6 As shown, based on the structural learning model and generative adversarial network model of this embodiment, this embodiment proposes a method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks, including three stages: data acquisition, feature learning, and online real-time anomaly detection.
[0108] In the first phase, sensor data on the axle temperature of the rail vehicle's running gear is collected and stored in a database. Data generated during train operation is primarily collected through sensors. There are two main data flows: firstly, the collected data is transmitted to the data center's database for effective storage; secondly, real-time sensing data serves as input for online real-time anomaly detection.
[0109] In the second stage, a structural learning model and a generative adversarial network (GAN) model are established. During the training of the structural learning model, the optimal model is selected and simplified according to the FFX algorithm and the Deap algorithm. The simplified optimal model can predict the sensor data for the target period based on the historical data in the database. The GAN model obtains the difference between the predicted value of the sensor data for the target period and the corresponding actual observation value, and generates the data sequence for the target period. The predicted value of the simplified optimal model and the generated sequence of the GAN model are superimposed as the healthy baseline data. The aforementioned training methods are used to learn structural features and dynamic features to form a feature expression, which is used as the structural expression of the healthy baseline and then deployed as a model.
[0110] In the third stage, sensor data from the running gear of the rail vehicle within the target time period is acquired as real-time monitoring data. Real-time deviations are calculated based on the real-time monitoring data and the corresponding health baseline data. Then, the average error of all real-time deviations within the target time period is calculated. If the average error exceeds a preset alarm threshold, an anomaly is determined to have occurred. The specific steps include:
[0111] 1) Obtain the computational results of the structure learning model and overlay them with the computational results of the generative adversarial network model as a healthy baseline:
[0112] The result of the structure learning model is the simplified optimal model f. best Predicting sensor data for the target time period involves estimating the signal trajectory of the shaft temperature over the most recent w moments using the following formula. get:
[0113]
[0114] In the above formula, y t1 ...y tw Let represent the target variable estimates at the i-th time point of sampling point t;
[0115] The computational result of the generative adversarial network model is the sequence of the target time period learned by the generative adversarial network model from the residual h between the predicted and observed values. signal trajectory With the corresponding sequence The baseline prediction value of Pikachu is obtained by superposition;
[0116] 2) Acquire real-time sensing data: Acquire real-time bearing temperature monitoring data y over the most recent w moments using sensors;
[0117] 3) Residual Generation: The predicted health baseline value y^ and the real-time monitoring data y are simultaneously fed to the residual generator, which calculates the real-time deviation of the system using the following formula:
[0118]
[0119] In the above formula, and y ti Let represent the target variable observation and the health baseline prediction at the i-th time point t, respectively;
[0120] 4) Error estimation: Calculate the average error, as shown in the following expression:
[0121]
[0122] In the above formula, w is the number of time points in the target time period, and ri is the error between the real-time monitoring data of the bearing temperature at the i-th time point and the corresponding health baseline. The error estimation using the average error is mainly based on the following considerations: (a) As can be seen from the internal temperature of the bearing, the bearing temperature rise when the system fails follows an exponential equation, and the average error is sufficient to detect the anomaly; (b) The entire online anomaly detection framework needs to meet real-time calculation, and it is still difficult to support overly complex evaluation methods in engineering practice.
[0123] 5) Anomaly Decision-Making: Based on the error estimation results, a decision on whether to issue an alarm is made using the following formula:
[0124]
[0125] Where α is the alarm threshold, which is determined according to the actual application requirements. When the average error of the target time period is greater than the alarm threshold, it indicates that an abnormality has occurred and an alarm is triggered.
[0126] In summary, this embodiment has the following advantages:
[0127] 1) Data-driven automation mechanism mining
[0128] To address the problem of incomplete mechanisms in complex equipment, which prevents the establishment of accurate mechanistic models, this paper utilizes data-driven techniques for structural learning. This enables the creation of machine learning models that can accurately predict trends in equipment data changes, achieving mechanism reconstruction and system identification. On one hand, system structural characteristics reflect the system's dynamics and the interactions between signals; on the other hand, these characteristics serve as a health baseline for system operation, allowing for the design of an online real-time anomaly detection framework. This provides a new technical approach for constructing digital twins.
[0129] 2) System structural feature learning method based on symbolic regression
[0130] This paper proposes a system structure feature learning method based on symbolic regression, which integrates deterministic optimization algorithms and genetic algorithms. FFX and Deap are integrated as structure generation and parameter learning engines within the framework. The former is used to generate initial solutions, while the latter is used to prune the results generated by FFX to form a more simplified model.
[0131] 3) Mechanism-based health baseline-driven fault detection
[0132] In response to the unique characteristics of small sample size and long tail distribution of complex equipment fault data, this paper adopts a baseline model based on normal operation data to reflect the trend of equipment status, and then conducts abnormal behavior monitoring and fault mode learning under the guidance of the baseline. This approach is highly innovative in terms of technical approach.
[0133] 4) Use generative networks to learn dynamic features
[0134] Because symbolic regression is an NP-hard problem, searching for a fully expressible expression is not always feasible. Generative neural networks are used to learn the dynamic features of the system through residual learning.
[0135] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention in any way. Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the invention. Therefore, any simple modifications, equivalent changes, and alterations made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention should fall within the protection scope of the present invention.
Claims
1. A method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks, characterized in that, Includes the following steps: Collect sensor data from the running gear of the rail vehicle and store it in a database; A structure learning model and a generative adversarial network (GAN) model are established. During training, the structure learning model is optimized and simplified using the FFX and Deap algorithms. The simplified optimal model predicts sensor data for a target time period based on historical data from a database. The GAN model generates a data sequence for the target time period based on the predicted and actual sensor data. The results of the structure learning model and the GAN model are superimposed to form the health baseline data. The generator of the GAN model includes a differential equation generator and a differential equation solver. The expression for the differential equation generator is as follows: Where t∈[0,1], the generator G consists of multiple subnetworks, each learning the dynamic characteristics of a specific waveform, and the system state h= y- y is the actual observed value of the target variable, and Here, z represents the estimated value of the target variable, and z represents random noise. These are the network parameters for generative adversarial networks; The expression for the differential equation solver is as follows: Where t∈[0,1], the generator G consists of multiple subnetworks, each of which learns the dynamic features of a specific waveform. To generate samples, Step size; Acquire sensor data from the running gear of the rail vehicle within the target time period as real-time monitoring data, and calculate the real-time deviation based on the real-time monitoring data and the corresponding health baseline data; Calculate the average error of all real-time deviations within the target time period. If the average error exceeds the preset alarm threshold, an anomaly is determined to have occurred.
2. The method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks according to claim 1, characterized in that, The specific steps for establishing a structure learning model include: Set the operation subset P as the initial value of the basis function set Ψ, use the dependent variable X as the operand, and use the corresponding target variable y as the reference for calculating the evaluation index; Select a target basis function from the basis function set Ψ as an operator or operand, and add it to the current basis function set. Repeat this step until the size and quality of the current basis function set reach the preset iteration trigger threshold. The current model is obtained by fusing the current set of basis functions and model parameters. The target variable is estimated based on the dependent variable using the current model. The model parameters are learned by calculating the error between the estimated and actual values of the target variable. The quality of the current model is evaluated using model evaluation metrics. If the quality of the current model does not meet the requirements, the process returns to the step of setting the set of operations P as the initial value of the set of basis functions Ψ, or returns to the step of selecting the target basis function from the set of basis functions Ψ as an operator or operand. If the quality of the current model meets the requirements, the current model is output.
3. The method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks according to claim 2, characterized in that, The expression for the current model is as follows: In the above formula, i=1,...,n b , Let X be the estimated value of the target variable, X be the dependent variable, Ψ be the combined matrix from x1 to xm, and Ψ be the combined value of the target variable. i Let ai be the current set of basis functions, and Ψ be a. i The coefficients of each basis function in the equation, b is the offset, nb represents the number of basis function sets, m is the dimension of the input data, and x1 to x... m These represent m observed variables, i.e., the sensing data from m sensors.
4. The method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks according to claim 2, characterized in that, The specific evaluation metrics for the model include: Model optimization objective: The optimization objective is to minimize the sum of squared errors between the estimated and actual values of the target variable, i.e., the accuracy factor. Model generalization: The accuracy and complexity of the model are measured using an accuracy factor and a complexity factor. The model complexity is used as a constraint on the model, and the model is solved in conjunction with the accuracy factor to ensure both the accuracy and generalization of the model.
5. The method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks according to claim 4, characterized in that, The accuracy factor expression is as follows: In the above formula, RMSE is the standard root mean square error, and ti is the i-th time of sampling point t. Let be the target variable observation at the i-th time point of sampling point t. Let be the target variable estimate at the i-th time point of sampling point t, and n be the magnitude of the target variable y vector; The complexity factor is represented by the coefficients and number of basis functions, as shown in the following expression: In the above formula, X is the dependent variable, ranging from x1 to x2. m The merged matrix, where ai is the set of basis functions Ψ i The coefficients of each basis function are given, b is the offset, nb represents the number of basis function sets, and f(X) is the structure learning model.
6. The method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks according to claim 5, characterized in that, When using accuracy and complexity factors to measure model accuracy and complexity, a balance is struck between the two, as expressed below: In the above formula, RMSE is the accuracy factor, C is the complexity factor, and λ is the balance factor between accuracy and complexity.
7. The method for detecting anomalies in the running gear of rail vehicles based on symbolic regression and generative adversarial networks according to claim 1, characterized in that, The specific steps for selecting and simplifying the optimal model based on the FFX and Deap algorithms during the training of the structure learning model include: Generate a set of basis functions; Identify the optimal basis function and use path regularization learning to learn the parameters; By filtering redundant candidate functions based on minimizing the error rate and complexity, the optimal model set is obtained, and the optimal model set is used as the population. According to the genetic algorithm, the population is reproduced to generate a candidate set of offspring, and mutation and crossover operations are performed. The process involves evaluating the candidate set of offspring, selecting the best offspring based on the evaluation results to form a population, and then returning the process of generating a candidate set of offspring by breeding the population using a genetic algorithm. This process continues until the population converges, and the optimal model set corresponding to the population is output.
8. The method for detecting anomalies in the running gear of a rail vehicle based on symbolic regression and generative adversarial networks according to claim 1, characterized in that, The discriminator in the generative adversarial network model uses RNN classification for judgment.