[0037] The present invention will be described in detail below through specific embodiments and drawings.
[0038] The method and system for fault identification based on neural network self-learning in this embodiment is composed of the following parts: CSM-based data acquisition subsystem, data preprocessing subsystem, feature selection subsystem, model training subsystem, and real-time data analysis subsystem System and self-learning subsystem. It is used to solve the technical problems of large workload, low efficiency and high risk when manually diagnosing railway signal system faults in the prior art.
[0039] Neural network is mainly composed of neurons, and the structure of neurons is like figure 2 As shown, a1~an are the components of the input vector
[0040] w1~wn is the weight of each synapse of neuron
[0041] b is bias
[0042] f is the transfer function, usually a nonlinear function. Generally there are sigmod(), travelingd(), tansig(), hardlim(). The following defaults to hardlim().
[0043] t is the neuron output
[0044] Mathematical representation t = f ( W → A ′ → + b )
[0045] Weight vector
[0046] Is the input vector, for Transpose
[0047] b is bias
[0048] f is the transfer function
[0049] It can be seen that the function of a neuron is to obtain a scalar result through a nonlinear transfer function after obtaining the inner product of the input vector and the weight vector.
[0050] The role of a single neuron: divide an n-dimensional vector space with a hyperplane into two parts (called the judgment boundary), given an input vector, the neuron can determine which side of the hyperplane this vector is located.
[0051] The equation of the hyperplane: W → p → + b = 0
[0052] Weight vector
[0053] b bias
[0054] Vector on the hyperplane
[0055] The neural network must first learn with certain learning criteria before it can work. Take the neural network's recognition of handwritten "A" and "B" as an example. It is stipulated that when "A" is input to the network, it should output "1", and when the input is "B", the output is "0" ".
[0056] Therefore, the criterion for network learning should be: if the network makes a wrong decision, through the network learning, the network should reduce the possibility of making the same mistake next time. First, assign a random value in the interval (0,1) to each connection weight of the network, and input the image mode corresponding to "A" to the network. The network adds the weight of the input mode, compares it with the threshold, and performs non- Linear operation, get the output of the network. In this case, the probability that the network output is "1" and "0" is 50%, which means it is completely random. At this time, if the output is "1" (the result is correct), the connection weight is increased so that when the network encounters the "A" mode input again, it can still make a correct judgment.
[0057] The neural network changes the connections between neurons and the weights between connections through learning and training to adapt to the surrounding environment. Using the same initial network configuration to learn through different training sets, the neural network obtained is completely different. Neural network is a system with learning ability, which can develop knowledge so as to exceed the original knowledge level of the designer. Generally, its learning and training methods can be divided into two types, one is supervised learning, which uses the given sample standard to classify or imitate; the other is unsupervised learning, when only prescribed learning Method or certain rules, the specific learning content varies with the environment of the system (that is, the input signal situation). The system can automatically discover environmental characteristics and regularities, and has a function more similar to the human brain. The structural characteristics of neural network determine that it is more suitable for distributed storage and parallel computing.
[0058] These characteristics of neural network are very suitable for rail transit failure analysis and early warning, the system can obtain massive monitoring data through CSM. The two learning methods of parallel neural networks can train known failure analysis, and can also discover new failure types and causes through continuous learning.
[0059] The fault recognition model mainly consists of three steps: one is the data preparation stage, the original monitoring data is preprocessed, feature selection and format conversion, to obtain the training set that the neural network can handle; the second is to find the appropriate training set according to the given training set The number of neural network layers and parameters; the third is to use the function model completed in the first step to analyze the real-time monitoring data to obtain whether the system fails and the cause of the failure.
[0060] 1. Data acquisition subsystem
[0061] The data acquisition subsystem is connected with the CSM system of the railway company, railway bureau, and power depot to obtain historical monitoring data stored in the CSM and real-time monitoring data. Historical monitoring data is used in the model training stage to train the model to obtain a classification model; the trained model is used to classify real-time monitoring data to obtain the current operating status of the system, such as whether there is a fault and the cause of the fault, etc. .
[0062] 2. Data preprocessing subsystem
[0063] The data preprocessing subsystem processes the collected monitoring data, including operations such as data denoising, data formatting, and normalization, and converts the data into space vector format data. The data in this format is convenient for subsequent feature selection and neural network processing.
[0064] The monitoring data includes Boolean and analog values. The difference between different data is large, and the value range of the data is quite different, and some monitoring data such as air temperature and water temperature also include negative values. In view of this situation, normalization algorithms are designed for different data types:
[0065] (1) Boolean
[0066] When the value of the data contains only two values, normalize the corresponding data to -1 and 1;
[0067] (2) Only analog quantity that contains positive numbers
[0068] y=2*(x-min)/(max-min)-1, this formula normalizes the data to the interval [-1,1].
[0069] (3) Analog quantity containing positive and negative numbers
[0070] y=x/|max|, this formula also plans the data to the interval [-1,1].
[0071] 3. Feature selection subsystem
[0072] CSM collects many signals, and some of them are redundant signals. After these signals are converted into features, similarity calculations are performed on them, and then redundant features are removed, which can greatly reduce the amount of calculation and processing.
[0073] Different from the general similarity calculation method, many of the signals collected by CSM are voltage and current signals. These signals have continuity and correlation. For example, if the current at point A increases, the current at point B directly connected to it will follow. Increase. From this feature, it can be seen that the current value of point A can replace the change trend of the current value of point B, and then B is the redundant feature of A. The feature selection is carried out through the correlation between voltage and current. The specific calculation method is as follows:
[0074] Va and Vb respectively represent the values of acquisition points a and b. First, Va and Vb are normalized, that is, the value ranges of the two features are the same, limited to [0,1]. Then calculate the features:
[0075] X i = 0 n Log ( Va Vb ) / n
[0076] Where n is the number of Va and Vb contained in the training set. Through normalization, the value ranges of Va and Vb are the same. If the value of the above formula is less than the given threshold, then Va and Vb are redundant features. Remove the value of Vb to retain the value of Va. The selection of the threshold mainly depends on the noise of the collection point. When the noise is large and large, the threshold needs to be set large, and vice versa. Through the above steps, redundant features can be greatly reduced.
[0077] The feature selection subsystem is used to process the space vector data after preprocessing. Because only part of the monitoring data is related to a specific fault, it is necessary to sort out fault-related features based on existing knowledge to form a fault feature library. The unused features can be used for unsupervised learning to discover new knowledge.
[0078] 4. Model training subsystem
[0079] As can be seen from the previous chapters of this article, there are three types of neural networks in this system: feedforward neural networks, feedback neural networks, and self-organizing neural networks. The designs of the three models are as follows:
[0080] (1) Feedforward neural network
[0081] According to the existing expert knowledge, it is possible to summarize some certain faults and the specific reasons that caused the faults. Therefore, the function of the model is to obtain the determined models and parameters according to the training data, and then use the model to analyze the real-time monitored data And early warning. The model of feedforward neural network is like Figure 4 Shown. It is a 3-layer feedforward neural network, where the first layer is the input unit, the second layer is called the hidden layer, and the third layer is called the output layer. For a three-layer feedforward neural network, if X is used to represent the input vector of the network, W1~W3 represent the connection weight vectors of each layer of the network, and f(x) represents the action function of the three layers of the neural network.
[0082] W weight is initialized randomly;
[0083] The output model of the neuron node corresponding to the hidden layer: Oj=f(∑Wij×Xi-qj); Wij is the weight of the jth node in the i-th layer;
[0084] Output node output model: Yk=f(∑Tjk×Oj-qk) where Yk represents the kth node of the output layer; Tjk represents the weight of the connection between hidden layer node j and output layer node k; qk here is Regularity factor, Xi is the input data of the i-th layer;
[0085] f- is a nonlinear function: f(x)=1/(1+e -x )
[0086] The error calculation model is a function that reflects the size of the error between the expected output of the neural network and the calculated output:
[0087] Where tj is the expected value of output layer node j; oj is the actual value of output layer node j;
[0088] Re-adjust the weight by error:
[0089] △Wij(n+1)=h×Ep×Oj+a×△Wij(n) where n represents the number of iterations, the weight of n+1 times in the training process is based on the weight and output value of the nth iteration The difference between the expected value and the expected value is calculated; △Wij(n) is the weight change of the j-th node in the i-th layer during the n-th iteration.
[0090] Among them, h-learning factor; Ep-calculation error of output node i; Oj-calculated output of output node j; a-momentum factor.
[0091] Through the above steps, a certain neural network model and parameters can be obtained. This neural network can be trained for any certain fault, and an analysis model for the fault can be obtained.
[0092] The above-mentioned junction box fault is a kind of known fault. The corresponding collected data is the junction box receiving terminal voltage, the cable terminal voltage, and the sending terminal voltage. The collected voltage data is preprocessed to form training data.
[0093] Construct a three-layer neural network with voltage as the input layer;
[0094] Use the empirical formula n1=sqrt(n+m)+d to determine the number of neurons, n1 is the number of hidden layer units
[0095] n is the number of input units
[0096] m is the number of output units
[0097] d is a constant between 0 and 10
[0098] The fault n=3, m=4, and d is set to 5. Thus, the number of neurons n1 is 7.
[0099] The weights between the input layer and the hidden layer and between the hidden layer and the output layer are initialized randomly, with a value range of (0,1), and the transfer function uses the sigmod function. Then the formation is a three-layer neural network, the input layer contains three stages, the hidden layer contains 7 nodes, and the output layer contains 4 nodes. Then the training data is used for training to obtain all the parameters of the neural network. The neural network structure is like Picture 9 Shown.
[0100] Finally, the obtained neural network is used as input to the real-time monitoring data, and then according to the output of the neural network, it can be judged whether there is a fault and the type of the fault.
[0101] (2) Feedback neural network.
[0102] According to the existing experience of on-site technicians, some faults can be known, but the cause of the fault is not comprehensive, and only part of the cause of the fault can be understood. At this time, the role of the feedback neural network model is reflected.
[0103] The structure of the feedback neural network is as Figure 5 As shown, assuming that there are n inputs (I1, I2.., In) and m outputs (o1, o2, ..om), the influence of each input on the output of different types of faults is calculated through feedback.
[0104] R(I)+=(o t -o t-1 )*(I t -I t-1 ) Where o t Is the output at time t, I t For the input value at time t, through the calculation of the training set, a vector can be obtained, which records the correlation between the input data a and the output value o, and then removes the input data with small correlation, and continuously calculates. To improve the correlation between the fault and the feature, until all the features related to the fault are determined. In this way, the remaining data is the data related to the fault.
[0105] In addition to the known related data, as much of the input data as possible use possibly related monitoring data as input features.
[0106] Not only can the model be trained based on the existing training data, but also based on the real-time data and status obtained, the relationship between the fault and the monitoring data can be continuously analyzed and mined, so as to continuously improve the model. Figure 5 It is a schematic diagram of the feedback neural network model.
[0107] The biggest difference from the feedforward neural network is that it not only uses historical monitoring data for learning, but also uses real-time monitoring data for training. When a fault occurs, it manually marks the fault to form a fault sample. Figure 5 The feedback data in the data is used as the data of the model, and the model will automatically learn and improve the network to realize the ability of fault analysis.
[0108] (3) Self-organizing neural network:
[0109] With the continuous development of the railway monitoring system, more monitoring data will be generated, and various new faults may appear. In order to be able to ensure a strong recognition ability for faults, the system needs to have self-learning capabilities and self-organizing nerves. The network automatically finds the inherent laws and essential attributes in the sample, self-organizes and self-adaptively changes the parameters and results of the network, so that it has the ability to identify and analyze new faults.
[0110] Self-organizing neural networks are learning networks without tutors. It self-organizes and adaptively changes the network parameters and structure by automatically searching for the internal laws and essential attributes in the sample. Such as Image 6 Shown: The network includes two layers: the input layer and the competition layer. In the case of learning without a tutor, the model has clustering capabilities. According to this feature, the system is designed to cluster faultless into one category and cluster with faults. Identify the fault.
[0111] The purpose of clustering is to classify similar pattern samples into one category, and to separate the dissimilar ones, so as to realize the similarity and separation between the pattern samples.
[0112] Here, the fault is also clustered as a feature, and the feature is time-sequential, because the fault itself has a certain time-sequence. When some collection points are abnormal, the fault will also occur. Feature clustering algorithm design ideas:
[0113] ●Cluster with the fault feature as the central feature, and the clustering result generated is the feature related to the fault;
[0114] ●Calculate the similarity between all non-faulty features and the central point for each central point. When the similarity exceeds a certain threshold, the features are clustered into one category;
[0115] ●Because some features may be related to multiple faults, the results of clustering can be crossed, that is, a feature can belong to multiple central points;
[0116] ●The remaining unclassified feature points can directly select the center point with the greatest correlation as a category;
[0117] The calculation formula of similarity is:
[0118] C it Represents the value of fault i (center point) at time t; F jt Represents the value of feature j at time t;
[0119] k = X t = 1 t = l C it - C it - 1 X t = 1 t = l F jt - F jt - 1
[0120] w = X i = 1 l Log ( C it - C it - 1 F jt - F jt - 1 ) / k
[0121] sim=1/w
[0122] l represents the maximum time range of the training set; n represents the number of training sets. When sim is greater than a given threshold, then it is judged that the feature j is similar to the fault i and belongs to the same category.
[0123] The result of the clustering is that the fault Ci is related to all the features under the category.
[0124] (4) Model fusion
[0125] The three types of neural networks correspond to three different types of faults, feedback neural networks and self-organizing neural networks, which not only use neural networks for fault analysis, but also feature selection and causality mining. But the three models ultimately analyze and predict the collected monitoring data in the form of neural networks. There will be multiple models for each model. Although these models have the same structure and initial value, the results obtained through different training are different, that is, different models.
[0126] Suppose there are N models, then encode the output of the N models, the faulty is represented as 1, and the no fault is represented as 0;
[0127] The output of N models can produce N-square states. The system uses a Hash table to map the binary state and convert it to the displayed state. For example, when all 0s are no fault, a 1 indicates a fault. Through the fusion in this way, different models can be converted into a unified system to facilitate various data processing.
[0128] Obtain the monitoring data in VSM format, and then use different parameters to perform ten-fold cross-validation on the data. In order to obtain the best model and parameters for classification and Pan-China capabilities. Through the connection with the real-time analysis component, the trained model is transmitted to the analysis component.
[0129] 6. Real-time data analysis subsystem
[0130] Real-time monitoring data also needs to go through a process similar to historical monitoring data. Finally, real-time monitoring data in VSM format is used as input and input to the real-time data analysis component. Through calculations, it can be calculated whether the current system has a specific fault and the occurrence of the fault s reason.