A chemical process zero-shot fault diagnosis method based on joint training and attribute matching of a graph neural network

By using a method of joint training of graph neural networks and attribute matching, a zero-sample fault diagnosis model for chemical processes is constructed, which solves the diagnostic difficulties caused by the lack of fault samples and the absence of fault types in chemical processes, and achieves high accuracy and stable fault diagnosis.

CN122241445APending Publication Date: 2026-06-19KARAMAY FUCHENG OIL & GAS SALES CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
KARAMAY FUCHENG OIL & GAS SALES CO LTD
Filing Date
2026-05-22
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing zero-sample fault diagnosis methods for chemical processes face difficulties in diagnosis when there are no fault samples or when dealing with unseen fault types. Furthermore, existing technologies fail to effectively embed the coupling structure of fault attributes, resulting in insufficient diagnostic accuracy and stability.

Method used

A joint training and attribute matching method using graph neural networks is adopted. By constructing a fault association graph and attribute matrix, graph attention network is used to mine topological associations and semantic dependencies. Combined with feature fusion mechanism, a joint loss model is constructed for training, and a Gaussian Naive Bayes classifier is used for diagnosis.

Benefits of technology

It enables accurate fault diagnosis even when no fault samples are found, improving the accuracy and stability of fault diagnosis in chemical processes and solving the problem of poor diagnostic performance in existing technologies.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241445A_ABST
    Figure CN122241445A_ABST
Patent Text Reader

Abstract

This invention belongs to the field of industrial process fault diagnosis technology, specifically disclosing a zero-sample fault diagnosis method for chemical processes based on joint training of graph neural networks and attribute matching. The method includes the following steps: first, collecting and preprocessing simulated data of the chemical process; then, constructing a fault association graph and a network model for feature extraction; subsequently, training and optimizing the network model; finally, extracting features from the samples to be diagnosed and reducing their dimensionality, and calculating the Euclidean distance between the predicted attribute probability vector and the true attribute vector to obtain the diagnostic result. Addressing the difficulty of zero-sample fault diagnosis in chemical processes, this invention constructs a K-nearest neighbor graph and introduces a graph attention network, enabling the network model to use the semantic association information of known faults for reasoning when dealing with unknown faults. This solves the problem of difficulty in zero-sample fault diagnosis in chemical processes due to the lack of fault samples and the existence of unseen fault types during real-time monitoring and fault diagnosis.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of industrial process fault diagnosis technology, and specifically to a zero-sample fault diagnosis method for chemical processes based on graph neural network joint training and attribute matching. Background Technology

[0002] With the rapid development of industry towards large-scale, complex, and intelligent production, the safe and stable operation of continuous production processes in industries such as chemicals and energy places higher demands on fault diagnosis technology. In actual industrial scenarios, chemical processes are often accompanied by extreme environments such as high temperature and high pressure. Once a fault occurs, it can not only cause huge economic losses but may even lead to serious safety accidents. However, the scarcity of key fault samples, the frequent occurrence of unseen faults, and the insufficient generalization ability of traditional data-driven models limit the effectiveness of fault diagnosis. Therefore, real-time monitoring and accurate fault diagnosis of chemical processes are of great significance.

[0003] In recent years, data-driven fault diagnosis methods, especially deep learning technologies such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory Networks (LSTMs), have achieved remarkable results in the field of industrial fault diagnosis. However, the training and test sets of these deep learning methods typically contain the same fault categories, and each fault category has sufficient labeled samples. But in actual chemical production processes, industrial systems are in normal operation most of the time, making fault data, especially severe fault data, extremely difficult to obtain. Moreover, with changes in operating conditions or equipment replacement, fault types that were never seen during the training phase often appear.

[0004] Currently, existing zero-shot fault diagnosis methods fail to effectively embed attribute coupling structures into the discrimination process, and feature fusion lacks mechanistic consistency. Most schemes simply assume that fault attributes are independent, typically using attribute topology as shallow auxiliary information to map the feature space to the semantic space to identify unknown faults. They do not aggregate and embed higher-order coupling relationships using graph neural networks, making it difficult to integrate the dependencies between attributes into the sample feature learning process. Furthermore, the fusion methods of data features and semantic attribute features are relatively crude, lacking targeted alignment and enhancement mechanisms. This prevents the model from fully utilizing the structured knowledge of known faults to assist in the diagnosis of unknown faults, resulting in insufficient reasoning for unknown faults and weak generalization performance. In addition, existing technologies typically employ fixed graph structures and fail to optimize training objectives for the imbalanced characteristics of industrial attributes. Consequently, in real chemical processes, facing complex conditions with strong coupling, nonlinearity, and multiple disturbances, the diagnostic stability and accuracy of the models significantly decrease, failing to meet the needs of accurate identification of unknown faults in industrial settings.

[0005] In summary, it is necessary to design a zero-sample fault diagnosis method for chemical processes based on joint training of graph neural networks and attribute matching. This method aims to address the difficulty of zero-sample fault diagnosis in chemical processes due to the lack of fault samples and the existence of unseen fault types during real-time monitoring and fault diagnosis. Summary of the Invention

[0006] The purpose of this invention is to provide a zero-sample fault diagnosis method for chemical processes based on joint training of graph neural networks and attribute matching, so as to solve the problem that zero-sample fault diagnosis is difficult in chemical processes due to the lack of fault samples and the existence of situations where no fault type is found when performing real-time monitoring and fault diagnosis.

[0007] To achieve the above objectives, the basic solution provided by this invention is: a zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks, comprising the following steps: S1: Data Acquisition and Preprocessing: First, sensor data of the chemical process is collected. Then, the collected sensor data is truncated into multi-time step samples using sliding window technology. Next, the multi-time step samples are dimensionality reduced by linear discriminant analysis to obtain the initial feature vector. S2: Constructing the fault association graph and attribute matrix: First, construct a binary fault attribute matrix based on the semantic attributes of the fault type. Then, construct a K-nearest neighbor graph and convert the K-nearest neighbor graph into a sparse adjacency matrix. S3: Construct a feature extraction model based on graph neural networks: First, build a network model that includes an attribute embedder, a graph attention context extractor, and a feature fusion module. Then, use the graph attention network to aggregate neighbor node information and generate context features on the fault category association graph. Finally, fuse the initial feature vector with the context features through the attention mechanism. S4: Joint Loss Model Training: First, calculate the joint loss value using sample data of known fault types. Then, perform joint training and optimization of the model by calculating the weighted sum of the binary cross-entropy loss of attribute prediction and the cross-entropy loss of fault classification. S5: Zero-sample fault diagnosis: First, the fusion features of samples of the unknown fault type to be diagnosed are extracted using the joint loss model trained in S4. Then, after further dimensionality reduction through principal component analysis, the attribute probability vector of the sample is predicted using a Gaussian Naive Bayes classifier, and the Euclidean distance between the probability vector and the true attribute vector of all fault types is calculated.

[0008] The beneficial effects of this invention are as follows: (1) This invention constructs a sparse adjacency matrix of fault types and introduces a graph attention network to mine the topological association and semantic dependency between fault types. At the same time, it combines the weighted fusion of context feature vectors and semantic features and the feature splicing mechanism, so that the network model can use the semantic association information of the seen faults to make inferences when dealing with unseen faults, which effectively solves the problem of difficulty in extracting zero-sample fault features in chemical processes; (2) It adopts the maximum-minimum normalization unified dimension and combines three-dimensional tensor samples to construct time-series monitoring data adapted to chemical processes; relying on the zero-sample learning architecture of fault attribute semantic transfer and graph topology modeling, it can realize fault diagnosis without unseen fault samples; (3) By introducing positive sample weights calculated according to sample distribution, We employ a weighted binary cross-entropy to construct a multi-task joint loss function, which effectively avoids the problem of imbalanced fault sample classes. At the same time, based on low variance attribute adaptive weights, we use weighted Euclidean distance and fault class center vector to complete similarity matching, weaken the interference of redundant features, and further improve the diagnostic accuracy of no faults in complex chemical processes.

[0009] Option 2, an optimized version of the basic option, involves extracting samples from multiple time steps in S1 at a time step size T=8, thus transforming the sensor data into three-dimensional tensor samples. Where R represents the set of real numbers and N is the number of samples. Extracting sensor time-series data according to a fixed time step can stably capture the dynamic features of chemical processes, achieving information structuring. The 3D tensor sample format is compatible with deep learning input specifications, improving the effectiveness of subsequent feature extraction and the stability of model training.

[0010] Option 3, the preferred option among the basic options, involves constructing the K-nearest neighbor graph in S2 as follows: Step 1: First, calculate the cosine similarity between attribute vectors of different fault types based on the binary fault attribute matrix to quantify the degree of semantic association between faults; Step 2: Next, for each fault type node, select the K nodes with the highest similarity as neighbor nodes and establish connection edges to obtain the K nearest neighbor graph; Step 3: Finally, transform the K-nearest neighbor graph into a sparse adjacency matrix according to the transformation rule of 1 for edges and 0 for no edges; Because chemical processes have significant time dynamics and time delays, data at a single moment cannot fully reflect the characteristics of the fault. Therefore, constructing a K-nearest neighbor graph can guide the network model to use the semantic information of known faults to assist in reasoning about unknown faults, and automatically establish fault associations using semantic similarity, without the need for manual definition of fault dependencies.

[0011] Option 4, the preferred option of the basic scheme, uses a cross-attention mechanism in S3, specifically: A: First, map the initial feature vector obtained by dimensionality reduction in S1 to the query vector; B: Next, the contextual features output by the graph attention network are mapped into key vectors and value vectors; C: Then calculate the attention scores of the query vector and the key vector, and perform a weighted summation of the value vectors to obtain the contextual semantic features; D: Finally, the initial feature vector is concatenated with the contextual semantic features to form a fused feature vector; By using graph attention adaptive weighting, the network model can automatically focus on key semantics; after fusion, it can simultaneously retain the original data features and fault context semantics, and the feature expression is more comprehensive and the discriminative power is stronger.

[0012] Option 5, the preferred option among the basic options, uses the following formula to calculate the joint loss value in S4: ,in The weighted binary cross-entropy loss for the fault attribute prediction task. Cross-entropy loss for fault classification auxiliary tasks, This is the balancing coefficient. By constructing a joint loss function, the problem of insufficient feature discriminative power in single-attribute prediction tasks is addressed.

[0013] Option six, this is the preferred option of the basic option, in S4, We employ weighted binary cross-entropy loss, and apply the formula to each fault attribute dimension. Calculate the weights of positive samples ,in and The first The number of positive and negative samples for each fault attribute in the training set. and This is the smoothing coefficient. A weighted binary cross-entropy is used, with the weight of positive samples calculated based on the number of positive and negative samples. This avoids the problem of imbalanced fault types and mitigates missed detections and biases caused by sample imbalance.

[0014] Option 7, an optimized version of the basic option, trains a Gaussian Naive Bayes classifier for each attribute dimension in S5. Then, it uses this trained classifier to predict the probability of samples with no known fault type in each fault attribute dimension, forming an attribute probability vector for these samples. Subsequently, it calculates the weighted Euclidean distance between the predicted attribute probability vector and each row vector in the predefined fault attribute matrix; the vector with the smallest distance is the diagnostic result. The Gaussian Naive Bayes classifier algorithm is fast and has simple parameters, making it suitable for attribute prediction of high-dimensional data. Attached Figure Description

[0015] Figure 1This is a flowchart of a zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to the present invention; Figure 2 This is a schematic diagram of the fault attribute graph neural network structure in the zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks in this invention; Figure 3 This is a schematic diagram of the cross-attention model of a zero-sample fault diagnosis method for chemical processes based on graph neural network joint training and attribute matching, according to the present invention. Figure 4 This is a schematic diagram of the zero-sample diagnosis stage in a zero-sample fault diagnosis method for chemical processes based on graph neural network joint training and attribute matching according to the present invention. Detailed Implementation

[0016] The present invention will be further described in detail below through specific embodiments: Example The Tennessee-Eastman (TE) chemical process simulation dataset was used as the validation object. This chemical process includes core units such as reactors, condensers, and compressors, involving 12 manipulated variables and 41 measured variables, for a total of 52 observed variables. The fault type information studied in this embodiment is shown in Table 1. Fault 1, Fault 2, and Fault 4 in Table 1 are set as unseen fault types, and the remaining 12 fault types are used to train the model. As shown in Table 2, 18 fault attributes are selected as the semantic basis for the 15 fault types in Table 1. These fault attributes are suitable for characterizing seen fault types and can also be used to describe unseen fault types. Table 1: Fault Type Information Table Table 2: Fault Attribute Information Table like Figures 1 to 4 As shown, a zero-sample fault diagnosis method for chemical processes based on joint training of graph neural networks and attribute matching includes the following steps: S1: Data Acquisition and Preprocessing: First, sensor time-series data of the chemical process are acquired from the TE simulation platform; then, the maximum-minimum normalization method is used, i.e., through the formula... The acquired sensor time-series data are uniformly mapped to the interval [0, 1] through a linear transformation, where Represents raw data, Represents the minimum value in the dataset. Represents the maximum value in the dataset. This represents the normalized value; then, using the sliding window technique, the continuous sensor time-series data is truncated into multi-time-step samples with a time step size of T=8, thus transforming the sensor time-series data into three-dimensional tensor samples. Where R represents the set of real numbers and N is the number of samples; then, linear discriminant analysis (LDA) is used to reduce the dimensionality of the three-dimensional tensor samples, resulting in an initial feature vector with a dimension of 9. ; S2: Constructing the Fault Association Graph and Attribute Matrix: First, construct a 15×18 binary fault attribute matrix based on the semantic attributes of the 15 fault types in Table 1 (as shown in Table 3); then, according to the formula... The cosine similarity between attribute vectors of different fault types is calculated by combining the binary fault attribute matrix, where A and B represent attribute vectors of two fault types, and A·B represents the dot product of vectors A and B, quantifying the semantic association between faults; then, a K-nearest neighbor graph is constructed based on the calculated cosine similarity between fault type attribute vectors; subsequently, the three nodes with the highest similarity for each fault node in the K-nearest neighbor graph are retained as neighbors, and the K-nearest neighbor graph is converted into a sparse adjacency matrix according to the conversion rule of 1 for edges and 0 for no edges; Table 3: Binary Fault Attribute Matrix S3: Constructing a feature extraction model based on graph neural networks: First, a multilayer perceptron (MLP) is used to map the binary fault attribute vector into a dense low-dimensional embedding vector; then, the first layer graph attention network is set with an input dimension of 96, an output dimension of 128, and a single-head attention mechanism; the second layer graph attention network has an input dimension of 128, an output dimension of 48, and a three-head attention mechanism. The outputs are then concatenated, resulting in a context feature vector with a final output dimension of 48 × 3 = 144; then the initial feature vector obtained in S1 is... As a query vector, the 144-dimensional context feature vector output by the graph attention network is linearly mapped to 96 dimensions. Then, the weights are calculated by scaling dot product attention, and the context semantic features are weighted and fused into the context feature vector. Finally, the fused feature vector is concatenated with the initial feature vector to form the final fused feature vector. S4: Joint Loss Model Training: First, construct the joint loss function: ,in The weighted binary cross-entropy loss for the fault attribute prediction task. Cross-entropy loss for fault classification auxiliary tasks, This is the balance coefficient (set to 0.05 here); Introducing positive sample weights The loss of faulty positive samples is compensated using a weighted method, and the formula for calculating the weight of positive samples is as follows: ,in and The first The number of positive and negative samples for each fault attribute in the training set. and The smoothing coefficient is used; then the known fault set data obtained from the TE simulation platform is input into the network model to calculate the joint loss value. Then, the Adam optimizer is used in combination with the learning rate decay strategy to iteratively update the network parameters based on the joint loss value until the joint loss value tends to stabilize and the joint loss model reaches the convergence state, at which point training stops. S5: Zero-Sample Fault Diagnosis: First, the fusion features of the sample to be diagnosed are extracted using the joint loss model trained in S4, and principal component analysis is used to reduce the feature dimension to 10. Next, 18 Gaussian Naive Bayes classifiers are trained for each of the 18 fault attributes. Then, the Gaussian Naive Bayes classifiers are used to predict the posterior probability of the sample to be diagnosed possessing 15 fault types, forming a predicted fault attribute probability vector. Weights are then assigned based on the variance of each attribute of the visible fault types, and subsequently, the formula is used... Calculate the Euclidean distance between the sample to be diagnosed and each fault type, where Indicates the total number of dimensions of the fault attributes. Indicates attribute weight, Represents the vector of samples to be diagnosed, The fault type center vector is represented by , and then the fault type with the smallest Euclidean distance is selected as the final diagnosis result.

[0017] As shown in Table 4, the 15 fault types in the TE chemical process simulation dataset are divided into training fault types and test fault types. Zero-shot fault diagnosis models FDAT, AFT, ZSIDM-OC, and GLA-ZSL are used to diagnose faults in the sensor time-series data of the chemical process acquired by the TE simulation platform. Finally, the diagnostic accuracy of the zero-shot fault diagnosis models FDAT, AFT, ZSIDM-OC, and GLA-ZSL is compared with that of this embodiment. The results show that the average diagnostic accuracy of FDAT is 70.72%, AFT is 47.39%, ZSIDM-OC is 65.75%, and GLA-ZSL is 79.01%, while the average diagnostic accuracy of the method in this embodiment reaches 83.37%, significantly better than the average diagnostic accuracy of the other four zero-shot fault diagnosis models.

[0018] Table 4: Fault Type Dataset Division Table Table 5: Experimental Validation and Comparison of TE Dataset The above descriptions are merely embodiments of the present invention, and common knowledge regarding specific structures and characteristics is not elaborated upon here. It should be noted that those skilled in the art can make various modifications and improvements without departing from the structure of the present invention, and these should also be considered within the scope of protection of the present invention. These modifications and improvements will not affect the effectiveness of the present invention or the practicality of the patent. The scope of protection claimed in this application should be determined by the content of its claims, and the specific embodiments described in the specification can be used to interpret the content of the claims.

Claims

1. A zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks, characterized in that, Includes the following steps: S1: Data Acquisition and Preprocessing: First, sensor data of the chemical process is collected. Then, the collected sensor data is truncated into multi-time step samples using sliding window technology. Next, the multi-time step samples are dimensionality reduced by linear discriminant analysis to obtain the initial feature vector. S2: Constructing the fault association graph and attribute matrix: First, construct a binary fault attribute matrix based on the semantic attributes of the fault type. Then, construct a K-nearest neighbor graph and convert the K-nearest neighbor graph into a sparse adjacency matrix. S3: Construct a feature extraction model based on graph neural networks: First, build a network model that includes an attribute embedder, a graph attention context extractor, and a feature fusion module. Then, use the graph attention network to aggregate neighbor node information and generate context features on the fault category association graph. Finally, fuse the initial feature vector with the context features through the attention mechanism. S4: Joint Loss Model Training: First, calculate the joint loss value using sample data of known fault types. Then, perform joint training and optimization of the model by calculating the weighted sum of the binary cross-entropy loss of attribute prediction and the cross-entropy loss of fault classification. S5: Zero-sample fault diagnosis: First, the fusion features of samples of the unknown fault type to be diagnosed are extracted using the joint loss model trained in S4. Then, after further dimensionality reduction through principal component analysis, the attribute probability vector of the sample is predicted using a Gaussian Naive Bayes classifier, and the Euclidean distance between the probability vector and the true attribute vector of all fault types is calculated.

2. The zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to claim 1, characterized in that, In S1, samples from multiple time steps are extracted according to a time step of T=8, transforming the sensor data into three-dimensional tensor samples. , where R represents the set of real numbers and N is the number of samples.

3. The zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to claim 1, characterized in that, In S2, the steps to construct the K-nearest neighbor graph are as follows: Step 1: First, calculate the cosine similarity between attribute vectors of different fault types based on the binary fault attribute matrix to quantify the degree of semantic association between faults; Step 2: Next, for each fault type node, select the K nodes with the highest similarity as neighbor nodes and establish connection edges to obtain the K nearest neighbor graph; Step 3: Finally, convert the K-nearest neighbor graph into a sparse adjacency matrix according to the conversion rule of 1 for edges and 0 for no edges.

4. The zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to claim 1, characterized in that, In S3, the feature fusion module employs a cross-attention mechanism, specifically: A: First, map the initial feature vector obtained by dimensionality reduction in S1 to the query vector; B: Next, the contextual features output by the graph attention network are mapped into key vectors and value vectors; C: Then calculate the attention scores of the query vector and the key vector, and perform a weighted summation of the value vectors to obtain the contextual semantic features; D: Finally, the initial feature vector is concatenated with the contextual semantic features to form a fused feature vector.

5. The zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to claim 1, characterized in that, In S4, the formula for calculating the joint loss value is: ,in The weighted binary cross-entropy loss for the fault attribute prediction task. Cross-entropy loss for fault classification auxiliary tasks, This is the balance coefficient.

6. The zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to claim 5, characterized in that, In S4, We employ weighted binary cross-entropy loss, and apply the formula to each fault attribute dimension. Calculate the weights of positive samples ,in and The first The number of positive and negative samples for each fault attribute in the training set. and This is the smoothing coefficient.

7. The zero-sample fault diagnosis method for chemical processes based on joint training and attribute matching of graph neural networks according to claim 1, characterized in that, In S5, a Gaussian Naive Bayes classifier is trained for each attribute dimension. Then, the trained Gaussian Naive Bayes classifier is used to predict the probability of samples without fault type in each fault attribute dimension, forming the attribute probability vector of the predicted samples without fault type. Subsequently, the weighted Euclidean distance between the predicted attribute probability vector and each row vector in the predefined fault attribute matrix is ​​calculated, and the smallest distance is the diagnosis result.