A distributed industrial process data filling method based on a lightweight heterogeneous graph neural network

By using a distributed lightweight heterogeneous graph neural network, the problem of missing data in distributed industrial environments is solved, high-precision data filling is achieved, data integrity and reliability are improved, and the intelligent and digital upgrading of industrial processes is promoted.

CN122309938APending Publication Date: 2026-06-30BEIJING UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING UNIV OF TECH
Filing Date
2026-03-30
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In distributed industrial environments, industrial process data is often missing due to equipment failures, communication anomalies, or human error. Existing methods struggle to effectively characterize the complex spatiotemporal dependencies and high-order heterogeneous relationships among multiple variables, resulting in insufficient data feature mining capabilities and low data filling accuracy, making it difficult to meet the demand for high-quality data in complex industrial processes.

Method used

A distributed lightweight heterogeneous graph neural network is adopted. By constructing a spatiotemporal joint prototype federated learning, a lightweight heterogeneous graph neural network data imputation strategy is designed. A lightweight multilayer linear model is used to extract temporal features, and a spatial conditional attention graph neural network is combined to model high-order heterogeneous relationships, so as to achieve high-precision data imputation.

Benefits of technology

In a distributed environment, it enables high-precision and automated filling of missing industrial process data, improving data integrity and reliability, reducing operational risks and costs, and promoting the intelligent and digital upgrading of industrial processes.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309938A_ABST
    Figure CN122309938A_ABST
Patent Text Reader

Abstract

This invention provides a distributed industrial process data imputation method based on lightweight heterogeneous graph neural networks. Based on real industrial process time-series data, this invention establishes a lightweight heterogeneous graph neural network distributed data imputation method. It utilizes a lightweight multi-layer linear model to extract rich deep features from the time series data and leverages various high-order heterogeneous information to learn spatial relationships, thereby imputing missing values. This lightweight heterogeneous graph neural network data imputation method can solve the problems of distributed industrial process data, complex high-order heterogeneous information among the data, which makes data feature modeling difficult and spatial information capture insufficient. It achieves high-precision distributed industrial data imputation, providing technical support for industrial engineering.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention establishes a lightweight heterogeneous graph neural network distributed data imputation method based on real industrial process time series data. It utilizes a lightweight multilayer linear model to extract rich deep features from the time series data and leverages various high-order heterogeneous information to learn spatial relationships, thereby imputing missing values. This lightweight heterogeneous graph neural network data imputation method addresses the challenges of distributed industrial process data, where complex high-order heterogeneous information exists, making feature modeling difficult and spatial information capture insufficient. It achieves high-precision distributed industrial data imputation, providing technical support for industrial engineering. Background Technology

[0002] Modern complex industrial processes are developing rapidly, and decision-making relies on higher-quality data. However, due to human error or machine malfunction, collected data often contains many missing segments, affecting data quality. Decisions in industrial processes can be impacted by ignoring or mishandling missing values. Therefore, data imputation is crucial in industrial processes. Currently, data imputation is often done manually, which is cumbersome, has low accuracy, and is difficult to apply widely. Developing an intelligent, automated, and distributed industrial engineering data imputation method is key to improving data quality. Deep learning-based data imputation methods effectively improve the automation level and accuracy of data imputation. However, in actual industrial processes, because data is distributed and there are high-order heterogeneous relationships between diverse data, the feature discovery capability of the model is reduced, resulting in low data imputation accuracy. In practical application environments, accurately imputing data with complex high-order heterogeneous relationships in distributed environments has significant economic and social benefits. Therefore, the research results of this invention have broad application prospects.

[0003] Statistical imputation methods suffer from low accuracy and high cost. With the rapid development of my country's industrial systems, the quantity and complexity of industrial process data are increasing dramatically. These existing imputation methods can no longer meet the needs of industrial process data imputation. To address the complexity issue, some machine learning-based imputation methods have been proposed. Industrial process data possesses various complex characteristics and strong correlations between data points. Simple feature extraction models and unit-level sensor data struggle to capture the comprehensive spatiotemporal features of the data. Therefore, these methods result in significant numerical deviations and are unsuitable for large-scale industrial processes. With the development of deep learning, graph neural network-based data imputation methods have been extensively studied. As the number of sensors and the complexity of equipment in industrial processes increase, capturing heterogeneous features between data becomes difficult, leading to poor performance of imputation methods considering homogeneous graph signals. In recent years, data imputation methods based on heterogeneous graph neural networks have attracted widespread attention. These methods model multi-dimensional data in industrial processes as heterogeneous graph signals, learning complex heterogeneous information to meet the imputation requirements of deep learning. However, these methods do not consider the distributed nature of the data and the high-order, complex heterogeneous relationships, resulting in poor imputation performance and inability to adapt to distributed environments. Therefore, how to learn the complex, high-order heterogeneous spatiotemporal characteristics of data in a distributed environment and achieve fast and accurate data filling has become an important research topic in the field of data filling, and has significant practical implications.

[0004] This invention designs a distributed, lightweight, heterogeneous graph neural network (BRNN) method for industrial process data imputation, achieving intelligent data imputation. First, a spatiotemporal joint prototype federated learning distributed environment is established to adapt to the heterogeneity of distributed data. Then, a lightweight heterogeneous graph neural network data imputation strategy is designed, utilizing a lightweight multilayer linear model to extract rich temporal features from the data, and employing a spatial conditional attention graph neural network method to model and imput complex high-order heterogeneous relationships between data, thereby improving data imputation accuracy. In practical industrial process data imputation, this method can adapt to the distributed data environment, solve the problem of modeling difficulties caused by high-order heterogeneous relationships between data, achieve high-precision data imputation, and provide an effective method for industrial process data imputation. Summary of the Invention

[0005] The technical problem that this invention needs and can solve.

[0006] (1) Technical problems of inaccurate filling of missing data in distributed industrial processes

[0007] In actual industrial production processes, numerous sensors continuously collect multivariate time series data. However, due to equipment failures, communication anomalies, or human error, industrial process data often exhibits varying degrees of missing information. Furthermore, industrial systems typically consist of multiple devices or process units, with data distributed across different device nodes, exhibiting a clear distributed characteristic. The variables not only have temporal dependencies but also complex spatial correlations and higher-order heterogeneous relationships. Existing data imputation methods mostly rely on statistical models or single deep learning structures, making it difficult to effectively model the complex spatiotemporal dependencies and heterogeneous relationships among multiple variables simultaneously in a distributed environment. This results in insufficient data feature mining capabilities and low imputation accuracy, failing to meet the high-quality data requirements of complex industrial processes. Therefore, how to effectively characterize the complex spatiotemporal relationships and higher-order heterogeneous structures of multivariate time series in a distributed industrial environment, and achieve high-precision, automated imputation of missing industrial process data, becomes the key technical problem that this invention aims to solve.

[0008] The present invention adopts the following technical solution and implementation steps:

[0009] A distributed, lightweight, heterogeneous graph neural network (Graph Neural Network) method for infilling industrial process data is characterized by the following steps: collecting and preprocessing industrial process data, establishing a distributed, lightweight, heterogeneous graph neural network data infilling method, training a distributed, lightweight graph neural network data infilling model for industrial process data, and infilling the industrial process data.

[0010] (1) Acquisition and preprocessing of industrial process data

[0011] Obtain an industrial process time series dataset. The data is from a wastewater treatment plant and is distributed across edge devices in the influent chamber, tank A, and tank B. The influent chamber dataset is also available. It contains 4 variables, represented as ,in The pH value at time t is obtained from a pH meter. The SS (suspended solids) at time t, expressed in milligrams per liter, is collected by an SS sensor. The value represents the COD (Chemical Oxygen Demand) at time t, expressed in milligrams per liter, and is obtained from a COD analyzer. This represents the NH3N at time t, in milligrams per liter, collected by an ammonia nitrogen analyzer. (Data set from pool A) It contains 9 variables, represented as ,in The value represents the ORP (oxidation-reduction potential) in anaerobic conditions at time t, measured in millivolts, and acquired by an ORP sensor. The value represents the pre-anoxic ORP (oxidation-reduction potential) at time t, measured in millivolts, and is acquired by an ORP sensor. The value of NO3N at time t represents the final nitrate nitrogen (NO3N) in the anoxic state, expressed in mg / L, and was obtained from a nitrate nitrogen analyzer. The value represents the final MLSS (mixed liquor suspended solids concentration) at time t in the anoxic pool, expressed in milligrams per liter, and is collected by a suspended solids concentration meter. The value represents the final liquid level at time t, in millimeters, and is obtained from a liquid level sensor. The dissolved oxygen (DO) at time t before aerobic respiration (before the aerobic tank), expressed in mg / L, is measured by a dissolved oxygen meter. The dissolved oxygen (DO) concentration at time t represents the oxygen concentration in the aerobic zone, expressed in milligrams per liter, and is obtained from a dissolved oxygen meter. The value of DO (dissolved oxygen in aerobic tank 2) at time t is expressed in milligrams per liter and is obtained from a dissolved oxygen meter. This represents the orthophosphate concentration at time t (at the end of the second aerobic tank), in mg / L, collected by an orthophosphate analyzer. (Data set from tank B) There are 13 variables, represented as

[0012] ,in The value represents the ORP in the anaerobic environment at time t, measured in millivolts, and is acquired by an ORP sensor. The value represents the pre-hypoxia ORP at time t, measured in millivolts, and is acquired by an ORP sensor. The value of NO3N at time t represents the final nitrate nitrogen (NO3N) in the anoxic state, expressed in mg / L, and was obtained from a nitrate nitrogen analyzer. The MLSS at time t represents the end-hypoxia level, expressed in mg / L, and was collected by a suspended solids concentration meter. The value represents the final liquid level at time t, in millimeters, and is obtained from a liquid level sensor. The dissolved oxygen (DO) at time t represents the initial DO level before aerobic activity, expressed in mg / L, and is measured by a dissolved oxygen meter. The dissolved oxygen (DO) concentration at time t represents the oxygen concentration in the aerobic zone, expressed in milligrams per liter, and is obtained from a dissolved oxygen meter. The dissolved oxygen (DO) at time t is expressed in mg / L and is obtained from a dissolved oxygen meter. The value of diphosphate at time t is expressed in mg / L and was obtained from an orthophosphate analyzer. The total nitrogen (TN) at the effluent at time t is expressed in milligrams per liter (mg / L) and is collected by a total nitrogen analyzer. The COD of the effluent at time t is expressed in milligrams per liter and is obtained from a COD analyzer. This indicates the external return flow rate at time t, with units of cubic meters per hour, obtained from the flow meter data. The pH value of the effluent at time t is obtained from a pH meter. The above data is normalized using the following formula:

[0013]

[0014] Where i represents the i-th device, j represents the j-th variable, and t represents the t-th time point. This represents the maximum value of the j-th variable in the i-th device. Let represent the minimum value of the j-th variable in the i-th device. This represents the value of the j-th variable in the i-th device at time t;

[0015] (2) Establish a distributed lightweight heterogeneous graph neural network data imputation method

[0016] ① A distributed, lightweight, heterogeneous graph neural network data imputation method is constructed. This data imputation method trains three identical data imputation models simultaneously. The input of each model is all the variables of the corresponding device. The imputation is interactively learned through a spatiotemporal prototype federated learning module. Each model contains a decomposition module, a lightweight temporal embedding module, an edge vector generation module, and a spatial conditional attention embedding module. In the basic settings of each model, there are 4 types of heterogeneity, the number of nodes is equal to the number of variables j, the batch size is 8, the window size is 168, the number of training rounds is 100, the optimizer is Adam, the learning rate is 0.001, and the dropout rate is 0.2.

[0017] ② Construct a decomposition module, in which a moving average kernel of size 25 and step size 1 is used to divide the data into a trend component and a seasonal component, using the following formula:

[0018]

[0019] This represents the value of the j-th variable in the i-th device at time t+k, where n=25. Indicates trend components, Indicates seasonal quantity;

[0020] ③ Construct a lightweight temporal embedding module, which includes a ReLU layer, a 6-layer linear layer, a loss layer, and a residual connection layer, as shown in the formula:

[0021]

[0022] in For input, let W represent the raw data of the j-th variable in the i-th device, including the trend component and the seasonal component. h Let b represent the weight matrix of the h-th layer. h This represents the output bias parameter of the h-th layer, b in the initial stage of the network. h It can be any constant between 0 and 1, h is 6, D(·) represents the missing module with a loss rate of 0.2, and finally the model is residually connected, T( ) is the output of the lightweight time embedding module, where the original data, trend component and seasonal component are used to obtain the time embedding vectors of the static part, trend part and seasonal part through formula (3);

[0023] ④ Construct an edge vector generation module, which uses matrix multiplication and a set of convolution kernels, as shown in the formula:

[0024]

[0025] This represents the feature matrix of the i-th device. yes The transpose of the matrix, Θ represents the parameter matrix of the convolutional layer. Here, the time embedding vectors of the static part, trend part and seasonal part are input, and the initial side vectors of the static part, trend part and seasonal part are obtained by formula (4).

[0026] ⑤ Construct a spatial conditional attention embedding module, which includes two types of spatial attention methods and one spatial embedding method, where each variable of each device is treated as a node, g l and g m Let represent the feature vectors of the l-th and m-th nodes, respectively;

[0027] The formula for the first spatial attention method is:

[0028]

[0029] in and Represent two types of second-order initial edge vectors. It is a projection weight matrix. These are learnable attention weights; these are learnable model parameters. This refers to connecting two matrices according to their characteristic wind direction. It is the ReLU activation function. This represents the aggregation method for circular relationships. This indicates that nodes l and m are on the edge Relationships under certain conditions;

[0030] The formula for the second spatial attention method is:

[0031]

[0032] in This represents the initial edge vector of order 2. This indicates that nodes l and m are on the edge The relationship under the condition is obtained by using the time embedding vectors of each part and the corresponding edge vectors, and the node relationship under different heterogeneous relationships is obtained by formulas (5)~(6).

[0033] The spatial embedding method is constructed, and its formula is as follows:

[0034]

[0035] in The characteristic of node m is that node m belongs to the set of neighboring nodes of node l under different edge conditions. This represents the relationship coefficient between node l and node m under the condition of a second-order edge. This represents the relationship coefficient between node l and node m under the condition of a first-order edge. Represents the set of first-order nodes. Represents the set of second-order nodes. and These represent the learnable weight matrices, and Represents the learnable attention weights for different parts. The updated node features are obtained by using the three-layer formula (7) to get the final filling result of the single model after obtaining the node relationships under different heterogeneous relationships.

[0036] ⑥ Construct a spatiotemporal prototype federated learning module, which includes two parts: local prototype aggregation and global prototype aggregation. The formula for local prototype aggregation is:

[0037]

[0038] This represents the embedding vector of the u-th layer in the four heterogeneous relationships, where U is 3. These are the prototypes corresponding to the four heterogeneous relationships;

[0039] The global prototype aggregation formula is:

[0040]

[0041] in This represents the prototype of class j in device i. Represents a set of devices of class j. Device i contains data of class j. This represents the set of devices containing class j. Indicates the number of devices. The value is 3.

[0042] ⑦ Construct the loss function for a distributed, lightweight, heterogeneous graph neural network, the formula of which is:

[0043]

[0044] in It is the input data. This indicates different first-order and second-order heterogeneous relations. It is the actual value. The value is 7, which represents the number of prediction points. and These are the local prototype and global prototype of client i, respectively. This represents the L2 distance metric, where I is 3, which is the number of clients.

[0045] The gradient descent algorithm is used to distribute and lightly quantize the parameters of a heterogeneous graph neural network model. The update formula is divided into two parts. The first part of the formula is as follows:

[0046]

[0047]

[0048]

[0049] in, This represents the weight matrix in the l-th spatial embedding method at the (t+1)-th iteration. This represents the weight matrix in the spatial attention method at the t-th iteration; represents the weight matrix of the h-th layer in the multi-layer linear layer at the t-th iteration; each element in all initial values ​​is a constant between 0 and 1; η is the learning rate of the gradient descent algorithm, which is 0.001. The single model network is updated by the gradient descent algorithm shown in formulas (11) to (14). After the network is trained, the local prototype and global prototype are calculated by formulas (8) to (9).

[0050] The second part is:

[0051]

[0052] in, This represents the prototype of class j in device i at the (t+1)th iteration; η is the learning rate of the gradient descent algorithm, which is 0.001. After calculating the local prototype and the global prototype, the distributed framework is updated by formula (15).

[0053] (3) Training a distributed lightweight graph neural network data filling model for industrial process data

[0054] The specific training process for data imputation in industrial systems using distributed lightweight heterogeneous graph neural networks is as follows:

[0055] 1) Use formulas (1) to (10) to complete the single-wheel forward process of the model;

[0056] 2) Use formulas (11)~(15) to perform gradient descent algorithm and update the overall model;

[0057] 3) Repeat steps 1) to 2) until the set 100 rounds are reached, then stop training;

[0058] (4) Fill in industrial process data

[0059] The specific process of using distributed lightweight heterogeneous graph neural networks for industrial system data filling is as follows: the distributed data to be filled is obtained by passing the model trained in step (3) to obtain the filling result.

[0060] The effects that this invention can achieve:

[0061] (1) Social effects

[0062] This invention utilizes a distributed, lightweight, heterogeneous graph neural network-based method for industrial process data infilling. This method achieves high-precision intelligent infilling of distributed, heterogeneous data, improving the integrity and reliability of industrial data, reducing manual intervention, and enhancing the stability and decision-making accuracy of industrial systems. It effectively addresses complex spatiotemporal relationships and data gaps, reduces operational risks and costs, and promotes the intelligent and digital upgrading of industrial processes. This has positive social significance for driving high-quality industrial development and green, sustainable transformation.

[0063] (2) Economic effects

[0064] This invention significantly improves the operational efficiency of industrial systems by enhancing the accuracy and stability of distributed industrial process data filling, reducing misjudgments and redundant corrections caused by missing data, and minimizing energy waste and operational losses during production. Simultaneously, this method reduces the frequency of manual data repair and on-site troubleshooting, lowering the manpower and time costs of operation and maintenance. With the widespread application of this technology, it can effectively improve the data utilization efficiency and production decision-making level of industrial enterprises, enhance the overall economic benefits of intelligent manufacturing systems, and promote the large-scale implementation of industrial digitalization and high-efficiency applications.

[0065] (3) Technical effects

[0066] This invention technically overcomes the limitations of traditional data imputation methods in depicting high-order heterogeneous relationships and lacking spatiotemporal feature modeling capabilities in distributed environments. By constructing a lightweight temporal embedding module and a spatial conditional attention embedding mechanism, it accurately extracts trend and periodic features from time series data and effectively models complex heterogeneous associations and spatial dependencies among multiple variables. Simultaneously, it introduces a spatiotemporal prototype federated learning strategy to achieve collaborative modeling and knowledge sharing among multiple devices, enhancing the model's generalization ability and stability in distributed scenarios. This method significantly improves data imputation accuracy while maintaining computational efficiency, and is compatible with and can be integrated with existing industrial monitoring and control systems, providing advanced technical support for the intelligent and highly reliable operation of industrial processes. Attached Figure Description

[0067] Figure 1 This is a diagram showing the result of data filling in according to the present invention; Detailed Implementation

[0068] The present invention adopts the following technical solution and implementation steps:

[0069] A distributed, lightweight, heterogeneous graph neural network (Graph Neural Network) method for infilling industrial process data is characterized by the following steps: collecting and preprocessing industrial process data, establishing a distributed, lightweight, heterogeneous graph neural network data infilling method, training a distributed, lightweight graph neural network data infilling model for industrial process data, and infilling the industrial process data.

[0070] (1) Acquisition and preprocessing of industrial process data

[0071] Obtain an industrial process time series dataset. The data is from a wastewater treatment plant and is distributed across edge devices in the influent chamber, tank A, and tank B. The influent chamber dataset is also available. It contains 4 variables, represented as ,in The pH value at time t is obtained from a pH meter. The SS (suspended solids) at time t, expressed in milligrams per liter, is collected by an SS sensor. The value represents the COD (Chemical Oxygen Demand) at time t, expressed in milligrams per liter, and is obtained from a COD analyzer. This represents the NH3N at time t, in milligrams per liter, collected by an ammonia nitrogen analyzer. (Data set from pool A) It contains 9 variables, represented as ,in The value represents the ORP (oxidation-reduction potential) in anaerobic conditions at time t, measured in millivolts, and acquired by an ORP sensor. The value represents the pre-anoxic ORP (oxidation-reduction potential) at time t, measured in millivolts, and is acquired by an ORP sensor. The value of NO3N at time t represents the final nitrate nitrogen (NO3N) in the anoxic state, expressed in mg / L, and was obtained from a nitrate nitrogen analyzer. The value represents the final MLSS (mixed liquor suspended solids concentration) at time t in the anoxic pool, expressed in milligrams per liter, and is collected by a suspended solids concentration meter. The value represents the final liquid level at time t, in millimeters, and is obtained from a liquid level sensor. The dissolved oxygen (DO) at time t before aerobic respiration (before the aerobic tank), expressed in mg / L, is measured by a dissolved oxygen meter. The dissolved oxygen (DO) concentration at time t represents the oxygen concentration in the aerobic zone, expressed in milligrams per liter, and is obtained from a dissolved oxygen meter. The value of DO (dissolved oxygen in aerobic tank 2) at time t is expressed in milligrams per liter and is obtained from a dissolved oxygen meter. This represents the orthophosphate concentration at time t (at the end of the second aerobic tank), in mg / L, collected by an orthophosphate analyzer. (Data set from tank B) There are 13 variables, represented as

[0072] ,in The value represents the ORP in the anaerobic environment at time t, measured in millivolts, and is acquired by an ORP sensor. The value represents the pre-hypoxia ORP at time t, measured in millivolts, and is acquired by an ORP sensor. The value of NO3N at time t represents the final nitrate nitrogen (NO3N) in the anoxic state, expressed in mg / L, and was obtained from a nitrate nitrogen analyzer. The MLSS at time t represents the end-hypoxia level, expressed in mg / L, and was collected by a suspended solids concentration meter. The value represents the final liquid level at time t, in millimeters, and is obtained from a liquid level sensor. The dissolved oxygen (DO) at time t represents the initial DO level before aerobic activity, expressed in mg / L, and is measured by a dissolved oxygen meter. The dissolved oxygen (DO) concentration at time t represents the oxygen concentration in the aerobic zone, expressed in milligrams per liter, and is obtained from a dissolved oxygen meter. The dissolved oxygen (DO) at time t is expressed in mg / L and is obtained from a dissolved oxygen meter. The value of diphosphate at time t is expressed in mg / L and was obtained from an orthophosphate analyzer. The total nitrogen (TN) at the effluent at time t is expressed in milligrams per liter (mg / L) and is collected by a total nitrogen analyzer. The COD of the effluent at time t is expressed in milligrams per liter and is obtained from a COD analyzer. This indicates the external return flow rate at time t, with units of cubic meters per hour, obtained from the flow meter data. The pH value of the effluent at time t is obtained from a pH meter. The above data is normalized using the following formula:

[0073]

[0074] Where i represents the i-th device, j represents the j-th variable, and t represents the t-th time point. This represents the maximum value of the j-th variable in the i-th device. Let represent the minimum value of the j-th variable in the i-th device. This represents the value of the j-th variable in the i-th device at time t;

[0075] (2) Establish a distributed lightweight heterogeneous graph neural network data imputation method

[0076] ① A distributed, lightweight, heterogeneous graph neural network data imputation method is constructed. This data imputation method trains three identical data imputation models simultaneously. The input of each model is all the variables of the corresponding device. The imputation is interactively learned through a spatiotemporal prototype federated learning module. Each model contains a decomposition module, a lightweight temporal embedding module, an edge vector generation module, and a spatial conditional attention embedding module. In the basic settings of each model, there are 4 types of heterogeneity, the number of nodes is equal to the number of variables j, the batch size is 8, the window size is 168, the number of training rounds is 100, the optimizer is Adam, the learning rate is 0.001, and the dropout rate is 0.2.

[0077] ② Construct a decomposition module, in which a moving average kernel of size 25 and step size 1 is used to divide the data into a trend component and a seasonal component, using the following formula:

[0078]

[0079] This represents the value of the j-th variable in the i-th device at time t+k, where n=25. Indicates trend components, Indicates seasonal quantity;

[0080] ③ Construct a lightweight temporal embedding module, which includes a ReLU layer, a 6-layer linear layer, a loss layer, and a residual connection layer, as shown in the formula:

[0081]

[0082] in For input, let W represent the raw data of the j-th variable in the i-th device, including the trend component and the seasonal component. h Let b represent the weight matrix of the h-th layer. h This represents the output bias parameter of the h-th layer, b in the initial stage of the network. h It can be any constant between 0 and 1, h is 6, D(·) represents the missing module with a loss rate of 0.2, and finally the model is residually connected, T( ) is the output of the lightweight time embedding module, where the original data, trend component and seasonal component are used to obtain the time embedding vectors of the static part, trend part and seasonal part through formula (3);

[0083] ④ Construct an edge vector generation module, which uses matrix multiplication and a set of convolution kernels, as shown in the formula:

[0084]

[0085] This represents the feature matrix of the i-th device. yes The transpose of the matrix, Θ represents the parameter matrix of the convolutional layer. Here, the time embedding vectors of the static part, trend part and seasonal part are input, and the initial side vectors of the static part, trend part and seasonal part are obtained by formula (4).

[0086] ⑤ Construct a spatial conditional attention embedding module, which includes two types of spatial attention methods and one spatial embedding method, where each variable of each device is treated as a node, g l and g m Let represent the feature vectors of the l-th and m-th nodes, respectively;

[0087] The formula for the first spatial attention method is:

[0088]

[0089] in and Represent two types of second-order initial edge vectors. It is a projection weight matrix. These are learnable attention weights; these are learnable model parameters. This refers to connecting two matrices according to their characteristic wind direction. It is the ReLU activation function. This represents the aggregation method for circular relationships. This indicates that nodes l and m are on the edge Relationships under certain conditions;

[0090] The formula for the second spatial attention method is:

[0091]

[0092] in This represents the initial edge vector of order 2. This indicates that nodes l and m are on the edge The relationship under the condition is obtained by using the time embedding vectors of each part and the corresponding edge vectors, and the node relationship under different heterogeneous relationships is obtained by formulas (5)~(6).

[0093] The spatial embedding method is constructed, and its formula is as follows:

[0094]

[0095] in The characteristic of node m is that node m belongs to the set of neighboring nodes of node l under different edge conditions. This represents the relationship coefficient between node l and node m under the condition of a second-order edge. This represents the relationship coefficient between node l and node m under the condition of a first-order edge. Represents the set of first-order nodes. Represents the set of second-order nodes. and These represent the learnable weight matrices, and Represents the learnable attention weights for different parts. The updated node features are obtained by using the three-layer formula (7) to get the final filling result of the single model after obtaining the node relationships under different heterogeneous relationships.

[0096] ⑥ Construct a spatiotemporal prototype federated learning module, which includes two parts: local prototype aggregation and global prototype aggregation. The formula for local prototype aggregation is:

[0097]

[0098] This represents the embedding vector of the u-th layer in the four heterogeneous relationships, where U is 3. These are the prototypes corresponding to the four heterogeneous relationships;

[0099] The global prototype aggregation formula is:

[0100]

[0101] in This represents the prototype of class j in device i. Represents a set of devices of class j. Device i contains data of class j. This represents the set of devices containing class j. Indicates the number of devices. The value is 3.

[0102] ⑦ Construct the loss function for a distributed, lightweight, heterogeneous graph neural network, the formula of which is:

[0103]

[0104] in It is the input data. This indicates different first-order and second-order heterogeneous relations. It is the actual value. The value is 7, which represents the number of prediction points. and These are the local prototype and global prototype of client i, respectively. This represents the L2 distance metric, where I is 3, which is the number of clients.

[0105] The gradient descent algorithm is used to distribute and lightly quantize the parameters of a heterogeneous graph neural network model. The update formula is divided into two parts. The first part of the formula is as follows:

[0106]

[0107]

[0108]

[0109] in, This represents the weight matrix in the l-th spatial embedding method at the (t+1)-th iteration. This represents the weight matrix in the spatial attention method at the t-th iteration; represents the weight matrix of the h-th layer in the multi-layer linear layer at the t-th iteration; each element in all initial values ​​is a constant between 0 and 1; η is the learning rate of the gradient descent algorithm, which is 0.001. The single model network is updated by the gradient descent algorithm shown in formulas (11) to (14). After the network is trained, the local prototype and global prototype are calculated by formulas (8) to (9).

[0110] The second part is:

[0111]

[0112] in, This represents the prototype of class j in device i at the (t+1)th iteration; η is the learning rate of the gradient descent algorithm, which is 0.001. After calculating the local prototype and the global prototype, the distributed framework is updated by formula (15).

[0113] (3) Training a distributed lightweight graph neural network data filling model for industrial process data

[0114] The specific training process for data imputation in industrial systems using distributed lightweight heterogeneous graph neural networks is as follows:

[0115] 1) Use formulas (1) to (10) to complete the single-wheel forward process of the model;

[0116] 2) Use formulas (11)~(15) to perform gradient descent algorithm and update the overall model;

[0117] 3) Repeat steps 1) to 2) until the set 100 rounds are reached, then stop training;

[0118] (4) Fill in industrial process data

[0119] The specific process of using distributed lightweight heterogeneous graph neural networks for industrial system data filling is as follows: the distributed data to be filled is obtained by passing the model trained in step (3) to obtain the filling result.

Claims

1. A distributed lightweight heterogeneous graph neural network method for industrial process data imputation, characterized in that, The process involves collecting and preprocessing industrial process data, establishing a distributed, lightweight, heterogeneous graph neural network data imputation method, training a distributed, lightweight graph neural network data imputation model for industrial process data, and imputing the industrial process data. This includes the following steps: (1) Acquisition and preprocessing of industrial process data Obtain an industrial process time series dataset. The data is from a wastewater treatment plant and is distributed across edge devices in the influent chamber, tank A, and tank B. The influent chamber dataset is also available. It contains 4 variables, represented as ,in The pH value at time t is obtained from a pH meter. The SS (suspended solids) at time t, expressed in milligrams per liter, is collected by an SS sensor. The value represents the COD (Chemical Oxygen Demand) at time t, expressed in milligrams per liter, and is obtained from a COD analyzer. This represents the NH3N concentration at time t, in milligrams per liter, obtained from an ammonia nitrogen analyzer. (Data set from pool A) It contains 9 variables, represented as ,in The value represents the ORP (oxidation-reduction potential) in anaerobic conditions at time t, measured in millivolts, and acquired by an ORP sensor. The value represents the pre-anoxic ORP (oxidation-reduction potential) at time t, measured in millivolts, and is acquired by an ORP sensor. The value of NO3N at time t represents the final nitrate nitrogen (NO3N) in the anoxic state, expressed in mg / L, and was obtained from a nitrate nitrogen analyzer. The value represents the final MLSS (mixed liquor suspended solids concentration) at time t in the anoxic pool, expressed in milligrams per liter, and is collected by a suspended solids concentration meter. The value represents the final liquid level at time t, in millimeters, and is obtained from a liquid level sensor. The dissolved oxygen (DO) at time t before aerobic respiration (before the aerobic tank), expressed in mg / L, is measured by a dissolved oxygen meter. The dissolved oxygen (DO) concentration at time t represents the oxygen concentration in the aerobic zone, expressed in milligrams per liter, and is obtained from a dissolved oxygen meter. The value of DO (dissolved oxygen in aerobic tank 2) at time t is expressed in milligrams per liter and is obtained from a dissolved oxygen meter. This represents the orthophosphate concentration at time t (at the end of the second aerobic tank), in mg / L, collected by an orthophosphate analyzer. (Data set from tank B) There are 13 variables, represented as ,in The value represents the ORP in the anaerobic environment at time t, measured in millivolts, and is acquired by an ORP sensor. The value represents the pre-hypoxia ORP at time t, measured in millivolts, and is acquired by an ORP sensor. The value of NO3N at time t represents the final nitrate nitrogen (NO3N) in the anoxic state, expressed in mg / L, and was obtained from a nitrate nitrogen analyzer. The MLSS at time t represents the end-hypoxia level, expressed in mg / L, and was collected by a suspended solids concentration meter. The value represents the final liquid level at time t, in millimeters, and is obtained from a liquid level sensor. The dissolved oxygen (DO) at time t represents the initial DO level before aerobic activity, expressed in mg / L, and is measured by a dissolved oxygen meter. The dissolved oxygen (DO) concentration at time t represents the oxygen concentration in the aerobic zone, expressed in milligrams per liter, and is obtained from a dissolved oxygen meter. The dissolved oxygen (DO) at time t is expressed in mg / L and is obtained from a dissolved oxygen meter. The value of diphosphate at time t is expressed in mg / L and was obtained from an orthophosphate analyzer. The total nitrogen (TN) at the effluent at time t is expressed in milligrams per liter (mg / L) and is collected by a total nitrogen analyzer. The COD of the effluent at time t is expressed in milligrams per liter and is obtained from a COD analyzer. This indicates the external return flow rate at time t, with units of cubic meters per hour, obtained from the flow meter data. The pH value of the effluent at time t is obtained from a pH meter. The above data is normalized using the following formula: ; Where i represents the i-th device, j represents the j-th variable, and t represents the t-th time point. This represents the maximum value of the j-th variable in the i-th device. Let represent the minimum value of the j-th variable in the i-th device. This represents the value of the j-th variable in the i-th device at time t; (2) Establish a distributed lightweight heterogeneous graph neural network data imputation method ① A distributed, lightweight, heterogeneous graph neural network data imputation method is constructed. This data imputation method trains three identical data imputation models simultaneously. The input of each model is all the variables of the corresponding device. The imputation is interactively learned through a spatiotemporal prototype federated learning module. Each model contains a decomposition module, a lightweight temporal embedding module, an edge vector generation module, and a spatial conditional attention embedding module. In the basic settings of each model, there are 4 types of heterogeneity, the number of nodes is equal to the number of variables j, the batch size is 8, the window size is 168, the number of training rounds is 100, the optimizer is Adam, the learning rate is 0.001, and the dropout rate is 0.

2. ② Construct a decomposition module, in which a moving average kernel of size 25 and step size 1 is used to divide the data into a trend component and a seasonal component, using the following formula: ; This represents the value of the j-th variable in the i-th device at time t+k, where n=25. Indicates trend components, Indicates seasonal quantity; ③ Construct a lightweight temporal embedding module, which includes a ReLU layer, a 6-layer linear layer, a loss layer, and a residual connection layer, as shown in the formula: ; in For input, W represents the raw data of the j-th variable in the i-th device, including the trend component and the seasonal component. h Let b represent the weight matrix of the h-th layer. h This represents the output bias parameter of the h-th layer, b in the initial stage of the network. h It can be any constant between 0 and 1, h is 6, D(·) represents the missing module with a loss rate of 0.2, and finally the model is residually connected, T( ) is the output of the lightweight time embedding module, where the original data, trend component and seasonal component are used to obtain the time embedding vectors of the static part, trend part and seasonal part through formula (3); ④ Construct an edge vector generation module, which uses matrix multiplication and a set of convolution kernels, as shown in the formula: ; This represents the feature matrix of the i-th device. yes The transpose of the matrix, Θ represents the parameter matrix of the convolutional layer. Here, the time embedding vectors of the static part, trend part and seasonal part are input, and the initial side vectors of the static part, trend part and seasonal part are obtained by formula (4). (5) Constructing the spatial condition attention embedding module, which contains two types of spatial attention methods and one spatial embedding method, each variable of each device as a node, g l and g m represent the feature vectors of the lth and mth nodes, respectively; The formula for the first spatial attention method is: ; in and Represent two types of second-order initial edge vectors. It is a projection weight matrix. These are learnable attention weights; these are learnable model parameters. This refers to connecting two matrices according to their characteristic wind direction. It is the ReLU activation function. This represents the aggregation method for circular relationships. This indicates that nodes l and m are on the edge Relationships under certain conditions; The formula for the second spatial attention method is: ; in This represents the initial edge vector of order 2. This indicates that nodes l and m are on the edge The relationship under the condition is obtained by using the time embedding vectors of each part and the corresponding edge vectors, and the node relationship under different heterogeneous relationships is obtained by formulas (5)~(6). The spatial embedding method is constructed using the following formula: ; in The characteristic of node m is that node m belongs to the set of neighboring nodes of node l under different edge conditions. This represents the relationship coefficient between node l and node m under the condition of a second-order edge. This represents the relationship coefficient between node l and node m under the condition of a first-order edge. Represents the set of first-order nodes. Represents the set of second-order nodes. and These represent the learnable weight matrices, and Represents the learnable attention weights for different parts. The updated node features are obtained by using the three-layer formula (7) to get the final filling result of the single model after obtaining the node relationships under different heterogeneous relationships. ⑥ Construct a spatiotemporal prototype federated learning module, which includes two parts: local prototype aggregation and global prototype aggregation. The formula for local prototype aggregation is: ; This represents the embedding vector of the u-th layer in the four heterogeneous relationships, where U is 3. These are the prototypes corresponding to the four heterogeneous relationships; The global prototype aggregation formula is: ; in This represents the prototype of class j in device i. Represents a set of devices of class j. Device i contains data of class j. This represents the set of devices containing class j. Indicates the number of devices. It is 3; ⑦ Construct the loss function for a distributed, lightweight, heterogeneous graph neural network, the formula of which is: ; in It is the input data. This indicates different first-order and second-order heterogeneous relations. It is the actual value. The value is 7, which represents the number of prediction points. and These are the local prototype and global prototype of client i, respectively. This represents the L2 distance metric, where I is 3, representing the number of clients.

2. The method according to claim 1, characterized in that, The gradient descent algorithm is used to distribute and lightly quantize the parameters of a heterogeneous graph neural network model. The update formula is divided into two parts. The first part of the formula is as follows: ; ; ; in, This represents the weight matrix in the l-th spatial embedding method at the (t+1)-th iteration. This represents the weight matrix in the spatial attention method at the t-th iteration; represents the weight matrix of the h-th layer in the multi-layer linear layer at the t-th iteration; each element in all initial values ​​is a constant between 0 and 1; η is the learning rate of the gradient descent algorithm, which is 0.

001. The single model network is updated by the gradient descent algorithm shown in formulas (11) to (14). After the network is trained, the local prototype and global prototype are calculated by formulas (8) to (9). The second part is: ; in, η represents the prototype of class j in device i at the (t+1)th iteration; η is the learning rate of the gradient descent algorithm, which is 0.

001. After calculating the local prototype and the global prototype, the distributed framework is updated by formula (15).

3. The method according to claim 1, characterized in that, The specific training process for data imputation in industrial systems using distributed lightweight heterogeneous graph neural networks is as follows: 1) Use formulas (1) to (10) to complete the single-wheel forward process of the model; 2) Use formulas (11)~(15) to perform gradient descent algorithm and update the overall model; 3) Repeat steps 1) to 2) until the set 100 rounds are reached, then stop training; 4) Populate industrial process data; The specific process of using distributed lightweight heterogeneous graph neural networks for data infilling in industrial systems is as follows: the distributed data to be filled is processed by the model trained in step 3) to obtain the filling result.