A method of industrial process monitoring
By constructing an adaptive spatiotemporal neighborhood feature learning autoencoder, the problem of lack of neighborhood structure information in existing technologies is solved, enabling more accurate industrial process monitoring and enhancing tolerance to process uncertainties and monitoring performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- UNIV OF SCI & TECH BEIJING
- Filing Date
- 2025-03-10
- Publication Date
- 2026-06-12
AI Technical Summary
Existing industrial process monitoring methods lack neighborhood structure information during feature extraction, leading to inaccurate monitoring results.
An adaptive spatiotemporal neighborhood feature learning autoencoder is constructed. By simultaneously considering temporal and spatial information, dynamic spatiotemporal neighborhood relationships are established. Constraints are imposed on the input data and features of the autoencoder. Attention mechanism and cross-entropy variant are used to measure topological similarity. An adaptive spatiotemporal neighborhood feature learning loss function is constructed to capture dynamic spatiotemporal features.
Maintaining the spatiotemporal topology of data improves the accuracy of industrial process monitoring and its tolerance to process uncertainties, thereby enhancing monitoring capabilities.
Smart Images

Figure CN120276381B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of machine learning technology, and in particular to an industrial process monitoring method. Background Technology
[0002] Industry is a hallmark of a nation's development, determining the modernization of the national economy and serving as its leading force. In modern industry, the scale of technological processes is expanding daily, placing higher demands on process safety and product quality. Failures can impose a significant burden on the economy and the environment, making monitoring systems crucial for the timely identification of faults and anomalies. Therefore, constructing a monitoring system capable of timely and reliable fault identification has become a critical task.
[0003] With advancements in distributed control systems and smart sensors, vast amounts of process data have been recorded, greatly promoting the development of data-driven monitoring methods. Compared to first-principles-based methods, data-driven methods do not require precise physical modeling; instead, they use historical data to describe relationships within complex industrial systems. As an important branch of data-driven methods, Multivariate Statistical Process Monitoring (MSPM) has been extensively studied, including techniques such as Principal Component Analysis (PCA), Partial Least Squares (PLS), and Independent Component Analysis (ICA).
[0004] To further mine deeper information from data, deep learning methods have been introduced to handle complex nonlinear process data. Related models include Convolutional Neural Networks (CNNs), Deep Belief Networks (DBNs), Recurrent Neural Networks (RNNs), and Autoencoders (AEs). Autoencoders achieve feature extraction and data reconstruction through their unsupervised architecture. Due to their superior monitoring performance and easily adjustable network structure, various variants have emerged, including Sparse Autoencoders (SAEs), Denoising Autoencoders (DAEs), and Variational Autoencoders (VAEs). By constructing statistical metrics in the latent layers, faults can be effectively detected. However, these methods do not consider the neighborhood structure of the process data. During encoding and decoding, imposing constraints only on the Euclidean distance between the input and reconstructed data may disrupt the data topology. Furthermore, extracted features may lose neighborhood information, negatively impacting the accuracy of process monitoring.
[0005] Process variables in industrial data often exhibit strong topological correlations, and preserving the data's topological structure during model training is beneficial for feature extraction and reconstruction. However, existing methods primarily extract features by minimizing the Euclidean distance between data points, which may disrupt the data's neighborhood topology. Consequently, the extracted features lack neighborhood structure information, leading to inaccurate monitoring results.
[0006] Manifold learning methods, such as Locally Linear Embedding (LLE), Isometric Mapping (ISOMAP), Locality Preserving Projections (LPP), and Neighborhood Preserving Embedding (NPE), can effectively capture the manifold structure of data. Methods applying manifold learning to autoencoders have gradually emerged, aiming to capture both the local and global structure of data.
[0007] Wang et al. proposed a novel stacked local-preserving autoencoder to more effectively maintain local data structures in latent features. To extract more comprehensive features, both local and global neighborhood information must be considered simultaneously. Liu et al. proposed a stacked autoencoder that preserves both local and non-local structures, incorporating a regularizer to capture both local and non-local data structure information. Yang et al. proposed a novel stacked autoencoder that preserves local, non-local, and global structures to extract more comprehensive key structure-related features. The above methods determine neighborhood relationships by calculating the distance between spatially proximate points. However, the deterministic structures derived in this way face challenges when addressing the inherent uncertainties of industrial process data. Probabilistic manifold methods, such as t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP), describe local and global neighborhood structures in a non-linear probabilistic manner, making them more suitable for real-world industrial processes. Furthermore, constructing adjacency graphs in the spatial domain only captures static information from process data. Due to process inertia and feedback control in industrial systems, time dynamics are often neglected. Considering adjacency relationships in the time domain allows for a more comprehensive extraction of neighborhood features.
[0008] Recent research has applied the concept of preserving spatiotemporal features to the extraction of dynamic features. Song et al. integrated local preservation projection with sequence information to achieve spatiotemporal location preservation coordination. Furthermore, some studies have incorporated spatiotemporal structure constraints into the loss function to extract spatiotemporal features. Liu et al. proposed a spatiotemporal neighborhood-preserving stacked autoencoder to learn deep dynamic features. Sampling temporal distance is used to construct the temporal neighborhood structure. Wang et al. proposed a spatiotemporal neighborhood learning network combined with an average teacher to learn the spatiotemporal neighborhood structure of the data. The weight of the temporal neighborhood is measured by the proportion of time difference between samples. Both methods focus on the impact of sampling time on the data neighborhood structure, where the closer the sampling times of two samples are, the greater the weight. Wang et al. proposed a novel spatiotemporal feature extraction method by developing a local spatiotemporal structure-preserving stacked semi-supervised autoencoder, which considers both temporal and spatial correlations when constructing the neighborhood structure. However, the above methods only consider the spatial structure in the temporal neighborhood, lacking temporal information in the description of the neighborhood structure. Considering both spatial and temporal information when constructing the neighborhood structure allows the network to capture more comprehensive spatiotemporal features. Furthermore, industrial process data is a slowly changing continuous time series, with different time periods typically exhibiting different spatiotemporal characteristics. This can cause fixed neighborhood relationships to be affected by noise and fluctuations, thus impacting monitoring results and leading to inaccurate final monitoring outcomes. Summary of the Invention
[0009] This invention provides an industrial process monitoring method to solve the technical problem that existing industrial process monitoring methods lack neighborhood structure information in the extracted features, leading to inaccurate monitoring results.
[0010] To solve the above-mentioned technical problems, the present invention provides the following technical solution:
[0011] On one hand, the present invention provides an industrial process monitoring method, comprising:
[0012] Collect production process data from historical industrial processes to construct a sample dataset;
[0013] Simultaneously considering temporal and spatial information, a dynamic spatiotemporal neighborhood relationship is established to impose constraints on the input data and features of the autoencoder, thereby constructing an adaptive spatiotemporal neighborhood feature learning autoencoder;
[0014] An adaptive spatiotemporal neighborhood feature learning autoencoder was trained using a sample dataset;
[0015] Industrial process monitoring is achieved by learning an autoencoder based on trained adaptive spatiotemporal neighborhood features.
[0016] Furthermore, production process data from historical industrial processes were collected to construct a sample dataset, including:
[0017] Production process data during normal production in historical industrial processes were collected in chronological order, and the collected production process data were standardized to obtain raw data with zero mean and unit variance, forming a sample dataset.
[0018] Furthermore, the adaptive spatiotemporal neighborhood feature learning autoencoder first utilizes the autoencoder model to learn the basic features of the input data through reconstruction constraints. Then, it uses a unified manifold approximation and projection method to calculate the spatial topology between adjacent data points in the temporal neighborhood of the input layer and the hidden layer, and introduces an attention mechanism to learn the influence of temporal distance on the topology to achieve dynamic adjustment of the spatial topology, thereby generating a topology containing spatiotemporal information and converting it into a joint probability distribution. The similarity between the two joint probability distributions is measured by using a cross-entropy variant and introduced into the loss function to constrain the autoencoder, enabling it to capture dynamic spatiotemporal features.
[0019] Furthermore, considering both temporal and spatial information, a dynamic spatiotemporal neighborhood relationship is established to impose constraints on the input data and features of the autoencoder, thus constructing an adaptive spatiotemporal neighborhood feature learning autoencoder, including:
[0020] Construct an autoencoder; whereby the autoencoder learns the features of the data by minimizing the reconstruction error;
[0021] Simultaneously considering temporal and spatial information, a dynamic spatiotemporal neighborhood relationship is established to construct an adaptive spatiotemporal neighborhood feature learning loss function, which constrains the autoencoder so that it can capture dynamic spatiotemporal features.
[0022] Furthermore, the calculation process of the adaptive spatiotemporal neighborhood feature learning loss function includes:
[0023] Calculate adaptive time distance weights for the data;
[0024] Calculate the spatial distance of the data;
[0025] Based on the adaptive temporal distance weights and spatial distances of the data, the adaptive spatiotemporal manifold loss is calculated, and combined with the loss function of the autoencoder, the adaptive spatiotemporal neighborhood feature learning loss function is calculated.
[0026] Furthermore, the adaptive time distance weight of the calculated data includes:
[0027] Construct the time distance matrix T d :
[0028]
[0029] Where N represents the number of samples in the sample dataset; t represents the current time, T t T represents the sampling time at the current moment. i Let T represent the sampling time at the i-th moment, i∈(t,N), where T is the sampling time at the i-th moment. t+1 T represents the sampling time at the next moment. N This represents the sampling time at the Nth time step.
[0030] Identify the local maxima and local minima in the current time series. Calculate the distances from adjacent maxima to maxima, adjacent maxima to minima, adjacent minima to maxima, and adjacent minima to minima. Calculate the Shannon entropy for each of the four calculated extreme distance data sets, and then calculate the average of the Shannon entropies corresponding to the four extreme distance data sets to obtain the attention entropy H. attention ;
[0031] Calculate time weight T w :
[0032]
[0033] T w As input to the self-attention network, the key value k and query value q are obtained respectively:
[0034] k = W k T w +bk
[0035] q = W q T w +b q
[0036] Among them, W k and W q b represents the weight matrix of the network. k and b q Represents the bias vector of the network;
[0037] Calculate attention score (sim):
[0038]
[0039] Where T represents the transpose of the matrix; D is the dimension of k;
[0040] Calculate the attention weight β = softmax(sim); where softmax is an activation function, also known as the normalized exponential function, and its specific formula is:
[0041] Calculate the adaptive time distance weight S T =βT d .
[0042] Furthermore, the formula for calculating the spatial distance of the data is expressed as:
[0043]
[0044] in, x represents i With x ij Spatial distance; x i x represents the i-th sample point in the sample dataset; ij x represents i The j-th sample point within the time neighborhood; x i The time neighborhood is located at x i The first K sample points and located at x i It consists of K sample points, where K is a preset integer value.
[0045] Furthermore, the adaptive spatiotemporal manifold loss is calculated based on the adaptive temporal distance weights and spatial distances of the data, and combined with the loss function of the autoencoder, an adaptive spatiotemporal neighborhood feature learning loss function is calculated, including:
[0046] In the original space formed by the input data, the heat kernel function is used to calculate the conditional probability of the spatiotemporal relationship between the nearest neighbors of the data:
[0047]
[0048] Where, p i|j This represents the relationship between point j and point i within a neighborhood centered at point i. x represents i With x ij Spatial distance; x i x represents the i-th sample point in the sample dataset; ij x represents i The j-th sample point within the time neighborhood; x i The time neighborhood is located at x i The first K sample points and located at x i It consists of the following K sample points, where K is a preset integer value; x represents i With x ij Time distance weight; σ i It is the normalization factor, determined as follows:
[0049]
[0050] Define the joint probability p ij for:
[0051] p ij =(p j|i +p i|j )-p j|i p i|j
[0052] Where, p j|i This represents the relationship between points i and j within a neighborhood centered at point j.
[0053] Calculate the spatiotemporal relationship probability between a sample point in the target embedding space composed of features and a sample point in its temporal neighborhood; where, for the i-th sample point z in the target embedding space composed of features... i It and its j-th sample point z in the time neighborhood ij The probability q of the spatiotemporal relationship between them ij Represented as:
[0054]
[0055] in, and They are z i With z ij Spatial distance between two points and adaptive time-adjusted weights; a and b are parameters used to fit the function; z i The time neighborhood is located at z i The first K sample points and located at z iIt consists of the following K sample points, where K is a preset integer value;
[0056] Constructing an adaptive manifold regularization loss function Loss ASMR :
[0057]
[0058] Where N represents the number of samples in the sample dataset; p ij (x) represents the spatiotemporal relationship calculated on the original input data x; q ij (z) represents the spatiotemporal relationship calculated for feature z;
[0059] Calculate the adaptive spatiotemporal neighborhood feature learning loss function Loss ASMRAE ;
[0060] Loss ASMRAE =Loss MSE +αLoss ASMR
[0061] Among them, Loss MSE It is the loss function of the autoencoder, and α is the preset weight parameter.
[0062] Furthermore, the step of training the adaptive spatiotemporal neighborhood feature learning autoencoder using the sample dataset includes:
[0063] The training data in the sample dataset is divided into multiple batches and is not shuffled during training, with each batch sorted by time.
[0064] Initialize the parameters of the adaptive spatiotemporal neighborhood feature learning autoencoder network, input each batch of data into the adaptive spatiotemporal neighborhood feature learning autoencoder in sequence to obtain feature data and reconstructed data, calculate the adaptive spatiotemporal neighborhood feature learning loss function, and update the network parameters through backpropagation.
[0065] Furthermore, the industrial process monitoring based on the trained adaptive spatiotemporal neighborhood feature learning autoencoder includes:
[0066] Based on a pre-trained adaptive spatiotemporal neighborhood feature learning autoencoder, a T-type feature space is created using a sample dataset. 2 Statistics and SPE statistics used to monitor the residual space;
[0067] T is calculated using the kernel density estimation method. 2 Control limits for the SPE statistic and the SPE statistic;
[0068] Collect production process data from the current industrial process, input this data into a trained adaptive spatiotemporal neighborhood feature learning autoencoder, and calculate the corresponding T based on the output of the adaptive spatiotemporal neighborhood feature learning autoencoder. 2 Statistics and SPE statistics;
[0069] If the production process data corresponding to T in the current industrial process 2 If the statistical value or SPE statistical value exceeds the corresponding control limit, the current industrial process is judged to be abnormal; otherwise, the current industrial process is judged to be normal.
[0070] In another aspect, the present invention also provides an electronic device comprising a processor and a memory; wherein the memory stores at least one instruction, which is loaded and executed by the processor to implement the above-described method.
[0071] In another aspect, the present invention also provides a computer-readable storage medium storing at least one instruction, which is loaded and executed by a processor to implement the above-described method.
[0072] The beneficial effects of the technical solution provided by this invention include at least the following:
[0073] The present invention can maintain the spatiotemporal topology of data during the feature extraction process of the autoencoder. The adaptive spatiotemporal neighborhood structure calculation method can more effectively extract spatiotemporal neighborhood information in the data. At the same time, the probability-based method can adapt to the uncertainty of real industrial process data and improve the accuracy of industrial process monitoring. Attached Figure Description
[0074] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0075] Figure 1 This is a structural diagram of the adaptive spatiotemporal neighborhood feature learning autoencoder provided in the embodiments of the present invention;
[0076] Figure 2 This is a schematic diagram of the process for industrial process monitoring using an adaptive spatiotemporal neighborhood feature learning autoencoder provided in an embodiment of the present invention;
[0077] Figure 3 This is a system block diagram of the electronic device provided in the embodiments of the present invention. Detailed Implementation
[0078] To make the objectives, technical solutions, and advantages of the present invention clearer, the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
[0079] First, it should be noted that in the embodiments of the present invention, the words "exemplarily," "for example," etc., are used to indicate that they are examples, illustrations, or descriptions. Any embodiment or design scheme described as "exemplary" in the present invention should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of the term "exemplarily" is intended to present the concept in a specific manner. Furthermore, in the embodiments of the present invention, the meaning expressed by "and / or" can be both, or it can be either one or the other.
[0080] First Embodiment
[0081] To address the problem of inaccurate monitoring results caused by the lack of neighborhood structure information in the features extracted by existing industrial process monitoring methods, this embodiment considers both temporal and spatial information to construct dynamic spatiotemporal neighborhood relationships. Constraints are imposed on input data and features to capture spatiotemporal neighborhood information. An adaptive spatiotemporal neighborhood feature learning autoencoder is proposed, which probabilistically describes the topological structure between data points, enhancing tolerance to process uncertainties while preserving the spatiotemporal neighborhood structure to ensure data topology consistency and improve the model's monitoring capabilities. Furthermore, dynamic time weights are considered, adaptively adjusting the spatial topology using temporal information to cope with the effects of noise and fluctuations, thus more accurately describing the spatiotemporal neighborhood structure. The generated elements contain comprehensive spatiotemporal neighborhood information, thereby improving the model's monitoring performance. Based on this, this embodiment proposes an industrial process monitoring method based on an adaptive spatiotemporal neighborhood feature learning autoencoder, which includes:
[0082] S1, collect production process data from historical industrial processes and construct a sample dataset;
[0083] Specifically, in this embodiment, the implementation process of S1 is as follows: Production process data during normal production in historical industrial processes are collected in chronological order, and the collected production process data is standardized to obtain raw data with zero mean and unit variance, forming a sample dataset. This method is widely applicable to various large-scale industrial processes, such as chemical processes, aluminum electrolysis production processes, and blast furnace ironmaking processes. Taking the production process of vinyl acetate as an example, data on 29 monitoring variables are typically collected, including feed rate, flow rate at various locations and instruments, liquid level, pressure, temperature, and concentration.
[0084] S2, taking into account both temporal and spatial information, establishes dynamic spatiotemporal neighborhood relationships to impose constraints on the input data and features of the autoencoder, and constructs an adaptive spatiotemporal neighborhood feature learning autoencoder;
[0085] Specifically, the structure of the adaptive spatiotemporal neighborhood feature learning autoencoder in this embodiment is as follows: Figure 1 As shown, the process of learning probabilistic neighborhood features is as follows:
[0086] First, an autoencoder model is used to learn the basic features of the input data through reconstruction constraints. Then, a unified manifold approximation and projection method is employed to calculate the spatial topology between neighboring data points in the temporal neighborhood of the input and hidden layers. An attention mechanism is introduced to learn the influence of temporal distance on the topology, enabling dynamic adjustment of the spatial topology and generating a topology containing spatiotemporal information, which is then converted into a joint probability distribution. The similarity between the two joint probability distributions is measured using a cross-entropy variant and incorporated into the loss function to constrain the autoencoder, enabling it to capture dynamic spatiotemporal features.
[0087] Specifically, the construction process of the adaptive spatiotemporal neighborhood feature learning autoencoder is as follows:
[0088] S21, Temporal Neighborhood Selection: For a dataset sorted by time, its temporal neighborhood is defined as follows:
[0089]
[0090] in, Indicates sample x i The time neighborhood, where 2K represents the length of the neighborhood, needs to be manually set according to the data characteristics of different industrial objects, and its optimal value needs to be obtained through multiple experimental tests.
[0091] S22, construct an autoencoder network;
[0092] An autoencoder typically consists of an encoder and a decoder, both with the same number of network layers. For input data x... i The feature z can be obtained through encoder calculation. i . It is the hidden layer of the encoder. Then the decoder obtains the reconstructed data. This is the hidden layer of the decoder. The autoencoder learns the characteristics of the data by minimizing the reconstruction error. For a dataset of N samples, the loss function can be written as:
[0093]
[0094] S23, Calculate the adaptive time distance weight of the data; specifically including:
[0095] S231, Construct the time distance matrix T d :
[0096]
[0097] This matrix describes the sampling interval between sampling points. When data fluctuations are large within a time period, data within a short period becomes irrelevant, and the influence of neighboring points should be reduced, i.e., the time distance between neighboring points should be increased, and vice versa. Here, N represents the number of samples in the sample dataset; t represents the current time; and T represents the time interval. t T represents the sampling time at the current moment. i Let T represent the sampling time at the i-th moment, i∈(t,N), where T is the sampling time at the i-th moment. t+1 T represents the sampling time at the next moment. N This represents the sampling time at the Nth time step.
[0098] S232. To measure the impact of data volatility on time weights across different time periods, attention entropy is used for calculation. Specifically, the local maxima and local minima within the current time series (data within the time neighborhood DT mentioned above) are first identified as key points. The distances from adjacent maxima to maxima, adjacent maxima to minima, adjacent minima to maxima, and adjacent minima to minima are calculated respectively. The distance between two extremes refers to the difference in sampling time between the two points. The Shannon entropy is calculated for each variable across the four sets of data, and the average is used to obtain the attention entropy H. attention Specifically, for a variable at all sample points within the range, calculate the four sets of time difference distance data, and calculate the corresponding four Shannon entropies. Take the average to obtain the attention entropy of this variable. Perform the above operation on each variable at each sample point and take the average to obtain the overall attention entropy.
[0099] S233, Calculate the time weight T w :
[0100]
[0101] S234, T w As input to the self-attention network, the key value k and query value q are obtained respectively:
[0102] k = W k T w +b k (5)
[0103] q = W q T w +b q (6)
[0104] Among them, {W k ,b k} and {W q ,b q} represent the weight matrix and bias vector of the network, respectively;
[0105] S235, the attention score sim is obtained by calculating the similarity between the key value and the query value:
[0106]
[0107] Where T represents the transpose of the matrix; D is the dimension of k;
[0108] S236, Calculate the attention weight β:
[0109] β = softmax(sim) (8)
[0110] Here, softmax is an activation function, also known as the normalized exponential function, and its specific formula is:
[0111] S237, Calculate the adaptive time-distance weight S T :
[0112] S T =βT d (9)
[0113] S24, Calculate the spatial distance of the data: Points that are closer in time will receive more attention, while points that are farther apart in time, even if they are spatially close, should be relatively ignored. Therefore, for sample point x... i , It is x i The spatial distance between sample points in the temporal neighborhood is calculated as follows:
[0114]
[0115] S25, Calculate the adaptive spatiotemporal manifold loss; specifically including:
[0116] S251, In the original space formed by the input data, the heat kernel function is used to calculate the conditional probability p of the spatiotemporal relationship between the nearest neighbors of the data. i|j :
[0117]
[0118] in, x represents i With x ij Time distance weight; σ i It is the normalization factor, determined as follows:
[0119]
[0120] S252, To ensure the symmetry of probabilities, we define the joint probability p. ij for:
[0121] p ij =(p j|i +p i|j )-p j|i p i|j (13)
[0122] Where, p i|j This represents the relationship between point j and point i within a neighborhood centered at point i; p j|i Let p represent the relationship between points i and j within a neighborhood centered at point j. To ensure that the relationship between two points is unique and that the order of calculation does not lead to inconsistent results, a joint probability p is designed. ij .
[0123] S253, correspondingly, for a point z in the target embedding space constituted by features. i , For each point in its temporal neighborhood, the probability of a spatiotemporal relationship between the two points is:
[0124]
[0125] in, and They are z i and z ij Spatial distance between two points and adaptive time-adjusted weights; a and b are parameters used to fit the function; The calculation method is the same as that of formula (10). The calculation method is the same as that of formula (9).
[0126] S254. To measure the consistency between two probability distributions, an adaptive manifold regularization loss function, Loss, is constructed. ASMR :
[0127]
[0128] The first term of the loss function can make the nearest neighbor points in the original space closer in the target embedding space, and the second term can make the non-nearest neighbor points in the original space farther away in the target embedding space.
[0129] Where, p ij (x) represents the spatiotemporal relationship calculated on the original input data x; q ij (z) represents the spatiotemporal relationship calculated for feature z;
[0130] S255, Calculate the adaptive spatiotemporal neighborhood feature learning loss function Loss. ASMRAE ;
[0131] Loss ASMRAE =Loss MSE+αLoss ASMR (16)
[0132] Here, α is a parameter used to balance the sample reconstruction loss and the manifold constraint loss.
[0133] S3, using the sample dataset to train an adaptive spatiotemporal neighborhood feature learning autoencoder;
[0134] Specifically, in this embodiment, the implementation process of S3 is as follows: First, the training data is divided into multiple batches without being shuffled during training, wherein the data in each batch is sorted by time. The adaptive spatiotemporal neighborhood feature learning autoencoder from step 2 is established, and the network parameters are initialized. The normalized data x = [x1, x2, ..., x...] is then processed. N ] T The input is fed into the network to obtain the feature z = [z1, z2, ..., z N ] T and reconstructing data The loss function is calculated according to equation (16), and the network parameters are updated through backpropagation.
[0135] S4, Industrial process monitoring is achieved by learning an autoencoder based on trained adaptive spatiotemporal neighborhood features;
[0136] Specifically, in this embodiment, such as Figure 2 As shown, the implementation process of S4 above is as follows:
[0137] S41, based on the trained adaptive spatiotemporal neighborhood features, learns an autoencoder and creates T using the sample dataset. 2 The statistical measure and the SPE statistic; among which,
[0138] T 2 Statistics are used to monitor the feature space, and are calculated as follows:
[0139]
[0140] Where, Σ z Let z represent the variance.
[0141] The SPE statistic is used to monitor the residual space, and it is calculated as follows:
[0142]
[0143] S42, T is calculated using kernel density estimation. 2 Control limits for statistics and SPE statistics; if the statistics exceed the control limits during online monitoring, it indicates that a malfunction has occurred in the industrial process.
[0144] S43: Collect production process data from the current industrial process, input it into a trained adaptive spatiotemporal neighborhood feature learning autoencoder, and calculate T corresponding to the production process data in the current industrial process based on the output of the adaptive spatiotemporal neighborhood feature learning autoencoder. 2 Statistics and SPE statistics;
[0145] S44, if the T corresponding to the production process data in the current industrial process 2 If the statistical value or SPE statistical value exceeds the corresponding control limit, the current industrial process is judged to be abnormal; otherwise, the current industrial process is judged to be normal.
[0146] In summary, this embodiment provides an adaptive spatiotemporal manifold regularized autoencoder, designs a novel loss function that probabilistically describes the topological structure between data points, enhances tolerance to process uncertainties, and preserves the spatiotemporal neighborhood structure, ensuring the consistency of the data topology. Furthermore, it designs a novel dynamic temporal distance weight that can adaptively adjust the spatial topology using temporal information, thereby addressing the impact of noise and fluctuations, more accurately describing the spatiotemporal neighborhood structure, and improving the model's monitoring performance.
[0147] Second Embodiment
[0148] This embodiment provides an electronic device, such as... Figure 3 As shown, the electronic device includes a processor and a memory; wherein the processor and the memory can be connected via a communication bus; the memory stores at least one instruction, which is loaded and executed by the processor to implement the method of the first embodiment described above. Furthermore, the electronic device may also include a transceiver, the processor and the transceiver can be connected via a communication bus, and the transceiver is used to communicate with other devices.
[0149] Below, in conjunction with Figure 3 A detailed introduction to each component of this electronic device is provided below:
[0150] The processor is the control center of the electronic device. The electronic device may include multiple processors, each of which can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). The term "processor" can refer to a single processor or a collective term for multiple processing elements. For example, a processor can be one or more central processing units (CPUs), other general-purpose processors, application-specific integrated circuits (ASICs), or one or more integrated circuits configured to implement embodiments of the present invention, such as one or more digital signal processors (DSPs), one or more field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor. The processor can perform various functions of the electronic device by running or executing software programs stored in memory and by calling data stored in memory.
[0151] In a specific implementation, as one example, the processor may include one or more CPUs, for example... Figure 3 CPU0 and CPU1 shown are, of course, merely illustrative examples.
[0152] The memory is used to store the software program that executes the solution of the present invention, and the processor controls its execution. For specific implementation methods, please refer to the above method embodiments, which will not be repeated here.
[0153] Optionally, the memory may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital universal optical discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto. The memory may be integrated with the processor or may exist independently, and may be accessed through the interface circuit of the electronic device (…). Figure 3 (Not shown in the image) is coupled to the processor; however, this embodiment of the invention does not impose specific limitations on this.
[0154] The transceiver may include a receiver and a transmitter. Figure 3 (Not shown separately). The receiver is used to implement the receiving function, and the transmitter is used to implement the transmitting function. The transceiver can be integrated with the processor or exist independently, and is connected through the interface circuit of the electronic device (…). Figure 3 (Not shown in the image) is coupled to the processor, and this embodiment of the invention does not specifically limit this.
[0155] In addition, it should be noted that, Figure 3 The structure of the electronic device shown is not intended to limit the device. Actual devices may include more or fewer components than shown, or combine certain components, or have different component arrangements. Furthermore, the technical effects achieved by this electronic device when performing the method of the first embodiment described above can be referenced to the technical effects described in the first embodiment; therefore, they will not be repeated here.
[0156] Third Embodiment
[0157] This embodiment provides a computer-readable storage medium storing at least one instruction, which is loaded and executed by a processor to implement the method of the first embodiment described above. The computer-readable storage medium may be a ROM, random access memory, CD-ROM, magnetic tape, floppy disk, or optical data storage device, etc. The instruction stored therein can be loaded and executed by a processor in a terminal.
[0158] Furthermore, it should be noted that the present invention can be provided as a method, apparatus, or computer program product. Therefore, embodiments of the present invention can take the form of a completely or partially hardware embodiment, a completely or partially software embodiment, or an embodiment combining software and hardware aspects. Moreover, when implemented in software, embodiments of the present invention can take the form of a computer program product implemented on one or more computer-usable storage media containing computer-usable program code. The computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any usable medium accessible to a computer or a data storage device such as a server or data center containing one or more sets of usable media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. A semiconductor medium can be a solid-state drive (SSD).
[0159] Embodiments of the present invention are described with reference to flowchart illustrations and / or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0160] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1The functions specified in one or more boxes. These computer program instructions may also be loaded onto a computer or other programmable data processing terminal equipment to cause a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0161] It should also be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. The terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element. Furthermore, the term "and / or" is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, and B alone, where A and B can be singular or plural. Additionally, the character " / " in this text generally indicates an "or" relationship between the preceding and following objects, but it can also indicate an "AND / OR" relationship. Please refer to the context for specific interpretations. "At least one" refers to one or more items, while "more than" refers to two or more items. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or multiple items. For example, at least one of a, b, or c can be represented as: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.
[0162] Furthermore, it is understood that in various embodiments of the present invention, the order of the above-mentioned process numbers does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0163] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.
[0164] In the several embodiments provided by this invention, it should be understood that the disclosed devices, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative. For instance, the division of functional modules / units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the shown or discussed mutual couplings or direct couplings or communication connections may be through some interfaces; indirect couplings or communication connections between devices or units may be electrical, mechanical, or other forms. Units described as separate components may or may not be physically separate, and components shown as units may or may not be physical units, i.e., they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs. Additionally, the functional units in the various embodiments of this invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
[0165] If the method is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0166] Finally, it should be noted that the above description is merely a preferred embodiment of the present invention. It should be pointed out that although preferred embodiments of the present invention have been described, those skilled in the art, once they understand the basic inventive concept of the present invention, can make several improvements and modifications without departing from the principles described herein. These improvements and modifications should also be considered within the scope of protection of the present invention. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present invention.
Claims
1. An industrial process monitoring method, characterized in that, include: Collect production process data from historical industrial processes to construct a sample dataset; Simultaneously considering temporal and spatial information, a dynamic spatiotemporal neighborhood relationship is established to impose constraints on the input data and features of the autoencoder, thereby constructing an adaptive spatiotemporal neighborhood feature learning autoencoder; An adaptive spatiotemporal neighborhood feature learning autoencoder was trained using a sample dataset; Industrial process monitoring is achieved by learning an autoencoder based on trained adaptive spatiotemporal neighborhood features. The simultaneous consideration of temporal and spatial information, establishing dynamic spatiotemporal neighborhood relationships to impose constraints on the input data and features of the autoencoder, and constructing an adaptive spatiotemporal neighborhood feature learning autoencoder includes: Construct an autoencoder; whereby the autoencoder learns the features of the data by minimizing the reconstruction error; Simultaneously considering time and space information, a dynamic spatiotemporal neighborhood relationship is established to construct an adaptive spatiotemporal neighborhood feature learning loss function, which constrains the autoencoder so that it can capture dynamic spatiotemporal features. The calculation process of the adaptive spatiotemporal neighborhood feature learning loss function includes: Calculate adaptive time distance weights for the data; Calculate the spatial distance of the data; Based on the adaptive temporal distance weights and spatial distances of the data, the adaptive spatiotemporal manifold loss is calculated, and combined with the loss function of the autoencoder, the adaptive spatiotemporal neighborhood feature learning loss function is calculated. The adaptive time distance weight of the calculated data includes: Constructing the time distance matrix : ; in, N This represents the number of samples in the sample dataset; Indicates the current moment. This indicates the sampling time at the current moment. Indicates the first i The sampling time at each moment, , Indicates the sampling time of the next moment. Indicates the first N The sampling time at each moment; Identify the local maxima and local minima within the current time series. Calculate the distances from adjacent maxima to maxima, adjacent maxima to minima, adjacent minima to maxima, and adjacent minima to minima. Calculate the Shannon entropy for each of the four calculated extreme distance data sets, and then calculate the average of the Shannon entropies corresponding to the four extreme distance data sets to obtain the attention entropy. ; Calculate time weights : ; Will As input to the self-attention network, key values are obtained respectively. and query value : ; ; in, and Represents the weight matrix of the network; and Represents the bias vector of the network; Calculate attention score : ; in, T Represents the transpose of a matrix; yes The dimension; Calculate attention weights ;in, Indicates the activation function; Calculate adaptive time distance weights .
2. The industrial process monitoring method as described in claim 1, characterized in that, The collection of production process data from historical industrial processes to construct a sample dataset includes: Production process data during normal production in historical industrial processes were collected in chronological order, and the collected production process data were standardized to obtain raw data with zero mean and unit variance, forming a sample dataset.
3. The industrial process monitoring method as described in claim 1, characterized in that, The adaptive spatiotemporal neighborhood feature learning autoencoder first uses the autoencoder to learn the basic features of the input data through reconstruction constraints. Then, it uses a unified manifold approximation and projection method to calculate the spatial topology between adjacent data points in the temporal neighborhood of the input layer and the hidden layer, and introduces an attention mechanism to learn the influence of temporal distance on the topology to achieve dynamic adjustment of the spatial topology, thereby generating a topology containing spatiotemporal information and converting it into a joint probability distribution. The similarity between the two joint probability distributions is measured by using a cross-entropy variant and introduced into the loss function to constrain the autoencoder so that it can capture dynamic spatiotemporal features.
4. The industrial process monitoring method as described in claim 1, characterized in that, The formula for calculating the spatial distance of data is expressed as: ; in, express and Spatial distance; Represents the first in the sample dataset i One sample point; express The first in the time neighborhood j One sample point; The time neighborhood is located in Before K Each sample point and located at After K It consists of 10 sample points, where K This is a preset integer value.
5. The industrial process monitoring method as described in claim 1, characterized in that, The adaptive spatiotemporal manifold loss is calculated based on the data-driven adaptive temporal distance weights and spatial distances. Combined with the autoencoder's loss function, an adaptive spatiotemporal neighborhood feature learning loss function is calculated, including: In the original space formed by the input data, the heat kernel function is used to calculate the conditional probability of the spatiotemporal relationship between the nearest neighbors of the data. : ; in, Indicated by point i When creating a neighborhood around a center, the points within that neighborhood... j With point i The relationship between them; express and Spatial distance; Represents the first in the sample dataset i One sample point; express The first in the time neighborhood j One sample point; The time neighborhood is located in Before K Each sample point and located at After K It consists of 10 sample points, where K The preset integer value; express and Time distance weight; It is the normalization factor, determined as follows: ; in, K Set a preset integer value; define the joint probability. for: ; in, Indicated by point j When creating a neighborhood around a center, the points within that neighborhood... i With point j The relationship between them; Calculate the spatiotemporal relationship probability between a sample point in the target embedding space composed of features and its sample points in its temporal neighborhood; where, for the i-th sample point in the target embedding space composed of features... i sample points Its time neighborhood and the first j sample points Probability of spatiotemporal relationships Represented as: ; in, and They are and Spatial distance between two points and adaptive time-adjusted weights; and b These are the parameters used to fit the function; The time neighborhood is located in Before K Each sample point and located at After K It consists of 10 sample points, where K The preset integer value; Constructing an adaptive manifold regularization loss function : ; in, N This represents the number of samples in the sample dataset; This indicates the original input data. The spatiotemporal relationship of the calculation; This represents the spatiotemporal relationship calculated for feature z; Calculate the adaptive spatiotemporal neighborhood feature learning loss function ; ; in, It is the loss function of the autoencoder. These are preset weight parameters.
6. The industrial process monitoring method as described in claim 1, characterized in that, The training of the adaptive spatiotemporal neighborhood feature learning autoencoder using the sample dataset includes: The training data in the sample dataset is divided into multiple batches and is not shuffled during training, with each batch sorted by time. Initialize the parameters of the adaptive spatiotemporal neighborhood feature learning autoencoder network, input each batch of data into the adaptive spatiotemporal neighborhood feature learning autoencoder in sequence to obtain feature data and reconstructed data, calculate the adaptive spatiotemporal neighborhood feature learning loss function, and update the network parameters through backpropagation.
7. The industrial process monitoring method as described in claim 1, characterized in that, The industrial process monitoring based on the trained adaptive spatiotemporal neighborhood feature learning autoencoder includes: Based on a pre-trained adaptive spatiotemporal neighborhood feature learning autoencoder, a feature space for monitoring is created using a sample dataset. Statistics and SPE statistics used to monitor the residual space; Calculate using kernel density estimation methods respectively Control limits for the SPE statistic and the SPE statistic; The process involves collecting production process data from the current industrial process, inputting this data into a trained adaptive spatiotemporal neighborhood feature learning autoencoder, and then calculating the corresponding values for the production process data based on the output of the autoencoder. Statistics and SPE statistics; If the production process data in the current industrial process corresponds to If the statistical value or SPE statistical value exceeds the corresponding control limit, the current industrial process is judged to be abnormal; otherwise, the current industrial process is judged to be normal.