A batch process fast monitoring method based on multi-feature fusion and stacking

By collecting and fusing multiple features during penicillin fermentation and combining them with a Stacking ensemble model, the problems of incomplete feature extraction and poor model interpretability were solved, resulting in a higher fault identification rate and data understanding capability.

CN122221101APending Publication Date: 2026-06-16LIAONING UNIVERSITY OF TECHNOLOGY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
LIAONING UNIVERSITY OF TECHNOLOGY
Filing Date
2026-04-08
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies for industrial process fault monitoring suffer from problems such as imperfect feature extraction, poor model interpretability, and low fault identification rate, especially in the penicillin fermentation process where it is difficult to accurately monitor abnormal states.

Method used

A rapid batch process monitoring method based on multi-feature fusion and stacking is adopted. Data is collected by sensors, interpretable features and second-order interaction terms are added, and anomaly scores are generated by combining multiple anomaly detection algorithms. A Stacking ensemble model is constructed, including multiple base learners and meta-learners, to improve the accuracy and interpretability of the model.

🎯Benefits of technology

It improves the accuracy of fault monitoring during penicillin fermentation, enhances the model's ability to understand data, and can intuitively explain the causes of faults, making it suitable for fault identification in complex industrial processes.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122221101A_ABST
    Figure CN122221101A_ABST
Patent Text Reader

Abstract

The application discloses a kind of based on multi-feature fusion and stacking batch process fast monitoring method, comprising: step one, the historical running data of batch process is collected by sensor, and original data set is established;Step two, original data set is respectively added and constructed second-order interaction term to explainable feature, obtain combined data set;Step three, original data set is calculated by multiple anomaly detection algorithms, generate anomaly score, and it is fused as new feature in combined data set, obtain fusion data set;Step four, construct Stacking integrated model, and fusion data set is input into Stacking integrated model and is trained, and multi-feature fusion monitoring model is obtained;Step five, the batch process data of field detection is input into multi-feature fusion monitoring model, and fault monitoring result is obtained.Can overcome the problem that feature extraction is not perfect in monitoring process, model explainability is poor, and fault recognition rate is low, improve the accuracy of fault monitoring in penicillin fermentation process.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to a rapid batch process monitoring method based on multi-feature fusion and stacking, belonging to the field of industrial process monitoring technology. Background Technology

[0002] Complex industrial processes are characterized by high nonlinearity, time-varying nature, and uncertainty, often requiring advanced technologies for monitoring and diagnosis. Industrial fault monitoring technology started earlier abroad, particularly in industries such as aerospace, chemical engineering, metallurgy, and power generation, with many companies and research institutions making significant contributions. For example, model-based methods: Early fault monitoring methods relied primarily on physical or mathematical models, such as Kalman filtering, model predictive control (MPC), and linearized models. These methods establish mathematical models of the system, detecting and identifying faults as system deviations. However, these methods require high model accuracy and perform poorly when the system exhibits complex nonlinearities and unknown fault modes. Alternatively, statistical methods, such as Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Partial Least Squares Regression (PLS), are widely used in industrial fault monitoring. These methods identify anomalies in the data through dimensionality reduction and feature extraction, making them suitable for multivariate monitoring problems. For instance, PCA can reduce redundancy by extracting principal components of the data and identify system deviations during the detection process.

[0003] Research on industrial fault monitoring in China started relatively late, but with the increasing intelligence and automation of industry, significant progress has been made in this field in recent years. Firstly, data-driven methods are prevalent, with most domestic research focusing on fault monitoring approaches, primarily including techniques based on statistical analysis, machine learning, and deep learning. For example, Support Vector Machines (SVM), decision trees, and K-Nearest Neighbors (KNN) are widely used in equipment fault classification and diagnosis. Secondly, neural network-based deep learning is increasingly being applied in fault monitoring in China, especially in complex time-series data analysis. Domestic researchers utilize methods such as Long Short-Term Memory Networks (LSTM) and Convolutional Neural Networks (CNN) to automatically extract features from raw data and perform fault diagnosis. These methods have shown promising applications in anomaly detection and predictive maintenance in industrial processes. Thirdly, multimodal data fusion is a key area of ​​research. Since industrial equipment often relies on multiple types of sensors (such as temperature sensors, pressure sensors, and vibration sensors), fusing data from different sensors has become a focus of domestic research. Multimodal data fusion methods, such as Principal Component Analysis (PCA) and deep fusion networks, are used to integrate multi-source information, thereby improving the accuracy and reliability of monitoring systems.

[0004] However, in actual production processes, problems such as missing data, difficulties in data annotation, and data class imbalance often exist. Industrial process fault monitoring not only requires high accuracy but also the ability to understand and interpret the model's decision-making process. Especially in some safety-critical application scenarios, it is essential to be able to interpret the model's output so that engineers can understand the causes of the faults and take appropriate measures. Summary of the Invention

[0005] This invention designs and develops a rapid batch process monitoring method based on multi-feature fusion and stacking, which can overcome the problems of imperfect feature extraction, poor model interpretability, and low fault identification rate in the monitoring process, and improve the accuracy of fault monitoring in penicillin fermentation.

[0006] The technical solution provided by this invention is as follows:

[0007] A rapid batch process monitoring method based on multi-feature fusion and stacking includes:

[0008] Step 1: Collect historical operational data of the batch process using sensors to establish the original dataset;

[0009] Step 2: Add interpretable features and construct second-order interaction terms to the original dataset to obtain the combined dataset;

[0010] Step 3: Calculate anomaly scores on the original dataset using multiple anomaly detection algorithms, and then fuse these scores into the combined dataset as new features to obtain the fused dataset.

[0011] Step 4: Construct a Stacking ensemble model. Input the fused dataset into the Stacking ensemble model for training to obtain a multi-feature fusion monitoring model.

[0012] The Stacking ensemble model includes: multiple base learners and one meta-learner.

[0013] The meta-learner classifies based on probability values, and its expression is:

[0014] ;

[0015] In the formula, The output consists of category labels, where category 0 represents the normal state and category 1 represents the abnormal state. For input features, , , These are the weighting coefficients. For bias terms;

[0016] Step 5: Input the batch process data from the on-site inspection into the multi-feature fusion monitoring model to obtain the fault monitoring results.

[0017] Preferably, the batch process is a penicillin fermentation process.

[0018] Preferably, the interpretability features include:

[0019] The process variables in penicillin production include the acid-alkali flow rate ratio composed of the acid flow acceleration rate and the alkali flow acceleration rate, the cold water flow acceleration rate, the total flow rate composed of the acid flow acceleration rate and the alkali flow acceleration rate, and the CO2 / pH ratio composed of carbon dioxide concentration and pH value.

[0020] Acid-base flow rate ratio = acid flow rate / (base flow rate + constant).

[0021] Total flow rate = acid flow rate + alkali flow rate + cold water flow rate;

[0022] CO2 / pH ratio = CO2 concentration / (pH value + constant);

[0023] The constant value is 1×10. -6 .

[0024] Preferably,

[0025] For a given dataset: The generated second-order interaction features include:

[0026] The squared terms of all features: ;as well as

[0027] Interaction items for all features: .

[0028] Preferably, the anomaly detection algorithm in step three includes:

[0029] Isolated forest, local outlier, elliptical envelope.

[0030] Preferably, the meta-learner is a logistic regression model, which calculates a weighted sum of input features and then maps them to probability values ​​between 0 and 1 using a sigmoid function.

[0031] Class prediction based on meta-learner probability values:

[0032] when At that time, the prediction result was category 1;

[0033] when When the prediction result is category 0, the prediction result is category 0.

[0034] Preferably, it also includes:

[0035] The monitoring results of the Stacking ensemble model are compared and evaluated with the monitoring results of the Voting ensemble strategy. The Voting ensemble strategy obtains the final prediction result by weighting or majority voting on the predictions of multiple base models.

[0036] The beneficial effects of this invention are as follows:

[0037] (1) Added interpretable features: In some complex industrial process application scenarios, by showing the relationship between variables, the output of the industrial model can be interpreted intuitively, so that engineers can understand the cause of the failure and take corresponding measures.

[0038] (2) Multi-level feature engineering captures the nonlinear relationships between features by generating second-order interactive features, thereby improving the model's expressive power and prediction accuracy. This method can better help the model understand the complex relationships in the data, and is especially suitable for complex systems in industrial processes.

[0039] (3) Fusion of multiple anomaly detection methods: Three different anomaly detection methods, Isolation Forest, Local Outlier Factor, and Elliptic Envelope, were used to calculate anomaly scores. By incorporating these scores as new features into the model, the advantages of different methods were combined to solve the class imbalance problem.

[0040] In the penicillin fermentation process, the acid-base flow rate ratio, total flow rate, and CO2 / pH ratio are used as actual interpretable features to enhance the model's ability to understand data and predict performance. This facilitates the capture of potential nonlinear relationships and key driving factors in the data, thereby improving the model's predictive ability. Attached Figure Description

[0041] Figure 1 This is the overall flowchart of the batch process rapid monitoring method based on multi-feature fusion and stacking described in this invention.

[0042] Figure 2 This is a schematic diagram of the Stacking ensemble learning framework described in this invention.

[0043] Figure 3 This is the variable correlation heatmap described in this invention.

[0044] Figure 4(a) is a box plot showing the distribution of bacterial cell concentration under different abnormal conditions according to the present invention.

[0045] Figure 4(b) is a box plot showing the distribution of the product concentration under different abnormal conditions according to the present invention.

[0046] Figure 4(c) is a box plot showing the distribution of the culture medium volume under different abnormal conditions according to the present invention.

[0047] Figure 4(d) is a box plot showing the distribution of CO2 concentration under different abnormal conditions according to the present invention.

[0048] Figure 4(e) is a box plot showing the distribution of pH value under different abnormal conditions according to the present invention.

[0049] Figure 4(f) is a box plot showing the temperature distribution of the reaction vessel described in this invention under different abnormal conditions.

[0050] Figure 4(g) is a box plot showing the distribution of the reaction heat under different abnormal conditions according to the present invention.

[0051] Figure 4(h) is a box plot showing the flow rate distribution of the acid described in this invention under different abnormal conditions.

[0052] Figure 4(i) is a box plot showing the flow rate of the alkali described in this invention under different abnormal conditions.

[0053] Figure 4(j) is a box plot showing the distribution of cold water flow velocity under different abnormal conditions according to the present invention.

[0054] Figure 5 This is a schematic diagram comparing the accuracy of the various models described in this invention.

[0055] Figure 6(a) is a schematic diagram of Z-score anomaly detection according to the present invention.

[0056] Figure 6(b) is a schematic diagram of the IQR anomaly detection described in this invention.

[0057] Figure 6(c) is a schematic diagram of Mahalanobis distance anomaly detection according to the present invention.

[0058] Figure 6(d) is a schematic diagram of the Isolation Forest anomaly detection described in this invention. Detailed Implementation

[0059] The present invention will now be described in further detail with reference to the accompanying drawings, so that those skilled in the art can implement it based on the description.

[0060] like Figure 1 As shown in Figure 6, this invention provides a rapid batch process monitoring method based on multi-feature fusion and stacking, comprising:

[0061] Step 1: Collect historical operational data of the penicillin fermentation process using sensors to establish an original dataset;

[0062] The data collection time was 300 hours, and the sampling interval was 0.2 hours.

[0063] Step 2: Add interpretable features and construct second-order interaction terms to the original dataset to obtain the combined dataset;

[0064] The process variables in penicillin production include the acid-alkali flow rate ratio composed of the acid flow acceleration rate and the alkali flow acceleration rate, the cold water flow acceleration rate, the total flow rate composed of the acid flow acceleration rate and the alkali flow acceleration rate, and the CO2 / pH ratio composed of carbon dioxide concentration and pH value.

[0065] Acid-base flow rate ratio = acid flow rate / (base flow rate + constant).

[0066] Total flow rate = acid flow rate + alkali flow rate + cold water flow rate;

[0067] CO2 / pH ratio = CO2 concentration / (pH value + constant).

[0068] The constant value is 1×10. -6 .

[0069] Step 3: Calculate anomaly scores on the original dataset using the isolated forest, local outlier factor, and elliptic envelope algorithms, and then fuse these scores as new features into the combined dataset to obtain the fused dataset.

[0070] Step 4: Construct a Stacking ensemble model. Input the fused dataset into the Stacking ensemble model for training to obtain a multi-feature fusion monitoring model.

[0071] Stacking ensemble models consist of: multiple base learners and one meta-learner;

[0072] The learner is a logistic regression model, which calculates a weighted sum of the input features and then uses the sigmoid function to map them to probability values ​​between 0 and 1.

[0073] The meta-learner classifies based on probability values, and its expression is:

[0074] ;

[0075] In the formula, The output consists of category labels, where category 0 represents the normal state and category 1 represents the abnormal state. For input features, , , These are the weighting coefficients. For bias terms;

[0076] Class prediction based on meta-learner probability values:

[0077] when At that time, the prediction result was category 1;

[0078] when When the prediction result is category 0, the prediction result is category 0.

[0079] Step 5: Input the batch process data from the on-site inspection into the multi-feature fusion monitoring model to obtain the fault monitoring results.

[0080] The monitoring results of the Stacking ensemble model are compared and evaluated with those of the Voting ensemble strategy. The Voting ensemble strategy obtains the final prediction result by weighting or majority voting on the predictions of multiple base models.

[0081] Example

[0082] This method was applied to the penicillin fermentation process.

[0083] The penicillin fermentation process is a typical dynamic, nonlinear, and multi-stage process, which can be divided into two stages: 1. The Penicillium mold adaptation and growth stage; after a brief period of adaptation, it begins to grow and reproduce rapidly, accompanied by the rapid consumption of glucose primers. 2. The penicillin synthesis stage; in this stage, the Penicillium mold begins to produce penicillin. In order to maintain the rapid and efficient growth of penicillin, glucose must be added to the fermenter during this process.

[0084] Step 1: Data Preparation

[0085] Abnormal batches were generated using MATLAB software version 2014 and the PenSim 2.0 simulator for a simulation duration of 300 hours. Due to the intense penicillin fermentation process, the sampling interval was set to 0.2 hours. All batches were simulated under closed-loop control of pH and temperature, with glucose addition being open-loop. Under default initial settings, the simulator added a small amount of white noise to simulate normal operating conditions. The generated dataset included both normal and abnormal data. Penicillin anomaly detection was performed using machine learning, a binary classification task, with labels 0 and 1 representing abnormal and normal states, respectively. Here, 0 represents the normal state and 1 represents the abnormal state. Since no glucose was added to the fermenter in stage 1, the bottom flow rate was 0, thus preventing step or ramp failures in stage 1. To illustrate the data distribution, quartiles, and outliers, and to help understand the central tendency and dispersion of the data, box plots were introduced, as shown in Figures 4(a) to 4(j).

[0086] As shown in Figure 4(a), the figure reflects the growth status of microorganisms, which is an important biological indicator of the fermentation process. It can be seen from the figure that the distribution of cell concentration changes significantly under abnormal conditions, indicating that it has a strong sensitivity to faults and is an important monitoring variable.

[0087] As shown in Figure 4(b), the figure reflects the production results of penicillin yield, which is a core quality indicator. The figure shows that product concentration has a strong ability to distinguish abnormal states and can be used as a key output variable for fault identification.

[0088] As shown in Figure 4(c), the figure reflects the system's material balance and operating conditions. Abnormal fluctuations in the culture medium volume may be an important manifestation of system instability.

[0089] As shown in Figure 4(d), the figure reflects the intensity of microbial metabolism (respiration). The CO2 concentration differs significantly between the abnormal and normal states, which is one of the important and sensitive features for identifying abnormalities.

[0090] As shown in Figure 4(e), the figure reflects the acid-base balance of the system, which is one of the control variables. The pH value has a significant response to the abnormal state, indicating that the acid-base balance has been disrupted.

[0091] As shown in Figure 4(f), the figure reflects the reaction rate and microbial activity. Temperature anomalies directly affect the reaction process and are an important process control variable.

[0092] As shown in Figure 4(g), the figure reflects the reaction intensity and energy changes. The change in reaction heat can be used to identify energy anomalies in the system.

[0093] As shown in Figure 4(h), the figure reflects the important input variables for pH regulation. The change in acid flow rate reflects the system's regulatory behavior and has an auxiliary role in anomaly detection.

[0094] As shown in Figure 4(i), the figure reflects the pH regulation in conjunction with the acid flow rate. Changes in the alkali flow rate reflect system imbalance and are an important indirect indicator of anomalies.

[0095] As shown in Figure 4(j), the abnormal change in the cold water flow rate indicates a problem with the temperature control system (cooling system).

[0096] Box plot analysis of various process variables under normal and abnormal conditions revealed significant differences in the degree of response of different features to abnormal conditions. Specifically, CO2 concentration, pH value, cell concentration, and product concentration exhibited significant distribution shifts and dispersion changes in both states, indicating that these variables have strong sensitivity and discriminative ability to system anomalies. Variables such as culture medium volume, temperature, and heat of reaction mainly showed an increased fluctuation range, reflecting a decrease in system operational stability. The box plot analysis results demonstrate that different types of variables exhibit differentiated response characteristics under abnormal conditions, providing data support for multi-feature fusion modeling and validating the effectiveness of fusing multi-source information for fault detection.

[0097] Step 2: Add interpretable features and construct second-order interaction terms to the original dataset to obtain the combined dataset;

[0098] Based on existing original features, some new explanatory features are created to enhance the model's ability to understand data and predict performance. These new features typically reflect key indicators and interrelationships in the production process.

[0099] Specifically, it includes:

[0100] Acid-base flow rate ratio, total flow rate, and CO2 / pH ratio;

[0101] The calculation formulas are as follows:

[0102] Acid-base flow rate ratio = acid flow rate / (base flow rate + small constant).

[0103] Total flow rate = acid flow rate + alkali flow rate + cold water flow rate;

[0104] CO2 / pH ratio = CO2 concentration / (pH value + small constant).

[0105] This ratio helps the model understand the relationship between CO2 concentration and solution pH. These features are created based on domain knowledge and are designed to capture potential nonlinear relationships and key drivers in the data, thereby enhancing the model's predictive power.

[0106] In experiments with large amounts of data, the relationships between variables are often not directly apparent. Therefore, a correlation heatmap is used to help understand the distribution of each feature. The correlation heatmap displays the correlation between features. Highly correlated features may exhibit multicollinearity, affecting model stability and interpretability. Correlation analysis can identify highly correlated features, providing a basis for subsequent feature selection and dimensionality reduction. Understanding the relationships between features helps interpret the model's predictions and improves its interpretability. Figure 3 As shown.

[0107] Figure 3 This is used to display the degree of correlation between various features in a dataset. The correlation value ranges from -1 to 1, and the intensity of the color indicates the strength of the correlation, specifically as follows: Dark red: indicates a high positive correlation (correlation coefficient close to 1), meaning that as one variable increases, the other also increases. Dark blue: indicates a high negative correlation (correlation coefficient close to -1), meaning that as one variable increases, the other decreases. Near-white: indicates that there is almost no correlation between the two variables (correlation coefficient close to 0).

[0108] This shows that the values ​​on the main diagonal are all 1, because each variable is perfectly correlated with itself. The correlation between cell concentration and product concentration is close to 1 (0.99), indicating that these two variables are highly positively correlated and may be key features driving anomalies or certain reactions. The correlation between substrate volume and heat of reaction is also high (0.99), indicating that changes in heat of reaction are closely related to substrate volume. Negatively correlated variables: There is a strong negative correlation between alkali flow rate and cold water flow rate (-0.58), which may indicate that the alkali flow rate tends to decrease when the cold water flow rate increases. There is a strong negative correlation between CO2 concentration and cell concentration (-0.50), which may reflect that the cell concentration decreases when the carbon dioxide concentration increases, which may be related to certain reaction conditions or environment. Whether an anomaly is present shows a certain degree of positive or negative correlation with multiple features. For example, the correlation with cell concentration is 0.54, indicating that an increase in cell concentration may be associated with an abnormal state. The correlation with product concentration is 0.73, further proving that this feature is strongly correlated with an abnormal state. The correlations with reactor temperature and acid flow rate were -0.27 and -0.33, respectively, indicating that these characteristics may exhibit unique behavior under abnormal conditions. Therefore, highly correlated variables (such as cell concentration and product concentration) can serve as important reference variables in anomaly detection and model building. Negatively correlated variables (such as alkali flow rate and cold water flow rate) may provide interesting inverse patterns of change, allowing for further analysis of their impact on the system. The correlations between the anomaly column and multiple characteristics also provide clues for anomaly detection; these correlations can help determine which variables are likely the main factors causing the anomalies.

[0109] In machine learning and data science, feature engineering is a key step in improving model performance. This is especially true in complex chemical processes involving numerous variables and intricate nonlinear relationships, where single linear features often fail to effectively capture hidden patterns in the data. Generating second-order polynomial features is a method to enhance the expressive power of models, helping to better understand and model the complex behaviors in chemical processes.

[0110] Second-order interactive features are features generated by multinomial expansion of the original features. Specifically, second-order interactive features are new features generated by multiplying feature pairs (or features themselves). Simply put:

[0111] For a given dataset: The generated second-order interaction features include:

[0112] The squared terms of all features: ;as well as

[0113] Interaction items for all features: ;

[0114] In chemical processes, many variables are highly correlated, and these relationships are often non-linear. Therefore, generating second-order interaction features can help models better capture these complex relationships.

[0115] Step 3: Calculate anomaly scores on the original dataset using the isolated forest, local outlier factor, and elliptic envelope algorithms, and then fuse these scores as new features into the combined dataset to obtain the fused dataset.

[0116] Multiple anomaly detection methods are employed to calculate anomaly scores, combining the advantages of various algorithms to improve the robustness and accuracy of anomaly detection. By combining the scores from multiple anomaly detection methods, the classification model can more comprehensively understand anomaly patterns in the data, reducing false positives and false negatives. For example, some samples may be identified as anomalies in Isolation Forest but not in LOF; by combining this information, the classification model can make more accurate judgments. Different anomaly detection algorithms are based on different assumptions and mechanisms, providing diverse anomaly information. Integrating this diverse information into the classification model can enhance its robustness to different data distributions and anomaly types, improving its generalization ability.

[0117] Specific methods include: Isolation Forest: By randomly selecting features and segmentation values, a tree structure is constructed to isolate outlier samples on shorter paths. Suitable for high-order data, it effectively identifies outliers. Local Outlier Factor: Based on the local density of samples, it identifies samples with significantly lower density than their neighbors as outliers. Suitable for data distributions with complex shapes. Elliptic Envelope: Based on Mahalanobis distance, assuming the data follows a Gaussian distribution, it constructs an envelope to identify outliers. Suitable for multivariate normally distributed data.

[0118] As shown in Figure 6(a), Z-score anomaly detection is a classic method relying on statistical distribution hypotheses, primarily used to identify samples that significantly differ from normal data patterns. This method assumes that process variables approximately follow a stable distribution under normal operating conditions, using the mean and standard deviation to describe their central tendency and fluctuation range, thereby quantifying the degree of anomaly in a single sample. As penicillin fermentation begins, the cell concentration increases, and the product concentration also increases, with 0 representing normal and 1 representing anomaly. During this period, Z-score labels normal and anomalous data.

[0119] As shown in Figure 6(b), the IQR (Interquartile Range) anomaly detection method is a robust anomaly identification technique based on the statistical characteristics of data distribution, and is particularly suitable for process data with noise or non-Gaussian distribution. This method uses quartiles instead of the mean and variance to characterize the central interval of the data, so it is insensitive to extreme values ​​and is relatively stable.

[0120] As shown in Figure 6(c), the Mahalanobis anomaly detection method is an anomaly identification technique that uses multivariate statistical distance to measure the degree to which a sample deviates from the normal data distribution in the feature space. It is suitable for multivariate process data where variables are correlated. This distance represents the degree of multidimensional deviation between the sample and the center of the normal data distribution. The larger the distance, the more obvious the deviation between the sample and the normal pattern, and the more likely it is to be an anomaly.

[0121] As shown in Figure 6(d), Isolation Forest (IF) is an unsupervised anomaly detection method based on the idea of ​​random partitioning. It recursively partitions the sample space by constructing multiple independent random isolation trees. When each isolation tree is built, a feature f is randomly selected, and a split point p is randomly selected within the value range of this feature to divide the sample into two subsets, left and right, until the sample is separated into two separate subsets or the preset tree depth is reached.

[0122] The scores from the aforementioned anomaly detection methods are used as new features and integrated into the original feature set. These scores provide anomaly information from different perspectives, enriching the model's input features and helping to improve the model's ability to identify anomalous samples.

[0123] Step 4: Construct a Stacking ensemble model. Input the fused dataset into the Stacking ensemble model for training to obtain a multi-feature fusion monitoring model.

[0124] Stacking ensemble models consist of: multiple base learners and one meta-learner;

[0125] By combining the prediction results of multiple base models, the overall model's performance and generalization ability are improved.

[0126] The specific process includes: Base model selection: Random Forest Classifier, Extra Trees Classifier, and Gradient Boosting Classifier are selected as base models, leveraging their advantages in different dimensions. Fusion model: Logistic Regression is used as the final fusion model, combining the prediction results of the base models to generate the final prediction output. Stacked classifiers can effectively utilize the diversity of base models and improve the overall performance of the model. The overall flowchart is shown in Figure 1.

[0127] The learner is a logistic regression model, which calculates a weighted sum of input features and then maps them to probability values ​​between 0 and 1 using the sigmoid function.

[0128] Meta-learners classify based on probability values.

[0129] Based on experimental verification and considering the characteristics of the penicillin production process, to avoid overfitting, this experiment first tested the accuracy of a single machine learning model and standardized the results (as shown in the figure). Through repeated trials, the first three machine learning models were combined (hereinafter referred to as GRE) and ensemble learning was performed. Figure 5 As shown in the figure, the accuracy of different fault detection models varies significantly. The Gradient Boosting model performs best with an accuracy of 0.9607; followed by Random Forest (0.9333) and Extra Trees (0.9133); while Logistic Regression performs worst with an accuracy of only 0.7000.

[0130] Metamodel selection

[0131] Logistic regression is chosen because it's a simple linear classification model with relatively straightforward formulas and principles. As a meta-model, logistic regression effectively handles predictions (usually probability values ​​or class labels) from different base models, avoiding overfitting and improving generalization ability to generate the final prediction result. Furthermore, this method primarily utilizes the relationship between features and ensemble learning. In ensemble learning, multiple base models may struggle to explain their decision-making process, while logistic regression offers better interpretability through its linear structure. By examining the coefficients in the logistic regression model, one can understand the importance of each base model to the final decision.

[0132] Linear component (the core of logistic regression)

[0133] This part is a linear model that represents the weighted sum of all features.

[0134] The Sigmoid function (which maps linear outputs to probability values ​​between [0,1]) is expressed as follows:

[0135] ;

[0136] In the formula, The output consists of category labels, where category 0 represents the normal state and category 1 represents the abnormal state. For input features, , , These are the weighting coefficients. For bias terms;

[0137] Class prediction based on meta-learner probability values:

[0138] when At that time, the prediction result was category 1;

[0139] when When the prediction result is category 0, the prediction result is category 0.

[0140] In simple terms, the logistic regression model calculates a weighted sum of input features (i.e., z), then uses the Siqmold function to map it to probability values ​​between 0 and 1, and classifies the data based on these probability values.

[0141] Step 5: Input the batch process data from the on-site inspection into the multi-feature fusion monitoring model to obtain the fault monitoring results.

[0142] Stacking is a powerful ensemble learning method that combines multiple base models of different types and leverages meta-learners to further optimize prediction results. Its advantages include improved prediction accuracy, reduced overfitting, enhanced generalization ability, and the ability to capture complex feature interactions. Stacking performs exceptionally well on complex datasets, especially when the diversity and performance of the base models are well-balanced, significantly improving the model's predictive power. However, Stacking also has limitations, such as high computational cost, risk of overfitting, and long training times. Therefore, when using Stacking, it is crucial to select appropriate models and optimize the training process to maximize its effectiveness. Stacking ensemble learning frameworks include... Figure 2 As shown.

[0143] By combining Stacking Classifier, Random Forest Classifier, Extra Trees Classifier, and Gradient Boosting Classifier as base models, and Logistic Regression as a secondary evaluation model, the overall model's performance and generalization ability are improved through the fusion of multiple models. The stacking classifier, through the diversity of base models and the comprehensive capabilities of the fused models, further enhances the accuracy and stability of predictions, significantly improving the model's overall accuracy.

[0144] Voting, an ensemble strategy, improves overall prediction performance by combining the predictions of multiple models. Its main advantages are simplicity, ease of understanding, improved prediction accuracy, reduced overfitting, and enhanced robustness and stability. Voting is highly adaptable, accommodating combinations of various base learners, and is particularly suitable for integrating diverse models. However, Voting also has limitations, such as high computational cost and performance dependence on the quality of the base models.

[0145] Comparative experiment

[0146] The monitoring results of the Stacking ensemble model are compared and evaluated with those of the Voting ensemble strategy. The Voting ensemble strategy obtains the final prediction result by weighting or majority voting on the predictions of multiple base models.

[0147] To ensure the reliability of the experiment, a voting classifier was introduced again under the same processing conditions. By comparing the classification reports of the two methods, including key indicators such as precision, recall, and F1 score, as shown in the figure, the reliability of this experiment was verified again.

[0148] Table 1 Comparison results of different algorithms

[0149]

[0150] As can be seen from Table 1, our method significantly outperforms the Voting method in terms of Precision, Recall, and F1-score. This indicates that our method, through multi-feature fusion and the secondary learning of the Stacking ensemble strategy, can effectively integrate information, correct the bias of a single model, and enhance the model's ability to understand data and its predictive performance.

[0151] Stacked classifiers, by introducing a secondary learning process from the meta-classifier, can more intelligently combine the predictions of multiple base models, thereby improving overall prediction performance. Especially in complex and high-dimensional data, it can effectively reduce errors and compensate for the weaknesses of individual models. Voting classifiers, on the other hand, are a simpler ensemble method that uses simple weighting or majority voting on the predictions of multiple base models. While they can perform well in some cases, their performance is generally inferior to stacked classifiers, especially when dealing with complex tasks. Therefore, stacked classifiers typically outperform voting classifiers, particularly when there is a high diversity of base models or when the task is complex, as they better integrate the advantages of multiple models.

[0152] Although embodiments of the present invention have been disclosed above, they are not limited to the applications listed in the specification and embodiments. They can be applied to various fields suitable for the present invention. For those skilled in the art, other modifications can be easily made. Therefore, without departing from the general concept defined by the claims and their equivalents, the present invention is not limited to the specific details and illustrations shown and described herein.

Claims

1. A rapid batch process monitoring method based on multi-feature fusion and stacking, characterized in that, include: Step 1: Collect historical operational data of the batch process using sensors to establish the original dataset; Step 2: Add interpretable features and construct second-order interaction terms to the original dataset to obtain the combined dataset; Step 3: Calculate anomaly scores on the original dataset using multiple anomaly detection algorithms, and then fuse these scores into the combined dataset as new features to obtain the fused dataset. Step 4: Construct a Stacking ensemble model. Input the fused dataset into the Stacking ensemble model for training to obtain a multi-feature fusion monitoring model. The Stacking ensemble model includes: multiple base learners and one meta-learner. The meta-learner classifies based on probability values, and its expression is: ; In the formula, The output consists of category labels, with category 0 representing the normal state and category 1 representing the abnormal state. For input features, , , These are the weighting coefficients. For bias terms; Step 5: Input the batch process data from the on-site inspection into the multi-feature fusion monitoring model to obtain the fault monitoring results.

2. The rapid batch process monitoring method based on multi-feature fusion and stacking according to claim 1, characterized in that, The batch process described is the fermentation process of penicillin.

3. The rapid batch process monitoring method based on multi-feature fusion and stacking according to claim 2, characterized in that, The interpretability features include: The process variables in penicillin production include the acid-alkali flow rate ratio composed of the acid flow acceleration rate and the alkali flow acceleration rate, the cold water flow acceleration rate, the total flow rate composed of the acid flow acceleration rate and the alkali flow acceleration rate, and the CO2 / pH ratio composed of carbon dioxide concentration and pH value. Acid-base flow rate ratio = acid flow rate / (base flow rate + constant). Total flow rate = acid flow rate + alkali flow rate + cold water flow rate; CO2 / pH ratio = CO2 concentration / (pH value + constant); The constant value is 1×10. -6 .

4. The rapid batch process monitoring method based on multi-feature fusion and stacking according to claim 3, characterized in that, For a given dataset: The generated second-order interaction features include: The squared terms of all features: ;as well as Interaction items for all features: .

5. The rapid batch process monitoring method based on multi-feature fusion and stacking according to claim 4, characterized in that, The anomaly detection algorithm in step three includes: Isolated forest, local outlier, elliptical envelope.

6. The rapid batch process monitoring method based on multi-feature fusion and stacking according to claim 5, characterized in that, The meta-learner is a logistic regression model, which calculates a weighted sum of input features and then maps them to probability values ​​between 0 and 1 using the sigmoid function. Class prediction based on meta-learner probability values: when At that time, the prediction result was category 1; when When the prediction result is category 0, the prediction result is category 0.

7. The method for rapid batch process monitoring based on multi-feature fusion and stacking according to claim 6, characterized in that, Also includes: The monitoring results of the Stacking ensemble model are compared and evaluated with the monitoring results of the Voting ensemble strategy. The Voting ensemble strategy obtains the final prediction result by weighting or majority voting on the predictions of multiple base models.