Method and system for predicting the mass of a milk-based base powder formulation
By constructing a causal relationship analysis model, a causal feature set of emulsion base powder formulation quality is generated, which solves the problem of lack of causal mechanism in the existing technology, realizes the optimization guidance of formulation and process parameters, and improves the interpretability and usability of prediction results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANDONG TIANJIAO BIOLOGICAL TECH
- Filing Date
- 2026-03-11
- Publication Date
- 2026-06-26
Smart Images

Figure CN122290891A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to a method and system for predicting the quality of emulsion-based powder formulations. Background Technology
[0002] In the fields of infant formula, nutritional supplements, and functional dairy products, the formulation development and quality control of milk-based powders are key technological aspects. Existing technologies typically rely on historical formulation data, employing statistical models and machine learning algorithms such as multiple linear regression, partial least squares, or random forests to establish empirical correlations between formulation components and finished product quality indicators. These are combined with rapid detection technologies such as near-infrared spectroscopy and Raman spectroscopy to analyze raw materials and intermediate products, enabling the prediction and optimization of formulation quality, thereby reducing trial-and-error costs and shortening the development cycle.
[0003] Existing methods for predicting the quality of milk-based powder formulations primarily rely on statistical correlation to establish predictive relationships. These methods typically predict quality attributes such as solubility and stability by numerically fitting historical formulation data with finished product quality test results. However, these methods focus on the degree of correlation between variables, lacking a characterization of the causal relationship and dependency pathways between formulation composition, process parameters, and finished product quality attributes. They fail to clarify the actual mechanisms by which different formulation factors affect the quality formation process. For example, in milk-based powders, when the milk protein ratio and spray drying temperature change simultaneously, while statistical models may provide predictions for quality attributes, they cannot distinguish the sequential relationship and transmission pathway between changes in milk protein content and changes in process conditions during quality formation. Furthermore, they struggle to explain the synergistic or inhibitory effects between the two. Due to the lack of clear causal constraints, the prediction results often present a black-box output with insufficient mechanistic explanation, making it difficult to determine which formulation or process parameters should be adjusted first. This limits the practical guiding value of the prediction results in targeted formulation and process optimization. Summary of the Invention
[0004] In view of the aforementioned existing problems, the present invention is proposed.
[0005] Therefore, this invention provides a method for predicting the quality of emulsion-based powder formulations to solve the problem that the formulation-process-quality relationship is difficult to explain due to causal mechanism modeling, and the prediction results are difficult to effectively guide the problem of targeted formulation optimization.
[0006] To solve the above-mentioned technical problems, the present invention provides the following technical solution: In a first aspect, the present invention provides a method for predicting the quality of emulsion-based powder formulations, comprising: Collect formulation composition data, spectral detection data, and process data of emulsion base powder, perform standardized preprocessing, and output formulation quality correlation dataset; Causal relationship analysis and multimodal cross-validation were performed on the formulation quality correlation dataset to construct a causal constraint structure for formulation quality. By utilizing the causal constraint structure of formula quality, feature extraction and causal perception fusion are performed on the formula quality association dataset to form a causal relationship with the quality of finished products, generating a causal feature set of formula quality. The causal feature set of formula quality is used to perform prediction on the finished product quality attributes of the emulsion base powder according to the formula composition, based on the mapping logic established by the causal constraint structure of formula quality, to obtain the initial quality prediction value; and the contribution degree of influencing factors is decoupled and the uncertainty is quantified to output the quality prediction result. Based on the quality prediction results, the formulation and process parameters are optimized, and the causal constraint structure and mapping logic of the formulation quality are iteratively calibrated.
[0007] Preferably, the method for outputting the formula quality association dataset includes: Simultaneously collect formulation composition data, spectral detection data, and process data of emulsion base powder; and align them with the sample level according to the execution timestamp of the formulation batch to form a set of original multi-source data matched by formulation batch; Perform data standardization preprocessing on the original multi-source dataset, and combine it according to the formula batch to output the formula quality correlation dataset.
[0008] Preferably, the method for constructing the causal constraint structure for formulation quality includes: Based on the formula quality association dataset, formula composition data, spectral detection data, process data and finished product quality attributes are used as the set of analysis variables; the statistical dependencies between different variable pairs are extracted, and combined with the order information of variables in the formula batch dimension, variable pairs with stable statistical correlation and reasonable time sequence are screened to form a set of causal candidate relationships. Consistency verification of the causal direction of the same variable pair in the causal candidate relation set is performed, retaining variable relations with consistent causal directions and eliminating variable relations with inconsistent directions; The set of causal candidate relationships after causal consistency verification is cross-validated in different batch subsets of the formula quality association dataset to retain stable and reproducible variable relationships in different batches, thus forming a stable set of causal relationships. Based on a set of stable causal relationships, each variable is treated as a node, and the stable causal relationships are organized as directed constraints to construct a causal constraint structure for formula quality that limits the direction of variable action and allows dependency paths.
[0009] Preferably, the method for generating the causal feature set of the formula quality includes: Based on the causal constraint structure of formula quality, causal path limitation and filtering are performed on each variable in the formula quality association dataset, and variables that have effective directed paths pointing to the finished product quality attributes in the causal constraint structure of formula quality are retained. Based on the causal direction and dependency order defined by the causal constraint structure of the formulation quality, hierarchical feature extraction is performed on the retained variables to form causal constraint features; Based on the stable causal relationship between variables in the causal constraint structure of formula quality, the causal constraint features are checked for causal consistency, and causal features with conflicting causal directions between formula batches are eliminated. The various causal constraint features for causal consistency verification are fused according to the causal dependency relationship defined by the causal constraint structure of formula quality to generate a causal feature set for formula quality.
[0010] Preferably, the method for predicting the quality attributes of the finished product based on the formulation of the emulsion base powder includes: According to the direction of variable action and causal dependency path defined in the causal constraint structure of formula quality, the causal features in the formula quality causal feature set are organized in an orderly manner to form a quality attribute associated causal feature set. Based on the mapping relationship between the quality attribute-related causal feature set and the finished product quality attributes corresponding to historical formula batches, the finished product quality attribute mapping calculation is performed on the quality attribute-related causal feature set to obtain the initial predicted quality value under the corresponding formula composition conditions; during the finished product quality attribute mapping calculation, only causal features located on the effective directed path pointing to the finished product quality attribute in the formula quality causal constraint structure are allowed to participate in the prediction calculation.
[0011] Preferably, the method for decoupling the contribution of influencing factors and quantifying uncertainty includes: Based on the quality attribute-related causal feature set, while keeping the values of other causal features unchanged, controlled perturbations are applied to each causal feature for the initial predicted quality under the current formula conditions. The influence intensity of each causal feature is calculated based on the change in the initial predicted quality before and after the perturbation, thus forming the causal feature influence intensity. Based on the causal dependency path defined by the causal constraint structure of formula quality, the influence intensity corresponding to each causal feature is assigned and summarized by causal dependency path to form the decoupling result of the contribution of influencing factors. Based on the distribution of prediction residuals between the initial predicted quality values of historical formula batches and the corresponding measured results, uncertainty quantification is performed on the current initial predicted quality value; and the initial predicted quality value, the decoupling results of the contribution of influencing factors, and the uncertainty description are correlated and organized to form the quality prediction result.
[0012] Preferably, the method for optimizing the formulation and process parameters based on the quality prediction results includes: Based on the quality prediction results, identify the formulation composition parameters and process parameters whose corresponding contribution values are not zero in the decoupling results of the influencing factors; adjust the formulation composition parameters and process parameters to form candidate optimized formulation and process parameter combinations; Perform finished product quality attribute prediction on candidate optimized formulations and process parameter combinations, and select the formulation and process parameter combinations whose quality prediction results meet the preset quality requirements as the optimization results.
[0013] Preferably, the method for iteratively calibrating the causal constraint structure and mapping logic of the formulation quality includes: Collect the measured results of the finished product quality attributes corresponding to the optimized formula and process parameters, and compare them with the corresponding initial quality prediction values to obtain the quality prediction deviation value; Based on the distribution of quality prediction deviation values in different formulation batches, the validity of the relationship between relevant variables in the causal constraint structure of formulation quality is updated, and the mapping logic used in the mapping calculation of finished product quality attributes is corrected.
[0014] Preferably, the method of applying controlled perturbations to each causal feature includes: In the set of causal features associated with quality attributes, a single causal feature is selected as the target causal feature in turn, and the values of the remaining causal features are fixed to the corresponding values under the current formula conditions. For the target causal feature, without changing its causal hierarchy and dependency relationship in the causal constraint structure of formula quality, a positive or negative perturbation with a limited range is applied to its value to form the perturbed causal feature value. Based on the quality attribute association causal feature set containing the causal feature values after perturbation, the finished product quality attribute mapping calculation is re-executed to obtain the corresponding initial predicted value of quality after perturbation. The initial predicted mass value after the disturbance is compared with the initial predicted mass value before the disturbance. Based on the amount of change between the two, the influence intensity of the target causal feature under the current formulation conditions is determined.
[0015] Secondly, the present invention provides a quality prediction system for emulsion-based powder formulations, comprising: The data acquisition module is used to collect formulation composition data, spectral detection data, and process data of emulsion base powder, and to perform standardized preprocessing to output a formulation quality correlation dataset. The causal construction module is used to perform causal relationship analysis and multimodal cross-validation on the formula quality association dataset, and to construct the causal constraint structure for formula quality. The causal feature module is used to extract features and fuse causal perception of the formula quality correlation dataset with the finished product quality by utilizing the causal constraint structure of formula quality to generate a causal feature set of formula quality. The prediction module is used to predict the finished product quality attributes of emulsion base powder according to the formulation composition by mapping logic established by the causal feature set of the formulation quality according to the causal constraint structure of the formulation quality; and to decouple the contribution of influencing factors and quantify the uncertainty of the prediction results, and output the quality prediction results. The optimization module is used to optimize the formulation and process parameters based on the quality prediction results, and to iteratively calibrate the causal constraint structure and mapping logic of the formulation quality.
[0016] The beneficial effects of this invention are as follows: By conducting causal relationship analysis and multimodal cross-validation, a causal structure for formulation quality is constructed and constrained. Based on this, a set of causal features for formulation quality with clear action directions and dependency paths is generated, enabling a clear characterization of the causal relationship between formulation composition parameters, process parameters, and finished product quality attributes. This avoids the problem of unclear mechanisms caused by relying solely on statistical correlation for quality prediction, providing a stable causal explanation basis for the prediction process. Furthermore, under causal constraints, controlled perturbation analysis is performed on each causal feature, and the changes in prediction results are quantitatively attributed in conjunction with causal dependency paths. This decouples the influence intensity and action path of each formulation composition parameter and process parameter in the quality formation process, allowing the quality prediction results to clearly identify the contribution source and transmission relationship of each influencing factor while providing the predicted value. This effectively supports targeted formulation and process parameter optimization based on the prediction results, improves the interpretability and usability of the prediction results, and reduces the cost of R&D trial and error. Attached Figure Description
[0017] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0018] Figure 1 This is a flowchart of the method for predicting the quality of emulsion-based powder formulations in this invention; Figure 2 This is a schematic diagram of the quality prediction system for emulsion-based powder formulations in this invention; Figure 3 This is a flowchart illustrating the construction of the causal constraint structure for formula quality in this invention; Figure 4 This is a flowchart illustrating the process of generating quality prediction results in this invention. Detailed Implementation
[0019] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
[0020] Many specific details are set forth in the following description in order to provide a full understanding of the invention. However, the invention may also be practiced in other ways different from those described herein, and those skilled in the art can make similar extensions without departing from the spirit of the invention. Therefore, the invention is not limited to the specific embodiments disclosed below.
[0021] Secondly, the term "one embodiment" or "embodiment" as used herein refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. The phrase "in one embodiment" appearing in different places in this specification does not necessarily refer to the same embodiment, nor is it a single or selective embodiment that is mutually exclusive with other embodiments.
[0022] Reference Figure 1 , Figure 2 , Figure 3 and Figure 4 As an embodiment of the present invention, this embodiment provides a method for predicting the quality of emulsion-based powder formulations, comprising the following steps: Methods for outputting formula quality association datasets include: Simultaneously collect formulation composition data, spectral detection data, and process data of emulsion base powder; and align them with the sample level according to the execution timestamp of the formulation batch to form a set of original multi-source data matched by formulation batch.
[0023] Specifically, the following steps are taken simultaneously: collecting formulation composition data (raw material names, formulation batch numbers, and proportioning information); spectral detection data (spectral curves obtained by scanning samples using near-infrared or Raman spectrometers); process data (equipment parameters for stages such as mixing and drying); assigning a unified timestamp to all data; aligning and matching the timestamps using the formulation batch number as an index; and eliminating mismatched records; and combining the aligned multi-source data from the same batch to form the original multi-source data set.
[0024] Perform data standardization preprocessing on the original multi-source dataset, and combine it according to the formula batch to output the formula quality correlation dataset.
[0025] It should be noted that the preprocessing includes: structuring the formulation composition data; verifying the band consistency and denoising the spectral detection data; checking the time continuity and correcting anomalies in the process data; and combining the processed data by batch using the formulation batch number as an index to obtain a unified formulation quality association dataset.
[0026] In the research and development of infant formula and functional dairy products, existing formula quality prediction methods largely rely on statistical models such as partial least squares or neural networks. While these methods can establish numerical relationships between variables, they are essentially "black box" predictions, with a core deficiency being the lack of in-depth analysis of the production mechanism. In actual production, there are complex temporal logics and transmission mechanisms between formula ratios, production process parameters (such as spray drying temperature), and finished product quality indicators. Statistical models often classify random numerical fluctuations as correlations, making prediction results susceptible to interference from non-causal noise and exhibiting poor stability. More importantly, existing technologies cannot clearly define the contribution and dependency paths of different factors to the quality formation process. When prediction results deviate, researchers find it difficult to determine whether to prioritize adjusting the formula composition or optimizing process parameters. This lack of causal relationships not only limits prediction accuracy but also renders the model ineffective in guiding targeted production optimization. Therefore, to achieve a leap from "black box" to "white box" prediction, this solution proposes a deep prediction architecture based on causal logic, as follows: Methods for constructing causal constraint structures for formulation quality include: Based on the formula quality association dataset, formula composition data, spectral detection data, process data, and finished product quality attributes are used as the set of analytical variables. Statistical dependencies between different variable pairs are extracted, and combined with the order information of variables in the formula batch dimension, variable pairs with stable statistical correlation and reasonable time sequence are screened to form a set of causal candidate relationships.
[0027] It should be noted that the formulation composition data, spectral detection data, process data, and finished product quality attributes of the emulsion base powder are organized according to a unified formulation batch number, and the variable values from different sources but belonging to the same formulation batch are treated as the same observation sample; the statistical dependence between all pairs of variables (statistical dependence can be characterized by correlation indicators) is expressed as follows: In the formula, It is a variable and variables The statistical correlation strength between them It is a variable and variables Covariance across multiple batches of formulation samples and It is a variable and variables The standard deviation is then determined; a two-step screening process is then performed: stability screening: retaining variable pairs with stable dependencies across multiple batches; and temporal rationality screening: combining the production process sequence (e.g., formula setting first, process execution later, and quality inspection last), retaining only variable pairs with reasonable time sequence to form a set of causal candidate relationships.
[0028] Consistency verification is performed on the causal direction of the same variable pair in the causal candidate relation set. Variable relations with consistent causal directions are retained, while variable relations with inconsistent directions are removed.
[0029] Specifically, for each variable pair in the causal candidate relationship, it is checked whether the order of variable changes (cause first, effect later) in all historical batches is consistently consistent with its statistical dependency direction (e.g., A increases, B increases). If it remains consistent in all batches, it is determined that the causal direction is consistent and is retained; if there is a contradiction or the direction is unstable, it is removed.
[0030] The set of causal candidate relationships after causal consistency verification is cross-validated in different batch subsets of the formula quality association dataset. Stable and reproducible variable relationships in different batches are retained to form a stable set of causal relationships.
[0031] Specifically, the set of causal candidate relationships verified through causal direction consistency is divided into multiple independent formulation batch subsets according to the formulation batch number. The statistical dependency calculation and causal direction judgment process are repeated in each formulation batch subset. By comparing whether the same variable relationship can be repeatedly identified in different formulation batch subsets, when the same variable relationship exists stably in multiple formulation batch subsets, it is determined that the variable relationship has cross-formulation batch reproducibility and is retained. When the variable relationship only appears in a few formulation batch subsets or is unstable, it is determined that the variable relationship lacks stability and is removed, thus forming a stable causal relationship set.
[0032] Based on a set of stable causal relationships, each variable is treated as a node, and the stable causal relationships are organized as directed constraints to construct a causal constraint structure for formula quality that limits the direction of variable action and allows dependency paths.
[0033] Specifically, based on a stable causal relationship set, the formulation composition data, spectral detection data, process data, and finished product quality attributes of the emulsion base powder are organized as independent variable nodes, and the direction of action of the variables determined in the stable causal relationship set is connected as a directed constraint relationship, thereby limiting the allowed direction of action and dependency path between variables. For example, in the stable causal relationship set, among the confirmed variable relationships of "formula composition data variable pointing to process data variable", "process data variable pointing to spectral detection data variable", and "spectral detection data variable pointing to finished product quality attribute", only directed constraint relationships in the above directions are allowed to exist, and dependent paths of finished product quality attribute pointing in reverse to spectral detection data, process data, or formula composition data are not allowed. In the organization process, it is clear that formula composition data is located upstream of the causal relationship, process data is located in the intermediate action layer, and spectral detection data and finished product quality attribute are located in the result layer. Variable paths that do not conform to the chronological logic are eliminated through directed constraint relationships, forming a causal constraint structure for formula quality.
[0034] Methods for generating causal feature sets of formula quality include: Based on the causal constraint structure of formula quality, causal path limitation and filtering are performed on each variable in the formula quality association dataset, retaining variables that have a valid directed path pointing to the finished product quality attribute in the causal constraint structure of formula quality.
[0035] Specifically, in the causal constraint structure of formula quality, the finished product quality attribute is used as the path endpoint. The variables corresponding to the formula composition data, spectral detection data, and process data contained in the formula quality association dataset are checked one by one. During the path check, the variables are traced towards the finished product quality attribute along the directed constraint relationship determined in the causal constraint structure of formula quality. Only when the variable can point to the finished product quality attribute through continuous directed constraint relationship is it determined that there is a valid directed path association between the variable and the finished product quality attribute and it is retained.
[0036] Based on the causal direction and dependency order defined by the causal constraint structure of the formulation quality, hierarchical feature extraction is performed on the retained variables to form causal constraint features.
[0037] Specifically, based on the causal hierarchy and dependency order of variables defined in the causal constraint structure of formulation quality, variables selected through causal path screening are processed in layers: formulation composition data variables located upstream of the causal relationship are extracted based on the ratio changes and composition structure at the formulation batch level; process data variables located in the middle causal level are extracted based on the process representation formed by the changes of parameters at each process stage with the formulation batch; and spectral detection data variables located downstream of the causal level are extracted based on the stable response characteristics of the spectral curve at the formulation batch level. The feature formation order of each variable is consistent with the causal action direction defined in the causal constraint structure of formulation quality, forming causal constraint features with clear causal hierarchy attributes.
[0038] Based on the stable causal relationship between variables in the causal constraint structure of formula quality, the causal constraint features are checked for causal consistency, and causal features with conflicting causal directions between formula batches are eliminated.
[0039] It should be noted that the causal consistency verification uses the confirmed stable causal relationship in the causal constraint structure of the formulation quality as a reference, and compares and analyzes the changing trends of the causal constraint features corresponding to the same causal relationship in different formulation batches. When the causal constraint feature shows a changing direction inconsistent with the stable causal relationship in multiple formulation batches, it is determined that there is a causal direction conflict and it is removed. When it maintains a changing trend and direction consistent with the stable causal relationship in different formulation batches, it is determined that it has causal consistency and is retained, so as to improve the stability of the causal feature set in the formulation batch dimension.
[0040] The various causal constraint features for causal consistency verification are fused according to the causal dependency relationship defined by the causal constraint structure of formula quality to generate a causal feature set for formula quality.
[0041] Specifically, after completing the causal consistency verification, the causal dependency path defined in the causal constraint structure of the formulation quality is used as the fusion basis. The causal constraint features of the formulation composition data, the causal constraint features of the process data, and the causal constraint features of the spectral detection data located on the same causal path are combined according to the causal dependency order. During the fusion process, causal constraint features from different sources but belonging to the same causal path are organized in accordance with the principle of maintaining a clear causal hierarchy relationship to form a set of causal features for formulation quality.
[0042] Methods for predicting the quality attributes of finished products containing emulsion base powder according to their formulation composition include: Based on the variable action direction and causal dependency path defined in the causal constraint structure of formula quality, the causal features in the formula quality causal feature set are organized in an orderly manner to form a quality attribute associated causal feature set.
[0043] Specifically, using the causal constraint structure of formulation quality as the organizational basis, causal path labeling is performed on the variables corresponding to each causal feature in the causal feature set of formulation quality, clarifying the causal hierarchy position of the causal features in the causal constraint structure of formulation quality and their dependency paths pointing to the finished product quality attributes; subsequently, according to the causal action direction defined by the causal constraint structure of formulation quality, the causal features are sequentially ordered from upstream to downstream, so that the causal features corresponding to the formulation composition data, the causal features corresponding to the process data, and the causal features corresponding to the spectral detection data form a continuous causal transmission order in the sequence, and ensure that all the arranged causal features have a valid directed path pointing to the finished product quality attributes; causal features located on the same causal dependency path are grouped to form a quality attribute association causal feature set that transmits the changes in formulation composition to the finished product quality attributes through the process and spectral response.
[0044] Based on the mapping relationship between the quality attribute-related causal feature set and the finished product quality attributes corresponding to historical formula batches, the finished product quality attribute mapping calculation is performed on the quality attribute-related causal feature set to obtain the initial predicted quality value under the corresponding formula composition conditions; during the finished product quality attribute mapping calculation, only causal features located on the effective directed path pointing to the finished product quality attribute in the formula quality causal constraint structure are allowed to participate in the prediction calculation.
[0045] It should be noted that after the quality attribute association causal feature set is constructed, the mapping relationship between the quality attribute association causal feature set and the corresponding finished product quality attribute obtained in the historical formula batch is used as the basis to perform finished product quality attribute mapping calculation on the quality attribute association causal feature set under the current formula composition condition. In the specific process, the formula batch number is used as the index to organize the values of each causal feature in the quality attribute association causal feature set in the historical formula batch and the values of the finished product quality attribute that have been measured under the same formula batch, so as to form a one-to-one correspondence between the causal feature values and the finished product quality attribute. During the mapping calculation, only causal features that have a valid directed path to the finished product quality attribute in the causal constraint structure of the formulation quality are allowed to participate in the calculation. According to the causal dependency order determined in the set of causal features associated with quality attributes, the effect of each causal feature on the finished product quality attribute is calculated by superimposing or combining them item by item to avoid interference from non-causal features or reverse causal relationships on the prediction results. After completing the mapping calculation of all causal features, the initial predicted value of the finished product quality attribute corresponding to the current formulation composition conditions is output. This scheme achieves a technological leap from correlation fitting to mechanism derivation through causal constraint structure and feature path extraction. By utilizing time series constraints and path locking, it actively eliminates spurious correlations, ensuring that the prediction model is built on robust physical logic. The prediction process has strong anti-interference ability and cross-batch robustness, effectively avoiding statistical overfitting. Feature extraction and fusion strictly follow the causal hierarchy, making the prediction results not only highly accurate but also forming a clear causal chain.
[0046] After achieving preliminary predictions based on causal logic, the key to improving R&D efficiency lies in transforming macroscopic predicted values into microscopic control decisions. However, existing technologies often lack quantitative decomposition of the contributions of various influencing factors when providing prediction results, resulting in predictions where the outcome is known but the cause remains unclear. When formulation composition and process parameters are coupled, traditional models cannot distinguish whether raw material fluctuations or process instability are dominant, and it is even more difficult to quantify the confidence interval of the predicted value itself. This dual lack of interpretability and certainty forces R&D personnel to rely on experience and blindly try and fail when faced with unacceptable predictions. Therefore, this solution establishes a mechanism that can decouple the contributions of factors and assess prediction risks, as follows; Methods for decoupling the contribution of influencing factors and quantifying uncertainty include: Based on the quality attribute-related causal feature set, while keeping the values of other causal features unchanged, controlled perturbations are applied to each causal feature for the initial predicted quality under the current formulation conditions. The influence intensity of each causal feature is calculated based on the change in the initial predicted quality before and after the perturbation, thus forming the causal feature influence intensity.
[0047] Specifically, the analysis object is the set of causal features associated with quality attributes, and the initial predicted quality value obtained under the current formulation composition is used as the benchmark result. While keeping the values of the remaining causal features other than the target causal feature unchanged, a controlled perturbation of a preset amplitude is applied to each causal feature in the set of causal features associated with quality attributes, and the finished product quality attribute mapping calculation is re-executed after the perturbation. By comparing the changes in the quality prediction results before and after the perturbation, the degree of influence of each causal feature on the initial predicted quality value under the current formulation composition is determined, and the causal feature influence intensity corresponding to each causal feature is formed.
[0048] Based on the causal dependency path defined by the causal constraint structure of formula quality, the influence intensity corresponding to each causal feature is assigned and summarized by causal dependency path to form the decoupling result of the contribution of influencing factors.
[0049] Specifically, based on the causal hierarchical position of causal features in the causal constraint structure of formula quality and their dependency paths pointing to the quality attributes of finished products, the influence intensity of each causal feature is assigned to the corresponding causal dependency path; the influence intensity of causal features belonging to the same causal dependency path is summarized according to the causal dependency order to obtain the path-level influence result, thereby achieving the decoupled expression of the influence contribution of different causal dependency paths and forming the decoupled result of the contribution degree of influencing factors.
[0050] Based on the distribution of prediction residuals between the initial predicted quality values of historical formula batches and the corresponding measured results, uncertainty quantification is performed on the current initial predicted quality value; and the initial predicted quality value, the decoupling results of the contribution of influencing factors, and the uncertainty description are correlated and organized to form the quality prediction result.
[0051] Specifically, after decoupling the contribution of influencing factors, based on the differences between the initial predicted quality values obtained in historical formula batches and the corresponding measured finished product quality attributes, the prediction residuals of each formula batch are statistically organized to form a prediction residual distribution. When performing uncertainty quantification, the initial quality prediction value obtained under the current formulation composition conditions is compared with the prediction residual distribution. By determining the corresponding position of the current initial quality prediction value in the prediction residual distribution, the possible prediction deviation interval is determined, and an uncertainty description of the reliability of the current quality prediction result is obtained. After obtaining the uncertainty description, the current initial quality prediction value, the decoupling results of the corresponding influencing factor contribution, and the uncertainty description are uniformly organized and associated to form the quality prediction result.
[0052] By employing controlled perturbation and residual distribution modeling, this scheme successfully upgrades "static prediction" to "dynamic diagnosis." It not only provides predicted quality values but also, through sensitivity stripping of causal characteristics, identifies the contribution weights and transmission paths of each parameter to quality formation, making the decoupled results of influencing factors highly physically interpretable. Simultaneously, the introduction of uncertainty quantification in the predicted residual distribution provides decision-makers with clear risk boundaries, effectively avoiding decision-making errors caused by model extrapolation. This evolves the prediction results from a single dimension to a three-dimensional expression of "numerical value + contribution + confidence level," providing a quantitative basis for subsequent targeted formulation fine-tuning and process parameter optimization based on contribution priority.
[0053] Methods for optimizing formulation and process parameters based on quality prediction results include: Based on the quality prediction results, the formulation composition parameters and process parameters whose corresponding contribution values are not zero in the decoupling results of the contribution of influencing factors are identified; the formulation composition parameters and process parameters are adjusted to form candidate optimized formulation and process parameter combinations.
[0054] Specifically, after obtaining the quality prediction results under the current formulation composition conditions, the decoupling results of the contribution of influencing factors are used as the basis for adjustment. Formulation composition parameters and process parameters with non-zero contribution values are screened. A non-zero contribution value indicates that the corresponding parameter has an actual impact on the formation process of finished product quality attributes under the current formulation conditions. Combining the changes in finished product quality attributes under the same or similar parameter values in historical formulation batches, the screened formulation composition parameters and process parameters are directionally adjusted to form multiple parameter value combinations. The parameter adjustments are based on the premise of not changing the direction of variable action and causal dependence path in the causal constraint structure of the formulation quality, ensuring that the formed parameter value combinations still satisfy the existing causal constraint relationship, thereby obtaining candidate optimized formulations and process parameter combinations.
[0055] Perform finished product quality attribute prediction on candidate optimized formulations and process parameter combinations, and select the formulation and process parameter combinations whose quality prediction results meet the preset quality requirements as the optimization results.
[0056] It should be noted that each candidate optimized formula and process parameter combination is input as a new formula composition condition into the aforementioned finished product quality attribute mapping calculation process. According to the mapping relationship between the quality attribute association causal feature set and the finished product quality attribute, the corresponding quality prediction results are obtained one by one. In the prediction process, only causal features that have a valid directed path to the finished product quality attribute in the formula quality causal constraint structure are allowed to participate in the calculation. The quality prediction results corresponding to each candidate optimized formula and process parameter combination are compared with the preset quality requirements, and the formula and process parameter combinations whose prediction results meet the preset quality requirements are retained as the final optimization results. Preset quality requirements are used to define the target range that the quality attributes of the finished product should reach. Preset quality requirements can be derived from product quality standards, the statistical range of the quality attributes of the finished product in historical qualified formula batches, or the quality control range allowed by the production process. For example, the example value is that a certain quality indicator is not lower than the lower limit of historical stable production batches.
[0057] Methods for iteratively calibrating the causal constraint structure and mapping logic of formulation quality include: The measured results of the finished product quality attributes corresponding to the optimized formula and process parameters are collected and compared with the corresponding initial quality prediction values to obtain the quality prediction deviation value.
[0058] Specifically, after determining the optimized formula and process parameter combination and putting it into actual production, the measured results of the finished product quality attributes under the corresponding formula batch are collected, and the measured results of the finished product quality attributes are matched one-to-one with the initial quality prediction values previously obtained under the same formula composition conditions. By calculating the difference between the initial quality prediction value and the measured results of the finished product quality attributes, the quality prediction deviation value, which characterizes the degree of deviation of the prediction result, is obtained.
[0059] Based on the distribution of quality prediction deviation values in different formulation batches, the validity of the relationship between relevant variables in the causal constraint structure of formulation quality is updated, and the mapping logic used in the mapping calculation of finished product quality attributes is corrected.
[0060] Specifically, the quality prediction deviation values obtained from multiple optimized formulation batches will be summarized and organized, and the quality prediction deviation values will be classified at the path level according to the causal dependency path defined in the formulation quality causal constraint structure. The deviation direction, deviation magnitude and persistence of each causal dependency path in different formulation batches are compared and analyzed; when the quality prediction deviation value corresponding to a certain causal dependency path continues to show the same deviation in multiple formulation batches, the relationship between each variable is traced back along the causal dependency path to determine the degree of correspondence between the variable change and the measured results of the finished product quality attributes. When it is found that a certain directed constraint relationship can no longer stably reflect the actual impact of the variable on the finished product quality attributes, the directed constraint relationship corresponding to the formulation quality causal constraint structure is adjusted or removed. After updating the causal constraint relationship, based on the updated causal constraint structure of the formula quality, the range of causal features involved in the prediction and their combination order in the finished product quality attribute mapping calculation are synchronously corrected. Only causal features located in the updated causal dependency path are retained to participate in the mapping calculation. The effect relationship of each causal feature on the finished product quality attribute is reorganized according to the updated causal dependency order, so that the prediction deviation between the subsequent initial quality prediction value and the measured finished product quality attribute gradually decreases, thereby realizing the iterative calibration of the formula quality causal constraint structure and mapping logic.
[0061] Methods for applying controlled perturbations to each causal feature separately include: In the set of causal features associated with quality attributes, a single causal feature is selected as the target causal feature in turn, and the values of the remaining causal features are fixed to the corresponding values under the current formula conditions.
[0062] Specifically, the causal feature set associated with quality attributes is used as the operation object. According to the causal dependency order determined in the causal feature set associated with quality attributes, each causal feature is traversed and selected one by one. When a certain causal feature is selected as the target causal feature, the values of the other causal features are kept as the corresponding values used to obtain the initial predicted value of quality under the current formula composition conditions. This ensures that only the target causal feature changes during the perturbation process, eliminating the interference of changes in the values of other causal features on the calculation results of the finished product quality attribute mapping.
[0063] For the target causal feature, without changing its causal hierarchy and dependency relationship in the causal constraint structure of formulation quality, a positive or negative perturbation with a limited range is applied to its value to form the perturbed causal feature value.
[0064] It should be noted that, without disrupting the original causal constraint structure of the formula quality, the target feature value can be fine-tuned. The perturbation range can be set as: a small percentage of the original value (such as ±1% to 10%), one to two standard deviations of historical statistical data, or a fixed adjustment amount allowed by the process. The perturbation direction can be positive or negative, but the causal feature value after perturbation is still within the reasonable range of values that the causal feature has appeared in historical formula batches. Thus, the perturbation feature value is obtained.
[0065] Based on the quality attribute association causal feature set containing the causal feature values after perturbation, the finished product quality attribute mapping calculation is re-executed to obtain the corresponding initial predicted value of quality after perturbation.
[0066] It should be noted that during the calculation of the finished product quality attribute mapping, the effects of each causal feature on the finished product quality attribute are calculated by superimposing or combining them according to the predetermined causal dependency order in the set of causal features associated with the quality attribute. Only the target causal feature is replaced with the value of the causal feature after the perturbation, while the other causal features remain fixed. This ensures that the difference between the initial predicted quality value before and after the perturbation is caused only by the change in the target causal feature.
[0067] The initial predicted mass value after the disturbance is compared with the initial predicted mass value before the disturbance. Based on the amount of change between the two, the influence intensity of the target causal feature under the current formulation conditions is determined.
[0068] It should be noted that the influence intensity corresponding to the target causal feature is calculated by the difference between the initial predicted quality value after perturbation and the initial predicted quality value obtained without perturbation. This difference is used to characterize the degree of influence of a small change in the value of the target causal feature on the prediction result of the finished product quality attribute, while keeping the values of other causal features unchanged. When the difference is large, it indicates that the target causal feature has a strong influence on the formation process of the finished product quality attribute under the current formulation composition conditions. When the difference is small, it indicates that the influence of the target causal feature is relatively weak.
[0069] This embodiment also provides a quality prediction system for emulsion-based powder formulations, including: The data acquisition module is used to collect formulation composition data, spectral detection data, and process data of emulsion base powder, and to perform standardized preprocessing to output a formulation quality correlation dataset. The causal construction module is used to perform causal relationship analysis and multimodal cross-validation on the formula quality association dataset, and to construct the causal constraint structure for formula quality. The causal feature module is used to extract features and fuse causal perception of the formula quality correlation dataset with the finished product quality by utilizing the causal constraint structure of formula quality to generate a causal feature set of formula quality. The prediction module is used to predict the finished product quality attributes of emulsion base powder according to the formulation composition by mapping logic established by the causal feature set of the formulation quality according to the causal constraint structure of the formulation quality; and to decouple the contribution of influencing factors and quantify the uncertainty of the prediction results, and output the quality prediction results. The optimization module is used to optimize the formulation and process parameters based on the quality prediction results, and to iteratively calibrate the causal constraint structure and mapping logic of the formulation quality.
[0070] This embodiment also provides a computer device applicable to the method for predicting the quality of emulsion-based powder formulations, comprising: a memory and a processor; the memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions to implement the method for predicting the quality of emulsion-based powder formulations as proposed in the above embodiment.
[0071] The computer device can be a terminal, comprising a processor, memory, communication interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, carrier networks, NFC (Near Field Communication), or other technologies. The display screen can be an LCD screen or an e-ink screen. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad on the computer device's casing, or an external keyboard, touchpad, or mouse.
[0072] This embodiment also provides a storage medium storing a computer program that, when executed by a processor, implements the method for predicting the quality of emulsion-based powder formulations as proposed in the above embodiments. The storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Red-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.
[0073] In summary, this invention constructs and constrains the causal structure of formulation quality through causal relationship analysis and multimodal cross-validation. Based on this, it generates a set of causal features for formulation quality with clear directions of action and dependent paths, enabling a clear characterization of the causal relationships between formulation composition parameters, process parameters, and finished product quality attributes. This avoids the mechanistic ambiguity caused by relying solely on statistical correlation for quality prediction, providing a stable causal explanation basis for the prediction process. Furthermore, under causal constraints, controlled perturbation analysis is performed on each causal feature, and the changes in prediction results are quantitatively attributed to causal dependent paths. This decouples the influence intensity and action path of each formulation composition parameter and process parameter in the quality formation process, allowing the quality prediction results to clearly identify the contribution source and transmission relationship of each influencing factor while providing predicted values. This effectively supports targeted formulation and process parameter optimization based on the prediction results, improving the interpretability and usability of the prediction results and reducing R&D trial-and-error costs.
[0074] It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.
Claims
1. A method of predicting the mass of a milk-based base powder formulation, characterised in that, include: Collect formulation composition data, spectral detection data, and process data of emulsion base powder, perform standardized preprocessing, and output formulation quality correlation dataset; Causal relationship analysis and multimodal cross-validation were performed on the formulation quality correlation dataset to construct a causal constraint structure for formulation quality. By utilizing the causal constraint structure of formula quality, feature extraction and causal perception fusion are performed on the formula quality association dataset to form a causal relationship with the quality of finished products, generating a causal feature set of formula quality. By using the mapping logic established by the causal feature set of formula quality according to the causal constraint structure of formula quality, the quality attributes of the finished product containing emulsion base powder according to the formula composition are predicted to obtain the initial predicted value of quality. It also decouples the contribution of influencing factors and quantifies the uncertainty, and outputs the quality prediction results; Based on the quality prediction results, the formulation and process parameters are optimized, and the causal constraint structure and mapping logic of the formulation quality are iteratively calibrated.
2. The milk-based powder-in-ingredient formulation mass prediction method of claim 1, wherein, The method for outputting the formula quality association dataset includes: Simultaneously collect formulation composition data, spectral detection data, and process data of emulsion base powder; and align them with the sample level according to the execution timestamp of the formulation batch to form a set of original multi-source data matched by formulation batch; Perform data standardization preprocessing on the original multi-source dataset, and combine it according to the formula batch to output the formula quality correlation dataset.
3. The method for predicting the quality of emulsion-based powder formulations as described in claim 2, characterized in that, The method for constructing a causal constraint structure for formulation quality includes: Based on the formula quality association dataset, formula composition data, spectral detection data, process data and finished product quality attributes are used as the set of analysis variables; the statistical dependencies between different variable pairs are extracted, and combined with the order information of variables in the formula batch dimension, variable pairs with stable statistical correlation and reasonable time sequence are screened to form a set of causal candidate relationships. Consistency verification of the causal direction of the same variable pair in the causal candidate relation set is performed, retaining variable relations with consistent causal directions and eliminating variable relations with inconsistent directions; The set of causal candidate relationships after causal consistency verification is cross-validated in different batch subsets of the formula quality association dataset to retain stable and reproducible variable relationships in different batches, thus forming a stable set of causal relationships. Based on a set of stable causal relationships, each variable is treated as a node, and the stable causal relationships are organized as directed constraints to construct a causal constraint structure for formula quality that limits the direction of variable action and allows dependency paths.
4. The method for predicting the quality of emulsion-based powder formulations as described in claim 3, characterized in that, The method for generating the causal feature set of formula quality includes: Based on the causal constraint structure of formula quality, causal path limitation and filtering are performed on each variable in the formula quality association dataset, and variables that have effective directed paths pointing to the finished product quality attributes in the causal constraint structure of formula quality are retained. Based on the causal direction and dependency order defined by the causal constraint structure of the formulation quality, hierarchical feature extraction is performed on the retained variables to form causal constraint features; Based on the stable causal relationship between variables in the causal constraint structure of formula quality, the causal constraint features are checked for causal consistency, and causal features with conflicting causal directions between formula batches are eliminated. The various causal constraint features for causal consistency verification are fused according to the causal dependency relationship defined by the causal constraint structure of formula quality to generate a causal feature set for formula quality.
5. The method for predicting the quality of emulsion-based powder formulations as described in claim 4, characterized in that, The method for predicting the quality attributes of finished products containing emulsion base powder according to the formulation includes: According to the direction of variable action and causal dependency path defined in the causal constraint structure of formula quality, the causal features in the formula quality causal feature set are organized in an orderly manner to form a quality attribute associated causal feature set. Based on the mapping relationship between the quality attribute-related causal feature set and the finished product quality attributes corresponding to historical formula batches, the finished product quality attribute mapping calculation is performed on the quality attribute-related causal feature set to obtain the initial predicted quality value under the corresponding formula composition conditions; during the finished product quality attribute mapping calculation, only causal features located on the effective directed path pointing to the finished product quality attribute in the formula quality causal constraint structure are allowed to participate in the prediction calculation.
6. The method for predicting the quality of emulsion-based powder formulations as described in claim 5, characterized in that, The methods for decoupling the contribution of influencing factors and quantifying uncertainty include: Based on the quality attribute-related causal feature set, while keeping the values of other causal features unchanged, controlled perturbations are applied to each causal feature for the initial predicted quality under the current formula conditions. The influence intensity of each causal feature is calculated based on the change in the initial predicted quality before and after the perturbation, thus forming the causal feature influence intensity. Based on the causal dependency path defined by the causal constraint structure of formula quality, the influence intensity corresponding to each causal feature is assigned and summarized by causal dependency path to form the decoupling result of the contribution of influencing factors. Based on the distribution of prediction residuals between the initial predicted quality values of historical formula batches and the corresponding measured results, uncertainty quantification is performed on the current initial predicted quality value; and the initial predicted quality value, the decoupling results of the contribution of influencing factors, and the uncertainty description are correlated and organized to form the quality prediction result.
7. The method for predicting the quality of emulsion-based powder formulations as described in claim 6, characterized in that, The method for optimizing formulation and process parameters based on quality prediction results includes: Based on the quality prediction results, identify the formulation composition parameters and process parameters whose corresponding contribution values are not zero in the decoupling results of the influencing factors; adjust the formulation composition parameters and process parameters to form candidate optimized formulation and process parameter combinations; Perform finished product quality attribute prediction on candidate optimized formulations and process parameter combinations, and select the formulation and process parameter combinations whose quality prediction results meet the preset quality requirements as the optimization results.
8. The method for predicting the quality of emulsion-based powder formulations as described in claim 7, characterized in that, The method for iteratively calibrating the causal constraint structure and mapping logic of the formulation quality includes: Collect the measured results of the finished product quality attributes corresponding to the optimized formula and process parameters, and compare them with the corresponding initial quality prediction values to obtain the quality prediction deviation value; Based on the distribution of quality prediction deviation values in different formulation batches, the validity of the relationship between relevant variables in the causal constraint structure of formulation quality is updated, and the mapping logic used in the mapping calculation of finished product quality attributes is corrected.
9. The method for predicting the quality of emulsion-based powder formulations as described in claim 8, characterized in that, The method of applying controlled perturbations to each causal feature includes: In the set of causal features associated with quality attributes, a single causal feature is selected as the target causal feature in turn, and the values of the remaining causal features are fixed to the corresponding values under the current formula conditions. For the target causal feature, without changing its causal hierarchy and dependency relationship in the causal constraint structure of formula quality, a positive or negative perturbation with a limited range is applied to its value to form the perturbed causal feature value. Based on the quality attribute association causal feature set containing the causal feature values after perturbation, the finished product quality attribute mapping calculation is re-executed to obtain the corresponding initial predicted value of quality after perturbation. The initial predicted mass value after the disturbance is compared with the initial predicted mass value before the disturbance. Based on the amount of change between the two, the influence intensity of the target causal feature under the current formulation conditions is determined.
10. A quality prediction system for emulsion-containing base powder formulations, based on the quality prediction method for emulsion-containing base powder formulations according to any one of claims 1 to 9, characterized in that, include: The data acquisition module is used to collect formulation composition data, spectral detection data, and process data of emulsion base powder, and to perform standardized preprocessing to output a formulation quality correlation dataset. The causal construction module is used to perform causal relationship analysis and multimodal cross-validation on the formula quality association dataset, and to construct the causal constraint structure for formula quality. The causal feature module is used to extract features and fuse causal perception of the formula quality correlation dataset with the finished product quality by utilizing the causal constraint structure of formula quality to generate a causal feature set of formula quality. The prediction module is used to predict the finished product quality attributes of emulsion base powder according to the formulation composition by mapping logic established by the causal feature set of the formulation quality according to the causal constraint structure of the formulation quality; and to decouple the contribution of influencing factors and quantify the uncertainty of the prediction results, and output the quality prediction results. The optimization module is used to optimize the formulation and process parameters based on the quality prediction results, and to iteratively calibrate the causal constraint structure and mapping logic of the formulation quality.