Humanized automation control of industrial processes
By combining machine learning models with MPC, the system predicts operators' intervention tendencies and adjusts control outputs, solving the problem of frequent manual intervention by operators and achieving optimal operation and comprehensibility of industrial processes.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ABB (SCHWEIZ) AG
- Filing Date
- 2020-07-07
- Publication Date
- 2026-06-26
Smart Images

Figure CN116520705B_ABST
Abstract
Description
[0001] Divisional Application Instructions
[0002] This application is filed on July 7, 2020, with application number 202010644564.4, and entitled "Industrial Processes". This is a divisional application of a Chinese invention patent application for "humanized automated control". Technical Field
[0003] The present invention relates to a method for automatically controlling an industrial process in a manner more reasonable for operators who are monitoring the industrial process as a backup. Background Technology
[0004] Most industrial processes today are controlled automatically by process control systems (such as distributed control systems). However, operators still monitor the process as a backup. If a problem occurs and the automated control system cannot handle it, the operator can override the automated control system's recommended actions with manual intervention. For example, the control system may malfunction or be unable to handle unexpected sensor or actuator failures.
[0005] Operators are trained to monitor the development of certain state variables in a process and to manually intervene if that development becomes unreasonable for them. In this way, operators ensure the process proceeds in an orderly and safe manner, and that all constraints of the process are observed. However, each operator intervention incurs a cost. Automated control systems typically follow a trajectory in the state space chosen according to optimality criteria (such as saving materials or energy). If an operator intervenes, this trajectory deviates, and the process proceeds in a manner that is not optimal relative to those criteria.
[0006] It has been observed that, when automated control systems are capable of handling the situation, operators tend to intervene frequently. This results in processes operating in a suboptimal manner for a significant proportion of the time.
[0007] Invention Objective
[0008] Therefore, the objective of this invention is to allow industrial processes to be controlled in a manner that is unlikely to trigger unnecessary human intervention.
[0009] This objective is achieved through the method for controlling an industrial process as described in the main claim, the method for training a machine learning model as described in another independent claim, and the corresponding software product. More advantageous embodiments are described in detail in the corresponding dependent claims. Summary of the Invention
[0010] This invention provides a method for controlling an industrial process. In this method, a process controller obtains a set of current and / or past values of the process's state variables. For example, these values may be provided as time-series data. Based at least in part on the set of values, the process controller generates a set of control outputs, which are translated into at least one physical operation in the process. Specifically, a closed feedback loop may exist, in which the process controller continuously obtains new values of the process's state variables. These new values indicate the industrial process's response to a previous physical operation caused by the process controller's control outputs.
[0011] The control output can be directly applied to at least one actuator that performs physical operations in sequence. Alternatively or in combination, the control output can be applied to a lower-level controller that acts in sequence on such actuators. For example, the control output may include a setpoint (such as a desired temperature or pressure) for the lower-level controller, and the lower-level controller may control the actuator (such as a heater or valve) to keep the process state variables (such as the temperature or pressure) close to that setpoint.
[0012] The trained machine learning model is queried based on at least a subset of the set of current values and at least a subset of the set of control outputs. The trained machine learning model is configured to output categorical and / or regressive values that indicate to the monitoring operator a tendency to at least partially cover the control outputs provided by the process controller.
[0013] In response to determining categorical and / or regressive values and / or tendencies to meet predetermined criteria, countermeasures are taken to reduce the tendency for operators to cover control outputs. For example, the criteria may include a threshold for the probability of operator coverage.
[0014] The countermeasure may include modifying at least one control output in the control output, and / or at least one parameter characterizing the behavior of the process controller, and / or at least one constraint on the operation of the process controller. For example, the parameter may represent the optimal objective pursued by the control of the process. It is particularly advantageous to adjust at least one constraint on the operation of the process controller if the process controller itself is only available as a "black box" and its internal workings and parameters are unknown. In this respect, a constraint can be viewed as an abstract handle that coordinates operations on the internal workings of the process controller from the outside.
[0015] The end result of this modification is that the final selected physical operations applied to the industrial process are more reasonable for the operators. That is, these operations are more in line with the results expected by the operators based on their understanding of the process, so that the operators have little reason to suspect that any problems have occurred and require manual intervention.
[0016] The inventors have discovered that while the increasing complexity of process controllers enables industrial processes to operate better than given optimality criteria, it also makes the control outputs more difficult for operators to understand. A PID controller comprises three components, each with an easily understandable dependency on the input: proportional to the input itself, the integral of the input, and the derivative of the input, respectively. In contrast, a Model Predictive Controller (MPC) contains a mathematical model that approximates the process behavior. Creating such a model is a task for experts, and from the perspective of an average process operator, the output's dependency on the input is not direct.
[0017] In a visual example, designing a process in the best way according to optimality criteria might not be straightforward, because the trajectory the process follows in state space needs to satisfy constraints that might be considered forbidden zones in that space. Complex process controllers (such as MPCs) know about these forbidden zones through the process model and can send the process along a circuitous trajectory in state space to bypass them. However, the operator is unaware of these forbidden zones, so from their perspective, the trajectory suddenly deviates without any apparent reason. This modification makes the trajectory less optimized, but conversely, it contains fewer detours around forbidden zones.
[0018] In another example, while the operator may know which constraints need to be satisfied, they may not be certain whether the strategy proposed by the MPC actually satisfies those constraints. For instance, if a state variable increases or decreases rapidly, it may not be immediately obvious to the operator whether the maximum or minimum constraint for that state variable will always be satisfied. This example will be discussed in detail below.
[0019] The countermeasure may also include conveying at least one message to the operator. For example, such a message could indicate that the process controller is functioning correctly, or that components included in the control of the process (such as sensors, actuators, or lower-level controllers) have just undergone a self-test and have been found to be functioning correctly.
[0020] Therefore, communicating the message to the operator does not change the control input, thus making it more reasonable to the operator. Instead, it strives to modify the operator's understanding so that, through this modified understanding, the unmodified control output becomes reasonable.
[0021] Regardless of which strategy is adopted, individually or in combination, the end result always leads in the same direction: according to the given optimality criterion, the probability of the industrial process proceeding along a better trajectory in the state space increases because human biases from such a good trajectory through the poorer regions of the state space are minimized.
[0022] In a particularly advantageous embodiment, the categorical and / or regressive values include possible causes of the control outputs provided by the process controller that are overridden by the operator. Therefore, the modifications performed during this countermeasure can be specifically aimed at reducing the prevalence of such possible causes in the process's behavior.
[0023] In a visual example, the optimal trajectory might require the pump motor to be driven at a certain frequency spectrum. However, when the pump motor is driven at this frequency spectrum, it can cause the pump to produce a harsh noise very similar to the noise produced when the pump's bearing fails. The operator hears the noise and manually controls the pump to reduce its speed because of the bearing failure. However, since the bearing has not actually failed, this results in an unnecessary deviation from the optimal trajectory. The modification can then change the noise emitted by the pump so that it no longer resembles the sound of a failed bearing.
[0024] In a particularly advantageous embodiment, the possible causes covered specifically include overshoot or undershoot of at least one state variable in the process. Overshoot can specifically include an increase in the state variable exceeding a target value that it should have increased by. Similarly, undershoot can specifically include a decrease in the state variable exceeding a target value that it should have decreased by. If a state variable increases or decreases very rapidly over a period of time, this can create the impression, to the operator, that the state variable is out of control due to some kind of malfunction. For example, a valve may be stuck in the open position, or a temperature control switch may be blown and unable to open.
[0025] MPC knows the process model and therefore knows the maximum switching rate at which the rate of increase or decrease of state variables can vary. Therefore, if MPC deems it advantageous to increase or decrease state variables to a new target value in order to optimize the process according to optimality criteria, it can initially increase or decrease at the maximum possible rate, then decelerate at the last moment, still avoiding overshoot or undershoot during this period. In this way, the new target value is reached as quickly as possible, and the total time of the process is maximized in the optimal manner according to optimality criteria. However, the operator may not know the process model in as much detail as the MPC, and therefore may suspect that some error has occurred and that state variables are increasing or decreasing uncontrollably. Because the operator will spend more time than the MPC in slowing down the increase or decrease, the operator must decide to implement manual control before the MPC plans to slow down the increase or decrease.
[0026] In this situation, it is particularly advantageous to gradually slow down the rate at which the process's state variables increase or decrease. In this way, although the state variables take longer to reach their target values, the rate of change immediately informs the operator that the process control is functioning correctly and prompts them to take proactive action.
[0027] If the message is communicated to the operator, it may specifically include an explanation of the control policy following the control output to be applied. In the context of the above figurative example, such a message could be read as, "There is a restricted area in the state space directly ahead, and turning left around that restricted area is also prohibited. Therefore, I will turn right to bypass the restricted area." Alternatively, or in combination, the planned trajectory can be plotted to ensure that the operator knows all the constraints that the MPC must satisfy and actually plans to satisfy those constraints. The operator then only needs to roughly check whether the observed behavior is consistent with this explanation.
[0028] Alternatively or in combination, the message may specifically include an invitation for the operator to select one of several candidate control strategies to apply to the process. These control strategies are more reasonable to the operator because they require less difficult maneuvering around the restricted area, but at the cost of not being optimal from a given criterion of best practice. However, they are still far better than the path resulting from manual intervention by the operator.
[0029] In a particularly advantageous embodiment, the modifications performed during the countermeasure process specifically include generating multiple candidate sets of control outputs. Based on each candidate set of control outputs, the trained machine learning model is queried again to obtain candidate classification values and / or candidate regression values. The candidate sets of control outputs can generate candidate classification values and / or candidate regression values indicating that the operator's coverage tendency is smaller than the current set of control outputs to be applied. If this occurs, the candidate sets of control outputs become the new set of control outputs to be applied.
[0030] In this way, an active search can be performed in the space of possible control outputs. A candidate set of control outputs can be obtained using any suitable search strategy. The search strategy can be particularly dependent on the available time for the search, and this time depends on the rate of the industrial process to be controlled. In most large-scale industrial processes, the time constant is long enough that at least a few seconds can be allocated for the search.
[0031] In another particularly advantageous embodiment, the determination of the set of control outputs by the process controller can be performed using model predictive control (MPC). In MPC, in response to a given set of candidate control outputs applied to the process, a model of the industrial process is used to predict the evolution of state variables from a given set of values to a new set of candidate values. Multiple candidate sets of control outputs are tested in this manner, and an optimal value is assigned to each candidate set of state variables based on at least one optimality criterion. A candidate set of state variables whose at least one optimal value satisfies a predetermined criterion is determined. The candidate set of control outputs corresponding to this set of candidate state variables is determined as the set of control outputs.
[0032] As mentioned above, MPC performs exceptionally well in bypassing the forbidden zones in the state space, but at the cost of producing trajectories that are difficult for operators to understand. The countermeasures described above make the trajectories easier to understand and reduce the probability of operators overriding MPC.
[0033] Furthermore, MPC can be used again for the active search of new candidate sets of control outputs. For example, the constraints on the set of output values determined by MPC can be modified to obtain new candidate sets of output values. For instance, to avoid giving the impression that a state variable is about to overshoot, the constraint on the maximum value of the state variable can be set to a lower value.
[0034] The present invention also provides a method for training a machine learning model used in the above-described control method.
[0035] During this training method, multiple sets of control outputs from the process controller are recorded during the actual and / or simulated operation of the industrial process, under the control of the process controller and the monitoring of the operator. These control outputs are applied to at least one actuator and / or lower-level controller. The actuator and / or lower-level controller are configured to perform at least one (actual and / or simulated) physical operation in the industrial process.
[0036] For each of the multiple sets of control outputs provided by the process controller, the operator's decision whether to request that set of control outputs be overridden is recorded. For example, if the operator allows the process to run its automated process without intervention, then the operator's decision not to overridden that set of control outputs provided by the process controller can be recorded. The decision to overridden can be recorded whenever the operator exercises control in some way.
[0037] Furthermore, in each case of recording the set of control outputs, a set of current and / or past values of the process's state variables is recorded. As mentioned above, the set of control outputs is always associated with the process situation characterized by the set of current and / or past values of the state variables.
[0038] For each set of control outputs that corresponds to the set of current and / or past values of the state variables, the machine learning model is queried to obtain categorical and / or regressive values. The parameters characterizing the behavior of the machine learning model are optimized so that the categorical and / or regressive values more accurately predict whether the operator actually requests to cover the corresponding set of control outputs.
[0039] That is, if records show that when a certain set of control outputs is applied, the operator allows the process to run its automation, then the categorical and / or regressive values should indicate that the operator has a low or zero tendency to override the control outputs. In contrast, if records show that when a specific set of control outputs is applied, the operator intervenes, then the categorical and / or regressive values should indicate that the operator has a high or some tendency to override the control outputs.
[0040] If training is performed on a set of conditions with sufficiently high variability, the trained machine learning model can predict an operator's tendency to override control outputs in many situations, even those not part of the training. This is due to the generalization ability of such machine learning models. Simulated operation of industrial processes is particularly advantageous for increasing variability, as it easily introduces a wide variety of situations. Furthermore, the operation of industrial processes can be performed under the supervision of multiple independent operators to account for variability among these operators. For example, one operator may be more likely than another to react to a state variable that appears to be about to overshoot.
[0041] In a particularly advantageous embodiment, the method further includes recording the reason for the set of requests to cover control outputs. The parameters of the machine learning model can then be optimized so that the categorical and / or regression values also more accurately predict the reason.
[0042] In another particularly advantageous embodiment, parameter optimization includes: a first stage based on the set of control outputs and the coverage decision of the first process record, and a subsequent second stage based on the set of control outputs and the coverage decision of the second process record.
[0043] Specifically, the second phase can begin with the values of the parameters obtained at the end of the first phase. In this way, for example, the first phase of training performed on a more general version of an industrial process can be reused for multiple more specific instances of that process. This reuse saves computational time. Furthermore, the training data from the first phase of training does not need to be disclosed to anyone who wants to apply the machine learning model to a specific implementation of the industrial process. This training data can be confidential.
[0044] In another particularly advantageous embodiment, control inputs corrected by the operator are recorded. The set of control inputs that triggered operator intervention and the sum of the corrected control inputs as part of the intervention are recorded as a new set of control outputs. The operator's tendency to cover lower or zero values is attributed to this new set of control outputs. The motivation is that, by making specific quantitative control inputs, the operator has clearly indicated which values of control outputs he / she considers acceptable in the current situation.
[0045] In another particularly advantageous embodiment, a clustering algorithm is used to group all sets of control outputs for which the operator requests coverage, and / or the corresponding sets of current and / or past values of the process, into multiple clusters. Different reasons for coverage are associated with each cluster.
[0046] In this way, the appropriate categories for classification by the machine learning model can be automatically determined without the need to build a possible catalog of categories (i.e., a catalog of reasons why the operator wants to cover MPCs) using prior knowledge. For example, it may be unknown which type of behavior among the values of state variables is most likely to trigger human intervention. For instance, clustering might produce a first cluster with a common feature of at least one state variable sharply increasing. This cluster corresponds to a perceived impending overshoot as a cause of coverage. The clustering might also produce a second cluster with a common feature of at least one state variable sharply decreasing. This cluster corresponds to a perceived impending undershoot as a cause of coverage.
[0047] The primary outcome of training is a set of parameters characterizing the behavior of the trained machine learning model. If the model includes an artificial neural network, these parameters may include, for example, weights; these weights are used to weight the inputs of individual neurons when summing the activations of the corresponding neurons. If the model includes a support vector machine, these parameters may, for example, characterize the hyperplane that separates the different classes from each other. In the parameter set, the effort of collecting training data and the effort of training itself are compressed. Whoever possesses the parameter set can skip training and immediately use the trained machine learning model in the aforementioned control methods. Therefore, the parameter set is a product that can be sold separately.
[0048] Machine learning models can also be extended to include a further number of cases corresponding to each set representing the control outputs. For example, in addition to the set of current and / or past values of the process's state variables, at least one setpoint of the process and / or the future expectation of at least one state variable of the process can also be used to characterize the process's cases.
[0049] This invention can be implemented, at least in part, in a separately marketable computer program. Therefore, this invention also provides a computer program having machine-readable instructions that, when executed by one or more computers, cause the one or more computers to perform the control method and / or training method described above.
[0050] Specifically, computer programs and / or parameter sets may be sold and provided in the form of non-transitory storage media and / or downloadable products. A computer may be provided as a parameter set, computer program, and / or non-transitory storage media. Attached Figure Description
[0051] In the following description, the invention is illustrated using accompanying drawings, which are not intended to limit the scope of the invention.
[0052] The attached diagram shows:
[0053] Figure 1 Exemplary embodiment of control method 100;
[0054] Figure 2 A schematic diagram illustrating the trade-off between the optimality and understandability of a trajectory in the state space;
[0055] Figure 3 : To avoid the impression that the increase of state variable 11 from target value A to target value B will overshoot and exceed the constraint threshold T;
[0056] Figure 4 : A schematic diagram of an embodiment of operator 4 selecting control strategies 2a-2c;
[0057] Figure 5 Exemplary embodiment of training method 200. Detailed Implementation
[0058] Figure 1 This is a flowchart of an exemplary embodiment of control method 100. In step 110, based on the set 11 of current and / or past values of the state variables of process 1, the set 21 of control outputs is determined by process controller 2. Figure 1 In the example shown, process controller 2 performs model predictive control (MPC): According to block 111, for multiple expected sets 21a-21c of control outputs, when the corresponding candidate sets 21a-21c of control outputs are applied to process 1, an evolution from a given set 11 of state variable values to candidate sets 11a-11c occurs. According to block 112, based on optimality criterion 24, each set in the candidate sets 11a-11c of state variables is assigned a corresponding goodness value 25a-25c. According to block 113, based on this goodness value 25a-25c, a candidate set 21a-21c of control outputs is selected as the final set 21 of control outputs. This set 21 of control outputs is applied to process 1, for example, to actuator 12 or lower-level controller 13 of the process.
[0059] In step 120, the set 11 of state variables and the set 21 of control outputs are passed to a trained machine learning model 3, which returns classification values 31 and / or regression values 32, and / or reasons 34 that the operator 4 wishes to cover the set 21 of control outputs. The operator 4's tendency 33 to cover these control outputs 21 can be directly included in the output of the machine learning model 3, or it can be calculated from the classification values 31 and / or regression values 32.
[0060] Then, determine whether the categorical value 31 and / or the regression value 32 and / or the tendency 33 meet the predetermined criterion 130. If this occurs (true value 1), then there are two options that can be performed alternatively or in combination.
[0061] In step 140, at least one control output of control output 21, and / or at least one parameter 22 characterizing the behavior of process controller 2, and / or at least one constraint 23 during the operation of process controller 2, is modified with the aim of reducing the operator's tendency to override control input 21. This modification is performed before the set of control outputs 21 is applied to process 1. In particular, where process controller 2 is an MPC controller with an unknown internal structure, changing the constraints 22 of the MPC and then rerunning the MPC is a preferred method.
[0062] In box 140, an exemplary method for performing the modification is depicted. According to box 141, the rate at which the state variable increases or decreases may be gradually slowed down so as not to give the impression to operator 4 that the state variable is changing in an uncontrolled manner.
[0063] According to boxes 142-144, an active search can be performed on the new set 21 of control outputs. According to box 142, multiple candidate sets 21a-21c of control outputs can be generated. According to box 143, the trained machine learning model 3 can be queried again based on the set 11 of state variables and the candidate sets 21a-21c of control outputs to obtain candidate classification values 31a-31c and / or candidate regression values 32a-32c. Based on this output 22 from the trained machine learning model 3, according to box 144, the candidate set 21a-21c of the operator intervention tendency 33 is selected as the new set 21 of control outputs.
[0064] In step 150, message 41 is communicated to operator 4 to assure operator 4 that the set 21 of control outputs to be applied is reasonable from the perspective of the current state of process 1 as described in the set 11 of state variables. Specifically, message 41 may include an explanation 41a of the control strategy 2a to be applied, and / or an invitation 41b to select one from several candidate control strategies 20a to be applied to process 1.
[0065] Figure 2 This illustrates the trade-off between optimality and understandability of control output 21 in a simple example. When the set of control outputs 21 is applied to process 1, this causes process 1 to follow a certain trajectory in state space 15. Process controller 2 operates according to optimality criterion 24, which requires following trajectories 14a-14c. Figure 2 In a simple example, the grade is a school grade that descends from A to F.
[0066] The trade-off between optimality and understandability stems from the complexity of physical process 1. This complexity can be described as the presence of restricted zones 15a-15f in state space 15, through which trajectories 14a-14c cannot enter or pass. The higher the grade of trajectory 14a-14c, the more complex the paths required around restricted zones 15a-15f. More complex trajectories 14a-14c, in turn, require more complex control output patterns to generate.
[0067] In a simple example, there exists an easily understandable and straightforward trajectory 14a. However, this trajectory only achieves level D in optimality criterion 24. On the other hand, the optimal trajectory 14b, which reaches level A, is extremely complex and takes many detours around restricted areas 15a-15e in state space 15. Since the restricted areas 15a-15f in state space 15 are part of the process model in the MPC process controller 2, they are invisible to the operator, who may be confused by the circuitous trajectory 14b and suspect that some error has occurred. Therefore, operator 4 may manually control the process and follow the direct trajectory 14a for process 1, exchanging level A for level D.
[0068] In this situation, it is worthwhile to switch to trajectory 14c. Trajectory 14c has only one bend around the restricted area 15f in state space 15, and is therefore more likely to be reasonably accepted by operator 4 than the optimal trajectory 14b. To achieve this, the level A reached by the optimal trajectory 14b needs to be exchanged for level B, which is much less costly than downgrading to level D.
[0069] Figure 3 This is another example of a situation where operator 4 might trigger intervention. State variable 11 in process 1 needs to be increased from the first objective value A to the second objective value B because MPC controller 2 deems it worthwhile according to its optimality criterion. The second objective value B is very close to the constraint threshold T that must not be exceeded. The increase in state variable 11 is caused by… Figure 3 This is caused by control input 21, which is not explicitly shown in the text.
[0070] Starting from the first objective value A at time t1, the fastest way to reach the second objective value B is to accelerate the increase of state variable 11 to its maximum possible rate and decelerate it at the last possible moment (trajectory a). If this strategy is followed, the second objective value B will be reached at time t2.
[0071] However, the sharp increase in state variable 11 will surprise operator 4. Fearing that overshoot will exceed the constraint threshold T, operator 4 will manually control the system, separating it from the optimal trajectory a and guiding state variable 11 to the second target value B along trajectory b. The cost of this very cautious approach is that trajectory b does not reach the second target value B until a later time t3. This means that process 1 operates suboptimally between times t2 and t3.
[0072] The tendency of operator 4 to override MPC controller 2 can be reduced by gradually decreasing the rate at which state variable 11 increases. That is, after increasing at the maximum rate for a very short period, state variable 11 is guided along trajectory c. This trajectory c also involves deviating from the optimal trajectory a and only reaching the second target value B at time t4, which is later than t2, but still much earlier than t3. Furthermore, process 1 remains under automated control, thus eliminating the risk of operator 4 making errors that could lead to interventions that should be avoided due to overshoot.
[0073] Figure 4 An interactive method is shown that prompts operator 4 to select one of several proposed control strategies 2a-2c. Initially, during step 110 of method 100, the MPC process controller 2 generates a set 21 of control outputs, which corresponds to the first control strategy 2a.
[0074] During steps 120 and 130, it is determined that operator 4's tendency to cover the set 21 of control outputs is too high. Therefore, it is decided to prompt operator 4 with messages 41 and 41b. Further control strategies 2b and 2c are obtained from the MPC process controller 2. Although these further strategies 2b and 2c are suboptimal relative to the optimality criterion 24, they are easier for operator 4 to understand and therefore have a greater chance of being considered reasonable. Operator 4 is prompted to select one of the proposed control strategies 2a-2c and the corresponding set 21 of control outputs. The selected set 21 of control outputs is applied to process 1.
[0075] Figure 5 This is a flowchart of an exemplary embodiment of training method 200. In step 210 of method 100, during actual and / or simulated operation of process 1 under the control of process controller 2, multiple sets 21 of control outputs from process controller 2 are recorded. Furthermore, in step 220, based on the corresponding sets 21 of control outputs, a set 11 of current and / or past values of state variables of process 1 is recorded.
[0076] In step 230, for each of the multiple sets 21 of control outputs provided by the process controller, a decision 42 is recorded regarding whether the operator 4 requests coverage of the corresponding set 21 of control outputs under the appropriate circumstances. According to block 231, this may include grouping all sets 21 of control outputs for which the operator 4 requests coverage, and / or the corresponding sets (11) of the current and / or past values of the process's state variables, into multiple clusters using a clustering algorithm. Different reasons 34 for coverage are associated with each cluster. For example, one cluster may involve the operator 4's fear of state variable overshoot, and another cluster may involve the operator's fear of state variable undershoot.
[0077] In step 240, the machine learning model 3 being trained is queried using the set 21 of control outputs and the corresponding set 11 of state variables describing the situation represented by the control outputs. The behavior of the machine learning model is characterized by the set of parameters 35. The machine learning model returns classification values 31 and / or regression values 32, and the operator 4's tendency 33, which covers the set 21 of control outputs, follows these values.
[0078] In step 250, parameter 35 is optimized to make the predictions based on the propensity of classification value 31 and / or regression value 32 more accurate, i.e., to better match the actual decision 42. This optimization can be performed in two stages 251 and 252 on different processes 1 and 1', respectively. The resulting parameter 35 is applied to machine learning model 3.
[0079] In step 260, the control input 43 corrected by operator 4 is recorded. In step 270, the superposition of the set 21 of control inputs deemed unreasonable by the operator and the corrected control inputs 43 applied by operator 4 in response to that decision is recorded as a new set 21* of control outputs. For this new set 21* of control outputs, the tendency 33 of operator 4 to cover is known to be low or zero. The new set 21* can be used in steps 240 and 250 like any other set 21 of control outputs.
[0080] List of reference numerals
[0081] 1,1' Industrial Process
[0082] 11. Set of state variables for process 1
[0083] Candidate set of state variables 11a-11c
[0084] 12. Actuators in Process 1
[0085] 13. Low-level controller in process 1
[0086] Trajectories in state space 15 of 14a-14c
[0087] 15. State Space of Process 1
[0088] Forbidden Zones in State Space 15a-15f
[0089] 2 Process Controller
[0090] 2a-2c control strategy
[0091] 21. Set of control outputs
[0092] Candidate set of control outputs for 21a-21c
[0093] 22 Parameters characterizing the behavior of process controller 2
[0094] 23 Constraints under the operation of Process Controller 2
[0095] 24. Optimality Criteria of Process Controller 2
[0096] The merit values of 25a-25c according to the optimality criterion 24
[0097] 3 Machine Learning Models
[0098] 31 Classification values provided by Model 3
[0099] Classification values of candidate set 21a-21c (31a-31c)
[0100] 32 Regression values provided by Model 3
[0101] Regression values of candidate set 32a-32c and 21a-21c
[0102] 33 operators 4 cover the set of outputs 21 tendencies
[0103] 34. Reasons for Coverage
[0104] 35. Parameters characterizing the behavior of machine learning model 3
[0105] 4. Operators
[0106] Message 41 to operator 4
[0107] 41a Explanation of control strategies 2a-2c
[0108] 41b Invitation to select control strategies 2a-2c
[0109] 42. Decision of Operator 4
[0110] 43 Control inputs corrected by operator 4
[0111] 100 Control Methods
[0112] 110 Determine the set of control outputs 111 Candidate sets for predicting state variables 11a-11c
[0113] 112 Assign the merit values 25a-25c to the candidate set 11a-11c 113 Select candidate set 21a-21c as the set of control outputs 21 120 Query trained machine learning model 3
[0114] 130 Criteria for Classification 31, Regression 32, and / or Propensity 33 140 Modify control outputs 21, parameters 22, and / or constraints 23
[0115] 141 Gradually slow down the rate of increase / decrease of state variables. 142 Generate multiple candidate sets 21a-21c
[0116] 143 Query trained machine learning model 3
[0117] 144 Replace the control output with candidate set 21a-21c 150. Message to operator 4
[0118] 200 training methods
[0119] 210 Record multiple sets of control outputs 21
[0120] 220 A set of records of state variable values 11
[0121] 230 records determine 42
[0122] 231 Clustering Cases Requested
[0123] 232 attributed the different causes 34 to different clusters
[0124] 240 Query Machine Learning Model 3
[0125] 250 Optimize the parameters of model 3 by setting 35
[0126] 251 First process 1 optimization of the first stage of 250 252 Optimization of the second stage of process 1, 250 260 Record correction control inputs 43
[0127] 270 Use control input 43 to record a new set 21*A of control outputs, representing the first target value of state variable 11.
[0128] B. Second objective value of state variable 11
[0129] The constraint threshold of state variable T11
[0130] t time
[0131] t1-t4 are time points.
Claims
1. A method (100) for controlling an industrial process (1), the method (100) comprising: • Based at least in part on a set (11) of current and / or past values of state variables of the industrial process (1), a process controller (2) determines (110) a set (21) of control outputs to be applied to at least one actuator (12) and / or a low-level controller (13), the at least one actuator and / or low-level controller being configured to perform at least one physical operation in the industrial process (1); • The trained machine learning model (3), configured to output categorical values (31) and / or regression values (32), is queried (120) based on at least a subset of the set of current and / or past values of the state variables (11) and at least a subset of the set of control outputs (21), the machine learning model being configured to output categorical values (31) and / or regression values (32), the categorical values and / or regression values indicating the tendency (33) of the monitoring operator (4) for at least partially covering the control outputs (21) provided by the process controller (2); and • In response to determining the classification value (31), and / or the regression value (32), and / or the operator's (4) tendency (33) to cover the control output (21), to meet predetermined criteria (130), and - Modify at least one control output in the control output (21) of (140), and / or at least one parameter (22) characterizing the behavior of the process controller (2), and / or at least one constraint (23) during the operation of the process controller (2). The modification (140) mentioned therein specifically includes: • Generate (142) multiple candidate sets of control outputs (21a-21c); • Based on each candidate set (21a-21c) of the control output, query the trained machine learning model (3) (143) to obtain candidate classification values (31a-31c) and / or candidate regression values (32a-32c); and • Replace (144) the current set (21) of the control outputs to be applied with a set of candidate control outputs (21a-21c), wherein the corresponding candidate classification values (31a-31c) and / or candidate regression values (32a-32c) indicate that the operator has a lower tendency (33) to perform coverage.
2. The method (100) according to claim 1, wherein the classification value (31) and / or the regression value (32) includes possible reasons (34) for the operator (4) covering the control output (21) provided by the process controller (2).
3. The method (100) according to claim 2, wherein the modification (140) is specifically aimed at reducing the prevalence of the possible cause (34) in the behavior of the industrial process (1).
4. The method (100) according to claim 3, wherein the possible cause (34) specifically includes overshoot and / or undershoot of at least one state variable of the industrial process (1).
5. The method (100) according to claim 4, wherein the modification (140) specifically includes: This causes the rate at which a particular state variable of the industrial process (1) increases or decreases gradually to slow down.
6. The method (100) according to any one of claims 1 to 5, wherein determining (110) the set (21) of control outputs comprises: • For multiple candidate sets (21a-21c) of control outputs, predict (111) the candidate set (11a-11c) of values of state variables based on the model of the industrial process (1), wherein a given set (11) of values of state variables will evolve in response to the corresponding set (21a-21c) of candidate control outputs applied to the at least one actuator (12) and / or the low-level controller (13); • Based on at least one optimality criterion (24), assign (112) the excellence values (25a-25c) to each candidate set (11a-11c) of the state variables; and • The candidate set (21a-21c) of the control output corresponding to the candidate set (11a-11c) of the state variables whose merit values (25a-25c) satisfy a predetermined criterion is determined (113), and is used as the set (21) of the control output.
7. The method (100) according to any one of claims 1 to 5, wherein the machine learning model (3) is further based on the classification value (31) and / or the regression value (32), based on at least one setpoint of the industrial process (1), and / or based on the future expectation of at least one state variable of the industrial process (1).
8. The method (100) according to any one of claims 1 to 5, wherein the machine learning model (3) comprises an artificial neural network and / or a support vector machine.
9. A computer program product comprising machine-readable instructions that, when executed by one or more computers, cause the one or more computers to perform the method (100) according to any one of claims 1 to 8.
10. A non-transitory computer-readable storage medium comprising the computer program product according to claim 9.
11. A computer provided with a computer program product according to claim 9 and / or a non-transitory computer-readable storage medium according to claim 10.