Three-phase electric arc furnace electrode regulation system optimization control method based on hierarchical stackelberg-nash game

By optimizing the electrode adjustment system of a three-phase electric arc furnace using hierarchical Stackelberg-Nash game theory and ADP technology, the problems of unstable electrode control and high energy consumption were solved, and the stability and energy efficiency of the system were improved.

CN117687300BActive Publication Date: 2026-06-23NORTHEASTERN UNIV CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NORTHEASTERN UNIV CHINA
Filing Date
2023-12-14
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing three-phase electric arc furnace electrode control methods struggle to achieve effective optimization control when faced with asymmetric decision-making information, leading to unstable electrode operation and increased energy consumption.

Method used

An optimization control method based on hierarchical Stackelberg-Nash game theory, combined with adaptive dynamic programming (ADP) technology, is adopted to construct consistency error and performance index functions, design optimization control law and single-judgment network update law, and optimize the electrode adjustment system of three-phase electric arc furnace.

Benefits of technology

It achieves optimized control of the electrode adjustment system in complex industrial processes, reduces computational burden, improves system stability and energy efficiency, and shortens smelting time.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure QLYQS_1
    Figure QLYQS_1
  • Figure QLYQS_10
    Figure QLYQS_10
  • Figure QLYQS_20
    Figure QLYQS_20
Patent Text Reader

Abstract

The application provides a three-phase electric arc furnace electrode regulating system optimization control method based on a hierarchical Stackelberg-Nash game, first establishes a three-phase electric arc furnace electrode regulating system model and an equivalent state space expression form thereof, and defines a network topology relationship of the electrode regulating system; consistency errors and performance index functions are constructed from the perspectives of leaders and followers respectively; a hierarchical Stackelberg-Nash game mechanism with multiple participants is established; a coupled HJB equation is constructed by using the Bellman optimality principle, and an optimization control strategy is solved by using a gradient descent method; based on ADP optimization technology, an evaluation network is constructed to evaluate the control strategy executed each time to realize an optimization control target; the effectiveness of the control method is verified by numerical simulation of the three-phase electric arc furnace electrode regulating system; and the application provides a favorable tool for analyzing control series problems of the three-phase electric arc furnace electrode regulating system in the field of industrial processes, and can enhance the controllability of the system to a certain extent.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of optimization control technology for three-phase electric arc furnace systems in industrial processes, and relates to an optimization control method for electrode adjustment systems of three-phase electric arc furnaces based on hierarchical Stackelberg-Nash game theory. Background Technology

[0002] Three-phase electric arc furnace (EAF) smelting is a complex industrial process that utilizes the high temperatures generated by the electric arc to melt scrap steel, metals, or ores. The electrode control system is crucial to the EAF smelting process, and a reasonable electrode control strategy plays a vital role in improving EAF performance, reducing energy consumption, and shortening smelting time. Over the years, several EAF electrode control methods have been proposed, such as control methods based on arc impedance, dual-model electrode control methods based on arc current and arc impedance, and self-calibrating control methods. Currently, most practical methods employ PID control, but tuning the controller parameters requires experience. When the electrode operation becomes unstable during EAF smelting, the PID parameters need to be reset, resulting in long transition times and increased energy consumption. With the continuous development of modern control theory, intelligent control technologies, represented by reinforcement learning, adaptive control, and optimization game theory, are rapidly developing in the field of industrial control, bringing new ideas to the research of EAF electrode control strategies.

[0003] In recent years, game control has played a crucial role in simulating and analyzing complex behaviors such as collaboration, cooperation, competition, and adversarial among multiple participants in various scenarios. With ongoing research, game control has been widely applied in areas such as network spectrum allocation, smart grid management, traffic route planning, and troop tactical deployment. Currently, many game control methods aim to achieve Nash equilibrium, meaning that all participants act synchronously at the same decision-making level. However, when facing game problems with asymmetric decision-making information, simply pursuing Nash equilibrium is no longer applicable. Stackelberg games provide a feasible mechanism for implementing hierarchical decision-making processes. Generally, a Stackelberg game in its nominal form consists of two participants: a leader and a follower. Based on the hierarchical structure of asymmetric decision-making information, the leader enjoys an informational advantage, thus prioritizing strategies that optimize its current performance indicator function, while the follower responds optimally to the leader's implemented strategies.

[0004] Adaptive Dynamic Programming (ADP) is an approximate optimal method that emerged from the convergence of control and artificial intelligence fields. Its core characteristic lies in the system's ability to adaptively adjust its strategy based on constantly changing complex environments and feedback information to achieve optimal control. The technical background of this method originates from the principles of dynamic programming, which involves decomposing a problem into optimization substructures and overlapping subproblems and constructing non-explicit optimal solutions to satisfy specific constraints and objectives. Specifically, ADP further develops the principles of dynamic programming by introducing adaptive control and learning algorithms. It employs approximate structures, such as fuzzy logic systems or neural networks, to approximate the performance index function and non-explicit optimal control strategy in dynamic programming, ensuring that they satisfy the Bellman optimality principle. Finally, it obtains an approximate optimal control strategy through value iteration or policy iteration. Summary of the Invention

[0005] To address the optimization control problem of three-phase electric arc furnace electrode conditioning systems in industrial processes, a hierarchical Stackelberg-Nash game-based optimization control method is proposed. Under the established hierarchical Stackelberg-Nash game mechanism, a performance index function based on consistency error is constructed, and ADP technology is introduced to design an optimization control law and a single-judge network update law to achieve the optimization control objective of the three-phase electric arc furnace electrode conditioning system.

[0006] An optimal control method for a three-phase electric arc furnace electrode adjustment system based on hierarchical Stackelberg-Nash game theory specifically includes:

[0007] S1. Establish the mathematical model of the three-phase electric arc furnace electrode adjustment system and its corresponding state-space equivalent expression, and define the network topology of the electrode adjustment system;

[0008] S11. The mathematical model of the three-phase electric arc furnace electrode adjustment system is established as follows:

[0009]

[0010] Among them, v i and I i Let be the speed and output current of the i-th motor; It is the control quantity of the electric arc furnace electrode adjustment system on the output current of the motor; The gain coefficient is the arc length. s is the arc length of the three-phase electric arc; s is the Laplace transform variable; there is a nonlinear mapping relationship between the motor output current and the three-phase electric arc. and ξ i h is the reference current and arc length proportionality factor. i It is a nonlinear term; These represent the speed feedback gain, amplification stage gain, motor amplification transfer function gain, and integral stage gain, respectively. and For the rectifier and filter stage time constant, the motor electromechanical time constant and the stator inductance time constant;

[0011] Define an augmented state variable x i =[x i,1 ,x i,2 ,x i,3 ,x i,4 ] T And let the state component x i,1 =I i , x i,3 =v i , The mathematical model of the three-phase electric arc furnace electrode adjustment system is further formulated as follows:

[0012]

[0013] in,

[0014] S12. Define the network topology of the three-phase electric arc furnace electrode adjustment system as an undirected graph. in, and These are the vertex set and the undirected edge set, Λ i N is the i-th vertex in the vertex set; i ={Λ j :(Λ i ,Λ j )∈Γ} represents the set of neighbors of the i-th vertex, Λ j To be with Λ i The j-th connected vertex; Given a 3×3 adjacency matrix, a ij for The elements in the graph, since only the topology of undirected graph networks is considered, therefore, It is a symmetric matrix, so we have a ij =a ji And satisfy and Corresponding to the adjacency matrix The Laplace matrix is If i ≠ j, then l ij =-a ij ,otherwise, The information transmission relationship between leaders and followers is represented by a diagonal matrix. To describe this, if the i-th follower can directly interact with the leader, then rii >0, otherwise r ii =0;

[0015] S2. Based on step S1, construct the consistency error and performance index functions from the perspectives of leaders and followers, respectively.

[0016] S21. From the leader's perspective, the consistency error is constructed as follows:

[0017]

[0018] Where x0 and x i Let these be the system states of the leader and the i-th follower, respectively.

[0019] S22. Based on equations (2) and (3), the dynamic calculation of the consistency error e0 is as follows:

[0020]

[0021] S23. The consistency error is constructed from the follower's perspective as follows:

[0022]

[0023] Where, x j The system state of the j-th follower that is connected to the i-th follower;

[0024] S24. Based on equations (2) and (5), the consistency error e i The dynamic calculation is as follows:

[0025]

[0026] S25. Based on the consistency error e0 and e i The state response of the electrode adjustment system of a three-phase electric arc furnace is characterized by the following performance index function:

[0027]

[0028] in, Represents the initial value of the consistency error; u -i Defined as This refers to: the control input u of the i-th follower. i In addition, the control input u of the j-th follower connected to the i-th follower. j Defined as u -i ;

[0029] S26. For the i-th follower, q i (e i ,u0,u i ,u -i) is defined as:

[0030]

[0031] Among them, C i D i E i All are positive definite symmetric matrices; Q i P ij B i Represents the coupling coefficient matrix;

[0032] S27. For the leader, q0(e0,u0,u) i ,u -i ) is defined as:

[0033]

[0034] Where C0 and D0 are both positive definite symmetric matrices; L j Represents the coupling coefficient matrix; u j The control strategy of the j-th follower that is connected to the i-th follower;

[0035] S3. Based on the performance index function obtained in step S2, establish a hierarchical Stackelberg-Nash game mechanism with multiple participants.

[0036] S31. For any given leader control policy u0∈U0, if there exists a mapping Z i :U0→U i Make:

[0037]

[0038] Then the strategy set {Z1(u0), Z2(u0), ..., Z} N (u0)} constitutes a Nash equilibrium among followers, that is, when other followers execute policy Z -i When (u0), Z i (u0) represents the best response of the i-th follower; where Z -i (u0) is defined as

[0039] S32. Based on S31, when the i-th follower executes its policy Z... i When (u0), the leader adopts their strategy. If the performance index function has the following relationship:

[0040]

[0041] Then the strategy set This constitutes a Stackelberg-Nash equilibrium between leaders and followers; among which,

[0042] S4. Based on the performance index function and using the Bellman optimality principle, construct the coupled HJB equation and solve the optimization control strategy by gradient descent.

[0043] S41. Update the performance index function (7) of the three-phase electric arc furnace electrode adjustment system as follows:

[0044]

[0045] Based on equation (12), the optimization strategy for the electrode adjustment system of the three-phase electric arc furnace is given as follows:

[0046]

[0047] in, For optimized performance metric functions;

[0048] S42. For the i-th follower, the optimized performance metric function is given as:

[0049]

[0050] For leaders, the optimized performance metric function is given as:

[0051]

[0052] S43. Based on the follower consistency error dynamics, i.e., equation (6), the Hamiltonian function of the i-th follower is defined as:

[0053]

[0054] Based on the leader's consistency error dynamics, i.e., equation (4), the leader's Hamiltonian function is defined as:

[0055]

[0056] S44. For any given leader control policy u0, the optimal response of each follower is as follows:

[0057]

[0058] in, In strategy u0 and The corresponding performance index function; if but

[0059] S45. Combining equations (8), (16), and (18), the coupled HJB equations for each follower are obtained as follows:

[0060]

[0061] S46, Based on equation (19) and by solving have to The specific form is as follows:

[0062]

[0063] S47. According to equation (17), the leader's optimization strategy is given as follows:

[0064]

[0065] S48. Based on equations (9), (17), (20), and (21), the coupling HJB equations of the leader are obtained as follows:

[0066]

[0067] S49, Based on equation (22) and by solving have to The specific form is as follows:

[0068]

[0069] in, By combining equations (20) and (23), each follower regarding Optimized control strategy The following is given:

[0070]

[0071] S5. Based on ADP optimization technology, an evaluation network is constructed to evaluate the optimization control strategy executed each time in order to achieve the optimization control objective of the three-phase electric arc furnace electrode adjustment system.

[0072] S51. Implement a single-judge neural network to approximate the optimized performance index function. get:

[0073]

[0074] in, These represent the weights, activation function, and approximation error of a single-criteria neural network, respectively; c refers to the critic network.

[0075] According to equation (25), the performance index function The gradient is expressed as follows:

[0076]

[0077] in, They are respectively and The gradient;

[0078] S52. Based on equations (23) and (26), the leader's optimization strategy It is rephrased as follows:

[0079]

[0080] F1 and F2 are given as follows:

[0081]

[0082]

[0083] And c = c1 - c2 satisfies in It is a positive constant, and c1 and c2 are given as follows:

[0084]

[0085]

[0086] S53. Substituting equation (27) into equation (24), then each follower regarding Optimized control strategy It has been reformulated as follows:

[0087]

[0088] in, And d i It is norm-bounded, meaning there exists a positive constant. Make

[0089] S54, based on equations (25) and (26), and It is estimated as follows:

[0090]

[0091]

[0092] in, and They are respectively and The estimated value;

[0093] Combining equations (27)-(30), and It is estimated as follows:

[0094]

[0095]

[0096] S55. Based on equations (8), (16), and (30), the estimate of the Hamiltonian function of the i-th follower is given as follows:

[0097]

[0098] in, and

[0099] Based on equations (9), (26), and (30), the estimate of the leader's Hamiltonian function is given as:

[0100]

[0101] in, and

[0102] S56. Based on equations (33) and (34), the Bellman residual function is defined as follows:

[0103]

[0104] S57. To ensure the minimum Bellman residual, the following objective function is defined:

[0105]

[0106] S58. Based on equation (36) and using gradient descent and normalization principles, the update law of the single-judgment network is designed as follows:

[0107]

[0108] Where, ρ i For update rate; It was used for standardization;

[0109] S59, based on equation (37) and weight error The dynamics of the weighted error are further obtained as shown below:

[0110]

[0111] in, η i satisfy It is a positive constant.

[0112] Beneficial technical effects of the present invention:

[0113] Based on the topological relationships of undirected graphs, this invention proposes an optimized control method for a three-phase electric arc furnace electrode regulation system based on a hierarchical Stackelberg-Nash game. Compared to the nominal Stackelberg game involving only two participants, this invention establishes a hierarchical Stackelberg-Nash game mechanism with multiple participants, enabling the study of the bidirectional interaction characteristics between leaders and followers from a hierarchical decision-making perspective. Compared to adaptive dynamic programming methods employing an actor-commentator network architecture, this invention uses a single-commentator network architecture, reducing computational burden, and designs a weight update law through gradient descent and standardization principles. Attached Figure Description

[0114] Figure 1 This is a flowchart of the optimized control method for the electrode adjustment system of a three-phase electric arc furnace based on hierarchical Stackelberg-Nash game according to the present invention.

[0115] Figure 2 This is a schematic diagram of the three-phase electric arc furnace melting process in the simulation of an embodiment of the present invention;

[0116] Figure 3 This is a control block diagram of the electrode adjustment system in the simulation of an embodiment of the present invention;

[0117] Figure 4 The topological relationships of the undirected graph in the simulation of this embodiment of the invention;

[0118] Figure 5 Figure 1 shows the system state variable response curves in the simulation of this embodiment of the invention; Figure 2 shows the system state variable response curve of leader 0; Figure 3 shows the system state variable response curve of follower 1; and Figure 4 shows the system state variable response curve of follower 2.

[0119] Figure 6 Figure 1 shows the evaluation network weight update graph in the simulation of this embodiment of the invention; wherein, Figure 2a is the evaluation network weight graph of leader 0; Figure 3b is the evaluation network weight graph of follower 1; and Figure 4c is the evaluation network weight graph of follower 2.

[0120] Figure 7 Figure 1 shows the control law response curves in the simulation of an embodiment of the present invention; Figure 2a shows the control law response curve of leader 0; Figure 3b shows the control law response curve of follower 1; and Figure 4c shows the control law response curve of follower 2. Detailed Implementation

[0121] The present invention will be further described below with reference to the accompanying drawings and embodiments;

[0122] An optimal control method for a three-phase electric arc furnace electrode regulation system based on hierarchical Stackelberg-Nash game theory is presented in the appendix. Figure 1 As shown, it specifically includes:

[0123] S1. Establish the mathematical model of the three-phase electric arc furnace electrode adjustment system and its corresponding state-space equivalent expression, and define the network topology of the electrode adjustment system;

[0124] S11, the three-phase electric arc furnace smelting process is a relatively complex industrial process, such as Figure 2 As shown, one of the key aspects is controlling the distance between the electrode and the molten material using an electric motor, thereby maintaining the stability of the electric arc and achieving the optimal thermal effect, according to... Figure 3 The control block diagram shown below establishes the mathematical model of the three-phase electric arc furnace electrode adjustment system as follows:

[0125]

[0126] Among them, v i and I i Let be the speed and output current of the i-th motor; It is the control quantity of the electric arc furnace electrode adjustment system on the output current of the motor; The gain coefficient is the arc length. s is the arc length of the three-phase electric arc; s is the Laplace transform variable; there is a nonlinear mapping relationship between the motor output current and the three-phase electric arc. and ξ i h is the reference current and arc length proportionality factor. i It is a nonlinear term; These represent the speed feedback gain, amplification stage gain, motor amplification transfer function gain, and integral stage gain, respectively. and For the rectifier and filter stage time constant, the motor electromechanical time constant and the stator inductance time constant;

[0127] Define an augmented state variable x i =[x i,1 ,x i,2 ,x i,3 ,x i,4 ] T And let the state component x i,1 =I i , x i,3 =v i , The mathematical model of the three-phase electric arc furnace electrode adjustment system is further formulated as follows:

[0128]

[0129] in,

[0130] S12. Define the network topology of the three-phase electric arc furnace electrode adjustment system as an undirected graph. in, and These are the vertex set and the undirected edge set, Λ i N is the i-th vertex in the vertex set; i ={Λ j :(Λ i ,Λ j )∈Γ} represents the set of neighbors of the i-th vertex, Λ j To be with Λ i The j-th connected vertex; Given a 3×3 adjacency matrix, a ij for The elements in the graph, since only the topology of undirected graph networks is considered, therefore, It is a symmetric matrix, so we have a ij =a ji And satisfy and Corresponding to the adjacency matrix The Laplace matrix is If i ≠ j, then l ij =-a ij ,otherwise, The information transmission relationship between leaders and followers is represented by a diagonal matrix. To describe this, if the i-th follower can directly interact with the leader, then r ii >0, otherwise r ii =0;

[0131] S2. Based on step S1, construct the consistency error and performance index functions from the perspectives of leaders and followers, respectively.

[0132] S21. From the leader's perspective, the consistency error is constructed as follows:

[0133]

[0134] Where x0 and x i Let these be the system states of the leader and the i-th follower, respectively.

[0135] S22. Based on equations (2) and (3), the dynamic calculation of the consistency error e0 is as follows:

[0136]

[0137] S23. The consistency error is constructed from the follower's perspective as follows:

[0138]

[0139] Where, x j The system state of the j-th follower that is connected to the i-th follower;

[0140] S24. Based on equations (2) and (5), the consistency error e i The dynamic calculation is as follows:

[0141]

[0142] S25. Based on the consistency error e0 and e i The state response of the electrode adjustment system of a three-phase electric arc furnace is characterized by the following performance index function:

[0143]

[0144] in, Represents the initial value of the consistency error; u -i Defined as This refers to: the control input u of the i-th follower. i In addition, the control input u of the j-th follower connected to the i-th follower. j Defined as u -i ;

[0145] S26. For the i-th follower, q i (e i ,u0,u i ,u -i ) is defined as:

[0146]

[0147] Among them, C i D i E i All are positive definite symmetric matrices; Q i P ij B i Represents the coupling coefficient matrix;

[0148] S27. For the leader, q0(e0,u0,u) i ,u -i ) is defined as:

[0149]

[0150] Where C0 and D0 are both positive definite symmetric matrices; L j Represents the coupling coefficient matrix; u j The control strategy of the j-th follower that is connected to the i-th follower;

[0151] S3. Based on the performance index function obtained in step S2, establish a hierarchical Stackelberg-Nash game mechanism with multiple participants.

[0152] S31. For any given leader control policy u0∈U0, if there exists a mapping Z i :U0→U i Make:

[0153]

[0154] Then the strategy set {Z1(u0), Z2(u0), ..., Z} N (u0)} constitutes a Nash equilibrium among followers, that is, when other followers execute policy Z -i When (u0), Z i (u0) represents the best response of the i-th follower; where Z -i (u0) is defined as

[0155] S32. Based on S31, when the i-th follower executes its policy Z... i When (u0), the leader adopts their strategy. If the performance index function has the following relationship:

[0156]

[0157] Then the strategy set This constitutes a Stackelberg-Nash equilibrium between leaders and followers; among which,

[0158] S4. Based on the performance index function and using the Bellman optimality principle, construct the coupled HJB equation and solve the optimization control strategy by gradient descent.

[0159] S41. Update the performance index function (7) of the three-phase electric arc furnace electrode adjustment system as follows:

[0160]

[0161] Based on equation (12), the optimization strategy for the electrode adjustment system of the three-phase electric arc furnace is given as follows:

[0162]

[0163] in, For optimized performance metric functions;

[0164] S42. For the i-th follower, the optimized performance metric function is given as:

[0165]

[0166] For leaders, the optimized performance metric function is given as:

[0167]

[0168] S43. Based on the follower consistency error dynamic formula (6), the Hamiltonian function of the i-th follower is defined as:

[0169]

[0170] Based on the leader's consistency error dynamic equation (4), the leader's Hamiltonian function is defined as:

[0171]

[0172] S44. For any given leader control policy u0, the optimal response of each follower is as follows:

[0173]

[0174] in, In strategy u0 and The corresponding performance index function; if but

[0175] S45. Combining equations (8), (16), and (18), the coupled HJB equations for each follower are obtained as follows:

[0176]

[0177] S46, Based on equation (19) and by solving have to The specific form is as follows:

[0178]

[0179] S47. According to equation (17), the leader's optimization strategy is given as follows:

[0180]

[0181] S48. Based on equations (9), (17), (20), and (21), the coupling HJB equation of the leader is obtained as follows:

[0182]

[0183] S49, Based on equation (22) and by solving have to The specific form is as follows:

[0184]

[0185] in, By combining equations (20) and (23), each follower regarding Optimized control strategy The following is given:

[0186]

[0187] S5. Based on ADP optimization technology, an evaluation network is constructed to evaluate the optimization control strategy executed each time in order to achieve the optimization control objective of the three-phase electric arc furnace electrode adjustment system.

[0188] S51. Implement a single-judge neural network to approximate the optimized performance index function. get:

[0189]

[0190] in, These represent the weights, activation function, and approximation error of a single-criteria neural network, respectively; c refers to the critic network.

[0191] According to equation (25), the performance index function The gradient is expressed as follows:

[0192]

[0193] in, They are respectively and The gradient;

[0194] S52. Based on equations (23) and (26), the leader's optimization strategy It is rephrased as follows:

[0195]

[0196] F1 and F2 are given as follows:

[0197]

[0198]

[0199] And c = c1 - c2 satisfies in It is a positive constant, and c1 and c2 are given as follows:

[0200]

[0201]

[0202] S53. Substituting equation (27) into equation (24), then each follower regarding Optimized control strategy It has been reformulated as follows:

[0203]

[0204] in, And d i It is norm-bounded, meaning there exists a positive constant. Make

[0205] S54, based on equations (25) and (26), and It is estimated as follows:

[0206]

[0207]

[0208] in, and They are respectively and The estimated value;

[0209] Combining equations (27)-(30), and It is estimated as follows:

[0210]

[0211]

[0212] S55. Based on equations (8), (16), and (30), the estimate of the Hamiltonian function of the i-th follower is given as follows:

[0213]

[0214] in, and

[0215] Based on equations (9), (26), and (30), the estimate of the leader's Hamiltonian function is given as:

[0216]

[0217] in, and

[0218] S56. Based on equations (33) and (34), the Bellman residual function is defined as follows:

[0219]

[0220] S57. To ensure the minimum Bellman residual, the following objective function is defined:

[0221]

[0222] S58. Based on equation (36) and using gradient descent and normalization principles, the update law of the single-judgment network is designed as follows:

[0223]

[0224] Where, ρ i For update rate; It was used for standardization;

[0225] S59, based on equation (37) and weight error The dynamics of the weighted error are further obtained as shown below:

[0226]

[0227] in, η i satisfy It is a positive constant.

[0228] To verify the effectiveness of the control method of the present invention, numerical simulation was performed on the three-phase electric arc furnace electrode adjustment system in the embodiments. The undirected graph topological relationship between them is as follows: Figure 4 As shown, the following adjacency matrix and Laplace matrix can be obtained:

[0229]

[0230]

[0231] Choose state variable x i =[x i,1 ,xi,2 ,x i,3 ,x i,4 ] T Let i = 0, 1, 2, and let the state component x i,1 =I i , x i,3 =v i , The initial value of the leader's state variable is given as x0 = [0.1, 0.1, 0.1, 0.1]. T The initial values ​​of the state variables of the two followers are given as x1 = [0.01, 0.04, 0.06, 0.05]. T x2 = [0.08, 0.1, 0.1, 0.09] T The reference current is The speed feedback gains are respectively The gain of the amplification stage is respectively The transfer function gains of the motors are respectively The integral gain is respectively The arc length scaling factor is ξ0 = ξ1 = ξ2 = 0.25. The time constants of the rectifier-filter stage are respectively... Reference current is The electromechanical time constants of the electric motor are respectively The stator inductance time constant is The weight of the single-criteria network update law is For the leader, the initial value of the single-judgment network update law weight is chosen as follows: For the four followers, the initial weights of the single-judgment network update law are: The activation function is and This is a Gaussian type function with center and width c. NN,l ={-2,-1,0,1,2} and w NN,l ={2,2,1,2,2}, with update rates of ρ0 = 0.87, ρ1 = 1.15, and ρ2 = 2.24.

[0232] Simulation results are attached. Figure 5-7 As shown. Figure 5 The figure shows the rate of change curves of the state response of the three-phase electric arc furnace electrode adjustment system (including one leader and two followers) in the simulation of the embodiment of the present invention. It can be seen that the rate of change response curves of the output current of the three motors, the arc length of the three electrodes, the speed of the three motors and their derivatives oscillate greatly at the beginning, and then gradually become consistent and achieve state synchronization within a finite time. Figure 6The graph shows the weight update of a single-judge network with a leader and two followers. It can be seen that the weight update law of the single-judge network converges and reaches a stable state after being tuned in a short time. Figure 7 The control law response curves of the leader and two followers show that they initially oscillate slightly and gradually converge, eventually reaching a steady state.

Claims

1. An optimal control method for a three-phase electric arc furnace electrode adjustment system based on hierarchical Stackelberg-Nash game, characterized in that, Specifically, it includes: S1. Establish the mathematical model of the three-phase electric arc furnace electrode adjustment system and its corresponding state-space equivalent expression, and define the network topology of the electrode adjustment system; The mathematical model of the three-phase electric arc furnace electrode adjustment system is shown below: (1); in, and For the first The speed and output current of each motor; It is the control quantity of the electric arc furnace electrode adjustment system on the output current of the motor; The gain coefficient is the arc length. This is the arc length of the three-phase electric arc; The variables are Laplace transform variables; there is a nonlinear mapping relationship between the motor output current and the three-phase arc. ; and The reference current and arc length proportionality coefficient, It is a nonlinear term; , , , These represent the speed feedback gain, amplification stage gain, motor amplification transfer function gain, and integral stage gain, respectively. , and For the rectifier and filter stage time constant, the motor electromechanical time constant and the stator inductance time constant; Define an augmented state variable And let the state components , , , The mathematical model of the three-phase electric arc furnace electrode adjustment system is further formulated as follows: (2); in, , , ; Define the network topology of the three-phase electric arc furnace electrode adjustment system as an undirected graph. ,in, and They are the vertex set and the undirected edge set, respectively. For the first vertex in the vertex set One vertex; Representative and the The set of neighbors of each vertex. To and The connected first One vertex; It is a 3×3 adjacency matrix. for The elements in the graph, since only the topology of undirected graph networks is considered, therefore, It is a symmetric matrix, therefore it has And satisfy and ; Corresponding to the adjacency matrix The Laplace matrix is ,like ,but ,otherwise, The information transmission relationship between leaders and followers is represented by a diagonal matrix. To describe, if the first If a follower can directly interact with the leader, then ,otherwise ; S2. Based on step S1, construct the consistency error and performance index functions from the perspectives of leaders and followers, respectively. S3. Based on the performance index function obtained in step S2, establish a hierarchical Stackelberg-Nash game mechanism with multiple participants. S4. Based on the performance index function and using the Bellman optimality principle, construct the coupled HJB equation and solve the optimization control strategy by gradient descent. S5. Based on ADP optimization technology, an evaluation network is constructed to evaluate the optimization control strategy executed each time in order to achieve the optimization control objective of the three-phase electric arc furnace electrode adjustment system. Step S2 is as follows: S21. From the leader's perspective, the consistency error is constructed as follows: (3); in, and Leader and the The system state of each follower; S22, Based on equations (2) and (3), consistency error The dynamic calculation is as follows: (4); S23. The consistency error is constructed from the follower's perspective as follows: (5); in, In order to be with the first The first follower connected to the first The system state of each follower; S24. Based on equations (2) and (5), the consistency error The dynamic calculation is as follows: (6); S25. Based on consistency error and The state response of the electrode adjustment system of a three-phase electric arc furnace is characterized by the following performance index function: (7); in, This represents the initial value of the consistency error; Defined as This refers to: except for the first The control input of a follower In addition, with the first The first follower connected to the first The control input of a follower Defined as ; S26, Regarding the first One follower Defined as: (8); in, , , All are positive definite symmetric matrices; , , Represents the coupling coefficient matrix; S27. For leaders Defined as: (9); in, , All are positive definite symmetric matrices; Represents the coupling coefficient matrix; Representative and the The first follower connected to the first A follower's control strategy; Step S3 is as follows: S31. For any given leader control strategy If a mapping exists Make: (10); Then the strategy set This establishes a Nash equilibrium among followers, i.e., when other followers execute their policies. hour, For the first The best response from each follower; among them Defined as ; S32, Based on S31, when the... Each follower executes its strategy. At that time, the leader adopts his strategy If the performance index function has the following relationship: (11); Then the strategy set This constitutes a Stackelberg-Nash equilibrium between leaders and followers; among which, .

2. The optimized control method for a three-phase electric arc furnace electrode adjustment system based on hierarchical Stackelberg-Nash game as described in claim 1, characterized in that, Step S4 is as follows: S41. Update the performance index function (7) of the three-phase electric arc furnace electrode adjustment system as follows: (12); Based on equation (12), the optimization strategy for the electrode adjustment system of the three-phase electric arc furnace is given as follows: (13); in, For optimized performance metric functions; S42, Regarding the first For each follower, the optimized performance metric function is given as: (14); For leaders, the optimized performance metric function is given as: (15); S43. Based on the follower consistency error dynamics, i.e., equation (6), define the first... The Hamiltonian function for each follower is: (16); Based on the leader's consistency error dynamics, i.e., equation (4), the leader's Hamiltonian function is defined as: (17); S44. For any given leader control strategy The optimized response for each follower is shown below: (18); in, In strategy and The corresponding performance index function; if , but S45. Combining equations (8), (16), and (18), the coupled HJB equations for each follower are obtained as follows: (19); S46, Based on equation (19) and by solving have to The specific form is as follows: (20); S47. According to equation (17), the leader's optimization strategy is given as follows: (21); S48. Based on equations (9), (17), (20), and (21), the coupling HJB equations of the leader are obtained as follows: (22); S49, Based on equation (22) and by solving have to The specific form is as follows: (23); in, By combining equations (20) and (23), each follower regarding Optimized control strategy The following is given: (24)。 3. The optimized control method for a three-phase electric arc furnace electrode adjustment system based on hierarchical Stackelberg-Nash game as described in claim 1, characterized in that, Step S5 is as follows: S51. Implement a single-judge neural network to approximate the optimized performance index function. ,get: (25); in, , , These represent the weights, activation function, and approximation error of a single-criteria neural network, respectively; c refers to the critic network. According to equation (25), the performance index function The gradient is expressed as follows: (26); in, , , They are respectively , and The gradient; S52. Based on equations (23) and (26), the leader's optimization strategy It is rephrased as follows: (27); in, and It is given as follows: ; ; and satisfy ,in It is a positive constant. and It is given as follows: ; S53. Substituting equation (27) into equation (24), then each follower regarding Optimized control strategy It has been reformulated as follows: (28); in, , ,and It is norm-bounded, meaning there exists a positive constant. Make ; S54, based on equations (25) and (26), and It is estimated as follows: (29); (30); in, and They are respectively and The estimated value; Combining equations (27)-(30), and It is estimated as follows: (31); (32) ; S55, based on equations (8), (16) and (30), the first The estimates of the Hamiltonian function for each follower are given below: (33); in, and ; Based on equations (9), (26), and (30), the estimate of the leader's Hamiltonian function is given as: (34); in, and ; S56. Based on equations (33) and (34), the Bellman residual function is defined as follows: (35); S57. To ensure the minimum Bellman residual, the following objective function is defined: (36); S58. Based on equation (36) and using gradient descent and normalization principles, the update law of the single-judgment network is designed as follows: (37); in, For update rate; It was used for standardization; S59, based on equation (37) and weight error The dynamics of the weighted error are further obtained as shown below: (38); in, , ; satisfy , It is a positive constant.