A CMG frame servo system based on adaptive dynamic programming

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The CMG framework servo system, which combines adaptive dynamic programming and sliding mode control, solves the problem of insufficient control accuracy under multi-source interference and achieves high-precision and robust spacecraft attitude control.

CN115877706BActive Publication Date: 2026-06-23BEIJING INST OF CONTROL ENG

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: BEIJING INST OF CONTROL ENG
Filing Date: 2021-09-28
Publication Date: 2026-06-23

Application Information

Patent Timeline

28 Sep 2021

Application

23 Jun 2026

Publication

CN115877706B

IPC: G05B13/02

AI Tagging

Application Domain

Adaptive control

Technology Topics

Systems designControl engineering

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Deep space exploration constellation system integration simulation verification method
CN122263267AGeometric CAD Design optimisation/simulationData connectionSystems design
A machine learning-based device-level bad root cause localization method and system
CN116738275Breduce running timeStrong solving abilityMachine learningSystems designEngineering
Modular intelligent control method and device based on hydraulic cylinder system and medium
CN122258091AFluid-pressure actuator testing Controllers with particular characteristicsSystems designAdaptive matching
A bim-based parametric design method for existing buildings
CN122263224AGeometric CAD Multi-objective optimisationSystems designLogic network
Design method of prestress system for uhpc-steel box composite beam bridge
CN122263250AGeometric CAD Design optimisation/simulationBridge engineeringStress distribution

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing CMG framework servo systems struggle to achieve high-precision and robust control under multi-source interference, especially in high-dynamic and high-precision spacecraft attitude control, where insufficient control accuracy is a problem.

Method used

The CMG framework servo system adopts adaptive dynamic programming, which combines sliding mode control and neural network feedforward module. The neural network weights of the adaptive dynamic programming feedforward module are obtained through online training, so as to realize adaptive compensation for multi-source disturbances and improve control accuracy and robustness.

Benefits of technology

High-precision and robust control of the CMG frame servo system was achieved under multi-source disturbance environment, effectively suppressing nonlinear interference caused by factors such as gyro torque, unbalanced vibration torque and nonlinear friction, and improving the accuracy and stability of speed control.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115877706B_ABST

Patent Text Reader

Abstract

The application relates to a CMG frame servo system based on adaptive dynamic programming and belongs to the field of CMG frame servo system design. Step one is to establish a CMG frame servo system; step two is to obtain a frame motor control voltage signal u by adopting a sliding mode control operation calculation, and the frame motor control voltage signal u is sent to an accumulation module; step three is to obtain a frame motor control voltage signal compensation by adopting a feedforward operation calculation, and the frame motor control voltage signal compensation is sent to the accumulation module; and step four is that the accumulation module accumulatively calculates the received u, and outputs control to the frame motor. The application realizes adaptive compensation of multi-source interference of the frame system, and realizes high-precision strong robust control of the control moment gyro frame servo system under multi-source disturbance.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of CMG framework servo system design, and relates to a CMG framework servo system based on adaptive dynamic programming. Background Technology

[0002] Control moment gyroscopes (CMGs) are one of the core inertial actuators in spacecraft attitude control systems, widely used in the attitude control systems of large spacecraft and agile satellites, and have broad development prospects. With the continuous expansion of space missions, high-precision observation satellites, represented by high-resolution satellites, place increasingly higher demands on the attitude control accuracy of spacecraft platforms. Therefore, as the main inertial actuator for spacecraft attitude control, research on high-performance control methods for control moment gyroscopes has significant practical and engineering value.

[0003] The control accuracy of the control moment gyroscope frame servo system directly determines the accuracy of the overall output torque of the control moment gyroscope, and is one of the most critical performance indicators. As a typical space precision servo mechanism, the control moment gyroscope frame servo system is characterized by its complex and precise structure, strong nonlinearity in the control link, strong coupling of multi-source interference, variable load conditions, and microgravity. These characteristics present unique challenges to high-dynamic, high-precision servo control. Therefore, in-depth research is needed on the control methods of the control moment gyroscope frame servo system to meet the performance requirements of the control moment gyroscope. Summary of the Invention

[0004] The technical problem solved by this invention is to overcome the shortcomings of the prior art and propose a CMG frame servo system based on adaptive dynamic programming to achieve adaptive compensation for multi-source disturbances in the frame system and realize high-precision and robust control of the control torque gyroscope frame servo system under multi-source disturbances.

[0005] The solution of the present invention is:

[0006] A CMG framework servo system based on adaptive dynamic programming includes the following steps:

[0007] Step 1: Establish the CMG framework servo system, including the speed slip mode control module, the adaptive dynamic programming feedforward module, and the accumulation module;

[0008] Step 2: Set the current CMG frame speed deviation value of the speed loop sliding mode control module to e(t), calculate the frame motor control voltage signal u using sliding mode control calculation, and send the frame motor control voltage signal u to the accumulation module;

[0009] Step 3: [The text appears to be incomplete and contains several grammatical errors. A more accurate translation would require the full context.] a The CMG frame rotational speed deviation value x = [e(t), e(t-1), ..., e(tk)] is obtained from the photograph.a )] T The input is fed into the adaptive dynamic programming feedforward module; the feedforward operation is used to calculate the compensation amount of the frame motor control voltage signal. Compensation amount of frame motor control voltage signal Send to the accumulation module;

[0010] Step 4: The accumulation module processes the received u, Perform cumulative calculations and output the control of the frame motor.

[0011] In the aforementioned CMG frame servo system based on adaptive dynamic programming, the calculation method for the current CMG frame rotational speed deviation value e(t) in step two is as follows:

[0012] e(t)=ω set (t)-ω(t)

[0013] In the formula, ω set (t) sets the rotational speed of the frame;

[0014] ω(t) is the actual rotational speed of the frame.

[0015] In the aforementioned CMG frame servo system based on adaptive dynamic programming, the calculation method for the frame motor control voltage signal u in step two is as follows:

[0016] Set the first variable x1 and the second variable x2;

[0017] Calculate the switching plane s of the sliding mode controller based on the first variable x1 and the second variable x2;

[0018]

[0019] In the formula, c is the adjustment coefficient of the switching plane;

[0020] Both p and q are positive odd numbers, and p < q;

[0021] The control voltage signal u for the frame motor is:

[0022]

[0023] In the formula, k v This is the back EMF coefficient of the motor;

[0024] k e This is the motor torque coefficient;

[0025] J is the moment of inertia of the motor rotor;

[0026] r is the resistance of the two phases of the motor;

[0027] c is the adjustment coefficient for the switching plane;

[0028] T f For disturbance torque;

[0029] k s k is the proportional coefficient for sliding mode control. s >0;

[0030] s is the switching plane of the sliding mode controller;

[0031] δ is the sliding mode control switching coefficient, δ>0;

[0032] sign(s) is the sign function of s.

[0033] In the aforementioned CMG framework servo system based on adaptive dynamic programming, the first variable x1 and the second variable x2 satisfy:

[0034]

[0035] In the formula, To find the derivative with respect to x1;

[0036] k e This is the motor torque coefficient;

[0037] i represents the current;

[0038] T f For disturbance torque;

[0039] J′ is the moment of inertia of the motor.

[0040] In the aforementioned CMG frame servo system based on adaptive dynamic programming, in step three, the frame motor control voltage signal compensation amount... The calculation method is as follows:

[0041] S31. Set the activation function of the neural network to the Sigmoid function: Φ(v)=(1-e -v ) / (1+e -v );

[0042] S32, based on the first k a The CMG frame rotational speed deviation value x = [e(t), e(t-1), ..., e(tk)] is obtained from the photograph. a )] T Using neural network activation functions, calculate the compensation amount of the frame motor control voltage signal.

[0043] In the aforementioned CMG framework servo system based on adaptive dynamic programming, in step S32, the intermediate variable is set to a;

[0044] Calculate the first intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module.

[0045]

[0046] In the formula, k a The number of neurons in the input layer of the adaptive dynamic programming feedforward module;

[0047] i is the number of neuron rows;

[0048] j is the number of neuron columns;

[0049] Calculate the second intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module.

[0050]

[0051] In the formula, h a The number of hidden layer neurons in the feedforward module for adaptive dynamic programming;

[0052] e is the natural logarithm;

[0053] Calculate the third intermediate variable ψ(t) in the forward computation process of the adaptive dynamic programming feedforward module:

[0054]

[0055] In the formula, The activation function from the input layer to the hidden layer;

[0056] The activation function from the hidden layer to the output layer;

[0057] Compensation amount of motor control voltage signal for calculation framework

[0058]

[0059] In the aforementioned CMG framework servo system based on adaptive dynamic programming, the activation function from the input layer to the hidden layer... and activation function from hidden layer to output layer To obtain through online training; the activation function that minimizes the cost function in the trained neural network. and That is, the optimal parameters.

[0060] In the aforementioned CMG framework servo system based on adaptive dynamic programming, the optimal parameters are calculated. and The method is as follows:

[0061] Let the training network be called the evaluation network; let the controller network that makes up the feedforward module be called the behavior network;

[0062] The cost function J(t) for evaluating the network is:

[0063]

[0064] In the formula, γ is the discount factor that accelerates the convergence of the iteration, 0 < γ < 1;

[0065] It is a quadratic utility function;

[0066] τ is a time variable, and the starting time of τ is the current time t;

[0067] According to the Bellman optimality principle, the minimum cost function J * (t) is:

[0068]

[0069] For the CMG system, the optimal parameters of the behavioral network and Optimal parameters for evaluating the network and All are unknown;

[0070] according to and Compensation amount of motor control voltage signal for calculation framework

[0071]

[0072]

[0073]

[0074]

[0075] Set the input to the evaluation network as follows according to and Calculate the output of the evaluation network

[0076]

[0077]

[0078]

[0079] In the formula, k c To evaluate the number of neurons in the network's input layer;

[0080] h c To evaluate the number of hidden neurons in a network;

[0081] This is the first intermediate variable used to evaluate the forward computation process of the network;

[0082] To evaluate the second intermediate variable in the network's forward computation process;

[0083] The iterative updates of the evaluation network are reflected in backpropagation, and the network propagation error E is evaluated. c (t) is:

[0084]

[0085] In the formula, r(t) = -x T (t)·Q·x T (t) is an enhancement signal that rewards or punishes the current control policy, used to promote the algorithm iteration process and accelerate the approach to the optimal control policy;

[0086] Q is a positive definite matrix;

[0087] Based on Newton's gradient descent principle, the weight update rule for evaluating the network is as follows:

[0088]

[0089]

[0090] In the formula, λ c To evaluate the learning rate of a network, λ c >0;

[0091] Calculate the propagation error E of the behavioral network a (t):

[0092]

[0093] The update rule for the weights of the behavioral network is as follows:

[0094]

[0095]

[0096] In the formula, λ a Let λ be the learning rate of the behavioral network. a >0; By using the defined network weight update rule, train the neural network weights until the cost function is minimized, at which point the parameters of the behavioral network are... and These are the optimal parameters.

[0097] The advantages of this invention compared to the prior art are:

[0098] (1) The neural network weight parameters of the adaptive dynamic programming feedforward module of the present invention are obtained through online training, which can effectively compensate for the influence of the uncertainty of the CMG framework system, flexible dynamics, and strong coupling of multiple sources, and realize high-precision and robust control of the CMG framework servo system speed control.

[0099] (2) The present invention combines sliding mode control with adaptive dynamic programming method to provide the frame servo system with the ability to suppress nonlinear disturbance torques introduced by factors such as gyro torque, unbalanced vibration torque, and nonlinear friction, thereby improving the robustness of CMG frame servo system speed control. Attached Figure Description

[0100] Figure 1 This is a flowchart of the control process of the CMG framework servo system of the present invention;

[0101] Figure 2 This is a schematic diagram of the CMG framework servo system of the present invention;

[0102] Figure 3 This is a schematic diagram of the behavioral network and evaluation network of the present invention. Detailed Implementation

[0103] The present invention will be further described below with reference to the embodiments.

[0104] This invention provides a control method for a CMG frame servo system based on adaptive dynamic programming. This control method mainly includes two parts: the design of a speed loop sliding mode controller and adaptive dynamic programming feedforward. First, a sliding mode controller is used as the main controller to achieve robust and stable control of the closed-loop system. Second, an adaptive dynamic programming feedforward controller is used in conjunction with the sliding mode controller to form a composite controller. The parameters of the feedforward controller are obtained through online training, thereby achieving adaptive compensation for multi-source disturbances in the frame system and realizing high-precision, robust control of the control moment gyroscope frame servo system under multi-source disturbances.

[0105] A CMG framework servo system based on adaptive dynamic programming, such as Figure 1 As shown, the specific steps include the following:

[0106] Step 1: Establish a CMG framework servo system, including a speed slip mode control module, an adaptive dynamic programming feedforward module, and an accumulation module, such as... Figure 2 As shown.

[0107] Step 2: Set the current CMG frame speed deviation value to e(t) in the speed ring sliding mode control module, calculate the frame motor control voltage signal u using sliding mode control calculation, and send the frame motor control voltage signal u to the accumulation module; the calculation method for the current CMG frame speed deviation value e(t) is as follows:

[0108] e(t)=ω set (t)-ω(t)

[0109] In the formula, ω set (t) sets the rotational speed of the frame;

[0110] ω(t) is the actual rotational speed of the frame.

[0111] The calculation method for the frame motor control voltage signal u is as follows:

[0112] Let there be a first variable x1 and a second variable x2; the first variable x1 and the second variable x2 satisfy:

[0113]

[0114] In the formula, To find the derivative with respect to x1;

[0115] k e This is the motor torque coefficient (taken as 2 Nm / A);

[0116] i represents the current;

[0117] T f For disturbance torque;

[0118] J′ is the moment of inertia of the motor (taken as 1 kg·m). 2 ).

[0119] Calculate the switching plane s of the sliding mode controller based on the first variable x1 and the second variable x2;

[0120]

[0121] In the formula, c is the adjustment coefficient of the switching plane;

[0122] Both p and q are positive odd numbers, and p < q;

[0123] The control voltage signal u for the frame motor is:

[0124]

[0125] In the formula, k v This is the back EMF coefficient of the motor (taken as 2V / rpm);

[0126] k e This is the motor torque coefficient;

[0127] J is the moment of inertia of the motor rotor;

[0128] r is the resistance of two phases of the motor; r = 3.0Ω;

[0129] c is the adjustment coefficient for the switching plane, c = 1.0;

[0130] T f For disturbance torque;

[0131] k s k is the proportional coefficient for sliding mode control. s >0;

[0132] s is the switching plane of the sliding mode controller;

[0133] δ is the sliding mode control switching coefficient, δ>0;

[0134] sign(s) is the sign function of s.

[0135] Step 3: [The text appears to be incomplete and contains several grammatical errors. A more accurate translation would require the full context.] a The CMG frame rotational speed deviation value x = [e(t), e(t-1), ..., e(tk)] is obtained from the photograph. a )] T The input is fed into the adaptive dynamic programming feedforward module; the feedforward operation is used to calculate the compensation amount of the frame motor control voltage signal. Compensation amount of frame motor control voltage signal Send to the accumulation module.

[0136] Frame motor control voltage signal compensation amount The calculation method is as follows:

[0137] S31. Set the activation function of the neural network to the Sigmoid function: Φ(v)=(1-e -v ) / (1+e -v );

[0138] S32, based on the first k a The CMG frame rotational speed deviation value x = [e(t), e(t-1), ..., e(tk)] is obtained from the photograph. a )] T Using neural network activation functions, calculate the compensation amount of the frame motor control voltage signal.

[0139] Specifically: Set the intermediate variable to 'a';

[0140] Calculate the first intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module.

[0141]

[0142] In the formula, k a k represents the number of neurons in the input layer of the adaptive dynamic programming feedforward module. a =2;

[0143] i is the number of neuron rows;

[0144] j is the number of neuron columns;

[0145] Calculate the second intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module.

[0146]

[0147] In the formula, h a h represents the number of hidden layer neurons in the adaptive dynamic programming feedforward module. a =6;

[0148] e is the natural logarithm;

[0149] Calculate the third intermediate variable ψ(t) in the forward computation process of the adaptive dynamic programming feedforward module:

[0150]

[0151] In the formula, The activation function from the input layer to the hidden layer;

[0152] The activation function from the hidden layer to the output layer;

[0153] Compensation amount of motor control voltage signal for calculation framework

[0154]

[0155] Among them, the activation function from the input layer to the hidden layer and activation function from hidden layer to output layer To obtain through online training; the activation function that minimizes the cost function in the trained neural network. and That is, the optimal parameters.

[0156] Calculate the optimal parameters and The method is as follows:

[0157] Let the training network be called the evaluation network; let the controller network that makes up the feedforward module be called the behavior network; the schematic diagram is as follows. Figure 3 As shown.

[0158] The cost function J(t) for evaluating the network is:

[0159]

[0160] In the formula, γ is the discount factor to accelerate the convergence of the iteration, 0 < γ < 1; γ = 0.95;

[0161] It is a quadratic utility function;

[0162] τ is a time variable, and the starting time of τ is the current time t;

[0163] According to the Bellman optimality principle, the minimum cost function J * (t) is:

[0164]

[0165] The Sigmoid function is still chosen as the activation function for neurons in the online training network.

[0166] For the CMG system, the optimal parameters of the behavioral network and Optimal parameters for evaluating the network and All are unknown;

[0167] according to and Compensation amount of motor control voltage signal for calculation framework

[0168]

[0169]

[0170]

[0171]

[0172] Set the input to the evaluation network as follows according to and Calculate the output of the evaluation network

[0173]

[0174]

[0175]

[0176] In the formula, k c To evaluate the number of neurons in the network's input layer, k c =3;

[0177] hc To evaluate the number of hidden neurons in a network, h c =6;

[0178] This is the first intermediate variable used to evaluate the forward computation process of the network;

[0179] To evaluate the second intermediate variable in the network's forward computation process;

[0180] The iterative updates of the evaluation network are reflected in backpropagation, and the network propagation error E is evaluated. c (t) is:

[0181]

[0182] In the formula, r(t) = -x T (t)·Q·x T (t) is an enhancement signal that rewards or punishes the current control policy, used to promote the algorithm iteration process and accelerate the approach to the optimal control policy;

[0183] Q is a positive definite matrix; it is used to facilitate the algorithm's iterative process and accelerate the approximation of the optimal control strategy.

[0184] Based on Newton's gradient descent principle, the weight update rule for evaluating the network is as follows:

[0185]

[0186]

[0187] In the formula, λ c To evaluate the learning rate of a network, λ c >0;

[0188] Calculate the propagation error E of the behavioral network a (t):

[0189]

[0190] The update rule for the weights of the behavioral network is as follows:

[0191]

[0192]

[0193] In the formula, λ a Let λ be the learning rate of the behavioral network. a >0; By using the defined network weight update rule, train the neural network weights until the cost function is minimized, at which point the parameters of the behavioral network are... and These are the optimal parameters.

[0194] Step 4: The accumulation module processes the received u, Perform cumulative calculations and output the control of the frame motor.

[0195] After training, the connection between the evaluation network, the behavior network, and the sliding mode control module is disconnected. The trained parameters are then implanted into the behavior network to construct an offline feedforward compensation module. This module is then combined with the sliding mode control module to form a composite controller, thereby achieving robust control of the CMG framework servo system.

[0196] Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make possible changes and modifications to the technical solutions of the present invention by utilizing the methods and techniques disclosed above without departing from the spirit and scope of the present invention. Therefore, any simple modifications, equivalent changes and alterations made to the above embodiments based on the technical essence of the present invention without departing from the content of the technical solutions of the present invention shall fall within the protection scope of the technical solutions of the present invention.

Claims

1. A CMG framework servo system based on adaptive dynamic programming, characterized in that: Includes the following steps: Step 1: Establish the CMG framework servo system, including the speed loop sliding mode control module, the adaptive dynamic programming feedforward module, and the accumulation module; Step 2: Set the current CMG frame speed deviation value of the speed slip mode control module. The control voltage signal of the frame motor is obtained by sliding mode control calculation. ; and the frame motor control voltage signal Send to the accumulation module; Step 3, place the front CMG frame rotation speed deviation value The input is fed into the adaptive dynamic programming feedforward module; the feedforward operation is used to calculate the compensation amount of the frame motor control voltage signal. And compensate the frame motor control voltage signal amount Send to the accumulation module; Frame motor control voltage signal compensation amount The calculation method is as follows: S31. Set the activation function of the neural network to the Sigmoid function: ; S32, according to the preceding CMG frame rotation speed deviation value Using neural network activation functions, calculate the compensation amount of the frame motor control voltage signal. ; In step S32, the intermediate variable is set as follows: ; Calculate the first intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module. : In the formula, The number of neurons in the input layer of the adaptive dynamic programming feedforward module; The number of rows of neurons; The number of neurons; Calculate the second intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module. : In the formula, The number of hidden layer neurons in the feedforward module for adaptive dynamic programming; It is the natural logarithm; Calculate the third intermediate variable in the forward computation process of the adaptive dynamic programming feedforward module. : In the formula, The activation function from the input layer to the hidden layer; The activation function from the hidden layer to the output layer; Compensation amount of motor control voltage signal for calculation framework : Step 4: The accumulation module processes the received data. , Perform cumulative calculations and output the control of the frame motor.

2. The CMG framework servo system based on adaptive dynamic programming according to claim 1, characterized in that: In step two, the current CMG frame rotation speed deviation value The calculation method is as follows: In the formula, Set the rotation speed for the frame; This represents the actual rotational speed of the frame.

3. The CMG framework servo system based on adaptive dynamic programming according to claim 2, characterized in that: In step two, the frame motor control voltage signal The calculation method is as follows: Set the first variable Second variable ; According to the first variable Second variable Calculate the switching plane of the sliding mode controller. ; In the formula, The adjustment coefficient for switching planes; and All are positive odd numbers, and ; Then the frame motor control voltage signal for: In the formula, This is the back EMF coefficient of the motor; This is the motor torque coefficient; The moment of inertia of the motor rotor; The resistance of the two phases of the motor; The adjustment coefficient for switching planes; For disturbance torque; This is the proportional coefficient for sliding mode control. ; This is the switching plane for the sliding mode controller; This is the switching coefficient for sliding mode control. ; for The symbolic function.

4. The CMG framework servo system based on adaptive dynamic programming according to claim 3, characterized in that: First variable Second variable satisfy: In the formula, To Differentiate; This is the motor torque coefficient; For current; For disturbance torque; This represents the moment of inertia of the motor.

5. A CMG framework servo system based on adaptive dynamic programming according to claim 4, characterized in that: The activation function from the input layer to the hidden layer and activation function from hidden layer to output layer To obtain through online training; the activation function that minimizes the cost function in the trained neural network. and That is, the optimal parameters.

6. A CMG framework servo system based on adaptive dynamic programming according to claim 5, characterized in that: Calculate the optimal parameters and The method is as follows: Let the training network be called the evaluation network; let the controller network that makes up the feedforward module be called the behavior network; Cost function for evaluating a network for: In the formula, To accelerate the iterative convergence discount factor, ; It is a quadratic utility function; It is a time variable, and The starting time is the current time. ; According to the Bellman optimality principle, the minimum cost function for: For the CMG system, the optimal parameters of the behavioral network and Optimal parameters of the evaluation network and All are unknown; according to and Compensation amount of motor control voltage signal for calculation framework : Set the input to the evaluation network as follows ;according to and Calculate the output of the evaluation network : In the formula, To evaluate the number of neurons in the network's input layer; To evaluate the number of hidden neurons in a network; To evaluate the first intermediate variable in the network's forward computation process; To evaluate the second intermediate variable in the network's forward computation process; The iterative updates of the evaluation network are reflected in backpropagation, and the network propagation error is evaluated. for: In the formula, It is an enhancement signal that rewards or punishes the current control strategy, used to promote the algorithm iteration process and accelerate the approach to the optimal control strategy; It is a positive definite matrix; Based on Newton's gradient descent principle, the weight update rule for evaluating the network is as follows: In the formula, To evaluate the learning rate of the network, ; Calculate the propagation error of behavioral networks : The update rule for the weights of the behavioral network is as follows: In the formula, The learning rate of the behavioral network. By defining the network weight update rules, the neural network weights are trained until the cost function is minimized; at this point, the parameters of the behavioral network are... and These are the optimal parameters.