Incomplete information underactuated system control method based on self-learning and event triggering

By using a self-learning and event-triggered control method, the error threshold is dynamically adjusted and control commands are generated using a self-learning model. This solves the problems of model dependency and resource waste in underactuated systems of UAVs and robots, and improves control accuracy and resource utilization.

CN122308128APending Publication Date: 2026-06-30INST OF AUTOMATION CHINESE ACAD OF SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
INST OF AUTOMATION CHINESE ACAD OF SCI
Filing Date
2026-03-05
Publication Date
2026-06-30

Smart Images

  • Figure CN122308128A_ABST
    Figure CN122308128A_ABST
Patent Text Reader

Abstract

This invention belongs to the field of system control technology and provides a control method for underactuated systems based on self-learning and event triggering. The method includes: acquiring the real-time state of the underactuated system; calculating the state error between the real-time state and the target state; and outputting an event trigger command when the state error exceeds a threshold value, wherein the error threshold is dynamically determined based on changes in the real-time state or the running time of the underactuated system; inputting control parameter constraints and real-time state variables into a self-learning model; and obtaining the control command output by the self-learning model upon receiving the event trigger command. The sample states used in training the self-learning model are obtained through simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is generated by generating control commands from the control parameters corresponding to the minimum value of a preset performance index function with the sample states as input. The performance index function at least characterizes the control accuracy. This invention achieves precise control of the underactuated system.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of system control technology, and in particular to a control method for an underdriven system based on self-learning and event triggering with incomplete information. Background Technology

[0002] Optimal control of underactuated systems for drones and robots is a core research topic in the field of intelligent unmanned systems. It is crucial for the efficient execution, energy optimization, and operational safety of tasks such as drone inspection, logistics transportation, robotic industrial operations, and service navigation. The core characteristic of underactuated systems is that the control input dimension is smaller than the system degrees of freedom. They are typical strongly nonlinear, multi-coupled complex systems. As the application scenarios of drones and robots become increasingly complex (e.g., complex outdoor weather, dynamic industrial environments, and confined indoor spaces), the requirements for the optimality, real-time performance, robustness, and resource utilization of their underactuated system control are also increasing.

[0003] Traditional control methods for underactuated systems of drones and robots mainly rely on precise kinetic physical models, such as numerical simulation control methods based on the Lagrange and Euler equations. While these methods theoretically possess certain control effectiveness, they suffer from the following technical problems in practical applications: strong dependence on kinetic physical models, insufficient robustness, flawed event-triggered mechanism design, lack of resource optimization design in self-learning control, and difficulty in balancing control optimality and resource efficiency. Summary of the Invention

[0004] This invention provides a control method for an underdriven system with incomplete information based on self-learning and event triggering, in order to solve at least one of the aforementioned technical problems in the prior art.

[0005] This invention provides a control method for an underactuated system with incomplete information based on self-learning and event triggering, comprising the following steps: Obtain the real-time status of the underactuated system; Calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, output an event trigger command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state. The control parameter constraints and the real-time state variables are input into the self-learning model. Upon receiving the event triggering instruction, the self-learning model obtains the control instruction output by the underactuated system. The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the preset performance index function when the sample state is taken as input and the minimum value is obtained. The performance index function at least characterizes the performance of control accuracy.

[0006] According to the present invention, a control method for an underactuated system based on self-learning and event triggering with incomplete information is provided, wherein the performance index function is used to characterize the comprehensive performance of control accuracy, energy consumption and triggering frequency.

[0007] According to the present invention, a control method for an underactuated system based on self-learning and event triggering with incomplete information is provided, wherein the performance index function is expressed by the following formula: ; The self-learning model uses a single-hidden-layer evaluation neural network to fit the performance index function, and the fitting expression is: ; in, Indicates the end time of the task. Indicates the first i An underactuated system at time t The state vector, The cost function represents the time at which the task ends. This represents the vector of applied control parameters. This represents the instantaneous performance loss during the control process. This represents the transpose of the gradient vector. express The derivative of This represents the transpose of the evaluation network weight matrix. This represents the activation function of the hidden layer.

[0008] According to the present invention, a control method for an underactuated system based on self-learning and event triggering with incomplete information dynamically determines the error threshold based on changes in real-time state as follows: Calculate the first state error between the current real-time state and the corresponding target state; Calculate the second state error between the previous real-time state and the corresponding target state; If the error in the first state is greater than the error in the second state, the error threshold is reduced.

[0009] According to the present invention, a control method for an underactuated system based on self-learning and event triggering with incomplete information is provided. The error threshold is dynamically determined based on the running time of the underactuated system as follows: The longer the underactuated system runs, the smaller the error threshold becomes.

[0010] The present invention provides a control method for an underactuated system based on self-learning and event triggering with incomplete information, which further includes, before calculating the state error between the real-time state and the target state, eliminating interference information in the acquired real-time state.

[0011] According to the present invention, a control method for an underactuated system based on self-learning and event triggering with incomplete information is provided. In the case of multiple underactuated systems, the method acquires the real-time state of the underactuated system, including: Multiple agents acquire the real-time state of their respective underactuated systems. For any agent in an underactuated system, the corresponding real-time state is synchronized to the other agents.

[0012] The present invention also provides a control device for an underdriven system based on self-learning and event triggering with incomplete information, comprising the following modules: The real-time status acquisition module is used to acquire the real-time status of the underactuated system. The event triggering module is used to calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, it outputs an event triggering command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state. The control command output module is used to input the control parameter constraints and the real-time state variables into the self-learning model, and to obtain the control command output by the self-learning model for the underactuated system when the event trigger command is received. The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the preset performance index function when the sample state is taken as input and the minimum value is obtained. The performance index function at least characterizes the performance of control accuracy.

[0013] The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the incomplete information underdriven system control method based on self-learning and event triggering as described above.

[0014] The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the incomplete information underdriven system control method based on self-learning and event triggering as described above.

[0015] The present invention provides a control method for underactuated systems based on self-learning and event triggering. This method dynamically determines the error threshold by adjusting the error threshold used to output event triggering commands based on real-time state changes or the underactuated system's runtime. This avoids problems such as excessively frequent triggering leading to wasted communication and computing resources, or excessively long triggering intervals affecting control accuracy. Furthermore, by inputting control parameter constraints and the real-time state variables into a self-learning model, upon receiving the event triggering command, the self-learning model generates the control command output by the underactuated system. The self-learning model uses the real-time state and control parameter constraints as input, iteratively optimizing the control parameters with the goal of minimizing a preset performance index function. The control command is generated using the control parameters with the minimum performance index function, avoiding reliance on traditional dynamic physical models and improving control accuracy. Attached Figure Description

[0016] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0017] Figure 1 This is a flowchart illustrating the underdriven system control method based on self-learning and event triggering with incomplete information provided by the present invention.

[0018] Figure 2 This is a schematic diagram of the structure of the underdriven system control device based on self-learning and event triggering with incomplete information provided by the present invention.

[0019] Figure 3 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation

[0020] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0021] Existing related technologies, underactuated system control methods that rely on dynamic physical models have the following shortcomings: 1. Strong model dependence and insufficient robustness: The construction of physical models requires accurate understanding of the dynamic characteristics, coupling terms and constraints of underactuated systems. However, UAVs are subject to environmental disturbances such as wind speed and wind direction during actual flight, and robots are affected by uncertainties in their own state such as ground friction and joint wear during movement. This makes it extremely difficult to establish accurate physical models, and the control accuracy of traditional model-based control algorithms drops significantly when faced with disturbances.

[0022] 2. Deficiencies in the design of the event triggering mechanism: Some existing control methods that combine event triggering mostly use static trigger thresholds, which cannot be dynamically adjusted according to the real-time status of the system. This can easily lead to problems such as excessively frequent triggering, resulting in waste of communication and computing resources, or excessively long trigger intervals, which affect the control accuracy.

[0023] 3. Self-learning control lacks resource optimization design: In recent years, self-learning control methods based on neural networks and adaptive dynamic programming have been applied in underactuated systems. They can optimize control laws (i.e. control commands) through data-driven optimization. However, these methods mostly adopt a mode of continuous sampling and real-time control updates, which greatly increases the energy consumption of UAVs and the computational load of robots, making it difficult to meet the actual needs of UAV endurance and robot real-time control.

[0024] 4. It is difficult to balance control optimality and resource efficiency: When designing control strategies for underactuated systems, traditional methods either focus on control accuracy and optimality while ignoring resource consumption, or focus on resource conservation while sacrificing control performance. An effective balance between the two has not yet been achieved.

[0025] To address one of the aforementioned technical problems in existing related technologies, this invention provides a control method for an underactuated system based on self-learning and event triggering, with the specific process as follows: Figure 1 As shown, the process includes the following steps S110 to S130: Step S110: Obtain the real-time state of the underactuated system. In this step, the underactuated system can be an underactuated system such as a drone or a robot. For example, the real-time state of a drone underactuated system includes real-time state variables such as the drone's position, attitude, angular velocity, and flight speed. The real-time state of a robot underactuated system includes real-time state variables such as the robot's pose, joint angles, joint velocities, and motion speed.

[0026] For example, the real-time state of the underactuated system can be observed by an observer. This observer can be an external camera independent of the underactuated system, which observes the state of the UAV to obtain the aforementioned real-time state variables. Alternatively, it can be an observation module embedded in the UAV underactuated system, which directly obtains the aforementioned real-time state variables from the UAV underactuated system.

[0027] Step S120: Calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, output an event trigger command. The error threshold is dynamically determined based on the change in the real-time state or the running time of the underactuated system.

[0028] The target state is the desired state that the underactuated system is to be controlled to achieve. The target state corresponds to the real-time state; that is, each real-time state corresponds to a target state at a given moment. For example, to control a drone to fly from point A to point B along a predetermined flight path, the target state can at least include the drone's position, meaning the drone should not deviate from the predetermined flight path during flight. The target state can also include the drone's attitude (including pitch, yaw, and roll angles) at the target position. The more accurate the attitude, the less likely it is to deviate from the trajectory.

[0029] Taking the target state as the position of the UAV as an example, the target position corresponding to the real-time state can be obtained along the trajectory based on the predetermined speed and flight time. The state error is the position error between the UAV's real-time position and the corresponding target position. Correspondingly, the error threshold is the position error threshold. During the process of controlling the UAV by inputting control parameters to the UAV's underactuated system, if the error between the UAV's real-time position and the corresponding target position is greater than the position error, it indicates that the UAV has deviated significantly from the predetermined flight trajectory. In this case, an event trigger command needs to be output to trigger the re-output of control commands, thereby controlling the UAV to return to the predetermined flight trajectory.

[0030] In this step, the error threshold is dynamically determined based on real-time state changes or the runtime of the underactuated system. For example, when controlling the UAV's underactuated system to fly along a predetermined flight path, a larger error threshold can be set when the UAV is far from the destination to avoid frequent triggering. When the UAV is close to the destination (after a longer flight time), the error threshold can be dynamically reduced, resulting in more precise control and enabling the UAV to land accurately at the destination. Therefore, compared to existing technologies that often use static trigger thresholds and employ continuous sampling and real-time control updates, this approach achieves dynamic adjustment based on the system's real-time state, avoiding problems such as excessively frequent triggering leading to wasted communication and computing resources or excessively long trigger intervals affecting control accuracy.

[0031] Step S130: Input the control parameter constraints and the real-time state variables into the self-learning model. When the event triggering command is received, the self-learning model obtains the control command output by the underactuated system. That is, the self-learning model predicts and outputs the control command based on the control parameter constraints and real-time state variables only when triggered by the event triggering command.

[0032] Control commands are instructions formed from control parameters; that is, control commands include control parameters, which are inputs to the underactuated system to control the drone or robot. For an underactuated drone system, the control parameters can be thrust (controlling the drone's flight speed) and angle (controlling the drone's flight direction and attitude). Each underactuated system has corresponding control parameter constraints, including the value range of each control parameter and / or the nonlinear coupling characteristics between control parameters. For example, for an underactuated drone system, the maximum thrust and the maximum range of angular steering.

[0033] The self-learning model is trained based on the sample state of the underactuated system and the corresponding control command label. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the minimum value of the preset performance index function when the sample state is taken as input. The performance index function at least characterizes the performance of control accuracy.

[0034] Specifically, in the continuous-time nonlinear coupled dynamics model, the first... i An underactuated system ( , N The state equation for ≥1) is: (1).

[0035] in, This represents the real-time status of the underactuated system. for The derivative satisfies the underactuated characteristic. , and They represent n i peacekeeping m i The real number field of dimension , m i < n i , To control parameters, satisfy saturation constraints For a nonlinear function that satisfies local Lipshitz continuity on a bounded compact set; This is bounded interference, also known as incomplete information, which satisfies... .

[0036] Since the sample state during the training of the self-learning model is obtained from the simulation of the continuous-time nonlinear coupled dynamic model, and the control command label is the control command generated by the control parameters corresponding to the minimum value of the preset performance index function when the sample state is taken as input, it is equivalent to the self-learning model taking the real-time state as input and, under the constraint of the control parameters, making the output control command approximate the control command when the performance index function is optimal through self-learning, thereby optimizing the control accuracy performance of the underactuated system.

[0037] Of course, if the state error is not greater than the error threshold in step S120, no event triggering command will be output. In step S130, the self-learning model does not need to re-predict and output new control commands, and can continue to output the previous control commands to the underactuated system.

[0038] This embodiment presents a self-learning and event-triggered incomplete information underactuated system control method. By dynamically determining the error threshold through real-time state changes or the underactuated system's runtime, it dynamically adjusts the error threshold used to output event-triggered commands based on the system's real-time state. This avoids problems such as excessively frequent triggering leading to wasted communication and computing resources, or excessively long trigger intervals affecting control accuracy. Furthermore, by inputting control parameter constraints and the real-time state variables into the self-learning model, upon receiving the event-triggered command, the self-learning model generates the control command output by the underactuated system. The self-learning model uses the real-time state as input and, under control parameter constraints, learns to make the output control command approximate the control command optimal for the performance index function, avoiding dependence on traditional kinetic physical models and improving control accuracy. Moreover, the efficiency of predicting control commands after the self-learning model is trained is higher than that of traditional kinetic model simulation.

[0039] In some embodiments, the performance index function is used to characterize the combined performance of control accuracy, energy consumption, and trigger frequency. Specifically, the performance index function is expressed by the following formula: (2).

[0040] The self-learning model uses a single hidden layer evaluation network to fit the performance index function, and the fitting expression is: (3).

[0041] in, Indicates the end time of the task (i.e., the end time of a single control task). Indicates the first i An underactuated system at time t The state vector, The cost function represents the cost at the end of the mission, used to measure the performance of an underactuated system in reaching the target state at the end of the mission. For example, the cost of the deviation between a UAV and the target position at the end of the mission. This represents the vector of applied control parameters (input). This represents the instantaneous performance loss during the control process, such as the combined effects of energy consumption, control accuracy error, and trigger frequency penalty. This represents the transpose of the gradient vector, used to describe the rate at which performance metrics change with the state. express The derivative of This represents the transpose of the evaluation network weight matrix. This represents the activation function of the hidden layer.

[0042] The self-learning model uses a single hidden layer evaluation network to approximate the comprehensive performance index function, and can also use an execution network to approximate the optimal control command, that is, by executing the network to output the optimal control command. It is understood that the self-learning model in this embodiment can use existing deep learning neural network structures.

[0043] In this embodiment, the performance index function is used to characterize the comprehensive performance of control accuracy, energy consumption and trigger frequency. The self-learning model achieves a balance between control accuracy and resource consumption of the underactuated system by approximating the performance index function.

[0044] In some embodiments, the error threshold is dynamically determined based on changes in real-time status as follows: Calculate the first state error between the current real-time state and the corresponding target state.

[0045] Calculate the second state error between the previous real-time state and the corresponding target state.

[0046] If the error in the first state is greater than the error in the second state, the error threshold is reduced.

[0047] Specifically, if the error in the first state is greater than the error in the second state, it indicates that the control of the underactuated system is deviating from the target state. At this time, it is necessary to reduce the error threshold so as to output new control commands more frequently, so that the real-time state of the underactuated system is close to the corresponding target state, thereby achieving more precise control.

[0048] In some embodiments, the error threshold is dynamically determined based on the underactuated system's runtime as follows: the longer the underactuated system runs, the smaller the error threshold. This is because a longer runtime means the control of the underactuated system is closer to the final target state, requiring more precise control to ensure the underactuated system accurately reaches the target state. Therefore, the longer the runtime, the smaller the error threshold. For example, in controlling a UAV's underactuated system to fly along a predetermined trajectory, the longer the flight time, the closer the UAV is to the destination. More precise control is needed to land the UAV at the destination. Therefore, a longer flight time results in a smaller error threshold, leading to more frequent output of new control commands and a more precise landing of the UAV at the destination.

[0049] In some embodiments, before calculating the state error between the real-time state and the target state, the method further includes: eliminating interference information in the acquired real-time state, that is, eliminating the interference information in the above formula (1). Specifically, interference information can be eliminated by filtering out outliers in the observed real-time state, thereby eliminating the impact of interference information on control accuracy.

[0050] In some embodiments, when there are multiple underactuated systems, obtaining the real-time state of the underactuated system includes: multiple agents obtaining the real-time state of their respective underactuated systems, and for any agent of an underactuated system, synchronizing the corresponding real-time state to other agents.

[0051] Specifically, each underactuated system is configured with an agent as an observer, forming a distributed observer. For any agent in an underactuated system, the corresponding real-time state is synchronized to other agents.

[0052] In multi-agent systems, distributed observers enable each node (i.e., the underactuated system) to upgrade from "local perception" to "global cognition" by synchronizing the real-time states of the corresponding underactuated systems observed by each agent. The core function of this mechanism is to improve the accuracy of state estimation through multi-source information fusion, provide reliable global state consensus for cooperative control, and enhance the robustness of the system when some nodes fail.

[0053] In drone formation displays, each drone's distributed observer synchronizes its own position, attitude, and observation data from neighboring nodes to precisely maintain the formation and prevent pattern distortion due to local sensor failures. In multi-robot collaborative handling scenarios, each robot synchronizes force sensor and visual information through its observers to achieve a consistent understanding of the workpiece's attitude and force distribution, ensuring a smooth and safe handling process. Distributed observer state synchronization is a key technological support for achieving efficient and reliable collaboration in multi-agent systems.

[0054] The following describes the control device for an underactuated system based on self-learning and event triggering with incomplete information provided by the present invention. The control device for an underactuated system based on self-learning and event triggering with incomplete information described below can be referred to in correspondence with the control method for an underactuated system based on self-learning and event triggering with incomplete information described above.

[0055] The present invention provides a control device for an underdriven system based on self-learning and event triggering with incomplete information, such as... Figure 2 As shown, it includes: The real-time status acquisition module 210 is used to acquire the real-time status of the underactuated system.

[0056] The event triggering module 220 is used to calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, it outputs an event triggering command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state.

[0057] The control command output module 230 is used to input the control parameter constraints and the real-time state variables into the self-learning model, and obtain the control command output by the self-learning model for the underactuated system when the event trigger command is received.

[0058] The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the sample state as input and taking the minimum value of a preset performance index function. The performance index function at least characterizes the performance of control accuracy.

[0059] This embodiment of the incomplete information underactuated system control device based on self-learning and event triggering dynamically determines the error threshold by changing the real-time state or the underactuated system's runtime. This achieves dynamic adjustment of the error threshold used to output event triggering commands based on the system's real-time state, avoiding problems such as excessively frequent triggering leading to wasted communication and computing resources or excessively long triggering intervals affecting control accuracy. Furthermore, by inputting control parameter constraints and the real-time state variables into the self-learning model, upon receiving the event triggering command, the self-learning model obtains the control command output by the underactuated system. The self-learning model uses the real-time state and control parameter constraints as input, and iteratively optimizes the control parameters with the goal of minimizing a preset performance index function. The control command is generated using the control parameters with the minimum performance index function, avoiding dependence on traditional dynamic physical models and improving control accuracy.

[0060] In some embodiments, the performance index function is used to characterize the combined performance of control accuracy, energy consumption, and trigger frequency.

[0061] In some embodiments, the performance metric function is expressed by the following formula: ; The self-learning model uses a single-hidden-layer evaluation neural network to fit the performance index function, and the fitting expression is: ; in, Indicates the end time of the task. Indicates the first i An underactuated system at time t The state vector, The cost function represents the time at which the task ends. This represents the vector of applied control parameters. This represents the instantaneous performance loss during the control process. This represents the transpose of the gradient vector. express The derivative of This represents the transpose of the evaluation network weight matrix. This represents the activation function of the hidden layer.

[0062] In some embodiments, the error threshold is dynamically determined based on changes in real-time status as follows: Calculate the first state error between the current real-time state and the corresponding target state; Calculate the second state error between the previous real-time state and the corresponding target state; If the error in the first state is greater than the error in the second state, the error threshold is reduced.

[0063] In some embodiments, the error threshold is dynamically determined based on the underactuated system runtime as follows: The longer the underactuated system runs, the smaller the error threshold becomes.

[0064] In some embodiments, the incomplete information underdriven system control device based on self-learning and event triggering further includes: an interference information elimination module, used to eliminate interference information in the acquired real-time state before calculating the state error between the real-time state and the target state.

[0065] In some embodiments, when there are multiple underactuated systems, the real-time state acquisition module 210 is used for multiple agents to acquire the real-time state of each corresponding underactuated system, and for any agent of an underactuated system, to synchronize the corresponding real-time state to other agents.

[0066] It should be noted that the underactuated system control device based on self-learning and event triggering provided by the present invention can execute the underactuated system control method based on self-learning and event triggering described in any of the above embodiments during specific operation. This embodiment will not elaborate on this aspect.

[0067] Figure 3 This is a schematic diagram of the structure of the electronic device provided by the present invention, such as... Figure 3 As shown, the electronic device may include: a processor 310, a communications interface 320, a memory 330, and a communication bus 340, wherein the processor 310, the communications interface 320, and the memory 330 communicate with each other via the communication bus 340. The processor 310 can call logical instructions in the memory 330 to execute a self-learning and event-triggered underactuated system control method based on incomplete information. This method includes: Obtain the real-time status of the underactuated system.

[0068] Calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, output an event trigger command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state.

[0069] The control parameter constraints and the real-time state variables are input into the self-learning model. Upon receiving the event trigger instruction, the self-learning model obtains the control instruction output by the underactuated system.

[0070] The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the preset performance index function when the sample state is taken as input and the minimum value is obtained. The performance index function at least characterizes the performance of control accuracy.

[0071] Furthermore, the logical instructions in the aforementioned memory 330 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0072] On the other hand, the present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, wherein when the program instructions are executed by a computer, the computer is able to execute the incomplete information underactuated system control method based on self-learning and event triggering provided in the above embodiments, the method comprising: Obtain the real-time status of the underactuated system.

[0073] Calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, output an event trigger command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state.

[0074] The control parameter constraints and the real-time state variables are input into the self-learning model. Upon receiving the event trigger instruction, the self-learning model obtains the control instruction output by the underactuated system.

[0075] The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the preset performance index function when the sample state is taken as input and the minimum value is obtained. The performance index function at least characterizes the performance of control accuracy.

[0076] In another aspect, the present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, is implemented to perform the incomplete information underactuated system control method based on self-learning and event triggering provided in the above embodiments, the method comprising: Obtain the real-time status of the underactuated system.

[0077] Calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, output an event trigger command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state.

[0078] The control parameter constraints and the real-time state variables are input into the self-learning model. Upon receiving the event trigger instruction, the self-learning model obtains the control instruction output by the underactuated system.

[0079] The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the preset performance index function when the sample state is taken as input and the minimum value is obtained. The performance index function at least characterizes the performance of control accuracy.

[0080] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.

[0081] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.

[0082] All actions involving the acquisition of signal information or data in this application are carried out in accordance with the relevant data protection laws and policies of the country where the application is located, and with the authorization of the owner of the relevant device.

[0083] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A self-learning and event-triggered based incomplete information underactuated system control method, characterized in that, include: Obtain the real-time status of the underactuated system; Calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, output an event trigger command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state. The control parameter constraints and the real-time state variables are input into the self-learning model. Upon receiving the event triggering instruction, the self-learning model obtains the control instruction output by the underactuated system. The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the sample state as input and taking the minimum value of a preset performance index function. The performance index function at least characterizes the performance of control accuracy.

2. The self-learning and event-triggered based incomplete information underactuated system control method of claim 1, wherein, The performance index function is used to characterize the overall performance of control accuracy, energy consumption, and trigger frequency.

3. The self-learning and event-triggered based incomplete information underactuated system control method of claim 2, wherein, The performance index function is expressed by the following formula: ; The self-learning model uses a single-hidden-layer evaluation neural network to fit the performance index function, and the fitting expression is: ; wherein, denotes the task end time, denotes the state vector of the i nd underactuated system at time t denotes the cost function at the task end time, denotes the applied control parameter vector, denotes the instantaneous performance loss during control, denotes the transpose of the gradient vector, denotes the derivative of denotes the transpose of the evaluation network weight matrix, denotes the hidden layer activation function.​​ 4. The control method for under-driven systems based on self-learning and event triggering with incomplete information according to claim 1, characterized in that, The error threshold is dynamically determined based on real-time state changes as follows: Calculate the first state error between the current real-time state and the corresponding target state; Calculate the second state error between the previous real-time state and the corresponding target state; If the error in the first state is greater than the error in the second state, the error threshold is reduced.

5. The control method for an underactuated system based on self-learning and event triggering with incomplete information according to claim 1, characterized in that, The error threshold is dynamically determined based on the underactuated system runtime as follows: The longer the underactuated system runs, the smaller the error threshold becomes.

6. The control method for an under-driven system based on self-learning and event triggering with incomplete information according to any one of claims 1 to 5, characterized in that, Before calculating the state error between the real-time state and the target state, the method further includes: eliminating interference information in the acquired real-time state.

7. The control method for an under-driven system based on self-learning and event triggering with incomplete information according to any one of claims 1 to 5, characterized in that, In the case of multiple underactuated systems, obtain the real-time state of the underactuated systems, including: Multiple agents acquire the real-time state of their respective underactuated systems. For any agent in an underactuated system, the corresponding real-time state is synchronized to the other agents.

8. A control device for an under-driven system based on self-learning and event triggering with incomplete information, characterized in that, include: The real-time status acquisition module is used to acquire the real-time status of the underactuated system. The event triggering module is used to calculate the state error between the real-time state and the target state. If the state error is greater than the error threshold, it outputs an event triggering command. The error threshold is dynamically determined based on the change of the real-time state or the running time of the underactuated system. The target state is the state corresponding to the real-time state. The control command output module is used to input the control parameter constraints and the real-time state variables into the self-learning model, and to obtain the control command output by the self-learning model for the underactuated system when the event trigger command is received. The self-learning model is trained based on the sample state of the underactuated system and the control command label corresponding to the sample state. The sample state of the underactuated system is obtained by simulation based on a continuous-time nonlinear coupled dynamics model. The control command label is a control command generated by the control parameters corresponding to the sample state as input and taking the minimum value of a preset performance index function. The performance index function at least characterizes the performance of control accuracy.

9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the incomplete information underdriven system control method based on self-learning and event triggering as described in any one of claims 1 to 7.

10. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the incomplete information underdriven system control method based on self-learning and event triggering as described in any one of claims 1 to 7.