Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Online learning method for optimal controller of nonlinear system

A technology of nonlinear system and learning method, which is applied in the direction of adaptive control, general control system, control/regulation system, etc., and can solve the problems that the synchronization policy iteration method cannot apply the policy space, excitation noise deviation, insufficient exploration, etc.

Active Publication Date: 2020-05-12
INFORMATION SCI RES INST OF CETC
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method still has the following problems and shortcomings: 1) This method belongs to the on-policy method (on-policy), which has the problem of insufficient exploration. In order to improve the algorithm's ability to explore the policy space, it is necessary to add certain Exploration noise, and exploration noise will cause a certain excitation noise deviation; 2) This method is only applicable to affine systems, and is no longer applicable to more general non-affine systems
[0004] In order to overcome the problem that the existing synchronization strategy iteration method cannot be applied to general nonlinear and non-affine systems and the synchronization strategy iteration is insufficient to explore the strategy space, the technical problems to be solved in this patent include: 1. For general nonlinear and non-affine systems system, a reinforcement learning method that can perform real-time online learning of the optimal controller

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online learning method for optimal controller of nonlinear system
  • Online learning method for optimal controller of nonlinear system
  • Online learning method for optimal controller of nonlinear system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040]An online learning method for an optimal controller of a nonlinear system, comprising the following steps:

[0041] S1. Obtain the initial state, system state, and control input of the control system, where the control system includes the motion control system of the robot or the flight control system of the drone.

[0042] S2. Establish a continuous time system model:

[0043] x=f(x(t),u(t)),x(0)=x 0

[0044] In the formula, is the system state, u∈R m is the control input of the system, x(0)=x 0 is the initial state of the system, and Ω is the state area.

[0045] S3. Define the objective function:

[0046]

[0047] In the formula, the function r:R n × R m →R is a continuous positive definite function.

[0048] S4. Establish the optimal controller, the optimal controller u * Satisfy the following HJB equation:

[0049]

[0050] In the formula, is the Hamiltonian function, V * is the optimal controller u * The corresponding value function, namely:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an online learning method for an optimal controller of a nonlinear system. The method comprises the steps that: the initial state, system state and control input of a controlsystem are acquired, wherein the control system comprises a motion control system of a robot or a flight control system of an unmanned aerial vehicle; a continuous time system model is established; atarget function is defined; an optimal controller is established; a synchronization strategy iterative algorithm based on off-strategy learning is established; online training learning is performed onthe optimal controller; and the optimal controller obtained by training and learning is applied to an actual controlled object, wherein the controlled object comprises the control parameters of the motion control system of the robot or the control parameters of the flight control system of the unmanned aerial vehicle.

Description

technical field [0001] The invention relates to an online learning method for an optimal controller of a nonlinear system, in particular to a Background technique [0002] In the process of engineering practice, engineers and technicians often need to optimize the controllers of complex nonlinear systems such as robots and aircraft. From the perspective of cybernetics and mathematics, it is very difficult to find the optimal controller for nonlinear systems. Classical dynamic programming methods often face the problem of "curse of dimensionality", that is, the computational complexity increases with the increase of the system state dimension. index increase. In addition, obtaining the optimal controller requires solving the complex Hamilton-Jacobi-Bellman equation (HJB equation), but the HJB equation is a nonlinear partial differential equation, which is very difficult to solve. [0003] In recent years, reinforcement learning techniques are becoming a powerful tool for so...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G05B13/04
CPCG05B13/042Y02T10/40
Inventor 李新兴查文中王雪源王蓉
Owner INFORMATION SCI RES INST OF CETC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products