Robust control method based on reinforcement learning and Lyapunov function

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and robust control, applied in the direction of adaptive control, general control system, control/regulation system, etc., can solve the problems of low safety and stability, to ensure free exploration, high efficiency and safety Effect

Active Publication Date: 2020-03-27

SUN YAT SEN UNIV

View PDF6 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to overcome the problem of low security and stability of the robot control method in the above-mentioned prior art, the present invention provides a robust control method based on reinforcement learning and Lyapunov function, and uses adaptive online Bayesian reasoning to control robot dynamics. Modeling, based on Lyapunov's construction of constrained reinforcement learning problems, to achieve efficient learning, stable work, and safe exploration of robots

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0038] like figure 1 Shown is an embodiment of a robust control method based on reinforcement learning and Lyapunov functions, including the following steps:

[0039] Step 1: Build an affine system model, and then model the uncertainty in the system dynamics model based on the Gaussian process; the nonlinear affine system can be modeled as: Among them, f(s)+g(s)a represents the prior model obtained by modeling system dynamics and kinematics, and d(s) represents the deviation between the model and the real environment; Gaussian process regression obtains the state s through Bayesian inference * The lower deviation d(s * ) mean and variance:

[0040]

[0041]

[0042] Among them, k(s i ,s j ) is the kernel function defined in GP, k n =[k(s 1 ,s * ), k(s 2 ,s * ),...,k(s n ,s * )], [K] i,j =k(s i ,s j ) is the kernel matrix, is the label vector, σ noise is the standard deviation of label data noise, I is the identity matrix;

[0043] Get a high confiden...

Embodiment 2

[0060] like Figure 1-3 As shown, on the basis of Embodiment 1, this embodiment takes the trajectory tracking and obstacle avoidance tasks of a quadrotor UAV under random wind field interference as an example to illustrate the specific implementation steps of this method:

[0061] Step 1: In this example, based on the prior knowledge of robot dynamics and the actual task scene, a baseline nonlinear affine system model is established, and a parameterized trajectory in three-dimensional space is set, including the trajectory start point and end point, which is recorded as r(t) ∈ R 3 .

[0062] Define the safe state space of the UAV according to the task scenario: C={x|h(x)≥0} and the target balance point, such as: avoiding obstacles, the trajectory point as the balance point, and designing the corresponding control barrier function (CBF )h(s t ) and the Lyapunov function (CLF) V(s t ).

[0063] Step 2: Select the framework of model predictive control as the benchmark strate...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a robust control method based on reinforcement learning and a Lyapunov function. Modeling is conducted on robot dynamics through self-adaptive online Bayesian reasoning, a constrained reinforcement learning problem is constructed based on Lyapunov, and efficient learning, stable work and safe exploration of the robot are achieved by constructing a Lyapunov function controlstrategy and a barrier function control strategy. The technical problems of insecurity, instability and inefficiency of a nonlinear hybrid dynamic safety-critical robot system facing system uncertainty and external environment uncertainty in a task scene with limited state and action space are solved.

Description

technical field [0001] The invention relates to the field of robot control, and more specifically, to a robust control method based on reinforcement learning and Lyapunov functions. Background technique [0002] With the development of the times, robots play an important role in various fields of modern life, such as intelligent manufacturing, transportation, medical care, emergency rescue and disaster relief, etc. The real world environment has unstructured and dynamic uncertainties. Therefore, in the face of complex robot systems with nonlinear, hybrid dynamic models and limited state and action spaces, in mission scenarios where safety is extremely critical, it is necessary to design an efficient An adaptive controller that is stable and secure at the same time. [0003] In order to achieve the above goals, there are two methods used, one is the optimal control method, and the other is the control method based on reinforcement learning; the optimal control method is to u...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G05B13/04

CPCG05B13/027G05B13/042

Inventor 潘杰森郑磊成慧胡海峰

Owner SUN YAT SEN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Robust control method based on reinforcement learning and Lyapunov function

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology