A Parameter Adaptive Backstepping Control Method for Underwater Robots Based on Double BP Neural Network Q-Learning Technology

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A BP neural network, underwater robot technology, applied in the direction of adaptive control, general control system, control/adjustment system, etc., can solve the problems of low learning efficiency and difficult real-time online adjustment of parameters, so as to improve adaptability and reduce training The number of times, the effect of good control effect

Active Publication Date: 2022-05-13

HARBIN ENG UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem of low learning efficiency when the traditional Q-learning method is used to adjust the controller parameters, and the problem that the parameters existing in the traditional backstepping method are not easy to adjust on-line in real time. Parameter Adaptive Backstepping Control Method of Underwater Robot Based on BP Neural Network Q-Learning Technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment approach 1

[0052] Specific embodiment one: a kind of underwater robot parameter self-adaptive backstepping control method based on double BP neural network Q learning technology described in the present embodiment, described method specifically comprises the following steps:

[0053] Step 1. Design the speed control system and the heading control system of the underwater robot respectively based on the backstepping method, and then determine the control law of the speed control system and the control law of the heading control system according to the designed speed control system and heading control system;

[0054] The speed control system of the underwater robot is shown in formula (1):

[0055]

[0056] Among them, m is the mass of the underwater robot, and x u|u| are dimensionless hydrodynamic parameters, u is the longitudinal velocity of the underwater robot, |u| is the absolute value of u, is the longitudinal acceleration of the underwater robot, τ u is the longitudinal thr...

specific Embodiment approach 2

[0110] Specific embodiment 2: The difference between this embodiment and specific embodiment 1 is that in the second step, the output is the action value set k′ u , and then use the ε greedy strategy from the action value set k′ u Select the optimal action value corresponding to the current state vector; the specific process is:

[0111] Define the action space that needs to be divided as k′ u0 , k′ u0 ∈[-1, 2], put k′ u0 Every 0.2 is divided into 16 action values, and 16 action values form the action value set k′ u ; Then use the ε greedy strategy from the action value set k′ u Select the optimal action value k″ corresponding to the current state vector u .

[0112] action value set k′ u={-1,-0.8,-0.6,-0.4,-0.2,...,1.4,1.6,1.8,2}.

[0113] The adaptive backstepping speed controller and heading controller based on reinforcement learning, its action selection method is ε greedy strategy, ε∈(0,1), when ε=0 represents pure exploration, when ε=1 represents pure explorati...

specific Embodiment approach 3

[0114] Specific embodiment three: the difference between this embodiment and specific embodiment one is: in the step three, the first current BP neural network is in the current state S t Choose the optimal action a t And the reward value obtained after execution is r t+1 (S t+1 , a), r t+1 (S t+1 , the expression of a) is:

[0115] r t+1 (S t+1 ,a)=c 1 ·s 1u 2 +c 2 ·s 2u 2 (13)

[0116] Among them, c 1 and c 2 All are positive numbers greater than zero.

[0117] The reward and punishment function has a relatively clear goal, which is used to evaluate the performance of the controller. Usually, the quality of a controller is based on its stability, accuracy and rapidity. It is hoped that it can reach the expected value faster and more accurately. , reflected in the response curve should have a faster rising speed, and have a smaller overshoot and oscillation. c 1 and c 2 Both are positive numbers greater than zero, respectively representing the proportion o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an underwater robot parameter self-adaptive backstepping control method based on double BP neural network Q learning technology, which belongs to the technical field of underwater robot controller parameter adjustment. The invention solves the problems that the learning efficiency is low when the traditional Q learning method is used to adjust the controller parameters, and the parameters existing in the traditional backstepping method are not easy to be adjusted on-line in real time. The invention realizes the independent online adjustment of the parameters of the backstepping controller by combining the Q learning algorithm based on the double BP neural network and the backstepping method, so as to meet the requirement that the control parameters can be adjusted online in real time. At the same time, due to the introduction of the dual BP neural network and the experience playback pool, its powerful fitting ability enables the adaptive backstepping control method based on the dual BP neural network Q learning parameters to greatly reduce the number of training times to improve learning efficiency. In the case of better control effect. The invention can be applied to the adjustment of the controller parameters of the underwater robot.

Description

technical field [0001] The invention belongs to the technical field of parameter adjustment of underwater robot controllers, and in particular relates to an adaptive backstepping control method for underwater robot parameters based on double BP neural network Q learning technology. Background technique [0002] As an important tool and means for marine resource exploration and submarine mission execution, underwater robot's motion control performance will largely affect the effect of mission completion. Currently, some traditional conventional controllers are widely used in industrial environments due to their robustness and scalability, but these controllers are usually not optimally tuned and cannot achieve satisfactory performance , in practical applications, the parameters of the controller are solidified in the controller after frequent manual adjustments, and cannot adapt to changes in the environment in the controlled process. How to perform real-time self-tuning of t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G05B13/04G05D1/10

CPCG05B13/04G05D1/10

Inventor 王卓张佩秦洪德孙延超邓忠超张宇昂景锐洁曹禹

Owner HARBIN ENG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Parameter Adaptive Backstepping Control Method for Underwater Robots Based on Double BP Neural Network Q-Learning Technology

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment approach 1

specific Embodiment approach 2

specific Embodiment approach 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology