A Parameter Adaptive Backstepping Control Method for Underwater Robots Based on Double BP Neural Network Q-Learning Technology

A BP neural network, underwater robot technology, applied in the direction of adaptive control, general control system, control/adjustment system, etc., can solve the problems of low learning efficiency and difficult real-time online adjustment of parameters, so as to improve adaptability and reduce training The number of times, the effect of good control effect

Active Publication Date: 2022-05-13
HARBIN ENG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem of low learning efficiency when the traditional Q-learning method is used to adjust the controller parameters, and the problem that the parameters existing in the traditional backstepping method are not easy to adjust on-line in real time. Parameter Adaptive Backstepping Control Method of Underwater Robot Based on BP Neural Network Q-Learning Technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Parameter Adaptive Backstepping Control Method for Underwater Robots Based on Double BP Neural Network Q-Learning Technology
  • A Parameter Adaptive Backstepping Control Method for Underwater Robots Based on Double BP Neural Network Q-Learning Technology
  • A Parameter Adaptive Backstepping Control Method for Underwater Robots Based on Double BP Neural Network Q-Learning Technology

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0052] Specific embodiment one: a kind of underwater robot parameter self-adaptive backstepping control method based on double BP neural network Q learning technology described in the present embodiment, described method specifically comprises the following steps:

[0053] Step 1. Design the speed control system and the heading control system of the underwater robot respectively based on the backstepping method, and then determine the control law of the speed control system and the control law of the heading control system according to the designed speed control system and heading control system;

[0054] The speed control system of the underwater robot is shown in formula (1):

[0055]

[0056] Among them, m is the mass of the underwater robot, and x u|u| are dimensionless hydrodynamic parameters, u is the longitudinal velocity of the underwater robot, |u| is the absolute value of u, is the longitudinal acceleration of the underwater robot, τ u is the longitudinal thr...

specific Embodiment approach 2

[0110] Specific embodiment 2: The difference between this embodiment and specific embodiment 1 is that in the second step, the output is the action value set k′ u , and then use the ε greedy strategy from the action value set k′ u Select the optimal action value corresponding to the current state vector; the specific process is:

[0111] Define the action space that needs to be divided as k′ u0 , k′ u0 ∈[-1, 2], put k′ u0 Every 0.2 is divided into 16 action values, and 16 action values ​​form the action value set k′ u ; Then use the ε greedy strategy from the action value set k′ u Select the optimal action value k″ corresponding to the current state vector u .

[0112] action value set k′ u={-1,-0.8,-0.6,-0.4,-0.2,...,1.4,1.6,1.8,2}.

[0113] The adaptive backstepping speed controller and heading controller based on reinforcement learning, its action selection method is ε greedy strategy, ε∈(0,1), when ε=0 represents pure exploration, when ε=1 represents pure explorati...

specific Embodiment approach 3

[0114] Specific embodiment three: the difference between this embodiment and specific embodiment one is: in the step three, the first current BP neural network is in the current state S t Choose the optimal action a t And the reward value obtained after execution is r t+1 (S t+1 , a), r t+1 (S t+1 , the expression of a) is:

[0115] r t+1 (S t+1 ,a)=c 1 ·s 1u 2 +c 2 ·s 2u 2 (13)

[0116] Among them, c 1 and c 2 All are positive numbers greater than zero.

[0117] The reward and punishment function has a relatively clear goal, which is used to evaluate the performance of the controller. Usually, the quality of a controller is based on its stability, accuracy and rapidity. It is hoped that it can reach the expected value faster and more accurately. , reflected in the response curve should have a faster rising speed, and have a smaller overshoot and oscillation. c 1 and c 2 Both are positive numbers greater than zero, respectively representing the proportion o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an underwater robot parameter self-adaptive backstepping control method based on double BP neural network Q learning technology, which belongs to the technical field of underwater robot controller parameter adjustment. The invention solves the problems that the learning efficiency is low when the traditional Q learning method is used to adjust the controller parameters, and the parameters existing in the traditional backstepping method are not easy to be adjusted on-line in real time. The invention realizes the independent online adjustment of the parameters of the backstepping controller by combining the Q learning algorithm based on the double BP neural network and the backstepping method, so as to meet the requirement that the control parameters can be adjusted online in real time. At the same time, due to the introduction of the dual BP neural network and the experience playback pool, its powerful fitting ability enables the adaptive backstepping control method based on the dual BP neural network Q learning parameters to greatly reduce the number of training times to improve learning efficiency. In the case of better control effect. The invention can be applied to the adjustment of the controller parameters of the underwater robot.

Description

technical field [0001] The invention belongs to the technical field of parameter adjustment of underwater robot controllers, and in particular relates to an adaptive backstepping control method for underwater robot parameters based on double BP neural network Q learning technology. Background technique [0002] As an important tool and means for marine resource exploration and submarine mission execution, underwater robot's motion control performance will largely affect the effect of mission completion. Currently, some traditional conventional controllers are widely used in industrial environments due to their robustness and scalability, but these controllers are usually not optimally tuned and cannot achieve satisfactory performance , in practical applications, the parameters of the controller are solidified in the controller after frequent manual adjustments, and cannot adapt to changes in the environment in the controlled process. How to perform real-time self-tuning of t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G05B13/04G05D1/10
CPCG05B13/04G05D1/10
Inventor 王卓张佩秦洪德孙延超邓忠超张宇昂景锐洁曹禹
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products