Underwater robot parameter adaptive backstepping control method based on double-BP neural network Q learning technology

A BP neural network, underwater robot technology, applied in the direction of adaptive control, general control system, control/regulation system, etc., can solve the problems of low learning efficiency and difficult real-time online adjustment of parameters.

Active Publication Date: 2020-05-19
HARBIN ENG UNIV
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem of low learning efficiency when the traditional Q-learning method is used to adjust the controller parameters, and the problem that the parameters existing in the traditional backstepping method are not easy to adjust on-line in real time. Parameter Adaptive Backstepping Control Method of Underwater Robot Based on BP Neural Network Q-Learning Technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Underwater robot parameter adaptive backstepping control method based on double-BP neural network Q learning technology
  • Underwater robot parameter adaptive backstepping control method based on double-BP neural network Q learning technology
  • Underwater robot parameter adaptive backstepping control method based on double-BP neural network Q learning technology

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0052] Specific implementation mode 1: The method for adaptive backstepping control of underwater robot parameters based on dual BP neural network Q learning technology described in this embodiment specifically includes the following steps:

[0053] Step 1: Design the speed control system and the heading control system of the underwater robot separately based on the backstepping method, and then determine the control law of the speed control system and the heading control system according to the designed speed control system and heading control system;

[0054] The speed control system of the underwater robot is shown in formula (1):

[0055]

[0056] Among them, m is the quality of the underwater robot, And X u|u| Are all dimensionless hydrodynamic parameters, u is the longitudinal velocity of the underwater robot, |u| is the absolute value of u, Is the longitudinal acceleration of the underwater robot, τ u Is the longitudinal thrust of the propeller, v is the lateral speed of the ...

specific Embodiment approach 2

[0110] Specific embodiment two: this embodiment is different from specific embodiment one in that in the second step, the output is the action value set k′ u , And then use the ε greedy strategy from the action value set k′ u Select the optimal action value corresponding to the current state vector; the specific process is:

[0111] Define the action space to be divided as k′ u0 , K′ u0 ∈[-1, 2], put k′ u0 Every 0.2 is divided into 16 action values, and the 16 action values ​​form the action value set k′ u ; Then use the ε greedy strategy from the action value set k′ u Select the optimal action value k" corresponding to the current state vector in u .

[0112] Action value set k′ u = {-1, -0.8, -0.6, -0.4, -0.2,...,1.4,1.6,1.8,2}.

[0113] The adaptive backstepping speed controller and heading controller based on reinforcement learning, its action selection method is ε greedy strategy, ε ∈ (0,1), when ε=0, it means pure exploration, when ε=1, it means pure Use, so its value is betwee...

specific Embodiment approach 3

[0114] Specific embodiment three: This embodiment is different from specific embodiment one in that in the third step, the first current BP neural network is in the current state S t Choose the best action a t And the reward value obtained after execution is r t+1 (S t+1 ,a), r t+1 (S t+1 The expression of a) is:

[0115] r t+1 (S t+1 ,a)=c 1 ·S 1u 2 +c 2 ·S 2u 2 (13)

[0116] Where c 1 And c 2 All are positive numbers greater than zero.

[0117] The reward and punishment function has a relatively clear goal, which is used to evaluate the performance of the controller. Usually the quality of a controller is based on its stability, accuracy and rapidity. It is hoped that it can reach the expected value faster and more accurately , Reflected on the response curve, should have a faster ascent speed, and have a smaller overshoot and oscillation. c 1 And c 2 Both are positive numbers greater than zero, which respectively represent the proportion of the influence of the deviation and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an underwater robot parameter adaptive backstepping control method based on a double-BP neural network Q learning technology, belongs to the technical field of underwater robotcontroller parameter adjustment and solves problems that learning efficiency is low when controller parameter adjustment is carried out through a traditional Q learning method and parameters are noteasy to adjust online in real time when controller parameter adjustment is carried out through a traditional backstepping method. According to the method, autonomous on-line adjustment of the parameters of a backstepping method controller is realized by combining a double BP neural network-based Q learning algorithm and a backstepping method, so the requirement that the control parameters can be adjusted on line in real time is met, moreover, due to introduction of the double BP neural networks and an experience playback pool, the Q learning parameter adaptive backstepping control method basedon the double BP neural networks can greatly reduce the number of training times due to the powerful fitting capability, so learning efficiency is improved, and the better control effect is achievedunder the condition that the number of training times is small. The method can be applied to parameter adjustment of the underwater robot controller.

Description

Technical field [0001] The invention belongs to the technical field of underwater robot controller parameter adjustment, and in particular relates to an underwater robot parameter adaptive backstepping control method based on double BP neural network Q learning technology. Background technique [0002] As an important tool and method for marine resource exploration and subsea mission execution, underwater robots have a good or bad motion control performance that will greatly affect the effect of mission completion. At present, some traditional conventional controllers have been widely used in industrial environments due to their robustness and scalability. However, these controllers are usually not optimally tuned and cannot achieve satisfactory performance. In practical applications, the parameters of the controller are fixed in the controller after frequent manual debugging and cannot adapt to the changes in the environment during the controlled process. How to set the controll...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G05B13/04G05D1/10
CPCG05B13/04G05D1/10
Inventor 王卓张佩秦洪德孙延超邓忠超张宇昂景锐洁曹禹
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products