Neural network reinforcement learning control method of autonomous underwater robot

An underwater robot and neural network technology, applied in the field of autonomous underwater robot neural network reinforcement learning control, can solve the problems of poor adaptability to environmental changes, long time-consuming, difficult identification, etc., and achieve the effect of improving the controller

Inactive Publication Date: 2019-05-10
HARBIN ENG UNIV
View PDF7 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Traditional controllers often have two main methods. One requires people to continuously adjust parameters offline based on experience, which takes a long time and has poor adaptability to environmental changes.
Th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network reinforcement learning control method of autonomous underwater robot
  • Neural network reinforcement learning control method of autonomous underwater robot
  • Neural network reinforcement learning control method of autonomous underwater robot

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] This article will combine Figure 1 to Figure 3 , taking the combined speed and heading control of an underactuated AUV as an example to illustrate the basic principle of the neural network reinforcement learning controller. For reinforcement learning, the basic idea is to solve sequential decision-making problems by establishing a mapping relationship between state behavior and the environment. The sequential decision-making problem is usually defined as a Markov decision process (Markov Decision Processes, MDP) quadruple E={S,A,P,R}, and its basic model is as follows: Figure 4 shown. Where E represents the environment of the object Agent, S is the state space of the Agent, A is the action space corresponding to the state of the Agent, P is the state transition probability of the Agent, and R is the instantaneous state reward value corresponding to the action. Its core idea is to find a strategy for the Agent, that is, the action sequence, so that the value of the d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a neural network reinforcement learning control method of an autonomous underwater robot. The neural network reinforcement learning control method of the autonomous underwater robot comprises the steps that current pose information of an autonomous underwater vehicle (AUV) is obtained; quantity of a state is calculated, the state is input into a reinforcement learning neuralnetwork to calculate a Q value in a forward propagation mode, and parameters of a controller are calculated by selecting an action A; the control parameters and control deviation are input into the controller, and control output is calculated; the autonomous robot performs thrust allocation according to executing mechanism arrangement; and a reward value is calculated through control response, reinforcement learning iteration is carried out, and reinforcement learning neural network parameters are updated. According to the neural network reinforcement learning control method of the autonomousunderwater robot, a reinforcement learning thought and a traditional control method are combined, so that the AUV judges the self motion performance in navigation, the self controller performance isadjusted online according to experiences generated in the motion, a complex environment is adapted faster through self-learning, and thus, better control precision and control stability are obtained.

Description

technical field [0001] The invention belongs to the technical field of autonomous underwater robot control, in particular to an autonomous underwater robot neural network reinforcement learning control method. Background technique [0002] Autonomous Underwater Vehicle (AUV), as a member of the unmanned carrier system, AUV is an underwater vehicle with autonomous navigation function. By carrying various sensor equipment and communication systems, the position and attitude information of the unmanned vehicle can be obtained, and the control system of the AUV is formed with the power actuator. At the same time, AUV, as an underwater vehicle, can carry the required equipment according to the tasks to be performed to achieve various purposes such as detection and military. To evaluate the performance of an AUV, it is often judged by the speed, accuracy and stability of its task execution. AUVs with good performance can usually achieve higher efficiency in task execution. [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G05B13/04
Inventor 万磊张子洋王卓牛广智徐钰斐郑晓波陈国防
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products