Trajectory control method and control system of underwater robot based on deep reinforcement learning

An underwater robot, reinforcement learning technology, applied in general control systems, control/regulation systems, height or depth control, etc., can solve problems such as low trajectory tracking accuracy

Active Publication Date: 2019-12-10
SOUTH CHINA NORMAL UNIVERSITY
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to overcome the shortcomings and deficiencies of the prior art, and provide a trajectory control method for underwater robots based on deep reinforcement learning. Through this control method, precise control of the trajectory of the underwater robot can be realized, and the problem caused by the high altitude of the underwater robot can be avoided. The control problem of low trajectory tracking accuracy caused by continuous dimensional behavior space and nonlinear properties

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Trajectory control method and control system of underwater robot based on deep reinforcement learning
  • Trajectory control method and control system of underwater robot based on deep reinforcement learning
  • Trajectory control method and control system of underwater robot based on deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0155] This embodiment discloses a trajectory control method for an underwater robot based on deep reinforcement learning, which is characterized in that it includes a learning phase and an application phase; in the learning phase, the operating process of the underwater robot is simulated by a simulator, and the simulated data is collected. According to the data of the operating underwater robot simulated by the device, the decision-making neural network, auxiliary decision-making neural network, evaluation neural network and auxiliary evaluation neural network are studied according to these data; the specific steps are as follows:

[0156] S1. First, four neural networks are established, which are respectively used as decision-making neural network, auxiliary decision-making neural network, evaluation neural network and auxiliary evaluation neural network, and the neural network parameters of the four neural networks are initialized; the parameters of the neural network refer ...

Embodiment 2

[0221] This embodiment discloses a trajectory control method based on an underwater robot. The difference between it and the trajectory control method based on an underwater robot disclosed in Embodiment 1 is that the learning stage in this embodiment also includes the following steps: S8. During the operation of the robot, real-time data are collected at each moment, and the following re-learning is carried out for the decision-making neural network, auxiliary decision-making neural network, evaluation neural network and auxiliary evaluation neural network learned in step S7, specifically:

[0222] S81. Initialize the experience data buffer first; use the decision-making neural network, auxiliary decision-making neural network, evaluation neural network, and auxiliary evaluation neural network that have been learned in step S7 as the initial neural network; then, for the above-mentioned initial neural network, start from the initial moment , enter step S82 to start learning; ...

Embodiment 3

[0251] This embodiment discloses a trajectory control method for an underwater robot based on deep reinforcement learning, which is characterized in that it includes a learning phase and an application phase; in the learning phase, the specific steps are as follows:

[0252] S1. First, four neural networks are established, which are respectively used as decision-making neural network, auxiliary decision-making neural network, evaluation neural network and auxiliary evaluation neural network, and the neural network parameters of the four neural networks are initialized; the parameters of the neural network refer to The connection weights of each layer of neurons in the neural network; at the same time, an experience data buffer is established and initialized; then, for the above-mentioned four initialized neural networks, from the initial moment, enter step S2 to start learning;

[0253] S2, judging whether the current moment is the initial moment;

[0254] If so, then collect ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a method and system for controlling an underwater robot locus based on deep reinforcement learning. The system comprises a learning phase and an application phase. In the learning phase, a simulator simulates the running process of an underwater robot and collects the data of the simulated running underwater robot, the data comprises the state of each moment and the target state of the next moment corresponding to each moment, and the learning is performed aiming at a decision neural network, an auxiliary decision neural network, an evaluation neural network and an auxiliary evaluation neural network through the data. In the application phase, the state o the underwater robot at the current moment and the target state of the underwater robot at the next moment are obtained and input to the decision neural network obtained through the final learning in the learning phase, and the decision neural network is configured to calculate the propulsive force required by the underwater robot at the current moment. The method and system for controlling underwater robot locus based on the deep reinforcement learning can realize accurate control of the underwater motion track.

Description

technical field [0001] The invention relates to underwater robot control technology, in particular to an underwater robot trajectory control method and control system based on deep reinforcement learning. Background technique [0002] In recent years, underwater robots have been widely used in many marine science fields such as ocean exploration and marine environmental protection, and their status has become increasingly important. Through precise control of the trajectory of underwater robots, people can safely complete some tasks with relatively high risk factors. High-level tasks, such as exploring seabed oil and repairing seabed pipelines, etc. At present, there is still a common way for underwater robots to complete tasks on a specified trajectory manually. Manual operation requires a lot of energy and labor intensity, especially when the water flow When there is a change or external interference, manual operation alone is not only complicated to operate, but also diff...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G05D1/06G05B13/04
CPCG05B13/042G05D1/0692
Inventor 马琼雄余润笙石振宇黄晁星李腾龙张庆茂
Owner SOUTH CHINA NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products