A Floating Control Method for Target Area of ​​Underwater Vehicle Based on Double Critic Reinforcement Learning Technology

An underwater vehicle and target area technology, which is applied in neural learning methods, underwater ships, and underwater operation equipment, etc., can solve the problem of the increase in the number of Q values, the slow convergence speed of algorithm training, easy acquisition without consideration, and reliable performance Expert data and other issues, to achieve good control effect, fast convergence effect

An underwater vehicle and target area technology, which is applied in neural learning methods, underwater ships, and underwater operation equipment, etc., can solve the problem of the increase in the number of Q values, the slow convergence speed of algorithm training, easy acquisition without consideration, and reliable performance Expert data and other issues, to achieve good control effect, fast convergence effect

CN113033119BActive Publication Date: 2022-03-25SHANDONG UNIV

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Floating Control Method for Target Area of ​​Underwater Vehicle Based on Double Critic Reinforcement Learning Technology
  • A Floating Control Method for Target Area of ​​Underwater Vehicle Based on Double Critic Reinforcement Learning Technology
  • A Floating Control Method for Target Area of ​​Underwater Vehicle Based on Double Critic Reinforcement Learning Technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0098] A method for controlling the floating of an underwater vehicle target area based on double-critician reinforcement learning technology. The implementation process of the present invention is divided into two parts, the task environment construction stage and the floating strategy training stage, including the following steps:

[0099] 1. Define the task environment and model:

[0100] 1-1. Construct the task environment of the target area where the underwater vehicle is located and the dynamic model of the underwater vehicle;

[0101] Using the python language to write the underwater vehicle simulation environment task environment in the vscode integrated compilation environment, the geographic coordinate system E-ξηζ of the constructed simulated pool map is as follows image 3 As shown, the size of the three-dimensional pool is set to 50 meters * 50 meters * 50 meters, and the successful floating area of ​​the target area is a cylindrical area with the center of the wa...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for controlling the floating of the target area of ​​an underwater vehicle based on double-critician reinforcement learning technology, which belongs to the technical field of marine control experiments, and is based on the DDPG algorithm framework in deep reinforcement learning. Both the previously obtained expert data and the interaction data obtained from the interaction between the agent and the task environment are used, and the mixed collection of the two greatly improves the convergence speed of the algorithm. At the same time, the present invention utilizes two sets of critic networks independent of each other, and obtains the loss function of the actor network by taking the minimum value of Q(s, a) respectively output by the two groups, which effectively reduces the overestimation existing in the reinforcement learning algorithm.

Description

technical field [0001] The invention relates to a method for controlling the floating of a target area of ​​an underwater vehicle based on double-critician reinforcement learning technology, and belongs to the technical field of ocean control experiments. Background technique [0002] As a key marine equipment, underwater vehicles are widely used in many scientific research and engineering fields such as ocean topographic mapping, resource exploration, archaeological investigation, pipeline maintenance, biological monitoring, etc., and are an important means for human beings to explore the ocean. However, the seabed environment is complex and changeable. Underwater vehicles working in such an environment will inevitably lead to economic losses and loss of important data if they fail to float up to the area where the mother ship is located in a timely, safe and intelligent manner when encountering a fault or strong interference. . Therefore, in order to enhance the adaptabil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
25 Mar 2022
Publication
CN113033119B
IPC
G06F30/28; G06N3/04; G06N3/08; B63G8/18; B63G8/14
CPC
G06N3/08; G06F30/28; B63G8/14; B63G8/18; G06N3/045
Inventors
李沂滨; 张天泽