Unmanned ship real-time obstacle avoidance algorithm based on deep reinforcement learning

A technology of reinforcement learning and unmanned boats, applied in the field of unmanned boats, can solve problems such as strong dynamics, unpredictability, and complex environmental information, and achieve the effects of ensuring real-time performance, optimizing network structure, and enriching navigation information

Inactive Publication Date: 2019-11-19
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF6 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] (1) Environmental information is complex and highly dynamic. Unmanned boats will be disturbed by natural environments such as wind, waves, and currents when navigating on the water surface. These disturbances are highly dynamic and difficult to predict; in addition, complex environmental information also includes Static obst

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Unmanned ship real-time obstacle avoidance algorithm based on deep reinforcement learning
  • Unmanned ship real-time obstacle avoidance algorithm based on deep reinforcement learning
  • Unmanned ship real-time obstacle avoidance algorithm based on deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] In an embodiment of the present invention, a real-time obstacle avoidance algorithm for unmanned boats based on deep reinforcement learning includes the following steps:

[0033] S10, adding two long short-term memory (Long Short-Term Memory, LSTM) networks on the basis of the deep neural network CNN, the first network (LSTM1) contains 64 hidden units, and the input is the image information and the previous reward information; The second network (LSTM2) contains 256 hidden units, the input is image information, the output value of LSTM1, the current speed and the last action; after each iteration, the network retains the previous image information i t , the action taken a t-1 and the return r of the action t-1 , to provide reference for the next study.

[0034] S20. Add two auxiliary tasks of depth detection and loop detection to the A3C algorithm to enrich navigation information.

[0035] Specifically, step S20 includes:

[0036]S201. Add a depth detection network ...

Embodiment 2

[0047] In the embodiment of the present invention, the method described in Embodiment 1 will be supplemented with reference to the accompanying drawings:

[0048] Please refer to 1, in the optimized network of the present invention, CNN is composed of 2 fully convolutional layers and 1 fully connected layer, and the input image i t After decoding, image information and depth information D are output, and the output of LSTM2 is strategy π, value V and loopback information L.

[0049] see figure 2 , the present invention uses an asynchronous training method to perform simultaneous sampling of multiple agents, and the parameters of the main network are directly assigned to the sub-networks in the agent, and the gradients in each agent can update the parameters of the main network. The main network directly uses the obtained samples for training, and the training queue and prediction queue are input to the GPU network after batch processing. Considering the characteristics of G...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an unmanned ship real-time obstacle avoidance algorithm based on deep reinforcement learning, and relates to the technical field of unmanned ships. According to the invention,a deep learning method is adopted to obtain image information, on the basis of an A3C algorithm, a network structure is optimized, obstacle avoidance information is enriched, and according to three requirements of path planning, obstacle avoidance and environment exploration and adaptation, the action space of an intelligent agent is re-standardized, and three kinds of environments are selected for training; and in combination with a GPU platform, the pre-training data is integrated into the deep neural network, so that the training efficiency is improved, and the real-time performance of thealgorithm is ensured. The result shows that the method shortens the training time by 59.3% and improves the efficiency by more than 71.7% while meeting the single processing speed requirement, and theperformance of the algorithm model in an unknown environment is effectively improved and is superior to the existing scheme.

Description

technical field [0001] The invention relates to the technical field of unmanned boats, in particular to a real-time obstacle avoidance algorithm for unmanned boats based on deep reinforcement learning. Background technique [0002] As an autonomous surface unmanned vehicle, the surface unmanned vehicle has broad application prospects in military and civilian fields due to its small size, high intelligence, and ability to complete tasks independently. As one of the important criteria to measure its intelligence, the local obstacle avoidance technology of unmanned boats requires the unmanned boats to perceive and judge the surrounding unknown and known environments within a certain range, and to be able to quickly avoid obstacles. Finally arrived at the designated location safely. Unmanned boats usually use infrared, camera, ultrasonic and laser sensors as information acquisition sources. With the advancement of hardware technology and manufacturing capabilities, the accuracy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/08G06N3/049G06N3/045
Inventor 周治国
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products