Unmanned ship real-time obstacle avoidance algorithm based on deep reinforcement learning
A technology of reinforcement learning and unmanned boats, applied in the field of unmanned boats, can solve problems such as strong dynamics, unpredictability, and complex environmental information, and achieve the effects of ensuring real-time performance, optimizing network structure, and enriching navigation information
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0032] In an embodiment of the present invention, a real-time obstacle avoidance algorithm for unmanned boats based on deep reinforcement learning includes the following steps:
[0033] S10, adding two long short-term memory (Long Short-Term Memory, LSTM) networks on the basis of the deep neural network CNN, the first network (LSTM1) contains 64 hidden units, and the input is the image information and the previous reward information; The second network (LSTM2) contains 256 hidden units, the input is image information, the output value of LSTM1, the current speed and the last action; after each iteration, the network retains the previous image information i t , the action taken a t-1 and the return r of the action t-1 , to provide reference for the next study.
[0034] S20. Add two auxiliary tasks of depth detection and loop detection to the A3C algorithm to enrich navigation information.
[0035] Specifically, step S20 includes:
[0036]S201. Add a depth detection network ...
Embodiment 2
[0047] In the embodiment of the present invention, the method described in Embodiment 1 will be supplemented with reference to the accompanying drawings:
[0048] Please refer to 1, in the optimized network of the present invention, CNN is composed of 2 fully convolutional layers and 1 fully connected layer, and the input image i t After decoding, image information and depth information D are output, and the output of LSTM2 is strategy π, value V and loopback information L.
[0049] see figure 2 , the present invention uses an asynchronous training method to perform simultaneous sampling of multiple agents, and the parameters of the main network are directly assigned to the sub-networks in the agent, and the gradients in each agent can update the parameters of the main network. The main network directly uses the obtained samples for training, and the training queue and prediction queue are input to the GPU network after batch processing. Considering the characteristics of G...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com