Mobile robot obstacle avoidance method based on DoubleDQN network and deep reinforcement learning

A mobile robot and reinforcement learning technology, applied in biological neural network models, instruments, non-electric variable control and other directions, can solve the problems of long training time, low obstacle avoidance success rate, and high response delay, shortening training time and improving training. Efficiency, overcoming the effect of high response delay

Active Publication Date: 2019-03-01
HARBIN INST OF TECH +1
View PDF8 Cites 46 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to solve the problems of high response delay, long training tim

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mobile robot obstacle avoidance method based on DoubleDQN network and deep reinforcement learning
  • Mobile robot obstacle avoidance method based on DoubleDQN network and deep reinforcement learning
  • Mobile robot obstacle avoidance method based on DoubleDQN network and deep reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0024] Specific implementation mode one: as figure 1 As shown, the mobile robot obstacle avoidance method based on DoubleDQN network and deep reinforcement learning described in this embodiment comprises the following steps:

[0025] Step 1: use Kinect on the mobile robot to map the current environment where the mobile robot is located, and extract all obstacle information in the current environment where the mobile robot is located;

[0026] Step 2: Transform the mobile robot itself, target position and all obstacle information extracted in step 1 in the global coordinate system to the local coordinate system, and transform the mobile robot itself, target position and all obstacles extracted in step 1 in the local coordinate system The object information is used as the state input of the Double DQN network;

[0027] Step 3: Design the decision-making action space output by the Double DQN network;

[0028] Step 4: Design the reward function of the Double DQN network. The rew...

specific Embodiment approach 2

[0033] Specific implementation mode two: the difference between this implementation mode and specific implementation mode one is: the specific process of said step two is:

[0034] Transform the mobile robot itself, the target position and all obstacle information extracted in step 1 in the global coordinate system to the local coordinate system. The coordinate transformation is as follows: figure 2 As shown, v in the figure represents the expression form of the mobile robot speed (including direction and size) in the local coordinate system; the mobile robot itself, the target position and all obstacle information extracted in step 1 in the local coordinate system are used as the DoubleDQN network State input; the local coordinate system is based on the mobile robot itself as the coordinate origin, the direction of the mobile robot pointing to the target position is the positive direction of the x-axis, and the direction of the y-axis satisfies the right-hand rule and is perp...

specific Embodiment approach 3

[0040] Specific implementation mode three: the difference between this implementation mode and specific implementation mode one is: the specific process of the step three is:

[0041] In the local coordinate system, the set of decision-making action space a output by the DoubleDQN network is designed as A, where: set A refers to the x-axis direction of the local coordinate system as the center direction, and the angle difference from the center direction is -90°, -85° °, -80°, ... 0°, 5°, ... 85°, 90°, a set of candidate speed directions, then set A contains 37 candidate actions. The schematic diagram of the action space is as image 3 As shown in , the candidate actions are indicated by dashed arrows.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention, which belongs to the technical field of mobile robot navigation, provides a mobile robot obstacle avoidance method based on a DoubleDQN network and deep reinforcement learning so that problems of long response delay, long needed training time, and low success rate of obstacle avoidance based on the existing deep reinforcement learning obstacle avoidance method can be solved. Specialdecision action space and a reward function are designed; mobile robot trajectory data collection and Double DQN network training are performed in parallel at two threads, so that the training efficiency is improved effectively and a problem of long training time needed by the existing deep reinforcement learning obstacle avoidance method is solved. According to the invention, unbiased estimationof an action value is carried out by using the Double DQN network, so that a problem of falling into local optimum is solved and problems of low success rate and high response delay of the existing deep reinforcement learning obstacle avoidance method are solved. Compared with the prior art, the mobile robot obstacle avoidance method has the following advantages: the network training time is shortened to be below 20% of the time in the prior art; and the 100% of obstacle avoidance success rate is kept. The mobile robot obstacle avoidance method can be applied to the technical field of mobilerobot navigation.

Description

technical field [0001] The invention belongs to the technical field of mobile robot navigation, and in particular relates to an obstacle avoidance method for a mobile robot. Background technique [0002] With the development of the mobile robotics industry, collision avoidance is at the core of many robotic applications, such as in multi-agent coordination, home service robots, and warehouse robots. However, it is still a very challenging task to ensure accurate obstacle avoidance while finding the path with the shortest time spent. Because in many cases, it is necessary to reach a given target position in the shortest time while accurately avoiding obstacles. [0003] At present, according to whether mobile robots communicate with each other, the types of obstacle avoidance algorithms are divided into two categories, namely communication and non-communication. However, in practical application scenarios, it is sometimes difficult for us to obtain reliable communication in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G05D1/02G06N3/04
CPCG05D1/021G06N3/045
Inventor 李湛杨柳薛喜地孙维超林伟阳佟明斯高会军
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products