Multi-robot path planning method based on priori knowledge and DQN algorithm

A priori knowledge and multi-robot technology, applied in the direction of instrumentation, calculation, two-dimensional position/channel control, etc., can solve the problems of large training randomness and slow convergence speed of targetQ network

Active Publication Date: 2019-10-11
CHONGQING UNIV OF TECH
View PDF5 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the deficiencies of the above-mentioned prior art, the technical problem to be solved by the present invention is: how to better help improv

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-robot path planning method based on priori knowledge and DQN algorithm
  • Multi-robot path planning method based on priori knowledge and DQN algorithm
  • Multi-robot path planning method based on priori knowledge and DQN algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0098] like figure 1 Shown: A multi-robot path planning method based on prior knowledge and DQN algorithm, including:

[0099] S1: Initialize the multi-robot system iteration number threshold N, exploration step threshold M, time step threshold C, prior knowledge, prior rules, experience pool D, iteration number i, exploration step number t, Q network, and randomly generate Q The label weight ω of the network, the prior knowledge is generated according to the optimal path of each single robot, and the prior rules include the state sequence action sequence special state sequence and prior Q-value vector Q p ; Initialize the Q table and targetQ network through prior knowledge, so that the network weight of the targetQ network

[0100] Specifically, in this embodiment, the state sequence of the preset multi-robot system is The sequence of actions is The special state sequence is When a special state p occurs i In the case of , the optimal action selection strategy...

Embodiment 2

[0111] Embodiment 2: This embodiment also discloses a planning method for an optimal path of a single robot.

[0112] like image 3 Shown: the planning method of the optimal path of a single robot, including:

[0113]S101: Initialize the action set A, state set S, maximum number of iterations n, maximum number of exploration steps m, minimum number of paths MinPathNum, maximum number of successful pathfinding MaxSuccessNum, exploration factor ε, and single update step size eSize of the single-robot system , Exploration factor change cycle eCycle, maximum counting threshold h, start update time B(s, a), complete update time, action value function Q(s, a), state-action visit times C(s, a), reward The function stores U(s, a), the number of times of successful pathfinding SuccessNum, the number of successful paths PathNum, the PathList of successful paths, the storage table List of successful paths, the number of iterations i and the number of exploration steps t.

[0114] S102:...

Embodiment 3

[0139] In this embodiment, a simulation experiment of path planning of a multi-robot system is disclosed.

[0140] 1. Description of the simulation experiment

[0141]1) During the simulation experiment, the software platform uses Windows 10 operating system, the CPU uses Inter Core I5-8400, and the size of the running memory is 16GB. The path planning algorithm of the single robot system will use Python language and TensorFlow deep learning tool to complete the simulation experiment, and the multi-robot path planning algorithm will be written on the matlab2016a simulation software using matlab language.

[0142] 2) This paper will use the grid method to describe the environment, divide the working space of the robot system into several small grids, and each small grid can represent a state of the robot system. In the map, the white grid indicates the safe area, and the black grid indicates the existence of obstacles.

[0143] The target state and obstacles in the environmen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of robot path planning, in particular to a multi-robot path planning method based on priori knowledge and a DQN algorithm, and the method comprises the steps: initializing parameters of a multi-robot system; judging whether a special state occurs or not, if so, selecting an action instruction corresponding to the maximum prior of the prior Q value vector, and if not, generating an action instruction by an epsilon-greed strategy; then, calculating and generating running state parameters and a reward function after the robot executes the action instruction, storing related data in an experience pool, and updating a target Q network and according to the target Q network and the initial state parameters of the multi-robot system, repeatedly executing the selection action instruction and generating the state parameters to plan an optimal path of the multi-robot system. The problems that when the DQN algorithm is used for path planning of the multi-robot system, the convergence rate of the target Q network is low, and the training randomness is too high can be better solved.

Description

technical field [0001] The invention relates to the technical field of robot path planning, in particular to a multi-robot path planning method based on prior knowledge and DQN algorithm. Background technique [0002] Mobile robots have a wide range of applications, such as home, agriculture, industry, military and other fields that have mobile robots. The three cores in the research field of controlling robot movement are robot positioning, task assignment and path planning technology. Among them, path planning is the primary condition for the mobile robot to reach the task goal and complete the task content. For example: household service cleaning robots need reasonable path planning for the indoor environment to complete cleaning tasks; agricultural picking robots need path planning to walk between crops to complete picking tasks; industrial robots also need path planning to work in shared workspaces Complete the given task. [0003] With the development of robot techn...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/50G06N3/04G05D1/02
CPCG05D1/0221G06F30/20G06N3/045
Inventor 李波易洁梁宏斌
Owner CHONGQING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products