Unlock instant, AI-driven research and patent intelligence for your innovation.

Course learning method for learning multi-robot formation navigation strategy under sparse reward signals

A multi-robot and learning method technology, applied in the field of multi-mobile robots, can solve the problem of difficulty in learning navigation strategies for robot formations

Active Publication Date: 2020-10-27
SUN YAT SEN UNIV
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, under sparse reward signals, it is difficult for multi-robot formations to learn effective navigation policies through general deep reinforcement learning-based methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Course learning method for learning multi-robot formation navigation strategy under sparse reward signals
  • Course learning method for learning multi-robot formation navigation strategy under sparse reward signals
  • Course learning method for learning multi-robot formation navigation strategy under sparse reward signals

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] like figure 1 and figure 2 Shown, a curriculum learning method for learning a multi-robot formation navigation policy under sparse reward signals, where a curriculum learning based on fusing relative performance and absolute performance is used to allow the multi-robot formation to still be able to Learn an effective navigation strategy; based on the fusion of relative performance and absolute performance curriculum learning, that is, as the training progresses, gradually switch from relative performance-based curriculum learning to absolute performance-based curriculum learning, in this way, in the training In the early stage, the basic navigation strategy is quickly mastered through the course learning based on relative performance, and the complex navigation strategy is overcome through the course learning based on absolute performance in the later stage of training.

[0043] Compared with the general multi-robot formation navigation method based on deep reinforcem...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of multiple mobile robots in robots, and particularly relates to a course learning method for learning a multi-robot formation navigation strategy under sparse reward signals. When the reward signals are sparse, the multi-robot formation navigation method based on deep reinforcement learning is difficult to learn an effective navigation strategy througha trial and error mode. In order to enable a multi-robot formation to still learn a navigation strategy under the condition of sparse award signals, the invention provides a course learning method based on fusion of relative performance and absolute performance. According to the method, firstly, scenes are classified according to the distance from a starting point to a target point, and then, scene types to be interacted next are arranged on the basis of relative performance and absolute performance of the multi-robot formation in different types of scenes. Through the course learning methodprovided by the invention, the multi-robot formation can learn the effective navigation strategy under the condition of sparse award signals.

Description

technical field [0001] The invention belongs to the technical field of multi-robots in robots, and more specifically relates to a course learning method for learning multi-robot formation navigation strategies under sparse reward signals. Background technique [0002] Multi-robot formations have broad application prospects, such as large-scale search and rescue, surveying and mapping, and agricultural plant protection. When the multi-robot formation is working, it needs to rely on the multi-robot formation navigation to complete the movement of the multi-robot formation. [0003] Patent CN2019103948935 discloses an end-to-end distributed multi-robot formation navigation method based on deep reinforcement learning. The multi-robot formation navigation method based on deep reinforcement learning can solve an excellent navigation strategy through trial and error, and has the required artificial intelligence. Features with less intervention. In addition, the navigation strateg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G05D1/02
CPCG05D1/0257G05D1/0223G05D1/0221G05D1/0276
Inventor 林俊潼成慧
Owner SUN YAT SEN UNIV