Unlock instant, AI-driven research and patent intelligence for your innovation.

A Curriculum Learning Approach for Learning Multi-Robot Formation Navigation Policies Under Sparse Reward Signals

A multi-robot, learning method technology, applied in the field of multi-mobile robots, can solve the problem that it is difficult for robots to learn navigation strategies in formation

Active Publication Date: 2021-09-07
SUN YAT SEN UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, under sparse reward signals, it is difficult for multi-robot formations to learn effective navigation policies through general deep reinforcement learning-based methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Curriculum Learning Approach for Learning Multi-Robot Formation Navigation Policies Under Sparse Reward Signals
  • A Curriculum Learning Approach for Learning Multi-Robot Formation Navigation Policies Under Sparse Reward Signals
  • A Curriculum Learning Approach for Learning Multi-Robot Formation Navigation Policies Under Sparse Reward Signals

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] Such as figure 1 with figure 2 Shown, a curriculum learning method for learning a multi-robot formation navigation policy under sparse reward signals, where a curriculum learning based on fusing relative performance and absolute performance is used to allow the multi-robot formation to still be able to Learn an effective navigation strategy; based on the fusion of relative performance and absolute performance curriculum learning, that is, as the training progresses, gradually switch from relative performance-based curriculum learning to absolute performance-based curriculum learning, in this way, in the training In the early stage, the basic navigation strategy is quickly mastered through the course learning based on relative performance, and the complex navigation strategy is overcome through the course learning based on absolute performance in the later stage of training.

[0043] Compared with the general multi-robot formation navigation method based on deep reinfo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of multi-robots in robots, and more specifically relates to a course learning method for learning multi-robot formation navigation strategies under sparse reward signals. When the reward signal is sparse, it is difficult for multi-robot formation navigation methods based on deep reinforcement learning to learn effective navigation strategies through trial and error. In order to allow the multi-robot formation to learn the navigation strategy even when the reward signal is sparse, the present invention proposes a course learning method based on the fusion of relative performance and absolute performance; Classify and then arrange the types of scenarios to interact with next based on the relative and absolute performance of the multi-robot formation in different types of scenarios. Through the course learning method proposed by the present invention, the multi-robot formation can learn an effective navigation strategy when the reward signal is sparse.

Description

technical field [0001] The invention belongs to the technical field of multi-robots in robots, and more specifically relates to a course learning method for learning multi-robot formation navigation strategies under sparse reward signals. Background technique [0002] Multi-robot formations have broad application prospects, such as large-scale search and rescue, surveying and mapping, and agricultural plant protection. When the multi-robot formation is working, it needs to rely on the multi-robot formation navigation to complete the movement of the multi-robot formation. [0003] Patent CN2019103948935 discloses an end-to-end distributed multi-robot formation navigation method based on deep reinforcement learning. The multi-robot formation navigation method based on deep reinforcement learning can solve an excellent navigation strategy through trial and error, and has the required artificial intelligence. Features with less intervention. In addition, the navigation strateg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G05D1/02
CPCG05D1/0257G05D1/0223G05D1/0221G05D1/0276
Inventor 林俊潼成慧
Owner SUN YAT SEN UNIV