A Curriculum Learning Approach for Learning Multi-Robot Formation Navigation Policies Under Sparse Reward Signals

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-robot, learning method technology, applied in the field of multi-mobile robots, can solve the problem that it is difficult for robots to learn navigation strategies in formation

Active Publication Date: 2021-09-07

SUN YAT SEN UNIV

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, under sparse reward signals, it is difficult for multi-robot formations to learn effective navigation policies through general deep reinforcement learning-based methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0042] Such as figure 1 with figure 2 Shown, a curriculum learning method for learning a multi-robot formation navigation policy under sparse reward signals, where a curriculum learning based on fusing relative performance and absolute performance is used to allow the multi-robot formation to still be able to Learn an effective navigation strategy; based on the fusion of relative performance and absolute performance curriculum learning, that is, as the training progresses, gradually switch from relative performance-based curriculum learning to absolute performance-based curriculum learning, in this way, in the training In the early stage, the basic navigation strategy is quickly mastered through the course learning based on relative performance, and the complex navigation strategy is overcome through the course learning based on absolute performance in the later stage of training.

[0043] Compared with the general multi-robot formation navigation method based on deep reinfo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of multi-robots in robots, and more specifically relates to a course learning method for learning multi-robot formation navigation strategies under sparse reward signals. When the reward signal is sparse, it is difficult for multi-robot formation navigation methods based on deep reinforcement learning to learn effective navigation strategies through trial and error. In order to allow the multi-robot formation to learn the navigation strategy even when the reward signal is sparse, the present invention proposes a course learning method based on the fusion of relative performance and absolute performance; Classify and then arrange the types of scenarios to interact with next based on the relative and absolute performance of the multi-robot formation in different types of scenarios. Through the course learning method proposed by the present invention, the multi-robot formation can learn an effective navigation strategy when the reward signal is sparse.

Description

technical field [0001] The invention belongs to the technical field of multi-robots in robots, and more specifically relates to a course learning method for learning multi-robot formation navigation strategies under sparse reward signals. Background technique [0002] Multi-robot formations have broad application prospects, such as large-scale search and rescue, surveying and mapping, and agricultural plant protection. When the multi-robot formation is working, it needs to rely on the multi-robot formation navigation to complete the movement of the multi-robot formation. [0003] Patent CN2019103948935 discloses an end-to-end distributed multi-robot formation navigation method based on deep reinforcement learning. The multi-robot formation navigation method based on deep reinforcement learning can solve an excellent navigation strategy through trial and error, and has the required artificial intelligence. Features with less intervention. In addition, the navigation strateg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G05D1/02

CPCG05D1/0257G05D1/0223G05D1/0221G05D1/0276

Inventor 林俊潼成慧

Owner SUN YAT SEN UNIV

A Curriculum Learning Approach for Learning Multi-Robot Formation Navigation Policies Under Sparse Reward Signals

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology