Course learning method for learning multi-robot formation navigation strategy under sparse reward signals

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-robot and learning method technology, applied in the field of multi-mobile robots, can solve the problem of difficulty in learning navigation strategies for robot formations

Active Publication Date: 2020-10-27

SUN YAT SEN UNIV

View PDF4 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, under sparse reward signals, it is difficult for multi-robot formations to learn effective navigation policies through general deep reinforcement learning-based methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0042] like figure 1 and figure 2 Shown, a curriculum learning method for learning a multi-robot formation navigation policy under sparse reward signals, where a curriculum learning based on fusing relative performance and absolute performance is used to allow the multi-robot formation to still be able to Learn an effective navigation strategy; based on the fusion of relative performance and absolute performance curriculum learning, that is, as the training progresses, gradually switch from relative performance-based curriculum learning to absolute performance-based curriculum learning, in this way, in the training In the early stage, the basic navigation strategy is quickly mastered through the course learning based on relative performance, and the complex navigation strategy is overcome through the course learning based on absolute performance in the later stage of training.

[0043] Compared with the general multi-robot formation navigation method based on deep reinforcem...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of multiple mobile robots in robots, and particularly relates to a course learning method for learning a multi-robot formation navigation strategy under sparse reward signals. When the reward signals are sparse, the multi-robot formation navigation method based on deep reinforcement learning is difficult to learn an effective navigation strategy througha trial and error mode. In order to enable a multi-robot formation to still learn a navigation strategy under the condition of sparse award signals, the invention provides a course learning method based on fusion of relative performance and absolute performance. According to the method, firstly, scenes are classified according to the distance from a starting point to a target point, and then, scene types to be interacted next are arranged on the basis of relative performance and absolute performance of the multi-robot formation in different types of scenes. Through the course learning methodprovided by the invention, the multi-robot formation can learn the effective navigation strategy under the condition of sparse award signals.

Description

technical field [0001] The invention belongs to the technical field of multi-robots in robots, and more specifically relates to a course learning method for learning multi-robot formation navigation strategies under sparse reward signals. Background technique [0002] Multi-robot formations have broad application prospects, such as large-scale search and rescue, surveying and mapping, and agricultural plant protection. When the multi-robot formation is working, it needs to rely on the multi-robot formation navigation to complete the movement of the multi-robot formation. [0003] Patent CN2019103948935 discloses an end-to-end distributed multi-robot formation navigation method based on deep reinforcement learning. The multi-robot formation navigation method based on deep reinforcement learning can solve an excellent navigation strategy through trial and error, and has the required artificial intelligence. Features with less intervention. In addition, the navigation strateg...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G05D1/02

CPCG05D1/0257G05D1/0223G05D1/0221G05D1/0276

Inventor 林俊潼成慧

Owner SUN YAT SEN UNIV

Course learning method for learning multi-robot formation navigation strategy under sparse reward signals

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology