A multi-agent deep reinforcement learning method, system and application

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A reinforcement learning, multi-agent technology, applied in the field of multi-agent deep reinforcement learning, can solve the problems of long training time, slow neural network training, low learning efficiency, etc., to achieve the effect of high availability

Active Publication Date: 2021-11-05

ARMY ENG UNIV OF PLA

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, DRL is difficult to adapt to the dynamic and changeable environment, and faces many problems in the research: First, the learning efficiency is low: the essence of DRL is a trial-and-error learning process, and the learning experience is generated through the continuous interaction between the agent and the environment and stored in it into the cache

Due to the uneven quality of experience, this will make it difficult for the network model to learn effective sample data; second, the training time is long: with the increase of the number of agents, the action space shows an exponential growth trend, and the dimension of decision output will become more and more bigger

Moreover, the behavior decision of each agent not only needs to consider the state of the environment in which it is located, but also consider the impact of the decisions taken by other agents on its own strategy, which will lead to slow neural network training and even difficult convergence.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0039] In order to make the purpose, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the following The described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0040] The invention discloses a multi-agent deep reinforcement learning method, including the following process:

[0041] 1. Partition buffer area experience replay form

[0042] In the general multi-agent deep reinforcement learning, the agent realizes the transition from one state s to the next state s′ by performing a certain behavio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-agent deep reinforcement learning algorithm based on partition experience and multi-thread interaction. First, the algorithm uses the experience replay form of the partitioned buffer area to distinguish positive experience, negative experience and neutral experience by dividing the reward space, and uses stratified random sampling to extract these experience data during training. Secondly, the algorithm uses multi-threaded interaction to promote the trial-and-error process between the agent and the environment. Multiple clones of the agent learn in parallel and integrate their learning experience to train the parameters of the network model. The advantages are: the multi-agent deep reinforcement learning algorithm based on buffer replay and multi-thread interaction proposed by the present invention, combined with the advantages of partitioned experience buffer and multi-thread interaction mode, is introduced into the multi-agent deep reinforcement learning algorithm; It is superior to existing models in terms of convergence speed and training efficiency, and has higher usability in a multi-agent environment, and can be used to solve the problem of multi-agent cooperative tracking targets.

Description

technical field [0001] The invention relates to a multi-agent deep reinforcement learning method, system and application, and belongs to the field of multi-agent technology. Background technique [0002] Deep reinforcement learning is an efficient strategy search algorithm that combines deep learning (Deep Learning, DL) and reinforcement learning (Reinforcement Learning, RL). Extract data features in the dimensional state space and search for the optimal behavior strategy. At present, the research results of DRL can be applied to multi-agent systems, in order to realize complex combat tasks such as mutual cooperation and competition among multi-agents. However, DRL is difficult to adapt to the dynamic and changeable environment, and faces many problems in the research: First, the learning efficiency is low: the essence of DRL is a trial-and-error learning process, and the learning experience is generated through the continuous interaction between the agent and the environme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06N3/08

CPCG06N3/08

Inventor张婷婷董会张赛男

OwnerARMY ENG UNIV OF PLA

A multi-agent deep reinforcement learning method, system and application

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology