A multi-agent deep reinforcement learning method, system and application

A reinforcement learning, multi-agent technology, applied in the field of multi-agent deep reinforcement learning, can solve the problems of long training time, slow neural network training, low learning efficiency, etc., to achieve the effect of high availability

Active Publication Date: 2021-11-05
ARMY ENG UNIV OF PLA
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, DRL is difficult to adapt to the dynamic and changeable environment, and faces many problems in the research: First, the learning efficiency is low: the essence of DRL is a trial-and-error learning process, and the learning experience is generated through the continuous interaction between the agent and the environment and stored in it into the cache
Due to the uneven quality of experience, this will make it difficult for the network model to learn effective sample data; second, the training time is long: with the increase of the number of agents, the action space shows an exponential growth trend, and the dimension of decision output will become more and more bigger
Moreover, the behavior decision of each agent not only needs to consider the state of the environment in which it is located, but also consider the impact of the decisions taken by other agents on its own strategy, which will lead to slow neural network training and even difficult convergence.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A multi-agent deep reinforcement learning method, system and application
  • A multi-agent deep reinforcement learning method, system and application
  • A multi-agent deep reinforcement learning method, system and application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the purpose, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the following The described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0040] The invention discloses a multi-agent deep reinforcement learning method, including the following process:

[0041] 1. Partition buffer area experience replay form

[0042] In the general multi-agent deep reinforcement learning, the agent realizes the transition from one state s to the next state s′ by performing a certain behavio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-agent deep reinforcement learning algorithm based on partition experience and multi-thread interaction. First, the algorithm uses the experience replay form of the partitioned buffer area to distinguish positive experience, negative experience and neutral experience by dividing the reward space, and uses stratified random sampling to extract these experience data during training. Secondly, the algorithm uses multi-threaded interaction to promote the trial-and-error process between the agent and the environment. Multiple clones of the agent learn in parallel and integrate their learning experience to train the parameters of the network model. The advantages are: the multi-agent deep reinforcement learning algorithm based on buffer replay and multi-thread interaction proposed by the present invention, combined with the advantages of partitioned experience buffer and multi-thread interaction mode, is introduced into the multi-agent deep reinforcement learning algorithm; It is superior to existing models in terms of convergence speed and training efficiency, and has higher usability in a multi-agent environment, and can be used to solve the problem of multi-agent cooperative tracking targets.

Description

technical field [0001] The invention relates to a multi-agent deep reinforcement learning method, system and application, and belongs to the field of multi-agent technology. Background technique [0002] Deep reinforcement learning is an efficient strategy search algorithm that combines deep learning (Deep Learning, DL) and reinforcement learning (Reinforcement Learning, RL). Extract data features in the dimensional state space and search for the optimal behavior strategy. At present, the research results of DRL can be applied to multi-agent systems, in order to realize complex combat tasks such as mutual cooperation and competition among multi-agents. However, DRL is difficult to adapt to the dynamic and changeable environment, and faces many problems in the research: First, the learning efficiency is low: the essence of DRL is a trial-and-error learning process, and the learning experience is generated through the continuous interaction between the agent and the environme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/08
CPCG06N3/08
Inventor 张婷婷董会张赛男
Owner ARMY ENG UNIV OF PLA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products