Multi-agent cooperation model based on deep reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A reinforcement learning and multi-intelligence technology, applied in computing models, machine learning, computing, etc., can solve problems such as low efficiency, slow convergence, poor stability, etc., to ensure consistency, improve adaptability, and update rules.

Pending Publication Date: 2021-11-02

DALIAN UNIV

View PDF2 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Aiming at the problems of low efficiency, slow convergence speed and poor stability of the existing multi-agent reinforcement learning methods, this application provides a multi-agent cooperation model based on deep reinforcement learning, which ensures the global optimal action and local optimal action. Consistency, thereby improving the efficiency of multi-agent exploration in continuous action spaces

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0038] This embodiment adopts the basic structure of CCDA. The distributed Actor network is conducive to the distributed execution of agents. It interacts with the environment to generate state-action information and store it in the experience buffer. In order to combat the non-stationarity of the environment, the centralized Critic The network takes the global state-action information as input, designs the global reward R with the task of the cooperative multi-agent system as the goal, and learns a global action value Q by using TD error tot . In order to ensure the consistency between a single agent and the global optimal action, the present invention introduces the idea of value decomposition, adds the Q value decomposition network—QDN, and converts the global action value Q tot decomposes into an action value Q based on a single agent i , so that the implicit credit allocation is realized, so that the contribution of a single agent in the team can be expressed; in addit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-agent cooperation model based on deep reinforcement learning, which comprises a centralized Critic network, a plurality of distributed Actor networks and a Q value decomposition network. Each Actor network interacts with an environment to generate state-action information, the state-action information is stored in an empirical buffer area, the Critic network samples from the empirical buffer area, all state-action information serves as input, a global reward R is designed with the task of the cooperation multi-agent system as a target, and a global action value Qtot is obtained through learning in a TD error mode; the Q value decomposition network decomposes the global action value Qtot into action value Qi based on a single agent, and gradient update of each Actor network depends on the action value Qi of the corresponding single agent after decomposition. According to the method, the consistency of the global optimal action and the local optimal action is ensured, so that the exploration efficiency of multiple agents in a continuous action space is improved.

Description

technical field [0001] The invention relates to the technical field of multi-agent reinforcement learning, in particular to a multi-agent cooperation model based on deep reinforcement learning. Background technique [0002] MAS is a distributed decision-making system composed of multiple agents interacting with the environment. Since the 1970s, MAS has carried out numerous researches, the purpose of which is to establish a swarm intelligence system with a specific autonomous level and autonomous learning ability. The characteristics of MAS information sharing, distributed computing and collaborative execution have a very wide range of application requirements in real life, especially in many fields such as military, industry, and transportation. In decision-making optimization problems, reinforcement learning shows a huge advantage in online learning, and it is more in line with the learning mechanism of biological groups. With the upsurge of reinforcement learning led by ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N20/00

CPCG06N20/00

Inventor 邹启杰蒋亚军高兵秦静李丹李文雪

Owner DALIAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-agent cooperation model based on deep reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology