Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Cooperative agent learning method based on multi-agent reinforcement learning

A multi-agent, reinforcement learning technology, applied in the field of machine learning, can solve problems such as poor collaboration and low efficiency, and achieve the effects of improving efficiency, stable model training, and simplifying model complexity

Pending Publication Date: 2020-02-28
SUN YAT SEN UNIV
View PDF0 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to overcome the problems of poor collaboration and low efficiency of multi-agents in a cooperative environment in the above-mentioned prior art, the present invention provides a learning method for cooperative agents based on multi-agent reinforcement learning, which improves the performance of multi-agents in a cooperative environment. Collaboration and efficiency, enhancing the performance of agents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cooperative agent learning method based on multi-agent reinforcement learning
  • Cooperative agent learning method based on multi-agent reinforcement learning
  • Cooperative agent learning method based on multi-agent reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0039] Such as Figure 1-3 Shown is a kind of learning method of cooperative agent based on multi-agent reinforcement learning, comprising the following steps:

[0040] Step 1: Reset multiple target environments, which meet the characteristics of multi-agents in a cooperative relationship sharing information and acting together;

[0041] Step 2: Initialize the policy network π θ The model parameter θ of π and global information prediction network f θ The model parameter θ of f ;

[0042] Step 3: Parallel sampling of multi-agents in multiple environments with the current strategy π in the environment with a fixed number of steps; in each step, the environment e i Multiple agents in share the same state S i,t , to extract the global feature s for this state i,t,global , and for each agent pair state s i,t Extract local features, and combine them to obtain agent features s i,t,comb Finally, it is used as the input data of the policy network model;

[0043] The agents in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a cooperative agent learning method based on multi-agent reinforcement learning. The cooperative agent learning method comprises the steps of 1, resetting a plurality of target environments; 2, initializing a model parameter theta pi of the strategy network pi theta and a model parameter theta f of the global information prediction network f theta; 3, sampling the multipleagents in the multiple environments according to the current strategy pi in the environment, wherein in each step, a plurality of agents in the environment share the same state, and characteristics are extracted from the state by aiming at each agent and then are used as data input by the model; 4, updating the model parameters theta pi and theta f; and step 5, updating until the model convergesor reaches the maximum step number. According to the method, the global feature information is better utilized in the environment that the agents are in the cooperative relationship, and each agent learns to perceive the relationship between the local information and the global information through the model for predicting the global information through the local information, so that the agents canbetter cooperate; therefore, different agents can directly share the model parameters, the model complexity is simplified, and the efficiency is improved.

Description

technical field [0001] The invention relates to the field of machine learning, and more specifically, to a learning method for cooperative agents based on multi-agent reinforcement learning. Background technique [0002] Reinforcement learning is a subfield of machine learning whose goal is to take decision-making actions based on the environment so as to obtain the maximum benefit. Among them, reinforcement learning introduces deep learning technology as a function approximation method for learning value functions and strategies in reinforcement learning, which greatly improves end-to-end performance compared with manual feature extraction, thus solving a series of problems that traditional reinforcement learning cannot solve. , such as on video games, deep reinforcement learning has even achieved performance that exceeds the average human level. [0003] The existing reinforcement learning methods have a relatively mature system, including model-based and model-free metho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N20/00
CPCG06N20/00
Inventor 陈伟威潘嵘
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products