Unlock instant, AI-driven research and patent intelligence for your innovation.

A Reinforcement Learning Approach for Agent Contribution Assignment under Multi-Agent Cooperation Tasks

A reinforcement learning, multi-agent technology, applied in the field of agent contribution allocation under multi-agent reinforcement learning cooperation tasks, can solve problems such as model enlargement and model training difficulty, and achieve the effect of improving performance.

Active Publication Date: 2022-08-05
ZHEJIANG UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method is easy to implement and control when the number of agents is small, but when the number of agents is large, the parameters of the model will increase exponentially, making the model difficult to train
In this case, considering the workload and difficulty of engineering implementation, the method of distributed and independent training of each agent model is usually used to reduce the difficulty of model training. However, this method requires local rewards for each agent to provide independent training signal, in the case where the environment only provides a global reward signal, the distribution of contribution to estimate the contribution of each independent agent becomes an urgent problem to be solved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Reinforcement Learning Approach for Agent Contribution Assignment under Multi-Agent Cooperation Tasks
  • A Reinforcement Learning Approach for Agent Contribution Assignment under Multi-Agent Cooperation Tasks
  • A Reinforcement Learning Approach for Agent Contribution Assignment under Multi-Agent Cooperation Tasks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, and do not limit the protection scope of the present invention.

[0023] In the scenario of multi-agent performing cooperative tasks, it is easy to map the positions of all agents to a two-dimensional plane. The following is an example to describe the specific implementation. like figure 1 As shown, the scenario of this example is as follows: (1) 5 agents are scattered on a two-dimensional plane, and the relative distance between each agent can be measured by Euclidean distance; (2) the dotted box represents the observation range of the agent, figure 1 The three dotted boxes shown are the observation ranges of agent 2, agent 3 an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for assigning agent contribution under the multi-agent cooperation task of reinforcement learning, which includes: each agent independently observes the environment state, inputs its own strategy network, and obtains its own action decision; executes the action of each agent in a simulated environment , the global reward of environmental feedback; model the interaction between agents as an undirected graph, use this graph to calculate the contribution weight of each agent; use the contribution weight of each agent to calculate the local reward of each agent, Each agent's respective policy network is trained using this local reward. This method can contribute to the results (rewards) of the interaction between multiple agents and the environment, which can play the role of credibility assignment, provide a more accurate reward description for the training algorithm, and help the multi-agent system learn from cooperative tasks. better strategy.

Description

technical field [0001] The invention belongs to the field of artificial intelligence automation, and particularly relates to a method for assigning agent contributions under a multi-agent reinforcement learning cooperation task. Background technique [0002] The use of reinforcement learning to train agents to perform tasks is a common solution in the modern field of AI automation, where many scenarios rely on multiple agents cooperating to achieve a common goal. In the multi-agent cooperation system, the global reward information can be used as the training signal of the global value function and the value function of each agent, or the contribution degree of each agent can be allocated separately, the local reward of each agent can be estimated, and it can be used as the value of each agent. The learning signal provides gradients for training the policy or value network of each agent. [0003] The engineering implementation of using all the rewards of each agent is relati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F30/27G06F17/18
CPCG06F30/27G06F17/18
Inventor 谭哲越尹建伟尚永衡张鹿鸣李莹邓水光
Owner ZHEJIANG UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More