Multi-agent cooperation information processing method and system, storage medium and intelligent terminal

An information processing method and multi-agent technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as difficulty in training, increasing algorithm convergence time, and slow learning.

Inactive Publication Date: 2020-08-25
CHENGDU UNIV OF INFORMATION TECH
View PDF0 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The combination of multi-agent system and DRL will bring some problems: Compared with single-agent system, the strategy of a single agent in a multi-agent system will be affected by other agents in the same environment, making it difficult to formulate better learning objectives; As the number of agents increases, the action dimension of the policy output becomes larger, and the action space shows an exponential growth trend; the synchronous learning of multiple agents will make the environment unstable; the increase in the number of agents makes it easier for the learning of the strategy to fall into an endless loop. Difficulty learning good strategies
[0016] (1) The learning efficiency of the agent in the prior art is not fast, and it is not suitable for the problem of randomly changing environment
[0017] (2) It is difficult for multi-agents in the existing technology to choose groups to complete the target problem like humans. In MADDPG, the interaction of all agents is fully connected, which increases the convergence time of the algorithm, and even makes it difficult to converge, resulting in division of labor and cooperation. The scene effect is not good
[0018] (3) In the traditional algorithm, the multi-agent will start from zero every round during training. The condition for the algorithm to end the training is to find the target or reach the maximum step size, and the training time will become very large.
There are also some agents that have entered a dead end during training and learn very slowly
[0019] The difficulty of solving the above technical problems: the larger the scale of the environment, the more time it takes for the agent to explore, and it is also easy to enter an infinite loop, so the agent cannot learn efficient strategies
If the number of agents increases, multi-agent algorithms are prone to dimension explosion and difficult to train

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-agent cooperation information processing method and system, storage medium and intelligent terminal
  • Multi-agent cooperation information processing method and system, storage medium and intelligent terminal
  • Multi-agent cooperation information processing method and system, storage medium and intelligent terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0210] Embodiment 1, urban planning. When there are a large number of vehicles in the city, in order to reduce the overall urban traffic congestion time, deep multi-agent reinforcement learning is used to recommend the optimal travel route for each vehicle to ensure smooth traffic. Optimize bus routes and optimize traffic light control. Such as Figure 13 shown.

[0211] Step 1: Construct the road network of Mianyang City.

[0212] The second step: analyze the OD of the driving trajectory. The driving trajectory is mapped to the road network.

[0213] The third step: analysis system of time and space law of motor vehicle travel in Mianyang City based on bayonet data.

[0214] Step 4: Take the OD heat map of the driving trajectory, such as Figure 14 shown.

[0215] Step 5: Use the GAED-MADDPG algorithm to find the optimal traffic organization scheme, such as Figure 15 shown.

[0216] It should be noted that the embodiments of the present invention can be realized by ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of artificial intelligence. The invention discloses a multi-agent cooperation information processing method and system, a storage medium and an intelligentterminal. Each agent reserves own information trace in the environment when taking the next behavior in the environment, searches surrounding information trace when other agents reach the state in the environment, and adds the information trace into the neural network for training; the grouping model finds a better cooperation strategy among the multiple agents, and the grouping model predicts the optimal grouping of the multiple agents at the next moment by using the grouping relationship among the multiple agents; when one round of training through the G model is finished each time, the loss function formula of each agent is regarded as fitness, the loss value mean value of each round of intelligent agent trajectory is counted, and the loss values of all agent trajectories of each roundare summarized. According to the method, the team learning efficiency of the multiple agents is improved, and the multiple agents can better complete tasks through team cooperation.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence, and in particular relates to a multi-agent cooperative information processing method, system, storage medium, and intelligent terminal. Background technique [0002] At present, multi-agent cooperation is a new topic in practical application, and it is also a challenging topic. a) How to make multi-agents learn efficiently in a larger and random environment is a constant challenge in reinforcement learning. Some algorithms in reinforcement learning use policy iteration to train agents, which can be generalized to larger environments, but this method can only be used for the optimization of single-agent algorithms. In multi-agent systems, this method Not so applicable. b) It is a new task to make multi-agents work together to accomplish goals like humans. Deep reinforcement learning uses an asynchronous framework to train multi-agents. Each agent is independent from other agents...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06N3/02G06N20/00
CPCG06N3/02G06N3/08G06N3/084G06N20/00
Inventor 邹长杰郑皎凌张中雷
Owner CHENGDU UNIV OF INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products