Multi-agent system cooperative control method and system based on reinforcement learning algorithm

A multi-agent system and collaborative control technology, applied in the field of control, can solve problems such as low efficiency, complex HJB equations, and difficulty in operating efficiency to meet practical applications, and achieve the effect of high scalability and wide application value.

Pending Publication Date: 2021-10-22
SHANDONG UNIV
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In practical applications, controllers have limitations of one kind or another, which means that the control is bounded
Under the condition of limited control, the already difficult HJB equation becomes more complicated
For the current cooperative control of multi-agent systems, in the face of complex calculation problems, its computational efficiency is difficult to meet the practical application in some applications with a large amount of data. In this regard, how to improve the cooperative control capabilities of multi-agent systems and overcome the existing efficiency of cooperative control methods The problem of low is the biggest challenge we are currently facing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-agent system cooperative control method and system based on reinforcement learning algorithm
  • Multi-agent system cooperative control method and system based on reinforcement learning algorithm
  • Multi-agent system cooperative control method and system based on reinforcement learning algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] Such as figure 1 As shown, the multi-agent system cooperative control method based on reinforcement learning algorithm includes:

[0028] Construct a dynamic graph game model according to the network topology of the multi-agent system, construct a value function according to the dynamic graph game model, and use the value function as the performance index of the multi-agent system;

[0029] The first neural network is used to fit the value function of each agent, and the second neural network is used to fit the control strategy of each agent;

[0030] Based on the reinforcement learning algorithm, the value function and the control strategy are iterated online, and the parameters of the first and second neural networks are updated using the gradient descent method until the convergence reaches the optimal value function under Nash equilibrium. At this time, the multi-agent system realizes collaboration control.

[0031] In the process of algorithm iteration, it needs ...

Embodiment 2

[0096] This embodiment also provides a multi-agent system cooperative control system based on a reinforcement learning algorithm, including:

[0097] The multi-agent system building block is used to establish a multi-agent system, construct a dynamic graph game model according to the network topology of the multi-agent system, construct a value function according to the dynamic graph game model, and use the value function as the performance index of the multi-agent system ;

[0098] The neural network building block is used to adopt the first neural network to fit the value function of each agent, and adopt the second neural network to fit the control strategy of each agent;

[0099] The data processing module is used for performing online iteration on the value function and the control strategy based on the reinforcement learning algorithm, and adopts the gradient descent method to update the parameters of the first and second neural networks until the optimal approximation v...

Embodiment 3

[0103] A computer-readable storage medium is used to store computer instructions. When the computer instructions are executed by a processor, the multi-agent system cooperative control method based on the reinforcement learning algorithm as described in the above embodiments is completed.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multi-agent system cooperative control method and system based on a reinforcement learning algorithm, and the method comprises the steps: building a multi-agent system, building a value function according to a dynamic graph game model, and employing the value function as a performance index of the multi-agent system; fitting a value function of each agent by adopting a first neural network, and fitting a control strategy of each agent by adopting a second neural network; carrying out online iteration on the value function and the control strategy based on a reinforcement learning algorithm until convergence is carried out to obtain an optimal approximation value reaching Nash equilibrium; and performing cooperative control on the multi-agent system according to the optimal approximation value. According to the multi-agent cooperative control method, the solution of the dynamic graph game is searched on line, the kinetic equation of the agent is not needed, suggestions are provided for the current strategy, the actor neural network provides the control strategy, the actual problem is solved, the purpose of algorithm design is achieved, and multi-agent cooperative control based on the method is more efficient and reasonable.

Description

technical field [0001] The disclosure belongs to the field of control, and relates to a multi-agent system cooperative control method and system based on a reinforcement learning algorithm. Background technique [0002] The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art. [0003] In the past two decades, distributed multi-agent cooperative control systems have attracted extensive attention due to their applications in computer science, spacecraft, unmanned aerial vehicles, and mobile robots. The so-called synchronization means that all agents finally reach a certain state through appropriate control strategies. In order to obtain the optimal strategy of each agent and make the system reach the Nash equilibrium, the traditional method is to solve a set of coupled HJB equations, and the calculation is extremely complicated due to the existence of the coupling relationship. With t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G05B13/04G06N3/04G06N20/00
CPCG05B13/042G06N3/04G06N20/00
Inventor 王炳昌张宝强王天祥
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products