Multi-agent fault tolerance consistency method and system based on reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of reinforcement learning and intelligent body, applied in the field of reinforcement learning and fault-tolerant control, it can solve problems such as violation and environmental instability, and achieve the effect of high tolerance

Pending Publication Date: 2022-01-11

ZHEJIANG SCI-TECH UNIV

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In addition, the unpredictable behavior of the faulty agent can also cause the instability of the environment, thus violating the Markov assumption, which is a common and thorny problem in MARL

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0048] Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0049] The preferred embodiment of the present invention is based on the multi-agent fault-tolerant consistency method based on reinforcement learning. First, according to the characteristics of the distributed system, the following system topology model is established, such as figure 1 Shown:

[0050] The system is a network composed of n agents, and the multi-agent topology is represented by a directed graph G(V,E), where V={1,2,…,n} represents the collection of agents, Represents the connections between agents. If agent i can receive information from agent j, agent j is called the neighbor of agent i, and the neighbor set of agent i is composed of N i ={j|(j, i)}∈E represents. in figure 1 Nodes 0, 1, 2, and 3 represent fault agents, and the types of fault agents include random state values and constant state values. 4-11 represent normal age...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a multi-agent fault-tolerant consistency method and system based on reinforcement learning, and the method comprises the following steps: S1, building a system network topology, and designing a reward function; and S2, interaction among the intelligent agents: according to the designed reward function, gradually adjusting the weights of the adjacent intelligent agents until the states of the normal intelligent agents are consistent. According to the invention, the trial and error thought of MARL is introduced, namely, continuous trial is performed, an algorithm D-OPDPG capable of being applied to solving the multi-agent fault-tolerant consistency problem is adopted, some problems in the prior art are solved by combining natural characteristics of the MARC system, weights of adjacent agents are adjusted step by step according to a designed reward function, and therefore the influence of a fault agent is relieved, and the fault intelligent agent is gradually identified. The invention has extremely high tolerance to noise under the condition that extra energy consumption is not added to the system. In addition, a distributed method based on reinforcement learning is adopted, the limitation condition of the network topology is relaxed, and only the network topology needs to meet the requirements of the connected graph.

Description

technical field [0001] The invention belongs to the technical field of reinforcement learning and fault-tolerant control, in particular to a multi-agent fault-tolerant consistency method and system based on reinforcement learning. Background technique [0002] In recent years, multi-agent technology has been widely used in modern infrastructure systems, such as transportation systems, power grids, wireless communication networks, medical care equipment and other fields. However, problems such as unpredictable environments and internal failures of agents in practical applications have brought many challenges to multi-agent systems. Fault-tolerant consistency means that in a multi-agent system, when there are erroneous input data, the agents can still achieve state consistency through interaction with each other. [0003] In a multi-agent system, the direct way to achieve multi-agent resilient consensus (MARC) is to remove faulty agents. Assuming that the maximum number of f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08G06N3/04G06F11/07

CPCG06N3/08G06F11/0751G06N3/045

Inventor 侯健邱鹏鹏王方圆

Owner ZHEJIANG SCI-TECH UNIV

Multi-agent fault tolerance consistency method and system based on reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology