Method for improving convergence and training speed of neural network with multiple agents

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-agent, neural network technology, applied in the field of neural networks with multi-agents to improve convergence and training speed

Pending Publication Date: 2021-05-18

厦门吉比特网络技术股份有限公司

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The problem is the reward setting of AI. According to the existing technology, they all share a reward Reward, that is, multiple characters to be operated share the same honor and disgrace. If one character makes a mistake, the whole team will be punished. Even if only one character in the whole team makes a mistake, the Reward will still be punished ( Of course, the punishment will be lower than that of the whole team)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0024] Such as image 3 As shown, the present invention discloses a method for improving the convergence and training speed of a neural network with multi-agents, which is realized based on a multi-agent system, and the multi-agent system includes a multi-agent master control and N agents, There is a buried point in the feedback of each agent, which is used to judge whether the instruction of the agent is wrong and whether it makes an excellent decision. The method is as follows:

[0025] Input state information, and pass the current state information to N agents;

[0026] The agents output their respective instructions according to their respective neural networks and combined with the current state information;

[0027] The agent gives reward and punishment feedback to the agent according to the results of its instructions and combined with the buried point judgment in the feedback;

[0028] Summarize the rewards and punishments of N agents into a list of rewards and puni...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a method and a device for improving convergence and training speed of a neural network with multiple agents, and a storable medium, which make directional rewards / punishment for rewards of the multiple agents, and for a single agent under a multi-agent task, the agent which has made an optimal decision currently is encouraged and reserved, and the agent making a wrong decision is directionally punished, so that the neural network optimization process of other agents cannot be influenced. Based on this, when the multi-agent AI is subjected to back propagation, an error agent object can be clearly known, so that only the object is punished during derivation, the convergence and training speed of the neural network is increased, and the effect of the multi-agent AI is further improved.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence reinforcement learning, in particular to a method for improving the convergence and training speed of a neural network with multiple agents. Background technique [0002] Such as figure 1 As shown, reinforcement learning is an agent (Agent) that learns in a "trial-and-error" manner, and guides behavior through rewards obtained through interaction with the environment. The goal is to enable the agent to obtain the maximum reward. Reinforcement learning is different from connectionist learning. Supervised learning is mainly manifested in the reinforcement signal. The reinforcement signal provided by the environment in reinforcement learning is an evaluation of the quality of the generated action, rather than telling the reinforcement learning system RLS (reinforcement learning system) how to generate the correct action. . With little information from the external environment, RLS m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08G06N20/00A63F13/67

CPCG06N3/04G06N3/084G06N20/00A63F13/67A63F2300/6027

Inventor 陈晨

Owner 厦门吉比特网络技术股份有限公司

Method for improving convergence and training speed of neural network with multiple agents

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A multi-agent, neural network technology, applied in the field of neural networks with multi-agents to improve convergence and training speed

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-agent, neural network technology, applied in the field of neural networks with multi-agents to improve convergence and training speed

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology