Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for improving convergence and training speed of neural network with multiple agents

A multi-agent, neural network technology, applied in the field of neural networks with multi-agents to improve convergence and training speed

Pending Publication Date: 2021-05-18
厦门吉比特网络技术股份有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The problem is the reward setting of AI. According to the existing technology, they all share a reward Reward, that is, multiple characters to be operated share the same honor and disgrace. If one character makes a mistake, the whole team will be punished. Even if only one character in the whole team makes a mistake, the Reward will still be punished ( Of course, the punishment will be lower than that of the whole team)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for improving convergence and training speed of neural network with multiple agents
  • Method for improving convergence and training speed of neural network with multiple agents
  • Method for improving convergence and training speed of neural network with multiple agents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Such as image 3 As shown, the present invention discloses a method for improving the convergence and training speed of a neural network with multi-agents, which is realized based on a multi-agent system, and the multi-agent system includes a multi-agent master control and N agents, There is a buried point in the feedback of each agent, which is used to judge whether the instruction of the agent is wrong and whether it makes an excellent decision. The method is as follows:

[0025] Input state information, and pass the current state information to N agents;

[0026] The agents output their respective instructions according to their respective neural networks and combined with the current state information;

[0027] The agent gives reward and punishment feedback to the agent according to the results of its instructions and combined with the buried point judgment in the feedback;

[0028] Summarize the rewards and punishments of N agents into a list of rewards and puni...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method and a device for improving convergence and training speed of a neural network with multiple agents, and a storable medium, which make directional rewards / punishment for rewards of the multiple agents, and for a single agent under a multi-agent task, the agent which has made an optimal decision currently is encouraged and reserved, and the agent making a wrong decision is directionally punished, so that the neural network optimization process of other agents cannot be influenced. Based on this, when the multi-agent AI is subjected to back propagation, an error agent object can be clearly known, so that only the object is punished during derivation, the convergence and training speed of the neural network is increased, and the effect of the multi-agent AI is further improved.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence reinforcement learning, in particular to a method for improving the convergence and training speed of a neural network with multiple agents. Background technique [0002] Such as figure 1 As shown, reinforcement learning is an agent (Agent) that learns in a "trial-and-error" manner, and guides behavior through rewards obtained through interaction with the environment. The goal is to enable the agent to obtain the maximum reward. Reinforcement learning is different from connectionist learning. Supervised learning is mainly manifested in the reinforcement signal. The reinforcement signal provided by the environment in reinforcement learning is an evaluation of the quality of the generated action, rather than telling the reinforcement learning system RLS (reinforcement learning system) how to generate the correct action. . With little information from the external environment, RLS m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/04G06N3/08G06N20/00A63F13/67
CPCG06N3/04G06N3/084G06N20/00A63F13/67A63F2300/6027
Inventor 陈晨
Owner 厦门吉比特网络技术股份有限公司