Reinforcement learning training optimization method and device for multi-agent confrontation

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-agent and reinforcement learning technology, applied in the field of machine learning, can solve problems such as low training efficiency, achieve efficient training, and improve training efficiency

Active Publication Date: 2020-04-10

NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI

View PDF11 Cites 35 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, these methods have great limitations. The main problems include: when the number of agents increases, the training efficiency is low

As the number of agents increases, the size of the action-state space of the multi-agent system increases exponentially, and more and more time is required for trial-and-error exploration, resulting in low training efficiency.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0029] figure 1 It is a flow chart of the multi-agent confrontation-oriented reinforcement learning training optimization method provided by an embodiment of the present invention. Such as figure 1 As shown, the method includes:

[0030] Step 101, the rule coupling algorithm training process, including: for each training step, obtain the initia...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention provides a reinforcement learning training optimization method and device for multi-agent confrontation. The method comprises the following steps of a rule coupling algorithm training process including the steps of acquiring an initial first state result set of a red party multi-agent for each training step, if the initial first state result set of the red party multi-agent meets a preset action rule, obtaining a decision-making behavior result set according to the preset action rule, and otherwise, acquiring the decision-making behavior result set according toa preset reinforcement training learning algorithm; and performing reinforcement learning training on the red-party multi-agent by utilizing a training sample formed by the decision-making behavior result set and other preset parameters. The embodiment of the invention provides the reinforcement learning training optimization method and device for multi-agent confrontation. In the whole training process, the preset action rule can guide the multiple agents to act, invalid actions are avoided, the problems that in the training process in the prior art, invalid exploration is much, and the training speed is low are solved, and the training efficiency is remarkably improved.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to a multi-agent confrontation-oriented reinforcement learning training optimization method and device. Background technique [0002] Artificial intelligence is a technical science that researches and develops theories, methods, technologies and applications for simulating and expanding human intelligence. One of the main goals of artificial intelligence research is to simulate human decision-making by intelligent agents (Agents), so as to be competent for some complex tasks that require human intelligence to complete. The limited functionality of a single agent to cope with complex tasks has driven the concept of multi-agent systems. A multi-agent system is composed of multiple agents that can make independent decisions and interact with each other. They share the same environment and have perception and execution mechanisms. At present, multi-agent systems have become a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06N3/00

CPCG06N3/008G06F18/214

Inventor 徐新海李渊戴华东王之元张冠宇宋菲菲

Owner NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Reinforcement learning training optimization method and device for multi-agent confrontation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology