Multi-agent reinforcement learning method and system based on population training

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A reinforcement learning and multi-agent technology, applied in the field of agent training, can solve problems such as dependence on training data, low winning rate of agents, and small amount of training data, so as to achieve the effect of solving the problem of small amount of training data

Active Publication Date: 2021-03-26

NO 15 INST OF CHINA ELECTRONICS TECH GRP

View PDF3 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] At present, there are relatively few studies on intelligent command and control training systems in robot confrontation scenarios, and there are relatively large limitations. The main problems include: heavily relying on training data , the amount of training data is small, and the winning rate of the agent after training with these small amounts of data is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0043] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0044] The purpose of the present invention is to provide a multi-agent reinforcement learning method and system based on population training, using the StarCraft platform to train an agent capable of simulating the combat command of an unmanned system.

[0045] Many large-scale and complex dynamic environmental problems in the real world, such as road traffic systems, weather forecasts, economic forecasts, smart city management, and military decision-making, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a multi-agent reinforcement learning method and system based on population training. The method comprises the steps of obtaining a first training set according to game videos;training the multi-layer full convolution LSTM network by using the first training set to obtain a first intelligent agent; utilizing the first intelligent agent to perform self-gaming, and obtaininga first population after a set time period; selecting a second intelligent agent, a first intelligent agent set and a second intelligent agent set from the first population; utilizing the first intelligent agent to fight with the selected three groups of intelligent agents at the same time, and storing and updating the first population until any one of the selected three groups of intelligent agents goes wrong, so as to obtain a second population; selecting a replacement agent from the second population to replace the battle agent to continue to fight with the first agent, storing and updating the second population, and obtaining a third population; and until the number of the agents in the third population reaches a preset value, outputting the first agent. According to the invention, the intelligent agent capable of simulating unmanned system combat command and control can be trained.

Description

technical field [0001] The invention relates to the field of agent training, in particular to a multi-agent reinforcement learning method and system based on population training. Background technique [0002] A milestone event in the field of intelligent agents in recent years is that Alphago, an intelligent agent based on reinforcement learning, defeated the top human Go players, thus making deep reinforcement learning the most likely path to artificial intelligence. The main technology used by Alphago is self- game. The dependence on the number of datasets can be reduced by self-play, and even surpass human experts, which is almost impossible for ordinary deep learning. [0003] In the era of intelligence in the future, a large number of robots that can replace human pilots to perform reconnaissance, strike, and confrontation tasks will be used in military warfare. This puts forward high requirements for the accuracy, timeliness, and effectiveness of the command and contr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08G06N3/00A63F13/822

CPCG06N3/08G06N3/006A63F13/822A63F2300/807G06N3/044G06N3/045

Inventor 王滨杨军原鑫钟晨

Owner NO 15 INST OF CHINA ELECTRONICS TECH GRP

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-agent reinforcement learning method and system based on population training

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology