Target detection and distribution method and device based on multi-agent reinforcement learning

A reinforcement learning, multi-agent technology, applied in the field of simulation, can solve the problem of slow convergence of combat behavior model optimization

Pending Publication Date: 2020-12-25
中国人民解放军军事科学院评估论证研究中心 +1
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the purpose of the present invention is to overcome the deficiencies of the prior art and provide a method and device for target detection and assignment based on multi-agent reinforcement learning to solve the problem of optimizing the convergence speed of the combat behavior model in the war game deduction system in the prior art. slow problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target detection and distribution method and device based on multi-agent reinforcement learning
  • Target detection and distribution method and device based on multi-agent reinforcement learning
  • Target detection and distribution method and device based on multi-agent reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051]In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be described in detail below. Obviously, the described embodiments are only some, but not all, embodiments of the present invention. Based on the embodiments of the present invention, all other implementations obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present invention.

[0052] Inspired by the MADDPG (Multi-Agent Deep Deterministic Policy Gradient) multi-agent algorithm, a series of improvements have been made to the policy gradient algorithm, making it suitable for complex multi-agent scenarios that traditional algorithms cannot handle. The MADDPG algorithm has the following three characteristics:

[0053] 1. The optimal strategy obtained by learning can only use local information to give the optimal action in application.

[0054] 2. There is no n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a target detection and distribution method and device based on multi-agent reinforcement learning. The method comprises the following steps: constructing a combat behavior model and a reinforcement learning training environment; adopting a reinforcement learning training environment to train the combat behavior model until the model converges to acquire an artificial intelligence behavior model; and training the artificial intelligence behavior model by adopting a combat simulation engine, and outputting an optimization model. The reinforcement learning algorithm MADDPG is integrated into the weapon deduction system, a simulation environment from simple to complex is constructed, the convergence rate of reinforcement learning is optimized, and the problem of optimization of the convergence rate of an intelligent agent in the weapon deduction system is effectively solved.

Description

technical field [0001] The invention belongs to the technical field of simulation and simulation, and in particular relates to a target detection and allocation method and device based on multi-agent reinforcement learning. Background technique [0002] With the development of artificial intelligence, the era of relying on humans to research tactics and formulate military plans is gradually passing away. In the past, in the process of computer application in war game simulation, people rely on differential equations and war theory to effectively simulate the process of war, which greatly improves the combat level of the army. Today, the application of artificial intelligence in wargaming will play an even more important role. The ability of multi-agent-based modeling to describe complex systems and the ability to model behavior in dynamic environments has advantages over traditional modeling methods. The emergence of multi-agent systems provides a new platform for the furt...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F30/27G06F119/14
CPCG06F30/27G06F2119/14
Inventor 伊山魏晓龙鹿涛黄谦齐智敏蔡春晓赵昊张帅亢原平
Owner 中国人民解放军军事科学院评估论证研究中心
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products