Reinforcement learning training optimization method and device for multi-agent confrontation

A multi-agent and reinforcement learning technology, applied in the field of machine learning, can solve problems such as low training efficiency, achieve efficient training, and improve training efficiency
CN110991545AActive Publication Date: 2020-04-10NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI
Publication Date
2020-04-10

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The embodiment of the invention provides a reinforcement learning training optimization method and device for multi-agent confrontation. The method comprises the following steps of a rule coupling algorithm training process including the steps of acquiring an initial first state result set of a red party multi-agent for each training step, if the initial first state result set of the red party multi-agent meets a preset action rule, obtaining a decision-making behavior result set according to the preset action rule, and otherwise, acquiring the decision-making behavior result set according toa preset reinforcement training learning algorithm; and performing reinforcement learning training on the red-party multi-agent by utilizing a training sample formed by the decision-making behavior result set and other preset parameters. The embodiment of the invention provides the reinforcement learning training optimization method and device for multi-agent confrontation. In the whole training process, the preset action rule can guide the multiple agents to act, invalid actions are avoided, the problems that in the training process in the prior art, invalid exploration is much, and the training speed is low are solved, and the training efficiency is remarkably improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of machine learning, in particular to a multi-agent confrontation-oriented reinforcement learning training optimization method and device. Background technique

[0002] Artificial intelligence is a technical science that researches and develops theories, methods, technologies and applications for simulating and expanding human intelligence. One of the main goals of artificial intelligence research is to simulate human decision-making by intelligent agents (Agents), so as to be competent for some complex tasks that require human intelligence to complete. The limited functionality of a single agent to cope with complex tasks has driven the concept of multi-agent systems. A multi-agent system is composed of multiple agents that can make independent decisions and interact with each other. They share the same environment and have perception and execution mechanisms. At present, multi-agent systems have become a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More