Distributed near-end strategy optimization method based on cognitive behavior knowledge and application thereof

An optimization method and distributed technology, applied in the field of deep reinforcement learning, can solve problems such as sampling complexity limiting the application of reinforcement learning algorithms

Active Publication Date: 2021-06-04
NAT UNIV OF DEFENSE TECH
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the reinforcement learning algorithm requires a large number of samples and continuously optimizes the Agent's strategy through trial and error.
However, the huge sampling complexity limits the appli

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed near-end strategy optimization method based on cognitive behavior knowledge and application thereof
  • Distributed near-end strategy optimization method based on cognitive behavior knowledge and application thereof
  • Distributed near-end strategy optimization method based on cognitive behavior knowledge and application thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0085] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0086] The distributed near-end strategy optimization method based on cognitive behavioral knowledge proposed by the present invention comprises the following steps:

[0087] S1. Using cognitive behavioral knowledge to establish the cognitive behavioral model of the Agent, and introducing the cognitive behavioral model into deep reinforcement learning, constructing a deep reinforcement learning framework based on cognitive behavioral knowledge. The deep reinforcement learning framework based on the cognitive row model is as follows: figure 1 shown. The process of interaction between the GOAL-based cognitive behavioral model and the environment is as follows: figure 2 shown. The invention adopts a unified agent modeling method, and expresses elements such as knowledge, beliefs, intentions, rules, etc. into a form that can be unders...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed near-end strategy optimization method based on cognitive behavior knowledge and application thereof, and the method comprises the following steps: employing cognitive behavior knowledge to establish a cognitive behavior model of an Agent, introducing the cognitive behavior model into deep reinforcement learning, and constructing a deep reinforcement learning framework based on cognitive behavior knowledge; based on the deep reinforcement learning framework, providing a distributed near-end strategy optimization algorithm based on cognitive behavior knowledge; and quantitatively designing a guide mode of the cognitive behavior model for Agent strategy updating, and achieving continuous learning of the Agent on the basis of cognitive behavior knowledge. According to the method provided by the invention, cognitive behavior knowledge can be effectively utilized, and strategy updating is carried out on the basis, so that the learning efficiency of the Agent is improved.

Description

technical field [0001] The invention relates to the technical field of deep reinforcement learning, in particular to a distributed proximal strategy optimization method based on cognitive behavioral knowledge and its application in air combat maneuver decision-making. Background technique [0002] In recent years, deep reinforcement learning has found widespread applications in video games, traffic light control, robotics, and more. However, the reinforcement learning algorithm requires a large number of samples and continuously optimizes the agent's strategy through trial and error. However, the huge sampling complexity limits the application of reinforcement learning algorithms in practical problems, and the use of existing cognitive behavioral knowledge to accelerate agent policy learning is an effective means to solve the above problems. [0003] Human beings and learning agents are quite different in cognition level, cognition mode and behavior mode. It is very difficu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F30/20G06F30/15G06N20/00G06N5/00
CPCG06F30/20G06F30/15G06N20/00G06N5/00Y02T10/40
Inventor 黄健陈浩李嘉祥刘权龚建兴韩润海
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products