Behavior imitation training method for air intelligent game

A training method and behavioral technology, applied in the field of machine learning, which can solve the problems of poor decision support robustness, timeliness and accuracy, limited number of sensing parameters or target objects, and low level of intelligent decision-making. Improve the design process and the random exploration process during initial training, solve the problem of slow or even non-convergence convergence, and solve the effect of cold start of training

Active Publication Date: 2021-08-06
HANGZHOU EBOYLAMP ELECTRONICS CO LTD
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the existing air-aided decision support system is relatively backward, the number of sensing parameters or target objects that can be controlled simultaneously is limited, the robustness, timeliness and accuracy of decision support are not good, and the decision It is easy to cause the training model to be difficult to converge, and it takes a long time to train a practical agent or even fail to train an effective decision-making agent
In addition, due to the sparse rewards and cumbersome and inefficient design of rewards in the aerial intelligent self-game confrontation, the decision-making level of the agent is low and takes a long time. At the same time, reward design needs to be manually customized for the scene, with high labor costs and poor reusability. Algorithm training has a cold start problem.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Behavior imitation training method for air intelligent game
  • Behavior imitation training method for air intelligent game
  • Behavior imitation training method for air intelligent game

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0083] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some, not all, embodiments of the application. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0084] It should be noted that, unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of the application. The terms used herein in the description of the application are only for the purpose of describing specific embodiments, and are not intended to limit the application.

[0085] Such as Figure 1-3 As shown, a behavior imitation training method for intelligent games in the air, including th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a behavior simulation training method for an air intelligent game. The method comprises the following steps: S1, constructing an intelligent agent game decision model; S2, determining an environment state and an action space, and shaping a continuous non-sparse reward function of each action; S3, carrying out an air game in the model, and executing the following steps: S31, generating a next environment state according to an executed action, obtaining a reward, and carrying out loop iteration in sequence to realize maximum accumulated reward; S32, realizing reverse reinforcement learning based on expert behaviors, and obtaining a target reward function; S33, calculating the similarity between each agent behavior and the expert behavior; S34, obtaining a comprehensive reward; and S4, training the agent game decision model. According to the method, a traditional low-efficiency reward function design process and a model training random exploration process are improved, so that the reward function has interpretability and human intervention ability, the agent decision level and convergence speed are improved, and the cold start problem of model training is solved.

Description

technical field [0001] The invention belongs to the technical field of machine learning, and in particular relates to a behavior imitation training method for aerial intelligent games. Background technique [0002] In addition to obtaining the most accurate intelligence from various detection systems anytime and anywhere to achieve information advantages, the future air game is more important to use machine learning, artificial intelligence, cloud computing and other technologies to achieve decision-making advantages. In order to better tap information to achieve decision-making advantages, exert game effectiveness, and ensure air superiority, in addition to the excellent aerial skills of the pilots and the good commanding skills of the commanders, an air-assisted decision-making support system is also required. As an artificial intelligence auxiliary system, the air-aided decision-making support system can provide decision-making reference in a highly dynamic and complex co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F30/27G06N20/00
CPCG06F30/27G06N20/00
Inventor 包骐豪朱燎原夏少杰瞿崇晓
Owner HANGZHOU EBOYLAMP ELECTRONICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products