Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Deep reinforcement learning training acceleration method for collision avoidance of multiple unmanned aerial vehicles

A technology of reinforcement learning and collision avoidance, applied in the field of drones, can solve the problems of tediousness, low degree of automation, and low degree of automation, and achieve the effect of accelerating the training process, good control strategy, and simple principle

Active Publication Date: 2021-12-03
NAT UNIV OF DEFENSE TECH
View PDF9 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The methods based on transfer learning are mostly used in visual information perception tasks, such as target recognition, etc., but for sensors such as lidar, the trained network model and parameters cannot be directly transferred and applied
[0006] (2) The degree of automation is not high
Staged training breaks down a certain task into multiple stages of tasks, and trains them sequentially, which is more cumbersome
Moreover, in staged training, subsequent stages of training may lead to policy forgetting in the pre-training stage
However, most of the existing human-guided training requires people to participate in the training process as teachers, which has a low degree of automation and consumes a lot of time and energy for developers.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep reinforcement learning training acceleration method for collision avoidance of multiple unmanned aerial vehicles
  • Deep reinforcement learning training acceleration method for collision avoidance of multiple unmanned aerial vehicles
  • Deep reinforcement learning training acceleration method for collision avoidance of multiple unmanned aerial vehicles

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0050] Such as figure 1 with figure 2 As shown, the deep reinforcement learning training acceleration method for multi-UAV collision avoidance of the present invention is a deep reinforcement learning method based on human experience assistance, which includes:

[0051]Step S1: Formally model the fully distributed UAV swarm obstacle avoidance problem based on a partially observable Markov decision process;

[0052] Step S2: Design a deep neural network to construct a mapping of observation input-action output and a network update algorithm;

[0053] Step S3: Design methods to incorporate human experience to accelerate training.

[0054] In a specific application example, in step S1, the formal modeling process includes:

[0055] The problem of cooperative obstacle avoidance in the process of multi-UAV going to the target location...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a deep reinforcement learning training acceleration method for collision avoidance of multiple unmanned aerial vehicles. The deep reinforcement learning training acceleration method comprises the steps of S1, performing formalized modeling on a completely distributed unmanned aerial vehicle cluster obstacle avoidance problem based on a Markov decision process; S2, constructing a deep neural network, and constructing a mapping and network updating method of observation input-action output; and S3, fusing human experience to accelerate training. The method has the advantages that the principle is simple, a training intelligence degree is high, the deep reinforcement learning training process can be accelerated and so on.

Description

technical field [0001] The invention mainly relates to the technical field of unmanned aerial vehicles, in particular to a deep reinforcement learning training acceleration method for collision avoidance of multiple unmanned aerial vehicles. Background technique [0002] With the gradual expansion of the application field of UAVs, people's demand for UAVs to perform tasks autonomously is getting higher and higher. Autonomous positioning, environment perception, path planning and collision avoidance are key technologies for UAVs to perform tasks autonomously. Compared with single UAVs, multi-UAVs can carry more mission loads, have a larger detection range, and perform various tasks. [0003] Deep reinforcement learning not only has the ability of deep learning to understand complex high-dimensional data, but also has the general learning ability of reinforcement learning to learn by itself through trial and error mechanisms. However, deep reinforcement learning mostly faces...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G05D1/10
CPCG05D1/104Y02T10/40
Inventor 刘志宏王祥科王冠政李杰相晓嘉丛一睿陈浩周文宏杨凌杰胡新雨
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products