Energy efficiency-oriented multi-agent deep reinforcement learning optimization method for unmanned aerial vehicle group

A technology of energy efficiency and reinforcement learning, which is applied in wireless communication, power management, electrical components, etc., can solve problems such as the inability of the algorithm to converge and converge, and achieve the effects of enhancing dynamic adaptability, improving life cycle, and improving energy efficiency

Active Publication Date: 2020-04-03
YANGTZE NORMAL UNIVERSITY
View PDF8 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, due to its dynamics, the energy efficiency optimization decision-making of UAV swarm communication faces severe challenges in large decision space.
Using traditional reinforcement learning methods, you will encounter the problem that the algorithm cannot converge or the convergence speed is too slow due to the large decision space.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Energy efficiency-oriented multi-agent deep reinforcement learning optimization method for unmanned aerial vehicle group
  • Energy efficiency-oriented multi-agent deep reinforcement learning optimization method for unmanned aerial vehicle group
  • Energy efficiency-oriented multi-agent deep reinforcement learning optimization method for unmanned aerial vehicle group

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0040] Such as image 3 As shown, the present invention discloses an energy efficiency-oriented UAV swarm multi-agent deep reinforcement learning optimization method, including the following steps:

[0041] S1. Obtain the current status information of the UAV cluster;

[0042] S2. Obtain the historical information of the UAV cluster, the historical information includes historical status information and historical decision information;

[0043] For each time slot, the historical information of multiple previous time slots is collected as the input of the neural network for learning, so as to obtain the decision information of the current time slot.

[0044]S3. Using the improved DQN deep reinforcement learning method based on Q-learning, using the historical inf...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an energy efficiency-oriented multi-agent deep reinforcement learning optimization method for an unmanned aerial vehicle group. The method comprises the steps of adopting an improved DQN deep reinforcement learning method based on Q learning, training and updating the neural network of each intelligent agent by using historical information of the unmanned aerial vehicle cluster to obtain channel selection and power selection decisions of each intelligent agent of the unmanned aerial vehicle cluster, training the neural network by using a short-time experience playback mechanism in the training process, and maximizing the energy efficiency value of the corresponding intelligent agent by using the optimization target of each neural network. According to the invention,a distributed multi-agent deep strong chemical method is adopted, and a short-time experience playback mechanism is set to train a neural network to mine a change rule contained in a dynamic networkenvironment. The problem that a convergence solution cannot be obtained in a large state space faced by traditional reinforcement learning is solved. Multi-agent distributed cooperative learning is achieved, the energy efficiency of unmanned aerial vehicle cluster communication is improved, the life cycle of an unmanned aerial vehicle cluster is prolonged, and the dynamic adaptive capacity of an unmanned aerial vehicle cluster communication network is enhanced.

Description

technical field [0001] The invention relates to the technical field of UAV cluster communication network access, in particular to an energy efficiency-oriented multi-agent deep reinforcement learning optimization method for UAV swarms. Background technique [0002] At present, the rapid development and application promotion of UAV technology is one of the frontier and hot issues, which has attracted extensive attention. Among them, the research on UAV swarms is the most eye-catching. UAV swarms can use low-cost UAVs to form groups according to different roles, and play a huge role in coordinated actions. [0003] But the key to the synergy of drone swarms lies in their robust communication network. Without a communication system to support the internal members of the UAV cluster, its coordinated action is out of the question. [0004] At the same time, the optimization of energy consumption of small UAVs, especially battery-powered UAVs, is crucial. The construction and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04W52/24H04W84/08
CPCH04W52/241H04W52/242H04W84/08
Inventor 姚昌华王修来党随虎李松柏阮郎田辉范浩人张海波
Owner YANGTZE NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products