Power distribution method in downlink NOMA of depth deterministic strategy gradient

An allocation method and a deterministic technology, applied in the field of NOMA resource allocation, can solve the problems that it is not easy to find the optimal solution, the numerical simulation method does not have an accurate system model, and consumes a lot of time, so as to solve the problem of spectrum scarcity and improve Average transfer rate, effect of improving utilization efficiency

Pending Publication Date: 2021-03-12
LIAONING TECHNICAL UNIVERSITY
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Power allocation problems are mostly NP-hard (NP-hard), and have non-convexity, and it is not easy to find the optimal solution. Therefore, there are many research methods that use explicit or implicit optimization techniques, through iterative Calculation to find the optimal solution, the traditional method can effectively improve the system performance, but the numerical simulation method does not have an accurate system model, and multiple iterative calculations involve a lot of calculations and consume a lot of time, which cannot meet the requirements of the communication system. Real-time data processing requirements, and based on the traditional method does not have the ability of self-learning, unable to adapt to changeable and complex communication systems, so the processing of actual communication scenarios is not perfect
[0005] Following the traditional optimization algorithm, some scholars have proposed the use of deep learning technology to solve the power allocation problem in the NOMA system. Such methods use deep neural networks or other variants, and use supervised learning. Through multi-layer The neural network is used to extract data features and learn the mapping from data to labels. Compared with the multiple iterative calculations of the traditional power allocation method, the operation of the neural network is more efficient and less complex, but the training of the neural network requires a lot of preparation. Good sample data, but it is difficult to obtain perfect sample data in the communication system, and the supervised learning method requires a benchmark algorithm for training, and its performance will be limited by the benchmark algorithm
[0006] Compared with the supervised learning method in deep learning, reinforcement learning adopts a self-learning strategy. The agent continuously learns the observed environmental information, and constantly updates its own behavior selection strategy, and finally learns the optimal one. The optimal behavior control strategy, the Q-learning algorithm is the most classic reinforcement learning algorithm, but the traditional Q-learning algorithm suffers from the "curse of dimensionality", that is, it cannot deal with high-dimensional state-action space problems, and can only deal with discrete actions Space, powerless for continuous action space, deep reinforcement learning effectively solves the curse of dimensionality problem by using deep neural network instead of traditional Q-value function. The most widely used algorithm is Deep Q Network (DQN), but it also has problems that cannot be solved. Continuous action space and other issues, there are certain limitations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Power distribution method in downlink NOMA of depth deterministic strategy gradient
  • Power distribution method in downlink NOMA of depth deterministic strategy gradient
  • Power distribution method in downlink NOMA of depth deterministic strategy gradient

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] Embodiment 1: as figure 1 Shown is a structural diagram of a cellular network power allocation method according to an embodiment of the present invention. This embodiment provides a downlink NOMA system power allocation method based on a deep deterministic policy gradient algorithm. The specific steps are as follows:

[0057] 1) Initialize the downlink NOMA system simulation environment, such as Figure 4 Shown is a simulated communication system diagram, including a base station and multiple end users. Considering the complexity of decoding at the receiving end, consider the case of two users on one subchannel;

[0058] 2) Initialize the weight parameters of the two neural networks contained in the actor network module and the critic network module;

[0059] 3) Use correlation algorithms to complete the matching work between users and channels, and adopt the method of equal power distribution between sub-channels;

[0060] 4) Obtain the initialization state, first ca...

Embodiment 2

[0070] Embodiment 2: This embodiment specifically explains the small-scale fading, large-scale fading, action set, neural network structure, and parameter update method of the target network in embodiment 1.

[0071] (1) Small-scale fading, the formula is:

[0072] in, and The formula for calculating the correlation coefficient ρ is: ρ=J 0 (2πf d T s )J 0 ( ) represents the zero-order Bessel function of the first kind, f d represents the maximum Doppler frequency, T s Indicates the time interval between adjacent moments, in milliseconds.

[0073] (2) Large-scale fading, the formula is: PL -1 (d)=-120.9-37.6 log 10 (d)+10log 10 (z)

[0074] Among them, z is a random variable that obeys the logarithmic normal distribution, and the standard deviation is 8dB, and d represents the distance from the transmitting end to the receiving end, and the unit is km.

[0075] (3) The action set is a set of continuous values, ranging from 0 to 1, but not including 0 and 1. The ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a power distribution method in a downlink NOMA system of a depth deterministic strategy gradient algorithm, and the method employs a double neural network structure and an experience pool playback mechanism, can effectively solve a problem related to a large-scale state action space, reduces the correlation between training samples, employs a deterministic strategy to select an action, and an action may be selected in a continuous action space. According to the algorithm, state information is used as input of a neural network, a state space, an action space and a rewardfunction are correspondingly designed according to a simulation downlink NOMA system situation, and signal to interference plus noise ratio information and rate information at the last moment are used as components of the state information at the current moment. According to the invention, the intelligent agent can learn and utilize the learned information to improve the behavior strategy more effectively, and an optimal power distribution strategy is obtained after multiple iterations. The method can effectively solve the problem of power distribution of multiple users in the downlink NOMA system, has good generalization performance under different user numbers and the transmitting power level of the base station, can effectively improve the rationality of power distribution, consumes less operation time, and effectively improves the efficiency of power distribution.

Description

technical field [0001] The invention relates to the field of NOMA resource allocation, in particular to a power allocation method in a downlink NOMA system based on a deep deterministic strategy gradient algorithm. Background technique [0002] With the continuous access of mobile terminal equipment and the continuous increase of user density in the wireless communication system, the amount of data in the communication system has shown an exponential growth, and the orthogonal multiple access technology has been unable to meet the needs of high system capacity. The fifth-generation mobile communication system emerged as the times require. The main focus of 5G technology is the improvement of data rate and the reduction of end-to-end delay, so as to adapt to the exponential growth of wireless business data volume. Non-orthogonal multiple access (NOMA) is considered to be a promising technology in the 5G communication system, which allows multiple users to communicate on the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04W72/04G06N3/04
CPCH04W72/0473G06N3/045H04W72/53Y04S10/50Y02E40/70Y02D30/70
Inventor 王伟殷爽爽吕明海武聪
Owner LIAONING TECHNICAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products