Q routing method based on arc tangent learning rate factor

A learning rate and arctangent technology, applied in the field of communication, can solve the problems affecting the application performance of routing algorithms, unstable algorithm performance, slow algorithm convergence speed, etc., to improve parameter adjustment ability, improve performance and delivery delay stability, The effect of improving the convergence speed

Active Publication Date: 2020-01-21
XIAN UNIV OF POSTS & TELECOMM
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, some common problems of Q routing are still obvious: 1. The Q value in the early stage of training is unreliable for a long time; 2. The speed of convergence to the optimal solution is slow; 3. The robustness of parameters is poor, and the performance of the algorithm is unstable.
[0006] The above defects in the existing technology limit the improvement of routing performance in the network, resulting in increased delay in the network, slow algorithm convergence speed, and unstable algorithm performance
Thus affecting the application performance of the routing algorithm based on Q-learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Q routing method based on arc tangent learning rate factor
  • Q routing method based on arc tangent learning rate factor
  • Q routing method based on arc tangent learning rate factor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032]The network is inseparable from people's lives, and many nodes can be wirelessly connected into a network in practical applications such as hotels, airports, and earthquake relief environments. For many years, research on routing in wireless ad hoc networks has been a hot topic. Wireless ad hoc network is a multi-hop mobility network, especially suitable for network deployment in emergency environments. Nodes obtain information in the environment through ad hoc networking and exchange information. Due to poor flexibility and high computational complexity, traditional routing algorithms cannot adapt to highly variable networks. Reinforcement learning is an effective alternative to address real-world network situations. Some existing routing algorithms based on reinforcement learning have the advantages of less state and action space requirements, only use local node information, and self-adaptive adjustments, etc., but there are still inaccurate Q values ​​in the early t...

Embodiment 2

[0057] The Q routing method based on the arctangent learning rate factor is the same as that in Embodiment 1, and the corresponding destination d and other neighbor nodes y in the Q value table for the current node x described in step 5 of the present invention 2For each Q value of , each update is performed using the arctangent learning rate factor η' one by one, and the calculation formula for updating the Q value of the neighbor with the minimum time cost is shown in the following formula:

[0058]

[0059] where y 2 is any other neighbor node of the current node x; η' is the arctangent adaptive learning rate factor, and the value of η' is in the range of (0,1); s 2 is the packet from x to node y 2 link transmission time; where is Q at time T x (d,y 2 ) value means, is Q updated at time T+1 x (d,y 2 ) value representation.

[0060] Only update the Q value of the next hop node determined in the network, and do not update the Q values ​​of other neighbor nodes. A...

Embodiment 3

[0063] The Q routing method based on the arctangent learning rate factor is the same as embodiment 1-2, and the realization of the arctangent learning rate factor η' in step 5 of the present invention is as follows:

[0064] η'=1-(2atan((T max -T est ) / (2k 2 π)) / π)

[0065] where k 2 is a constant, the value range is (0,1]; by adjusting k 2 The arctangent learning rate factor η' can be adjusted to further adjust the routing strategy. Calculate the maximum delivery time T according to the Q value table of the current data packet transmission node x max and average delivery time T est difference, where T est is the arithmetic mean value of all neighbors’ Q values ​​corresponding to sink d in the Q value table of the current node x, T max is the current node x so far, has obtained all T est The maximum of the values. If the difference between the maximum delivery time and the average delivery time of the current node x is T max -T est When η' is larger, the value of η...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Q routing method based on an arc tangent learning rate factor, which solves the problem that the additional learning rate factor adjusting capability of the existing algorithm is limited, and comprises the following steps of: arranging network topology; establishing a network Q value table; enabling the network node to obtain an estimated value of the neighbor with the minimum time cost; routing and deciding a data package, transmitting the data package, performing updating according to the Q value of the neighbor node with the minimum time cost; enabling the currentnode to update Q values of other neighbor nodes; circulating the routing process, and achieving self-adaptive routing adjustment of the wireless ad hoc network. According to the method, the Q value ofthe neighbor node which does not receive the data packet in the network is updated by using the learning rate factor with a large adjustment range, and adaptive adjustment can be performed accordingto different conditions of the network. According to the invention, the average delivery time of data and the oscillation between routes under high and low loads are reduced; route selection can be achieved only through local node information, excessive network overhead is avoided, and the method is used for wireless ad hoc network communication.

Description

technical field [0001] The invention belongs to the technical field of communication, and relates to Q routing of a wireless self-organizing network, in particular to a Q routing method based on an arctangent learning rate factor, which is used in a wireless self-organizing network. [0002] On the premise of not increasing network routing overhead, it can realize reasonable decision-making on data packet routing, reduce network routing delay, reduce routing oscillation, and improve the successful delivery rate of data packets. Background technique [0003] Wireless ad hoc is a network without a fixed infrastructure. There are usually no centralized control nodes in the network, and the nodes communicate through ad hoc networks. Usually the nodes in the network can move freely. In the mobile ad hoc network, the constant movement of the nodes leads to the constant change of the topology. Topology changes pose great challenges to network routing. Traditional routing technique...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04W40/02H04W40/24H04W40/34H04L12/751H04L12/721H04L12/727G06N20/00H04W84/18H04L45/02H04L45/121
CPCH04W40/02H04W40/248H04W40/34H04L45/02H04L45/14H04L45/121G06N20/00H04W84/18Y02D30/70
Inventor 黄庆东袁润芝曹艺苑
Owner XIAN UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products