Method and apparatus for optimizing power consumption of 5g terminal based on discontinuous reception, and medium

By constructing a Markov decision model in the 5G NR-U scenario and using the Actor-Critic reinforcement learning algorithm to optimize the discontinuous reception cycle parameters, the problem of increased power consumption and latency of 5G terminals in dense heterogeneous networks is solved, and a balance optimization of power consumption and latency is achieved.

CN117768982BActive Publication Date: 2026-06-23BEIJING UNIV OF POSTS & TELECOMM +5

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING UNIV OF POSTS & TELECOMM
Filing Date
2023-10-24
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In 5G NR-U scenarios, discontinuous reception technology leads to increased terminal power consumption and latency, especially when unlicensed frequency bands share channel resources with various wireless access technologies such as Wi-Fi. The uneven competition for channel resources results in additional power consumption and increased average reception latency.

Method used

A Markov decision model is constructed for dense heterogeneous network scenarios. Combined with the Actor-Critic reinforcement learning algorithm, the discontinuous reception cycle parameters of 5G terminals are optimized. By adaptively adjusting the DRX cycle length and terminal state transition, the power consumption and latency overhead of beam search and channel competition are reduced.

Benefits of technology

The power consumption and latency of 5G terminals in unlicensed frequency bands have been optimized, achieving a balance between energy saving and latency in dense heterogeneous network scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117768982B_ABST
    Figure CN117768982B_ABST
Patent Text Reader

Abstract

The application provides a 5G terminal power consumption optimization method and device based on discontinuous reception and a medium. The method comprises the following steps: constructing a Markov decision model according to the discontinuous reception state cycle length of a 5G terminal and the terminal state transition action, and obtaining the corresponding system time delay and energy saving trade-off benefit based on different terminal state transition actions; obtaining the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model according to the system time delay and the energy saving trade-off benefit, and constructing the optimal discontinuous reception cycle parameters corresponding to the 5G terminal; and sending the optimal discontinuous reception cycle parameters to the 5G terminal, so that the 5G terminal performs power consumption optimization adjustment according to the optimal discontinuous reception cycle parameters. The application reduces the additional power consumption and time delay overhead caused by beam search and channel competition of the 5G terminal in the unlicensed frequency band.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of communication technology, and in particular to a method, apparatus and medium for optimizing the power consumption of 5G terminals based on discontinuous reception. Background Technology

[0002] Discontinuous Reception (DRX) technology is an important technology for saving power consumption of User Equipment (UE) in Long Term Evolution (LTE) systems. It can control the UE to periodically enter sleep mode and reduce energy consumption by turning off the radio transceiver unit.

[0003] In the DRX mechanism of the 3GPP standard protocol, the UE wakes up from the DRX ON state between sleep cycles and activates the receiver to monitor the Physical Downlink Control Channel (PDCCH) to determine if there is data being transmitted from the Base Station (BS). In licensed frequency bands, channel resources can be exclusively occupied by a single radio technology, and the base station has complete control over the allocation and use of licensed channel resources. However, in unlicensed frequency bands, the 5G New Radio in Unlicensed Spectrum (5G NR-U), operating in unlicensed bands, shares channel resources with various radio access technologies such as Wi-Fi. When the BS has data to send to the UE, if the channel is occupied by other devices during the UE's wake-up period, the BS cannot compete for the channel during the UE's wake-up period, and the data packet must be delayed until at least the next wake-up time, resulting in increased average reception latency. Furthermore, even if the BS obtains access to unlicensed channels, in order to ensure fairness between NR-U and Wi-Fi, the Maximum Channel Occupancy Time (MCOT) of unlicensed bands limits the time that network nodes can continuously use the spectrum. This results in the frequency sharing of unlicensed bands reducing the probability that the UE can use available channels for data transmission, and the UE consumes additional power while waiting for unlicensed channels to become available.

[0004] Therefore, there is an urgent need for a 5G terminal power consumption optimization method, device, and medium based on discontinuous reception to solve the above problems. Summary of the Invention

[0005] To address the problems existing in the prior art, this invention provides a method, apparatus, and medium for optimizing the power consumption of 5G terminals based on discontinuous reception.

[0006] This invention provides a 5G terminal power consumption optimization method based on discontinuous reception, applied to the base station side, comprising:

[0007] Based on the discontinuous reception state cycle length and terminal state transition actions of the 5G terminal, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed, and based on different terminal state transition actions, the corresponding system latency and energy saving trade-off benefits are obtained. The discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length.

[0008] Based on the trade-off between system latency and energy saving benefits, obtain the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

[0009] Based on the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length, the optimal discontinuous reception cycle parameters corresponding to the 5G terminal in the dense heterogeneous network scenario are constructed, and the optimal discontinuous reception cycle parameters are sent to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustment based on the optimal discontinuous reception cycle parameters.

[0010] According to the present invention, a 5G terminal power consumption optimization method based on discontinuous reception is provided, wherein the method comprises constructing a Markov decision model corresponding to a dense heterogeneous network scenario based on the discontinuous reception state cycle length and terminal state transition actions of the 5G terminal, including:

[0011] Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0012] An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states.

[0013] Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed.

[0014] Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving.

[0015] Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed.

[0016] According to the present invention, a 5G terminal power consumption optimization method based on discontinuous reception is provided, wherein the formula for the instantaneous reward function is:

[0017] R = δ*α + (1-δ)*(β+P);

[0018]

[0019]

[0020] Where δ represents the weight between terminal power consumption and data packet waiting latency, α represents the power saving factor, β represents the latency factor, and P represents the adjustment parameter of the latency factor; t ssc The duration corresponding to the short sleep timer represents the length of the terminal's short sleep cycle; T W N1 represents the decision window length; N2 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the short sleep state; N3 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the long sleep state; N4 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the discontinuous reception active state; M i N represents the number of short sleep state cycles experienced by the 5G terminal during the time interval between acquiring the i-th data packet and the (i+1)-th data packet when the 5G terminal is in the short sleep state and the discontinuous reception activation state; ss t represents the maximum number of short sleep state cycles. lsc This indicates the duration corresponding to the long sleep timer; This indicates the duration of the long sleep state when the base station receives the i-th data packet, corresponding to the long sleep state experienced by the 5G terminal. This represents the ratio between the waiting latency of the i-th data packet acquired by the base station when the 5G terminal is in sleep mode and the corresponding terminal sleep state period; t on b is the duration corresponding to the discontinuous reception timer, representing the length of the discontinuous reception activation state period; i This represents the number of times the 5G terminal failed to compete for the terminal channel when the base station receives the i-th data packet; t c This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails.

[0021] According to the present invention, a 5G terminal power consumption optimization method based on discontinuous reception is provided, wherein obtaining the target discontinuous reception active state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the system latency and energy saving trade-off benefits includes:

[0022] Based on the trade-off between system latency and energy saving, the Markov decision model is solved using the Actor-Critic reinforcement learning algorithm to obtain the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

[0023] This invention also provides a 5G terminal power consumption optimization method based on discontinuous reception, applied to 5G terminals, comprising:

[0024] The optimal discontinuous reception period parameter is received from the base station. The optimal discontinuous reception period parameter is obtained by solving the system latency and energy saving trade-off between different terminal state transition actions based on the Markov decision model corresponding to the dense heterogeneous network scenario. The Markov decision model is constructed based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal.

[0025] Based on the Markov state transition model of the preset discontinuous reception mechanism, the current terminal state is determined. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the period length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception period parameter. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0026] According to the present invention, a 5G terminal power consumption optimization method based on discontinuous reception is provided. The preset discontinuous reception mechanism Markov state transition model is constructed from multiple terminal states of the 5G terminal, including terminal active state, terminal short sleep state, terminal long sleep state, discontinuous reception active state, and beam search state.

[0027] According to the present invention, a 5G terminal power consumption optimization method based on discontinuous reception is provided, the method further comprising:

[0028] If, during the discontinuous reception activation state, a data packet to be sent by the base station is detected on the physical downlink control channel, a terminal channel contention operation is performed. If the terminal channel contention operation fails, the discontinuous reception activation state is maintained, the terminal channel contention operation is performed again, and the duration of the discontinuous reception timer corresponding to the discontinuous reception activation state is increased.

[0029] If the terminal channel contention operation is successful, it switches to the beam search state. If a target beam pair that meets the preset beam conditions is found within the preset beam search time, the beam search state is switched to the terminal active state, and data is transmitted with the base station through the target beam pair.

[0030] This invention provides a 5G terminal power consumption optimization device based on discontinuous reception, applied at the base station end, comprising:

[0031] The dense heterogeneous network scenario construction module is used to construct a Markov decision model corresponding to the dense heterogeneous network scenario based on the discontinuous reception state cycle length and terminal state transition action of the 5G terminal, and to obtain the corresponding system latency and energy saving trade-off benefits based on different terminal state transition actions. The discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length.

[0032] The processing module is used to obtain the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the trade-off between system latency and energy saving benefits.

[0033] The period parameter generation module is used to construct the optimal discontinuous reception period parameter of the 5G terminal in the dense heterogeneous network scenario based on the target discontinuous reception activation state period length and the target terminal short sleep period length, and send the optimal discontinuous reception period parameter to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustment based on the optimal discontinuous reception period parameter.

[0034] According to the present invention, a 5G terminal power consumption optimization device based on discontinuous reception is provided, wherein the dense heterogeneous network scenario construction module is specifically used for:

[0035] Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0036] An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states.

[0037] Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed.

[0038] Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving.

[0039] Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed.

[0040] According to the present invention, a 5G terminal power consumption optimization device based on discontinuous reception is provided, wherein the formula of the instantaneous reward function is:

[0041] R = δ*α + (1-δ)*(β+P);

[0042]

[0043]

[0044] Where δ represents the weight between terminal power consumption and data packet waiting latency, α represents the power saving factor, β represents the latency factor, and P represents the adjustment parameter of the latency factor; t ssc The duration corresponding to the short sleep timer represents the length of the terminal's short sleep cycle; T W N1 represents the decision window length; N2 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the short sleep state; N3 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the long sleep state; N4 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the discontinuous reception active state; M i N represents the number of short sleep state cycles experienced by the 5G terminal during the time interval between acquiring the i-th data packet and the (i+1)-th data packet when the 5G terminal is in the short sleep state and the discontinuous reception activation state; ss t represents the maximum number of short sleep state cycles. lsc This indicates the duration corresponding to the long sleep timer; This indicates the duration of the long sleep state when the base station receives the i-th data packet, corresponding to the long sleep state experienced by the 5G terminal. This represents the ratio between the waiting latency of the i-th data packet acquired by the base station when the 5G terminal is in sleep mode and the corresponding terminal sleep state period; t on b is the duration corresponding to the discontinuous reception timer, representing the length of the discontinuous reception activation state period; iThis represents the number of times the 5G terminal failed to compete for the terminal channel when the base station receives the i-th data packet; t c This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails.

[0045] According to the present invention, a 5G terminal power consumption optimization device based on discontinuous reception is provided, wherein the processing module is specifically used for:

[0046] Based on the trade-off between system latency and energy saving, the Markov decision model is solved using the Actor-Critic reinforcement learning algorithm to obtain the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

[0047] The present invention also provides a 5G terminal power consumption optimization device based on discontinuous reception, applied to a 5G terminal, comprising:

[0048] The periodic parameter receiving module is used to receive the optimal discontinuous reception period parameters sent by the base station. The optimal discontinuous reception period parameters are obtained by solving the system latency and energy saving trade-offs based on the Markov decision model corresponding to the dense heterogeneous network scenario and different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal.

[0049] The period parameter adjustment module is used to determine the current terminal state based on a preset discontinuous reception mechanism Markov state transition model. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the period length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception period parameter. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0050] According to the present invention, a 5G terminal power consumption optimization device based on discontinuous reception is provided. The preset discontinuous reception mechanism Markov state transition model is constructed from multiple terminal states of the 5G terminal, including terminal active state, terminal short sleep state, terminal long sleep state, discontinuous reception active state, and beam search state.

[0051] According to the present invention, a 5G terminal power consumption optimization device based on discontinuous reception is provided, the device further being used for:

[0052] If, during the discontinuous reception activation state, a data packet to be sent by the base station is detected on the physical downlink control channel, a terminal channel contention operation is performed. If the terminal channel contention operation fails, the discontinuous reception activation state is maintained, the terminal channel contention operation is performed again, and the duration of the discontinuous reception timer corresponding to the discontinuous reception activation state is increased.

[0053] If the terminal channel contention operation is successful, it switches to the beam search state. If a target beam pair that meets the preset beam conditions is found within the preset beam search time, the beam search state is switched to the terminal active state, and data is transmitted with the base station through the target beam pair.

[0054] The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the 5G terminal power consumption optimization method based on discontinuous reception as described above.

[0055] The present invention provides a 5G terminal power consumption optimization method, device and medium based on discontinuous reception. By establishing a Markov decision model in a dense heterogeneous network scenario, the DRX period parameters of the 5G terminal discontinuous reception mechanism are adaptively adjusted, thereby reducing the additional power consumption and latency overhead caused by beam search and channel contention of 5G terminals in unlicensed frequency bands. Attached Figure Description

[0056] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0057] Figure 1 This is one of the flowcharts illustrating the 5G terminal power consumption optimization method based on discontinuous reception provided by the present invention.

[0058] Figure 2 This is a schematic diagram of the framework of the Actor-Critic reinforcement learning algorithm provided by the present invention;

[0059] Figure 3 The second flowchart illustrates the 5G terminal power consumption optimization method based on discontinuous reception provided by this invention.

[0060] Figure 4 A schematic diagram of the Markov state transition model for the discontinuous reception mechanism of 5G terminals provided by the present invention;

[0061] Figure 5A comparative diagram of the cumulative energy efficiency factor provided by the present invention;

[0062] Figure 6 A comparative diagram of the cumulative delay factor provided by the present invention;

[0063] Figure 7 One of the structural schematic diagrams of the 5G terminal power consumption optimization device based on discontinuous reception provided by the present invention;

[0064] Figure 8 The second schematic diagram of the 5G terminal power consumption optimization device based on discontinuous reception provided by the present invention;

[0065] Figure 9 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation

[0066] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.

[0067] In existing 5G mobile communication service application scenarios, interference between different access methods in multi-wireless heterogeneous scenarios is not considered, and the setting of period parameters is relatively fixed, lacking flexibility. Furthermore, current discontinuous reception DRX data transmission methods typically only involve DRX message exchange mechanisms between user equipment and base stations based on the protocol layer, lacking DRX parameter optimization mechanisms. The DRX period used only includes a fixed-length sleep period, lacking modeling and analysis of latency and energy consumption indicators based on DRX parameter configuration, resulting in certain deficiencies in flexibility and scalability.

[0068] This invention provides a 5G terminal power consumption optimization method for dense heterogeneous network scenarios. Based on the DRX mechanism of LTE technology, this method applies the DRX mechanism to dense multi-radio heterogeneous network scenarios and mitigates the additional power consumption overhead caused by terminal channel contention. It proposes a pre-defined discontinuous reception mechanism (Markov state transition model) considering channel contention and beam search state in 5G NR-U scenarios. Furthermore, based on this model, it proposes a DRX parameter optimization strategy based on the Actor-Critic reinforcement learning algorithm to achieve DRX parameter configuration (i.e., optimal discontinuous reception period parameters) that meets the UE latency and energy consumption requirements in dense heterogeneous network scenarios.

[0069] Figure 1This is one of the flowcharts illustrating the 5G terminal power consumption optimization method based on discontinuous reception provided by the present invention, such as... Figure 1 As shown, this invention provides a 5G terminal power consumption optimization method based on discontinuous reception, applied to the base station side, including:

[0070] Step 101: Based on the discontinuous reception state cycle length and terminal state transition action of the 5G terminal, construct a Markov decision model corresponding to the dense heterogeneous network scenario, and obtain the corresponding system latency and energy saving trade-off benefits based on different terminal state transition actions. The discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length.

[0071] In this invention, for a UE applying the DRX mechanism, the conditional probability distribution of transitioning from the current state to the next state is independent of its past states and depends only on the current state of the UE. Furthermore, the time between state transitions of different DRX mechanisms is random. Therefore, the Markov decision process in this invention can be regarded as a semi-Markov process.

[0072] Furthermore, this invention comprehensively considers factors such as unlicensed frequency bands and millimeter-wave beam search, and models the 5G terminal power consumption optimization method in dense heterogeneous network scenarios as a semi-Markov model with 5 states (i.e., a Markov state transition model with a preset discontinuous reception mechanism, details of which can be found in subsequent steps). Then, a corresponding Markov decision model is constructed based on this semi-Markov model. In this invention, the 5G terminal states include S1 terminal active state, S2 terminal short sleep state, S3 terminal long sleep state, S4 discontinuous active state, and S5 beam search state.

[0073] In LTE, the standard DRX mechanism divides the UE's sleep time into two alternating periods of long and short intervals. However, existing settings for the period length are mainly fixed values, which means that the fixed long and short periods cannot meet the energy-saving and latency balance requirements of 5G devices in scenarios with dense deployment of multiple wireless UEs. Therefore, this invention, based on the aforementioned pre-defined discontinuous reception mechanism Markov state transition model, introduces an Actor-critic reinforcement learning algorithm to enable the base station to adjust the DRX state period length of the 5G terminal, thereby optimizing the system's power consumption and latency.

[0074] It should be noted that, in addition to the target discontinuous reception activation state period length and the target terminal short sleep period length, the optimal DRX state period length obtained by solving the constructed Markov decision model also needs to be determined based on the target terminal short sleep period length. In this invention, the ratio between the terminal long sleep period length and the terminal short sleep period length can be preset (e.g., the terminal long sleep period length is N times the terminal short sleep period length), so that the target terminal long sleep period length can be quickly obtained when the target terminal short sleep period length is determined.

[0075] Step 102: Based on the trade-off between system latency and energy saving, obtain the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

[0076] In this invention, based on reinforcement learning algorithms, the expected returns of taking different actions in a given state are calculated by mathematically modeling state transition probabilities, reward functions, and value functions. The action with the highest expected return is selected as the final strategy. Based on different types of terminal state transition actions, the system's latency and energy consumption are evaluated. The balance between latency and energy consumption under different strategies is compared, thereby obtaining the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value.

[0077] Step 103: Based on the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length, construct the optimal discontinuous reception cycle parameter corresponding to the 5G terminal in the dense heterogeneous network scenario, and send the optimal discontinuous reception cycle parameter to the 5G terminal so that the 5G terminal can optimize and adjust power consumption according to the optimal discontinuous reception cycle parameter.

[0078] In this invention, the optimal discontinuous reception cycle parameter values ​​are obtained by solving the Markov decision model, and then sent to the 5G terminal. The 5G terminal adjusts the cycle length corresponding to its sleep state and discontinuous reception active state according to these parameter values, thereby achieving power optimization while ensuring a balance between latency and power consumption.

[0079] The present invention provides a 5G terminal power consumption optimization method based on discontinuous reception. By establishing a Markov decision model in a dense heterogeneous network scenario, the method adaptively adjusts the DRX period parameters of the 5G terminal discontinuous reception mechanism, thereby reducing the additional power consumption and latency overhead caused by beam search and channel contention in unlicensed frequency bands.

[0080] Based on the above embodiments, the step of constructing a Markov decision model corresponding to a dense heterogeneous network scenario according to the discontinuous reception state period length and terminal state transition actions of the 5G terminal includes:

[0081] Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0082] In this invention, a Markov decision model M is constructed.<S,A,T,R> Here, S represents the set of terminal states, A represents the set of actions, T represents the set of transition probabilities, and R represents the reward function. In each decision step, the system selects action a∈A, transitions from the current state s∈S to the new state s′∈S, and receives a reward R(s,a).

[0083] Furthermore, in the parameter optimization mechanism for DRX cycle length proposed in this invention, the base station configures the DRX cycle length T for the 5G terminal through system actions in each decision window. DRX =t on +t ssc The constructed Markov decision model is then solved using a subsequent reinforcement learning algorithm to obtain the DRX cycle length configuration strategy under the optimal state value.

[0084] Specifically, in order to enable 5G terminals to achieve the optimal balance between energy consumption and latency, this invention maintains a constant DRX cycle length in each decision window, with the decision window length T... W The number of data packets that arrive at the base station is determined by the number of data packets that are buffered at the base station (i.e., when the 5G terminal is in a sleep state or a discontinuous reception activation state, the base station buffers the data packets to be sent to the 5G terminal). That is, N data packets arrive at each decision window. When the data packets arrive at the base station, the UE state (S1, S2, S3, S4, S5) is determined by the DRX cycle length.

[0085] Furthermore, define the state when the data packet arrives. The current state s k Below, the base station is based on a random strategy π θ (s k ,a k The DRX cycle length is determined by the action, and then the state is transitioned to the next time step based on the environment. Where, π θ (s k ,a k ) represents the current state s based on the policy parameter θ.k Make an action a k The probability when choosing action a k When the state transition occurs between the current time k and the next time k+1, the probability function is expressed as:

[0086]

[0087] An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states.

[0088] Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed.

[0089] Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving.

[0090] In this invention, a system measurement index for the 5G terminal power consumption optimization method in dense heterogeneous network scenarios is defined (i.e., an energy saving factor is constructed) to measure the energy saving effect of the terminal; at the same time, a latency factor β is defined to measure the UE latency. The UE latency mainly includes the latency caused by the data packet arriving at the base station buffer when the UE is in sleep mode and the latency caused by the failure of channel contention in DRX ON mode.

[0091] Based on the above embodiments, the formula for the instant reward function is:

[0092] R = δ*α + (1-δ)*(β+P);

[0093]

[0094]

[0095] Where δ represents the weight between terminal energy consumption and data packet waiting latency, and the priority of latency and energy saving can be adjusted according to different business scenarios, 0≤δ≤1; α represents the energy saving factor, β represents the latency factor; P represents the adjustment parameter of the latency factor, when the latency exceeds the maximum latency requirement of the data packet, P=-∞, otherwise it is 0; t ssc The duration corresponding to the short sleep timer represents the length of the terminal's short sleep cycle; T WN1 represents the decision window length; N2 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the short sleep state; N3 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the long sleep state; N4 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the discontinuous reception active state; M i N represents the number of short sleep state cycles experienced by the 5G terminal during the time interval between acquiring the i-th data packet and the (i+1)-th data packet when the 5G terminal is in the short sleep state (i.e., state S2) and the discontinuous reception activation state (i.e., state S4); ss t represents the maximum number of short sleep state cycles. lsc This indicates the duration corresponding to the long sleep timer; This indicates the duration of the long sleep state when the base station receives the i-th data packet, corresponding to the long sleep state experienced by the 5G terminal. This represents the ratio between the waiting latency of the i-th data packet acquired by the base station and the corresponding terminal sleep state period when the 5G terminal is in the terminal sleep state (terminal short sleep state S2 or terminal long sleep state S3); t on b is the duration corresponding to the discontinuous reception timer, representing the length of the discontinuous reception activation state period; i This represents the number of times the 5G terminal failed to compete for the terminal channel when the base station receives the i-th data packet; t c This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails.

[0096] Furthermore, a state-value function for the Markov decision model is constructed, thereby representing the cumulative discounted reward. The formula for the state-value function is as follows:

[0097]

[0098] Where γ∈[0,1] represents the discount factor; s1 is the initial state, i.e. the terminal activation state.

[0099] Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed.

[0100] In this invention, a Markov decision model corresponding to a dense heterogeneous network scenario can be constructed based on the terminal state set, action set, transition probability set, and immediate reward function. This model can be solved using reinforcement learning-based methods (such as Markov decision process, Q-learning, deep reinforcement learning, etc.). Elements such as state transition probabilities and reward functions in the model can be used to calculate the expected reward of taking different actions in each state, and the action with the highest reward can be selected as the optimal policy.

[0101] Based on the above embodiments, the step of obtaining the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model according to the system latency and energy saving trade-off benefits includes:

[0102] Based on the trade-off between system latency and energy saving, the Markov decision model is solved using the Actor-Critic reinforcement learning algorithm to obtain the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

[0103] Figure 2 This is a schematic diagram of the framework of the Actor-Critic reinforcement learning algorithm provided by the present invention, which can be referred to. Figure 2 As shown, the Actor-Critic framework can be used to solve Markov decision processes in continuous state-to-continuous state space. The reinforcement learning agent optimizes its behavior by interacting with the environment, while the evaluator uses time difference error to estimate the time delay and energy saving trade-off of the new state caused by the new DRX cycle length setting, and judges whether the system benefit has improved. At the same time, the agent adjusts its strategy according to the time difference error, thereby accelerating the iterative process of finding the maximum state value function.

[0104] Specifically, when making action selection, the Actor-Critic reinforcement learning algorithm considers the k-th decision window and the environmental state s. k ∈S, the base station sets the DRX period length according to a random strategy and balances two competing objectives, namely, finding a better DRX period length to obtain more rewards. In this invention, a Gaussian distribution is used as the random strategy, and the base station is in state s k Choose action a according to probability k The random policy can be expressed as:

[0105]

[0106] Where, μ(s) k ) and σ(s k ) is action T DRX The expectation and equation, whose parameterized representation is:

[0107]

[0108] in, Represents the policy feature vector; This represents the policy parameter vector.

[0109] Furthermore, the DRX mechanism based on the selected action behavior is as follows:

[0110] UE based on the selected T DRX When the DRX mechanism is executed, the state transitions to s when all data packets within the k-th decision window have been sent. k+1 The time difference error of this decision window can be calculated as follows:

[0111]

[0112] in, Indicates that in state s k The following behavior a k The reward; Indicates that in state s k The transition state is given by γ, which represents the discount factor that maps future costs to the current state.

[0113] Furthermore, the state value function is updated:

[0114] The agent selects action a using a greedy algorithm based on time difference error. k After ∈A, Critic is based on the action value function. Update the parameter vector of the value function so that the agent can transition to the next state based on the selected action.

[0115] Furthermore, the policy function is updated:

[0116] Based on the updated status value Update the Actor's policy function using time difference error:

[0117]

[0118] Where, α k Let be the learning rate of the k-th decision window value function. The policy gradient can be derived from... With σ(s) k )Calculated.

[0119] Figure 3 This is the second flowchart illustrating the 5G terminal power consumption optimization method based on discontinuous reception provided by the present invention. Figure 3As shown, this invention provides a 5G terminal power consumption optimization method based on discontinuous reception, applied to 5G terminals, including:

[0120] Step 301: Receive the optimal discontinuous reception cycle parameters sent by the base station. The optimal discontinuous reception cycle parameters are obtained by solving the system latency and energy saving trade-offs based on the Markov decision model corresponding to the dense heterogeneous network scenario and different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state cycle length of the 5G terminal and the terminal state transition actions.

[0121] Step 302: Based on the preset discontinuous reception mechanism Markov state transition model, determine the current terminal state. If the current terminal state is a terminal sleep state or a discontinuous reception active state, adjust the period length corresponding to the terminal sleep state and the discontinuous reception active state according to the optimal discontinuous reception period parameter. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0122] Based on the above embodiments, the preset discontinuous reception mechanism Markov state transition model is constructed from multiple terminal states of the 5G terminal, including terminal active state, terminal short sleep state, terminal long sleep state, discontinuous reception active state, and beam search state.

[0123] Figure 4 This is a schematic diagram of the Markov state transition model for the discontinuous reception mechanism of 5G terminals provided by the present invention, which can be referred to. Figure 4 As shown, in this invention, the UE performs data transmission in the terminal active state S1 (i.e., the active state) and continuously monitors the PDCCH. The UE consumes the most power in this state. The duration of the S1 state is determined by the timing duration t of the inactive timer. I The control mechanism ensures that any data packet arriving at the UE before the inactive timer expires will reset the timer to 0, keeping the UE in the active state S1.

[0124] Furthermore, in the inactive timer's timing duration t I After the expiration date (i.e., no data packets are received by the UE during this period), the UE enters a short sleep state S2 (i.e., short sleep), the duration of which is determined by the short sleep timer's duration t. ssc Control. The duration t of the short sleep timer. ssc Upon expiration, the UE enters the discontinuous reception activation state S4 (i.e., DRX activation state) and starts the DRX timer, where the DRX timer's duration is t. onThen, the UE briefly wakes up from the DRX active state and listens to the PDCCH. If the UE remains in the DRX timer for the specified duration t... on If no data is detected from the base station before the expiration date, the terminal returns to short sleep state S2, initiating a new short sleep cycle, which continues for N consecutive periods. ss After a short sleep cycle, the UE enters the terminal long sleep state S3 (i.e., long sleep) and starts the long sleep timer, where the timing duration of the long sleep timer is t. lsc , through t lsc Further save power consumption.

[0125] Furthermore, when the timing duration t of the long sleep timer... lsc Upon expiration, the UE re-enters the discontinuous reception active state S4 and starts the DRX timer. The UE briefly wakes up in the DRX active state and listens to the PDCCH. If the UE remains in the DRX timer for the specified duration t... on If no data is detected from the base station before the expiration date, the terminal returns to the long sleep state S3 and starts a new long sleep cycle. The cycle continues until the UE successfully detects the PDCCH after entering the discontinuous reception active state S4 from any terminal long sleep state S3. In this case, the discontinuous reception active state S4 is maintained, and channel contention occurs.

[0126] Furthermore, after the UE successfully listens to the PDCCH in the discontinuous reception activation state S4, before entering the terminal activation state S1 to transmit and receive data, it needs to use the LBT (Listen Before Talk) mechanism to obtain unlicensed channel access permissions for 5G NR-U. Based on the above embodiment, the method further includes: if a data packet to be transmitted by the base station is detected on the physical downlink control channel in the discontinuous reception activation state, a terminal channel contention operation is performed; if the terminal channel contention operation fails, the discontinuous reception activation state is maintained, the terminal channel contention operation is performed again, and the duration of the discontinuous reception timer corresponding to the discontinuous reception activation state is increased.

[0127] In this invention, if the 5G terminal fails to compete for channel contention, it remains in the discontinuous reception active state S4. To reduce the latency caused by channel contention when the channel is busy, the time t for one channel contention process is... c Whenever channel contention fails, the expiration timer t of the discontinuous reception activation state S4 is set. on Extend t c This causes the UE to repeatedly execute the "channel contention to timer extension" process until t on Reaching the maximum value t on-MAXAt this point, the 5G terminal enters short sleep state S2 again, repeating the state transition process between short sleep state S2, long sleep state S3, and discontinuous reception activation state S4 in the above embodiments. In this invention, to ensure fair use of unlicensed frequency bands by wireless access technologies such as NR-U and WiFi, the maximum service time for data packet reception by the UE in terminal activation state S1 is set to not exceed MCOT. When MCOT expires, if the UE still has data buffer, it needs to re-enter discontinuous reception activation state S4 and re-execute the channel contention.

[0128] Furthermore, if the terminal channel contention operation is successful, it switches to the beam search state. If a target beam pair that meets the preset beam conditions is found within the preset beam search time, the beam search state is switched to the terminal active state, and data is transmitted with the base station through the target beam pair.

[0129] In this invention, beamforming-based directional communication is a key technology for solving problems such as high path loss in the millimeter-wave band. Before transitioning from a sleep state to an active state (i.e., terminal activation state), the UE needs to search for the optimal beam pair between the transmitter Tx and the receiver Rx through a beam search process. Specifically, the DRX semi-Markov model proposed in this invention extends the 3GPP standard's DRX mechanism by adding a beam search state between the DRX active state and the UE active state. When the UE detects data sent by the BS in the PDCCH during the discontinuous reception active state S4 and successfully competes for channel access, the UE enters the beam search state S5, during which time the beam search time t... bs After finding a suitable beam pair (such as meeting the preset signal strength) within the preset beam search time, the UE enters the terminal activation state S1 and transmits data with the BS.

[0130] The present invention provides a 5G terminal power consumption optimization method based on discontinuous reception. By establishing a Markov decision model in a dense heterogeneous network scenario, the method adaptively adjusts the DRX period parameters of the 5G terminal discontinuous reception mechanism, thereby reducing the additional power consumption and latency overhead caused by beam search and channel contention in unlicensed frequency bands.

[0131] In one embodiment, the energy consumption optimization method for 5G terminals in dense heterogeneous network scenarios proposed in this invention is simulated and evaluated. The proposed method is compared with the DRX mechanism under the 3GPP standard (C-DRX) and the idealized DRX mechanism (I-DRX) to evaluate the effectiveness of the proposed Actor-Critic based DRX mechanism (AC-DRX). The C-DRX mechanism uses a fixed DRX period configuration based on the 3GPP standard without considering service traffic characteristics. I-DRX assumes that all packet arrivals follow a Poisson distribution (setting λ = 1 / 30), and theoretically calculates the optimal DRX period that satisfies the average delay constraint, which can serve as a theoretical reference value for the optimal effect of the DRX mechanism. Two performance indicators, the cumulative energy efficiency factor (AEE) and the cumulative delay factor (AD), are defined to measure the effectiveness of the proposed DRX parameter optimization method in improving UE energy efficiency. The wireless environment parameter settings are shown in Table 1 below.

[0132] Table 1

[0133]

[0134]

[0135] During the simulation, we first compared the changes in the cumulative energy efficiency factor (AEE) of the three DRX mechanisms as the number of decision windows increased. Figure 5 A comparative diagram illustrating the cumulative energy efficiency factor provided by this invention, as shown below. Figure 5 As shown, I-DRX can achieve the theoretically best AEE performance, serving as an upper bound for measuring the AEE performance of C-DRX and AC-DRX methods. Compared with the C-DRX mechanism, the AC-DRX mechanism provided by this invention achieves a significant improvement in AEE. This improvement comes from dynamically optimizing the DRX cycle using the reinforcement learning Actor-Critic method within the Markov decision process framework.

[0136] Furthermore, the impact of the three DRX mechanisms on system latency is compared using the cumulative delay factor (AD). Figure 6 A comparative diagram of the cumulative delay factor provided by the present invention, as shown below. Figure 6 As shown, under ideal conditions, I-DRX performs best, AC-DRX outperforms C-DRX, and all three DRX mechanisms meet the Maximum Channel Occupied Time (MCOT) requirement in unlicensed frequency bands. Simulation results demonstrate that the 5G terminal power consumption optimization method proposed in this invention for densely heterogeneous scenarios can effectively improve the terminal's energy consumption and latency performance through an adaptive DRX parameter optimization mechanism.

[0137] This invention, based on a semi-Markov chain, establishes a model for the state and state transitions of the discontinuous reception mechanism of 5G terminals in dense heterogeneous network scenarios. This state transition model improves upon the DRX mechanism state in LTE standard scenarios by introducing the 5G NR beam search state and channel contention between multiple radio technologies in unlicensed channels into the state transition model, thus mitigating the additional power consumption overhead caused by terminal channel contention. Simultaneously, this invention introduces a reinforcement learning Actor-Critic algorithm to adaptively adjust the DRX cycle parameters of the discontinuous reception mechanism of 5G terminals in dense heterogeneous network scenarios. It defines energy-saving and latency factors to measure the impact of DRX cycle length on system latency and energy consumption, and uses the Actor-Critic framework to obtain the maximum benefit of balancing latency and energy consumption in the system.

[0138] The following describes the 5G terminal power consumption optimization device based on discontinuous reception provided by the present invention. The 5G terminal power consumption optimization device based on discontinuous reception described below and the 5G terminal power consumption optimization method based on discontinuous reception described above can be referred to in correspondence with each other.

[0139] Figure 7 This is one of the structural schematic diagrams of the 5G terminal power consumption optimization device based on discontinuous reception provided by the present invention, as shown below. Figure 7 As shown, this invention provides a 5G terminal power consumption optimization device based on discontinuous reception, applied to a base station. It includes a dense heterogeneous network scenario construction module 701, a processing module 702, and a periodic parameter generation module 703. The dense heterogeneous network scenario construction module 701 is used to construct a Markov decision model corresponding to the dense heterogeneous network scenario based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal. Based on different terminal state transition actions, it obtains the corresponding system latency and energy saving trade-off benefits. The discontinuous reception state period length includes the discontinuous reception active state period length and the terminal short sleep period length. The processing module 702 is used to obtain the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the system latency and energy saving trade-off benefits. The cycle parameter generation module 703 is used to construct the optimal discontinuous reception cycle parameters of the 5G terminal in the dense heterogeneous network scenario based on the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length, and send the optimal discontinuous reception cycle parameters to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustment based on the optimal discontinuous reception cycle parameters.

[0140] The 5G terminal power consumption optimization system based on discontinuous reception provided by this invention establishes a Markov decision model in a dense heterogeneous network scenario to adaptively adjust the DRX period parameters of the 5G terminal discontinuous reception mechanism, thereby reducing the additional power consumption and latency overhead caused by beam search and channel contention of 5G terminals in unlicensed frequency bands.

[0141] Based on the above embodiments, the dense heterogeneous network scenario construction module is specifically used for:

[0142] Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

[0143] An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states.

[0144] Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed.

[0145] Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving.

[0146] Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed.

[0147] Based on the above embodiments, the formula for the instant reward function is:

[0148] R = δ*α + (1-δ)*(β+P);

[0149]

[0150]

[0151] Where δ represents the weight between terminal power consumption and data packet waiting latency, α represents the power saving factor, β represents the latency factor, and P represents the adjustment parameter of the latency factor; t ssc The duration corresponding to the short sleep timer represents the length of the terminal's short sleep cycle; T WN1 represents the decision window length; N2 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the short sleep state; N3 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the long sleep state; N4 represents the number of data packets to be sent to the 5G terminal that the base station receives when the 5G terminal is in the discontinuous reception active state; M i N represents the number of short sleep state cycles experienced by the 5G terminal during the time interval between acquiring the i-th data packet and the (i+1)-th data packet when the 5G terminal is in the short sleep state and the discontinuous reception activation state; ss t represents the maximum number of short sleep state cycles. lsc This indicates the duration corresponding to the long sleep timer; This indicates the duration of the long sleep state when the base station receives the i-th data packet, corresponding to the long sleep state experienced by the 5G terminal. This represents the ratio between the waiting latency of the i-th data packet acquired by the base station when the 5G terminal is in sleep mode and the corresponding terminal sleep state period; t on b is the duration corresponding to the discontinuous reception timer, representing the length of the discontinuous reception activation state period; i This represents the number of times the 5G terminal failed to compete for the terminal channel when the base station receives the i-th data packet; t c This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails.

[0152] Based on the above embodiments, the processing module is specifically used for:

[0153] Based on the trade-off between system latency and energy saving, the Markov decision model is solved using the Actor-Critic reinforcement learning algorithm to obtain the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

[0154] Figure 8 This is a second schematic diagram of the 5G terminal power consumption optimization device based on discontinuous reception provided by the present invention, as shown below. Figure 8As shown, this invention provides a 5G terminal power consumption optimization device based on discontinuous reception, applied to a 5G terminal. It includes a period parameter receiving module 801 and a period parameter adjustment module 802. The period parameter receiving module 801 receives the optimal discontinuous reception period parameter sent by the base station. This optimal discontinuous reception period parameter is obtained by solving a Markov decision model based on a dense heterogeneous network scenario, considering the system latency and energy saving trade-offs for different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state period length and the terminal state transition actions of the 5G terminal. The period parameter adjustment module 802 determines the current terminal state based on a preset discontinuous reception mechanism Markov state transition model. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the period length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception period parameter. The terminal sleep state includes a short sleep state and a long sleep state.

[0155] The 5G terminal power consumption optimization system based on discontinuous reception provided by this invention establishes a Markov decision model in a dense heterogeneous network scenario to adaptively adjust the DRX period parameters of the 5G terminal discontinuous reception mechanism, thereby reducing the additional power consumption and latency overhead caused by beam search and channel contention of 5G terminals in unlicensed frequency bands.

[0156] Based on the above embodiments, the preset discontinuous reception mechanism Markov state transition model is constructed from multiple terminal states of the 5G terminal, including terminal active state, terminal short sleep state, terminal long sleep state, discontinuous reception active state, and beam search state.

[0157] Based on the above embodiments, the device is further used for:

[0158] If, during the discontinuous reception activation state, a data packet to be sent by the base station is detected on the physical downlink control channel, a terminal channel contention operation is performed. If the terminal channel contention operation fails, the discontinuous reception activation state is maintained, the terminal channel contention operation is performed again, and the duration of the discontinuous reception timer corresponding to the discontinuous reception activation state is increased.

[0159] If the terminal channel contention operation is successful, it switches to the beam search state. If a target beam pair that meets the preset beam conditions is found within the preset beam search time, the beam search state is switched to the terminal active state, and data is transmitted with the base station through the target beam pair.

[0160] The apparatus provided by the present invention is used to execute the above-described method embodiments. For specific processes and details, please refer to the above embodiments, which will not be repeated here.

[0161] Figure 9 This is a schematic diagram of the structure of the electronic device provided by the present invention, such as... Figure 9 As shown, the electronic device may include: a processor 901, a communication interface 902, a memory 903, and a communication bus 904, wherein the processor 901, the communication interface 902, and the memory 903 communicate with each other through the communication bus 904. The processor 901 can call logic instructions in the memory 903 to execute a 5G terminal power consumption optimization method based on discontinuous reception. This method includes: constructing a Markov decision model corresponding to a dense heterogeneous network scenario based on the discontinuous reception state cycle length and terminal state transition actions of the 5G terminal; and obtaining corresponding system latency and energy-saving trade-off benefits based on different terminal state transition actions, wherein the discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length; obtaining the target discontinuous reception active state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the system latency and energy-saving trade-off benefits; constructing the optimal discontinuous reception cycle parameters corresponding to the 5G terminal in the dense heterogeneous network scenario based on the target discontinuous reception active state cycle length and the target terminal short sleep cycle length; and sending the optimal discontinuous reception cycle parameters to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustments based on the optimal discontinuous reception cycle parameters.

[0162] Alternatively, the system receives the optimal discontinuous reception cycle parameter sent by the base station. This optimal discontinuous reception cycle parameter is obtained by solving a Markov decision model based on the dense heterogeneous network scenario, considering the system latency and energy-saving trade-offs for different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state cycle length of the 5G terminal and the terminal state transition actions. Based on the preset discontinuous reception mechanism Markov state transition model, the current terminal state is determined. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the cycle length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception cycle parameter. The terminal sleep state includes a short sleep state and a long sleep state.

[0163] Furthermore, the logical instructions in the aforementioned memory 903 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0164] On the other hand, the present invention also provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, wherein when the program instructions are executed by a computer, the computer is able to execute the 5G terminal power consumption optimization method based on discontinuous reception provided by the above methods, the method comprising: constructing a Markov decision model corresponding to a dense heterogeneous network scenario based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal, and obtaining the corresponding system latency and energy saving trade-off benefits based on different terminal state transition actions, wherein the discontinuous reception state... The state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length; based on the system latency and energy saving trade-off, the target discontinuous reception active state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model are obtained; based on the target discontinuous reception active state cycle length and the target terminal short sleep cycle length, the optimal discontinuous reception cycle parameters corresponding to the 5G terminal in the dense heterogeneous network scenario are constructed, and the optimal discontinuous reception cycle parameters are sent to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustment based on the optimal discontinuous reception cycle parameters;

[0165] Alternatively, the system receives the optimal discontinuous reception cycle parameter sent by the base station. This optimal discontinuous reception cycle parameter is obtained by solving a Markov decision model based on the dense heterogeneous network scenario, considering the system latency and energy-saving trade-offs for different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state cycle length of the 5G terminal and the terminal state transition actions. Based on the preset discontinuous reception mechanism Markov state transition model, the current terminal state is determined. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the cycle length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception cycle parameter. The terminal sleep state includes a short sleep state and a long sleep state.

[0166] In another aspect, the present invention also provides a non-transitory computer-readable storage medium storing a computer program thereon. When executed by a processor, the computer program is implemented to perform the 5G terminal power consumption optimization method based on discontinuous reception provided in the above embodiments. The method includes: constructing a Markov decision model corresponding to a dense heterogeneous network scenario based on the discontinuous reception state cycle length and terminal state transition actions of the 5G terminal; and obtaining corresponding system latency and energy saving trade-off benefits based on different terminal state transition actions, wherein the discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length; obtaining the target discontinuous reception active state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the system latency and energy saving trade-off benefits; constructing the optimal discontinuous reception cycle parameters corresponding to the 5G terminal in the dense heterogeneous network scenario based on the target discontinuous reception active state cycle length and the target terminal short sleep cycle length; and sending the optimal discontinuous reception cycle parameters to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustments based on the optimal discontinuous reception cycle parameters.

[0167] Alternatively, the system receives the optimal discontinuous reception cycle parameter sent by the base station. This optimal discontinuous reception cycle parameter is obtained by solving a Markov decision model based on the dense heterogeneous network scenario, considering the system latency and energy-saving trade-offs for different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state cycle length of the 5G terminal and the terminal state transition actions. Based on the preset discontinuous reception mechanism Markov state transition model, the current terminal state is determined. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the cycle length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception cycle parameter. The terminal sleep state includes a short sleep state and a long sleep state.

[0168] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.

[0169] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.

[0170] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for optimizing the power consumption of a 5G terminal based on discontinuous reception, characterized in that, Applications at the base station end include: Based on the discontinuous reception state cycle length and terminal state transition actions of the 5G terminal, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed, and based on different terminal state transition actions, the corresponding system latency and energy saving trade-off benefits are obtained. The discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length. Based on the trade-off between system latency and energy saving benefits, obtain the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model. Based on the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length, the optimal discontinuous reception cycle parameter corresponding to the 5G terminal in the dense heterogeneous network scenario is constructed, and the optimal discontinuous reception cycle parameter is sent to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustment based on the optimal discontinuous reception cycle parameter. The step of constructing a Markov decision model for a dense heterogeneous network scenario based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal includes: Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state. An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states. Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed. Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving. Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed. The formula for the instant reward function is: ; ; ; in, This represents the weight between terminal power consumption and data packet latency. This represents the energy saving factor. This represents the time delay factor. The parameter representing the adjustment of the delay factor; The duration corresponding to the short sleep timer represents the length of the short sleep cycle of the terminal; Indicates the length of the decision window; This indicates the number of data packets to be sent to the 5G terminal that the base station obtains when the 5G terminal is in a short sleep state. This indicates the number of data packets that the base station receives and buffers when the 5G terminal is in a long sleep state, and that are to be sent to the 5G terminal. This indicates the number of data packets that the base station receives and buffers to be sent to the 5G terminal when the 5G terminal is in the discontinuous reception activation state. This indicates that during the period when the 5G terminal is in the short sleep state and the discontinuous reception activation state, the base station obtains the first... The first data packet and the first The number of short sleep state cycles experienced by the 5G terminal within the time period of each data packet; Indicates the maximum number of short sleep state cycles. This indicates the duration corresponding to the long sleep timer; This indicates that the base station obtained the first... When a data packet is received, the duration of the long sleep state corresponding to when the 5G terminal is in a long sleep state is reached. This indicates that when the 5G terminal is in sleep mode, the base station obtains the first... The ratio between the latency of a data packet and the corresponding terminal sleep state period; The duration corresponding to the discontinuous reception timer represents the length of the discontinuous reception activation state period; This indicates that the base station acquired the first... The number of times the 5G terminal failed to compete for the terminal channel when a data packet is received; This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails.

2. The 5G terminal power consumption optimization method based on discontinuous reception according to claim 1, characterized in that, The step of obtaining the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the system latency and energy saving trade-off benefits includes: Based on the trade-off between system latency and energy saving, the Markov decision model is solved using the Actor-Critic reinforcement learning algorithm to obtain the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

3. A method for optimizing the power consumption of a 5G terminal based on discontinuous reception, characterized in that, Applied to 5G terminals, including: The optimal discontinuous reception period parameters transmitted by the receiving base station are obtained based on a Markov decision model corresponding to a dense heterogeneous network scenario. These parameters are calculated by considering the system latency and energy-saving trade-offs for different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal, and includes: Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state. An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states. Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed. Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving benefits; Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed; the formula for the instant reward function is: ; ; ; in, This represents the weight between terminal power consumption and data packet latency. This represents the energy saving factor. This represents the time delay factor. The parameter representing the adjustment of the delay factor; The duration corresponding to the short sleep timer represents the length of the short sleep cycle of the terminal; Indicates the length of the decision window; This indicates the number of data packets to be sent to the 5G terminal that the base station obtains when the 5G terminal is in a short sleep state. This indicates the number of data packets that the base station receives and buffers when the 5G terminal is in a long sleep state, and that are to be sent to the 5G terminal. This indicates the number of data packets that the base station receives and buffers to be sent to the 5G terminal when the 5G terminal is in the discontinuous reception activation state. This indicates that during the period when the 5G terminal is in the short sleep state and the discontinuous reception activation state, the base station obtains the first... The first data packet and the first The number of short sleep state cycles experienced by the 5G terminal within the time period of each data packet; Indicates the maximum number of short sleep state cycles. This indicates the duration corresponding to the long sleep timer; This indicates that the base station obtained the first... When a data packet is received, the duration of the long sleep state corresponding to when the 5G terminal is in a long sleep state is reached. This indicates that when the 5G terminal is in sleep mode, the base station obtains the first... The ratio between the latency of a data packet and the corresponding terminal sleep state period; The duration corresponding to the discontinuous reception timer represents the length of the discontinuous reception activation state period; This indicates that the base station acquired the first... The number of times the 5G terminal failed to compete for the terminal channel when a data packet is received; This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails; Based on the Markov state transition model of the preset discontinuous reception mechanism, the current terminal state is determined. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the period length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception period parameter. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

4. The 5G terminal power consumption optimization method based on discontinuous reception according to claim 3, characterized in that, The preset discontinuous reception mechanism Markov state transition model is constructed from multiple terminal states of the 5G terminal, including terminal active state, terminal short sleep state, terminal long sleep state, discontinuous reception active state, and beam search state.

5. The 5G terminal power consumption optimization method based on discontinuous reception according to claim 4, characterized in that, The method further includes: If, during the discontinuous reception activation state, a data packet to be sent by the base station is detected on the physical downlink control channel, a terminal channel contention operation is performed. If the terminal channel contention operation fails, the discontinuous reception activation state is maintained, the terminal channel contention operation is performed again, and the duration of the discontinuous reception timer corresponding to the discontinuous reception activation state is increased. If the terminal channel contention operation is successful, it switches to the beam search state. If a target beam pair that meets the preset beam conditions is found within the preset beam search time, the beam search state is switched to the terminal active state, and data is transmitted with the base station through the target beam pair.

6. A 5G terminal power consumption optimization device based on discontinuous reception, characterized in that, Applications at the base station end include: The dense heterogeneous network scenario construction module is used to construct a Markov decision model corresponding to the dense heterogeneous network scenario based on the discontinuous reception state cycle length and terminal state transition action of the 5G terminal, and to obtain the corresponding system latency and energy saving trade-off benefits based on different terminal state transition actions. The discontinuous reception state cycle length includes the discontinuous reception active state cycle length and the terminal short sleep cycle length. The processing module is used to obtain the target discontinuous reception activation state cycle length and the target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model based on the trade-off between system latency and energy saving benefits. The period parameter generation module is used to construct the optimal discontinuous reception period parameter of the 5G terminal in the dense heterogeneous network scenario based on the target discontinuous reception activation state period length and the target terminal short sleep period length, and send the optimal discontinuous reception period parameter to the 5G terminal so that the 5G terminal can perform power consumption optimization adjustment based on the optimal discontinuous reception period parameter. The dense heterogeneous network scenario construction module is specifically used for: Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state. An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states. Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed. Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving. Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed. The formula for the instant reward function is: ; ; ; in, This represents the weight between terminal power consumption and data packet latency. This represents the energy saving factor. This represents the time delay factor. The parameter representing the adjustment of the delay factor; The duration corresponding to the short sleep timer represents the length of the short sleep cycle of the terminal; Indicates the length of the decision window; This indicates the number of data packets to be sent to the 5G terminal that the base station obtains when the 5G terminal is in a short sleep state. This indicates the number of data packets that the base station receives and buffers when the 5G terminal is in a long sleep state, and that are to be sent to the 5G terminal. This indicates the number of data packets that the base station receives and buffers to be sent to the 5G terminal when the 5G terminal is in the discontinuous reception activation state. This indicates that during the period when the 5G terminal is in the short sleep state and the discontinuous reception activation state, the base station obtains the first... The first data packet and the first The number of short sleep state cycles experienced by the 5G terminal within the time period of each data packet; Indicates the maximum number of short sleep state cycles. This indicates the duration corresponding to the long sleep timer; This indicates that the base station obtained the first... When a data packet is received, the duration of the long sleep state corresponding to when the 5G terminal is in a long sleep state is reached. This indicates that when the 5G terminal is in sleep mode, the base station obtains the first... The ratio between the latency of a data packet and the corresponding terminal sleep state period; The duration corresponding to the discontinuous reception timer represents the length of the discontinuous reception activation state period; This indicates that the base station acquired the first... The number of times the 5G terminal failed to compete for the terminal channel when a data packet is received; This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails.

7. The 5G terminal power consumption optimization device based on discontinuous reception according to claim 6, characterized in that, The processing module is specifically used for: Based on the trade-off between system latency and energy saving, the Markov decision model is solved using the Actor-Critic reinforcement learning algorithm to obtain the target discontinuous reception activation state cycle length and target terminal short sleep cycle length corresponding to the maximum state value of the Markov decision model.

8. A 5G terminal power consumption optimization device based on discontinuous reception, characterized in that, Applied to 5G terminals, including: The periodic parameter receiving module is used to receive the optimal discontinuous reception period parameters sent by the base station. These optimal discontinuous reception period parameters are obtained based on a Markov decision model corresponding to a dense heterogeneous network scenario, calculated according to the system latency and energy-saving trade-offs for different terminal state transition actions. The Markov decision model is constructed based on the discontinuous reception state period length and terminal state transition actions of the 5G terminal, and includes: Based on the terminal state transition actions of the 5G terminal, a terminal state set, an action set, and a transition probability set are determined. The terminal state set includes a terminal active state, a terminal sleep state, a discontinuous reception active state, and a beam search state. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state. An energy-saving factor is constructed based on the number of data packets cached, the number of short sleep state cycles, and the duration of long sleep states. Based on the ratio between the data packet waiting delay and the terminal sleep state period when the 5G terminal is in sleep state, the length of the discontinuous reception activation state period, and the number of terminal channel contention failures, a delay factor is constructed. Based on the energy saving factor and the delay factor, an instantaneous reward function is constructed to calculate the trade-off between system delay and energy saving benefits; Based on the terminal state set, the action set, the transition probability set, and the instant reward function, a Markov decision model corresponding to the dense heterogeneous network scenario is constructed; the formula for the instant reward function is: ; ; ; in, This represents the weight between terminal power consumption and data packet latency. This represents the energy saving factor. This represents the time delay factor. The parameter representing the adjustment of the delay factor; The duration corresponding to the short sleep timer represents the length of the short sleep cycle of the terminal; Indicates the length of the decision window; This indicates the number of data packets to be sent to the 5G terminal that the base station obtains when the 5G terminal is in a short sleep state. This indicates the number of data packets that the base station receives and buffers when the 5G terminal is in a long sleep state, and that are to be sent to the 5G terminal. This indicates the number of data packets that the base station receives and buffers to be sent to the 5G terminal when the 5G terminal is in the discontinuous reception activation state. This indicates that during the period when the 5G terminal is in the short sleep state and the discontinuous reception activation state, the base station obtains the first... The first data packet and the first The number of short sleep state cycles experienced by the 5G terminal within the time period of each data packet; Indicates the maximum number of short sleep state cycles. This indicates the duration corresponding to the long sleep timer; This indicates that the base station obtained the first... When a data packet is received, the duration of the long sleep state corresponding to when the 5G terminal is in a long sleep state is reached. This indicates that when the 5G terminal is in sleep mode, the base station obtains the first... The ratio between the latency of a data packet and the corresponding terminal sleep state period; The duration corresponding to the discontinuous reception timer represents the length of the discontinuous reception activation state period; This indicates that the base station acquired the first... The number of times the 5G terminal failed to compete for the terminal channel when a data packet is received; This indicates the timing extension duration corresponding to the discontinuous reception timer when the terminal channel contention fails; The period parameter adjustment module is used to determine the current terminal state based on a preset discontinuous reception mechanism Markov state transition model. If the current terminal state is a terminal sleep state or a discontinuous reception active state, the period length corresponding to the terminal sleep state and the discontinuous reception active state is adjusted according to the optimal discontinuous reception period parameter. The terminal sleep state includes a terminal short sleep state and a terminal long sleep state.

9. The 5G terminal power consumption optimization device based on discontinuous reception according to claim 8, characterized in that, The preset discontinuous reception mechanism Markov state transition model is constructed from multiple terminal states of the 5G terminal, including terminal active state, terminal short sleep state, terminal long sleep state, discontinuous reception active state, and beam search state.

10. The 5G terminal power consumption optimization device based on discontinuous reception according to claim 9, characterized in that, The device is also used for: If, during the discontinuous reception activation state, a data packet to be sent by the base station is detected on the physical downlink control channel, a terminal channel contention operation is performed. If the terminal channel contention operation fails, the discontinuous reception activation state is maintained, the terminal channel contention operation is performed again, and the duration of the discontinuous reception timer corresponding to the discontinuous reception activation state is increased. If the terminal channel contention operation is successful, it switches to the beam search state. If a target beam pair that meets the preset beam conditions is found within the preset beam search time, the beam search state is switched to the terminal active state, and data is transmitted with the base station through the target beam pair.

11. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the 5G terminal power consumption optimization method based on discontinuous reception as described in any one of claims 1 to 2, or the 5G terminal power consumption optimization method based on discontinuous reception as described in any one of claims 3 to 5.