Adapting beam prediction neural network models
By adapting beam prediction neural networks with layer normalization and affine parameter adjustments, the model improves beam selection accuracy and reduces latency in unique scenarios, addressing the challenges of data distribution differences and changing propagation conditions.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- LENOVO UNITED STATES INC
- Filing Date
- 2026-02-05
- Publication Date
- 2026-06-25
AI Technical Summary
Existing beam prediction neural network models face suboptimal performance in unique or unknown scenarios due to differences in data distributions and changing propagation conditions, leading to high latency and overhead in beam selection processes, especially for base stations with large antenna arrays.
Adapting beam prediction neural networks by incorporating layer normalization layers and adjusting affine parameters using unlabeled data samples to tailor the model to specific cell sites, allowing for improved beam selection accuracy and reduced latency.
The adaptation of beam prediction neural networks enhances beam selection by mitigating prediction errors and overhead, enabling efficient beam selection even in unknown scenarios through the use of unlabeled data samples.
Smart Images

Figure IB2026051121_25062026_PF_FP_ABST
Abstract
Description
Lenovo Ref. No. SMM920240248-WO-PCTADAPTING BEAM PREDICTION NEURAL NETWORK MODELSCROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Patent Application No. 19 / 047,270, filed on February 6, 2025, which is incorporated by reference in its entirety.TECHNICAL FIELD
[0002] The present disclosure relates to wireless communications, and more specifically to the adaptation of beam prediction neural network (NN) models.BACKGROUND
[0003] A wireless communications system may include one or multiple network communication devices, such as base stations, which may support wireless communications for one or multiple user communication devices, which may be otherwise known as user equipment (UE), or other suitable terminology. The wireless communications system may support wireless communications with one or multiple user communication devices by utilizing resources of the wireless communications system (e.g., time resources (e.g., symbols, slots, subframes, frames, or the like) or frequency resources (e.g., subcarriers, carriers, or the like). Additionally, the wireless communications system may support wireless communications across various radio access technologies including third generation (3G) radio access technology, fourth generation (4G) radio access technology, fifth generation (5G) radio access technology, among other suitable radio access technologies beyond 5G (e.g., sixth generation (6G)).
[0004] In order for a UE to connect to a base station, the UE or the base station or both may perform beam search or beam selection. Beam selection involves the selection of a beam at a base station and a corresponding beam at a UE, where the UE selects a beam-pair (e.g., the beam at the base station and the beam at the UE) that results in a high signal strength for communication between the devices.SUMMARY1Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0005] An article “a” before an element is unrestricted and understood to refer to “at least one” of those elements or “one or more” of those elements. The terms “a,” “at least one,” “one or more,” and “at least one of one or more” may be interchangeable. As used herein, including in the claims, “or” as used in a list of items (e.g., a list of items prefaced by a phrase such as “at least one of’ or “one or more of’ or “one or both of’) indicates an inclusive list such that, for example, a list of at least one of A, B, or C means A or B or C or AB or AC or BC or ABC (i.e., A and B and C). Also, as used herein, the phrase “based on” shall not be construed as a reference to a closed set of conditions. For example, an example step that is described as “based on condition A” may be based on both a condition A and a condition B without departing from the scope of the present disclosure. In other words, as used herein, the phrase “based on” shall be construed in the same manner as the phrase “based at least in part on. Further, as used herein, including in the claims, a “set” may include one or more elements.
[0006] The present disclosure relates to methods, apparatuses, and systems that enable a network to adapt beam prediction NN models for beam selection procedures at specific cell sites.
[0007] A first node for wireless communication is described. The first node may be configured to, capable of, or operable to perform one or more operations as described herein. For example, the first node may comprise at least one memory and at least one processor coupled with the at least one memory and configured to cause the first node to determine an NN model for beam prediction, by computing a set of NN parameters associated with multiple neural layers of the NN model and computing a set of affine parameters associated with at least one layer normalization (LN) layer of the NN model, and transmit, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.
[0008] A method performed or performable by the first node is described. The method may comprise determining an NN model for beam prediction, by computing a set of NN parameters associated with multiple neural layers of the NN model and computing a set of affine parameters associated with at least one LN layer of the NN model, and transmitting,2Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTto a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.
[0009] In some implementations of the first node and method described herein, the set of affine parameters includes a set of scale parameters and a set of shift parameters for the at least one LN layer.
[0010] In some implementations of the first node and method described herein, the NN model is based on a set of data samples that includes unlabeled data samples, labeled data samples, or a combination of unlabeled data samples and labeled data samples.
[0011] In some implementations of the first node and method described herein, the first node and method may further be configured to, capable of, performed, performable, or operable to determine the set of data samples.
[0012] In some implementations of the first node and method described herein, the first node and method may further be configured to, capable of, performed, performable, or operable to receive the set of data samples from another node.
[0013] In some implementations of the first node and method described herein, the first node and method may further be configured to, capable of, performed, performable, or operable to determine the NN model for beam prediction by computing a global set of parameters, which includes the set of NN parameters and the set of affine parameters, via a learning algorithm that is based on the set of data samples.
[0014] A second for wireless communication is described. The second node may be configured to, capable of, or operable to perform one or more operations as described herein. For example, the second node may comprise at least one memory and at least one processor coupled with the at least one memory and configured to cause the second node to determine whether to update an NN model for beam prediction, wherein the NN model for beam prediction includes multiple neural layers, at least one layer LN layer, and a set of affine parameters associated with at least one LN layer, and update the set of affine parameters based on a set of input data samples.3Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0015] A method performed or performable by the second node is described. The method may comprise determining whether to update an NN model for beam prediction, wherein the NN model for beam prediction includes multiple neural layers, at least one layer LN layer, and a set of affine parameters associated with at least one LN layer and updating the set of affine parameters based on a set of input data samples.
[0016] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to determine the set of input data samples.
[0017] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to receive the set of data samples from another node.
[0018] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to update the set of affine parameters by determining a first quantity associated with a measure of uncertainty in an output of the NN model resulting from the set of input data samples, determining a second quantity associated with a remaining uncertainty in the output of the NN model resulting from the set of input data samples, determining a weighted first quantity for the first quantity and a weighted second quantity for the second quantity, and determining the set of affine parameters by minimizing a difference between the second weighted quantity and the first weighted quantity.
[0019] In some implementations of the second node and method described herein, the measure of uncertainty in the output of the NN model is a measure of entropy of the output.
[0020] In some implementations of the second node and method described herein, the measure of remaining uncertainty in the output of the NN model is a measure of conditional entropy in the output.
[0021] In some implementations of the second node and method described herein, the weighted first quantity and the weighted second quantity are based on weights having values between 0 and 1, inclusive.4Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0022] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to determine to update the NN model based on information that indicates periodic time intervals for updating the NN model.
[0023] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to determine to update the NN model based on receiving an indication to update the NN model from a first node.
[0024] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to determine to update the NN model based on receiving a configuration associated with updating the NN model from a first node.
[0025] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to determine to update the NN model based on information that indicates certain conditions of a communications network that includes the second node.
[0026] In some implementations of the second node and method described herein, the second node and method may further be configured to, capable of, performed, performable, or operable to determine to update the NN model based on determining a quality metric for a functionality of the NN model.BRIEF DESCRIPTION OF THE DRAWINGS
[0027] Figure 1 illustrates an example of a wireless communications system in accordance with aspects of the present disclosure.
[0028] Figure 2 illustrates example communications between a first node and a second node in accordance with aspects of the present disclosure.
[0029] Figure 3 illustrates an example beam prediction NN in accordance with aspects of the present disclosure.5Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0030] Figure 4 illustrates a flowchart of a method for adapting a beam prediction NN in accordance with aspects of the present disclosure.
[0031] Figure 5 illustrates an example of a UE in accordance with aspects of the present disclosure.
[0032] Figure 6 illustrates an example of a processor in accordance with aspects of the present disclosure.
[0033] Figure 7 illustrates an example of a network equipment (NE) in accordance with aspects of the present disclosure.
[0034] Figure 8 illustrates a flowchart of a method performed by an NE in accordance with aspects of the present disclosure.
[0035] Figure 9 illustrates a flowchart of a method performed by a UE in accordance with aspects of the present disclosure.DETAILED DESCRIPTION
[0036] Beam selection often involves the selection of an optimal beam (e.g., a beam offering a maximum signal strength) across all available beams. For example, a UE may perform an exhaustive search procedure, where a base station sends a reference signal on a subset of or all transmitting (Tx) beams, and the UE measures the signal strength (e.g., the reference signal received power (RSRP), the signal to interference plus noise ratio (SINR), and so on). The UE may then report the beam (e.g., via a beam index) with the highest signal strength to the base station. Such a procedure, while useful for beam selection, may introduce problems associated with high overhead and latency, especially for base stations having large antenna arrays that support many beams (e.g., base stations using millimeter wave (mmWave) frequencies).
[0037] To mitigate such problems, wireless communications systems may employ artificial intelligence (AI) and / or machine learning (ML) techniques, such as deep learning, to perform beam selection by predicting the beam having the highest signal strength. Beam prediction enhances or improves beam selection by employing and training (e.g., using supervised learning) a NN (e.g., a deep neural network, or DNN) to determine an optimal 6Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTbeam index for a cell site (e.g., at a base station). For example, the DNN performs measurements of a subset of beams for a cell site and outputs a mapping between the beam measurements and a best beam index.
[0038] While such beam prediction techniques are useful for targeted, known, or generalized scenarios, they may provide suboptimal predictions when deployed in unique or unknown scenarios, such as in scenarios where the data input into the DNN has different or unique statistical characteristics that the data from which the DNN or AI / ML model was trained. For example, beam prediction models are often specifically adapted to a certain cell site and trained on the physical characteristics of the cell site.
[0039] However, when deployed to a different cell site, data samples (e.g., 2 = {(x-, y )}^ ~ Pxy) that are inputted into the model may be different from the data samples(e.g., T>s= {(xf, ~ XY) used to train the model (e.g., the P^y is different than Py). Thus, a distribution of beam measurements, (So, S1(which are the distribution P(x) of the input samples to the model, may be vulnerable to differences or variations. Further, a mapping between the beam measurements and a corresponding optimal beam may change based on physical characteristics of the propagation medium (e.g., between outdoor scenarios and indoor scenarios), and P(y|x), a distribution of the output conditioned on the input, may change for beam selection.
[0040] The systems and methods described herein adapt a beam selection NN (e.g., a beam prediction NN), such as a DNN, to perform predictions for unseen or unknown target domains, such as by using unlabeled data samples from the target domains. For example, the DNN may include normalization layers between neural layers of the DNN. Further, the adaptation of the DNN may be an adaptation of the normalization layer (e.g., adapting affine parameters of LN layers without labeled data samples), and not the entire DNN.
[0041] In doing so, the systems and methods can utilize a DNN (or other AI / ML models) to perform beam prediction for a base station or cell site that is specific to the characteristics of the base station or cell site, taking advantage of employing beam prediction as a beam selection technique while mitigating issues with latency, overhead, and prediction errors, among other benefits.7Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0042] Figure 1 illustrates an example of a wireless communications system 100 in accordance with aspects of the present disclosure. The wireless communications system 100 may include one or more NE 102, one or more UE 104, and a core network (CN) 106. The wireless communications system 100 may support various radio access technologies. In some implementations, the wireless communications system 100 may be a 4G network, such as an LTE network or an LTE- Advanced (LTE-A) network. In some other implementations, the wireless communications system 100 may be a NR network, such as a 5G network, a 5G- Advanced (5G-A) network, or a 5G ultrawideband (5G-UWB) network. In other implementations, the wireless communications system 100 may be a combination of a 4G network and a 5G network, or other suitable radio access technology including Institute of Electrical and Electronics Engineers (IEEE) 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20. The wireless communications system 100 may support radio access technologies beyond 5G, for example, 6G. Additionally, the wireless communications system 100 may support technologies, such as time division multiple access (TDMA), frequency division multiple access (FDMA), or code division multiple access (CDMA), etc.
[0043] The one or more NE 102 may be dispersed throughout a geographic region to form the wireless communications system 100. One or more of the NE 102 described herein may be or include or may be referred to as a network node, a base station, a network element, a network function, a network entity, a radio access network (RAN), a NodeB, an eNodeB (eNB), a next-generation NodeB (gNB), or other suitable terminology. An NE 102 and a UE 104 may communicate via a communication link, which may be a wireless or wired connection. For example, an NE 102 and a UE 104 may perform wireless communication (e.g., receive signaling, transmit signaling) over a Uu interface.
[0044] An NE 102 may provide a geographic coverage area for which the NE 102 may support services for one or more UEs 104 within the geographic coverage area. For example, an NE 102 and a UE 104 may support wireless communication of signals related to services (e.g., voice, video, packet data, messaging, broadcast, etc.) according to one or multiple radio access technologies. In some implementations, an NE 102 may be moveable, for example, a satellite associated with a non-terrestrial network (NTN). In some8Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTimplementations, different geographic coverage areas associated with the same or different radio access technologies may overlap, but the different geographic coverage areas may be associated with different NE 102.
[0045] The one or more UE 104 may be dispersed throughout a geographic region of the wireless communications system 100. A UE 104 may include or may be referred to as a remote unit, a mobile device, a wireless device, a remote device, a subscriber device, a transmitter device, a receiver device, or some other suitable terminology. In some implementations, the UE 104 may be referred to as a unit, a station, a terminal, or a client, among other examples. Additionally, or alternatively, the UE 104 may be referred to as an Internet-of-Things (loT) device, an Internet-of-Everything (loE) device, or machine-type communication (MTC) device, among other examples.
[0046] A UE 104 may be able to support wireless communication directly with other UEs 104 over a communication link. For example, a UE 104 may support wireless communication directly with another UE 104 over a device-to-device (D2D) communication link. In some implementations, such as vehicle-to-vehicle (V2V) deployments, vehicle-to-everything (V2X) deployments, or cellular-V2X deployments, the communication link may be referred to as a sidelink. For example, a UE 104 may support wireless communication directly with another UE 104 over a PC5 interface.
[0047] An NE 102 may support communications with the CN 106, or with another NE 102, or both. For example, an NE 102 may interface with other NE 102 or the CN 106 through one or more backhaul links (e.g., SI, N2, N2, or network interface). In some implementations, the NE 102 may communicate with each other directly. In some other implementations, the NE 102 may communicate with each other or indirectly (e.g., via the CN 106. In some implementations, one or more NE 102 may include subcomponents, such as an access network entity, which may be an example of an access node controller (ANC). An ANC may communicate with the one or more UEs 104 through one or more other access network transmission entities, which may be referred to as a radio heads, smart radio heads, or transmission-reception points (TRPs).
[0048] The CN 106 may support user authentication, access authorization, tracking, connectivity, and other access, routing, or mobility functions. The CN 106 may be an 9Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTevolved packet core (EPC), or a 5G core (5GC), which may include a control plane entity that manages access and mobility (e.g., a mobility management entity (MME), an access and mobility management functions (AMF)) and a user plane entity that routes packets or interconnects to external networks (e.g., a serving gateway (S-GW), a Packet Data Network (PDN) gateway (P-GW), or a user plane function (UPF)). In some implementations, the control plane entity may manage non-access stratum (NAS) functions, such as mobility, authentication, and bearer management (e.g., data bearers, signal bearers, etc.) for the one or more UEs 104 served by the one or more NE 102 associated with the CN 106.
[0049] The CN 106 may communicate with a packet data network over one or more backhaul links (e.g., via an SI, N2, or another network interface). The packet data network may include an application server. In some implementations, one or more UEs 104 may communicate with the application server. A UE 104 may establish a session (e.g., a protocol data unit (PDU) session, or the like) with the CN 106 via an NE 102. The CN 106 may route traffic (e.g., control information, data, and the like) between the UE 104 and the application server using the established session (e.g., the established PDU session). The PDU session may be an example of a logical connection between the UE 104 and the CN 106 (e.g., one or more network functions of the CN 106).
[0050] In the wireless communications system 100, the NEs 102 and the UEs 104 may use resources of the wireless communications system 100 (e.g., time resources (e.g., symbols, slots, subframes, frames, or the like) or frequency resources (e.g., subcarriers, carriers)) to perform various operations (e.g., wireless communications). In some implementations, the NEs 102 and the UEs 104 may support different resource structures. For example, the NEs 102 and the UEs 104 may support different frame structures. In some implementations, such as in 4G, the NEs 102 and the UEs 104 may support a single frame structure. In some other implementations, such as in 5G and among other suitable radio access technologies, the NEs 102 and the UEs 104 may support various frame structures (i.e., multiple frame structures). The NEs 102 and the UEs 104 may support various frame structures based on one or more numerologies.
[0051] One or more numerologies may be supported in the wireless communications system 100, and a numerology may include a subcarrier spacing and a cyclic prefix. A first 10Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTnumerology (e.g., μ=0) may be associated with a first subcarrier spacing (e.g., 15 kHz) and a normal cyclic prefix. In some implementations, the first numerology (e.g., μ=0) associated with the first subcarrier spacing (e.g., 15 kHz) may utilize one slot per subframe. A second numerology (e.g., μ=1) may be associated with a second subcarrier spacing (e.g., 30 kHz) and a normal cyclic prefix. A third numerology (e.g., μ=2) may be associated with a third subcarrier spacing (e.g., 60 kHz) and a normal cyclic prefix or an extended cyclic prefix. A fourth numerology (e.g., μ=3) may be associated with a fourth subcarrier spacing (e.g., 120 kHz) and a normal cyclic prefix. A fifth numerology (e.g., μ=4) may be associated with a fifth subcarrier spacing (e.g., 240 kHz) and a normal cyclic prefix.
[0052] A time interval of a resource (e.g., a communication resource) may be organized according to frames (also referred to as radio frames). Each frame may have a duration, for example, a 10 millisecond (ms) duration. In some implementations, each frame may include multiple subframes. For example, each frame may include 10 subframes, and each subframe may have a duration, for example, a lms duration. In some implementations, each frame may have the same duration. In some implementations, each subframe of a frame may have the same duration.
[0053] Additionally, or alternatively, a time interval of a resource (e.g., a communication resource) may be organized according to slots. For example, a subframe may include a number (e.g., quantity) of slots. The number of slots in each subframe may also depend on the one or more numerologies supported in the wireless communications system 100. For instance, the first, second, third, fourth, and fifth numerologies (i.e., μ=0, μ=1, μ=2, μ=3, μ=4) associated with respective subcarrier spacings of 15 kHz, 30 kHz, 60 kHz, 120 kHz, and 240 kHz may utilize a single slot per subframe, two slots per subframe, four slots per subframe, eight slots per subframe, and 16 slots per subframe, respectively. Each slot may include a number (e.g., quantity) of symbols (e.g., OFDM symbols). In some implementations, the number (e.g., quantity) of slots for a subframe may depend on a numerology. For a normal cyclic prefix, a slot may include 14 symbols. For an extended cyclic prefix (e.g., applicable for 60 kHz subcarrier spacing), a slot may include 12 symbols. The relationship between the number of symbols per slot, the number of slots per subframe, and the number of slots per frame for a normal cyclic prefix and an extended 11Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTcyclic prefix may depend on a numerology. It should be understood that reference to a first numerology (e.g., μ=0) associated with a first subcarrier spacing (e.g., 15 kHz) may be used interchangeably between subframes and slots.
[0054] In the wireless communications system 100, an electromagnetic (EM) spectrum may be split, based on frequency or wavelength, into various classes, frequency bands, frequency channels, etc. By way of example, the wireless communications system 100 may support one or multiple operating frequency bands, such as frequency range designations FR1 (410 MHz - 7.125 GHz), FR2 (24.25 GHz - 52.6 GHz), FR3 (7.125 GHz - 24.25 GHz), FR4 (52.6 GHz - 114.25 GHz), FR4a or FR4-1 (52.6 GHz - 71 GHz), and FR5 (114.25 GHz - 300 GHz). In some implementations, the NEs 102 and the UEs 104 may perform wireless communications over one or more of the operating frequency bands. In some implementations, FR1 may be used by the NEs 102 and the UEs 104, among other equipment or devices for cellular communications traffic (e.g., control information, data). In some implementations, FR2 may be used by the NEs 102 and the UEs 104, among other equipment or devices for short-range, high data rate capabilities.
[0055] FR1 may be associated with one or multiple numerologies (e.g., at least three numerologies). For example, FR1 may be associated with a first numerology (e.g., μ=0), which includes 15 kHz subcarrier spacing; a second numerology (e.g., μ=1), which includes 30 kHz subcarrier spacing; and a third numerology (e.g., μ=2), which includes 60 kHz subcarrier spacing. FR2 may be associated with one or multiple numerologies (e.g., at least 2 numerologies). For example, FR2 may be associated with a third numerology (e.g., μ=2), which includes 60 kHz subcarrier spacing; and a fourth numerology (e.g., μ=3), which includes 120 kHz subcarrier spacing.
[0056] As described herein, in some embodiments, a DNN is enhanced to include LN layers between neural layers (e.g., an LN between every two neural layers), which facilitates the adaptation of the DNN when deployed at a specific cell site (e.g., such as a gNB). Thus, in some embodiments, the DNN may be adapted by only adapting the LN layers (e.g., affine parameters of the LN layers based on unlabeled data samples), and not the other layers or parameters of the DNN.12Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0057] Figure 2 illustrates example communications 200 between a first node 205 and a second node 215 in accordance with aspects of the present disclosure. The first node 205 (e.g., an encoder) may be associated with the NE 102 (e.g., may be the NE 102, such as a gNB) and the second node 215 (e.g., a decoder) may be associated with the UE 104 (e.g., may be the UE 104). While shown as being associated with the NE 102, the first node 205, in some cases, may be part of or otherwise associated with the UE 104, and, similarly, the second node 215 may be part of otherwise associated with the NE 102.
[0058] The UE 104 may initiate a beam selection or beam prediction procedure to connect to the NE 102. Before doing so, the UE 104, acting as the second node 215, may receive a set of NN model parameters 210 transmitted from the NE 102, acting as the first node 205. For example, the set of NN model parameters 210 may include a set of NN parameters for neural layers and a set of affine parameters for one or more LN layers of a DNN 220 that is employed by the UE 104 when performing beam prediction. The UE 104 may adapt the DNN 220 with the received affine parameters to tailor or modify the DNN 220 to the specific NE 102. Using the adapted DNN 220, the UE 104 performs beam prediction, and selects a suitable or optimal beam for connection to the NE 102.
[0059] The set of NN parameters for the neural layers may include weights of edges of the NN, as well as other information, parameters, and / or details about the architecture and / or configuration of the DNN 220. For example, the DNN 220 may be categorized, defined, and / or specified (uniquely), by some or all of the following information or parameters (e.g., the set of NN parameters): a number of layers L, a number of neurons in each layer nℓ, for ℓ = 1,activation functions of the neurons, connectivity between the neurons belonging to successive layers (e.g., whether the edge (i, j) exists for the ithneuron in layer ℓ (for ℓ = 2,..., L, i= 1..., nℓ, and the jthneuron in layer ℓ — 1 (for ℓ = 2,..., L, j = 1..., nℓ-1), weights of all the edges between neurons belonging to every successive pair of layers: W = {wij(ℓ)}, ℓ = 1,..., L, i =i = 1,...,nℓ, j = 1,...,nℓ-1, and so on.
[0060] As described herein, the DNN 220 may be a beam prediction NN, or f0, which includes LN layers. The DNN 220 may be based on training data from one or more source data domains, where θ = {θ1, θ2, θN} (e.g., a set of all learnable / trainable parameters / weights, or N parameters, of the NN. The DNN 220 may include one or more 13Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTLN layers, which operate to adapt or change a statistical distribution of data samples received from a preceding layer, before feeding the adapted / changed data samples as inputs to a subsequent layer.
[0061] Figure 3 illustrates an example beam prediction NN 300 in accordance with aspects of the present disclosure. The NN 300, which may be the DNN 220 or another AI / ML model, includes an input layer 320, which receives data samples 310 (e.g., a training data set), hidden layers 330 (e.g., neural layers), and an output layer 340, which outputs a beam prediction index 350. Between the different layers are LN layers 325, which, as described herein, adapt and / or change statistical distributions of data samples between layers. The NN 300, therefore, may include an input layer, an output layer, and hidden layers, such as neural layers (e.g., hidden layers 330) and LN layers (e.g., LN layers 325).
[0062] The LN layers 325 perform layer normalization (e.g., on received input samples). Layer normalization may be defined as follows:. Xi ~ Mix‘=—
[0063] where the set Siis defined as Si= {j | jB= iB}, with iB(and jB) denoting the sub-index of i (and j) along the B-axis (e.g., batch axis). Here, |Si| is the cardinality of Si, and ε is a small positive value. Thus, the LN layers 325 compute a mean p and standard deviation o along the (C, H, W~) axes for each data sample.
[0064] Then, x̂iis transformed into LN(x̂i) = γix̂i+ βi, where γiand βiare learnable / trainable parameters, indexed by iC. Note that, the multiplication of x̂iwith γiis element wise multiplication and the addition of βiis also element wise addition. Thus, γiand βieach have the same length as that of xi. Essentially, the LN layers 325 normalize 14Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTinput samples in all neurons in a same layer for each data sample. Also, all elements of γiwill have the same value and all elements of βiwill have the same values. The parameter γiis a learnable scale parameter and the parameter βiis a learnable shift parameter for feature i. These two parameters are affine parameters of the LN layers 325. Further, the ithLN layer has its learnable / trainable parameters as γi, βi, and a length of γi, and βi, is equal to the length of feature xi.
[0065] A DNN (e.g., the DNN 220) may have an Nnormnumber of LN layers 325, denoted by ℓ1, ..., ℓN. Let θAdenote a set of all the learnable / trainable parameters of all the LN layers 325 in the NN 300 (e.g., the NN fθ). Note that θAis a subset of θ (i.e., θA⊂ θ) and θ denotes the set of all learnable / trainable parameters of NN fθ. When each of the LN layers 325 in the NN fθperforms layer normalization, for LN layers—>^Nnorm’ th0set of learnable / trainable parameters is given bθA= {γj,i, βj,i}, j = 1,..., Nnorm, i = 1,..., Fj, where Fjis the total number of features at the input of the normalization layer ℓj.
[0066] The NN 300 may be trained using a labeled training data set Dtr(e.g., via supervised learning / training, semi-supervised learning, self-supervised learning, unsupervised learning, and so on). For example, supervised learning / training of an NN fθ, with θ = {θ1, θ2, θN} as its learnable / trainable parameters may include selecting an architecture / structure of the network (e.g., a number of layers, how the layers are connected, and so on), such as convolutional NNs (CNNs), recurrent NNs (RNNs), long short-term memory (LSTM) NNs, and so on. Once the architecture / structure is selected, optimal values of learnable / trainable parameters by minimizing a loss function £ over a training data set are determined.
[0067] As described herein, the NN 300 includes LN layers 325, which are placed in between the other layers of the NN 300. For example, the NN 300 includes the LN 325 layer between every pair of regular, non-normalization, neural layers (e.g., the hidden layers 330). The NN 300 depicts, as an example, a four-layer NN, with one input layer (e.g., the input layer 320), one output layer (e.g., the output layer 340) and two hidden layers (e.g., the hidden layers 330).15Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0068] The LN layers 325 are placed or positioned between the input layer 320 and the first hidden layer 330, between the first hidden layer 330 and the second hidden layer 330, and between the second hidden layer 330 and the output layer 340. Thus, there are three LN layers 325 in the depicted four-layer NN, and an NN having a total number of NLlayers, with one LN layer 325 between every pair of neural layers, there are Nnorm= — 1number of LN layers 325.
[0069] The NN 300, or fθ, includes one LN layer 325 between every pair of other, regular (non-normalization) neural layers (e.g., the hidden layers 330), where 0 ={θ1, θ2,..., θN} represents the learnable parameters of the entire NN (e.g., the learnable parameters of the LN layers 325 and the hidden layers 330). The NN 300 may be trained through supervised learning by minimizing a loss function over the labeled training data set (e.g., the data samples 310). For example, the set of optimal parameters θ = {θ1, θ2,..., θN} are determined by solving the following problem through a chosen / appropriate optimization algorithm (e.g., stochastic gradient descent (SGD), adaptive moment estimation (ADAM), and so on), where θ = minθ̃∈ℝL(θ̃, Dtr)θ = minθ̃∈ℝL(θ̃, Dtr)
[0070] In some cases, the NN 300 functions to map a set of M' beam measurements into one of M available beams, where the NN 300 is expressed as fθ: ℝM′→ ℕ, where θ = {θ1, θ2,..., θN} denotes the set of all the learnable / trainable parameters / weights of the underlying NN.
[0071] The NN 300 may be probabilistic, as it generates a probability distribution over the predictions, conditioned on the input data samples 310 and parameterized by 0, where a conditional distribution is differentiable in 0. Thus, the NN 300, which is a beam prediction AI / ML model, trains on one or more source domains and generates a probability distribution P(y|x; 0) that is differentiable in 0. The prediction y is a best beam out of M beams. Hence, y is a discrete variable with y 6 {1,..., M} and P(y|x; 0) is the probability that y is the predicted label for x under the model parameters 0. Thus, P(yc|xi; θ) is the probability that yjis the best beam for the given set of beam measurements xi, as per the prediction / inference made by the NN 300 with θ as its model parameters. Note that ∑c=1MP(yc|xi; θ) = 1, or equivalently, ∑y∈MP(yc|xi; θ) = 1, where M = {1,..., M}.16Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0072] As described herein, the NN 300 includes one or more LN layers 325, with each LN layer having a set of scale parameters and a set of shift parameters (e.g., collectively denoted as affine parameters). For example, the NN 300 may have NLnumber of LN layers 325. For a jthnormalization layer, where j ∈ {1,..., NL}, the jthnormalization layer has Fjnumber of learnable / trainable scale parameters γj,1,..., γj,Fand Fjnumber of learnable / trainable shift parameters βj,1,..., βj,F, where Fjis a total number of features at the jthnormalization layer input. Each γj,i,i, p7 I, i = 1,..., Fj, includes an Lj number of scalar parameters (e.g., γj,i= {γj,i,1, γj,i,2, ..., γj,i,L} and βj,i= {βj,i,1, βj,i,2, ..., βj,i,L} for i = 1,..., Fj) where Ljis the length of the feature vector at the input of the jthnormalization layer, or, equivalently, a number of neurons in the jthnormalization layer.
[0073] However, for layer normalization, γj,i,1= γj,i,2= ··· = γj,i,Land βj,i,1= βj,i,2= ··· = βj,i,L. Thus, there are Fjnumber of learnable / trainable scale parameters γj,1,..., γj,Fand the Fjnumber of learnable / trainable shift parameters βj,1,..., βj,F, where Fjis the total number of features at the jthnormalization layer input. Again, as described herein, the scale and shift parameters may be affine parameters of the LN layers 325. Further,let 0Adenote a set of the affine parameters of all the LN layers 325 in the NN 300, where f 1NL0A = (Yj,l< ■■■ > Yj, Fj< Pj,i< ■■■ < Pj, FjJ.=10Amay be a subset of 0 (i.e., 0Ac 0) as 0 denotes theset of all learnable / trainable parameters of the NN 300.
[0074] As described herein, the NN 300 (e.g., the DNN 220) may be adapted, modified, or optimized by adapting weights of the LN layers 325, such as by modifying θA, the set of affine parameters of the LN layers 325. Thus, in some cases, other parameters of the NN 300 (e.g., parameters / weights associated with the hidden layers 330) are unchanged or otherwise not adapted. For example, the NN 300 may be adapted to an unseen target domain with a few unlabeled samples from the target domain, where the target domain has a different distribution than one or more source domains over which the NN 300 is or was trained.
[0075] In some cases, the first node 205 (e.g., the NE 102, such as the gNB) triggers the adaptation of the NN 300 (e.g., the DNN 220). In other cases, the second node 215 may trigger the adaptation of the NN 300 and / or other network nodes (e.g., nodes associated 17Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTwith life cycle management of AI / ML models deployed by a wireless communications system may trigger the adaptation.
[0076] The first node 205, the second node 215, and / or another node may trigger the adaptation of the NN 300, or otherwise determine to update the NN 300, in a variety of ways. For example, the first node 205, the second node 215, and / or another node may determine to update the NN 300 based on information that indicates periodic time intervals for updating the NN model (e.g., the NN model is updated periodically in set intervals), based on receiving an indication to update the NN model from another node (e.g., a transmitting node), based on receiving a configuration associated with updating the NN model (e.g., from another node), based on information that indicates certain conditions of a communications network that supports the nodes, based on determining a quality metric for a functionality of the NN 300 (e.g., the model is accurately predicting the optimal beam over a threshold percentage of instances), and so on.
[0077] The second node 215, acting as a decoder, may capture, access, or otherwise utilize a set of data samples from a target domain, such as unlabeled data samples (e.g., data samples with unknown labels or target attributes). The second node 215, therefore, accesses or utilizes a few input samples xf= {x(, x^,..., x( from the target domain, where each x- is a set of M' beam measurements from the target domain (e.g., a domain associated with the NE 102). As described herein, the second node 215 may not know a best beam for each set of beam measurements (e.g., does not have information identifying corresponding labels yL, i = 1,.... n, for the input samples x-, i = 1,.... n)).
[0078] Figure 4 illustrates a flowchart of a method 400 for adapting a beam prediction NN in accordance with aspects of the present disclosure. For example, the method 400 may adapt the f0to the target domain with n unlabeled samples x, i = 1,..., n, from the target domain, as follows.
[0079] As described herein, 0 is the set of all the parameters of the NN 300 and 0Ais the set of affine parameters of all the LN layers 325 of the NN 300, where 0Ac 0. Thus, all the parameters in the set {θ\θA}, such as all the parameters in the set θ, except those parameters that belong to the set θA, are fixed (e.g., remain unchanged), and only the18Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTparameters in the set θAare adapted / adjusted to the target domain. The adaptation process, therefore, in a backward pass, adapts the affine parameters of all the LN layers 325, using the gradient of the loss function, as described herein. For example, a mapping of fθproduces a conditional distribution P(y|x; θ) that is differentiable in θ. The method computes P(y|x-; 0) for i = 1,
[0080] At 402, the method 400 computes a conditional entropy of the beam prediction NN (e.g., the NN 300). For example, the method 400 computes the conditional entropy given n input data samples xf:nH(P(y|xf;0)) = ^P(xf) H(p(y|xf;0))i=l
[0081] where P(x) is the probability distribution of the input data samples for the NN 300. In some cases, the probability distribution of the input data samples may be available for the NN 300 or may be provided to the node / entity performing the model selection. For example, the probability distribution may be based on a set of training data samples used to train the model. Assuming the input data samples are equally likely, the conditional entropy may be given as the average of the entropy of the AI / ML model predictions (y) given the n input data samples xf, orH(P(y|xt; θ)) = (1 / n) ∑i=1nH(P(y|xit; θ)) = -(1 / n) ∑i=1n∑y∈MP(yc|xit; θ) log P(yc|xit; θ)
[0082] In some cases, the logarithm (log) may be a base natural logarithm, or a base-2 log. Further, adapting the parameters to minimize the conditional entropy“SiLi H(P(y\xi>' 0))may assist in improving the confidence in individual predictions and / or enable the model to generate more confident predictions. However, in some cases, considering only the conditional entropy minimization can lead to degenerate solutions where the adapted model puts all the probability mass on a single or very few labels, predicting a single or very few best beams for all instances of beam measurements.19Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0083] At 404, the method 400 computes an empirical marginal distribution of predicted labels. For example, the method 400 computes the empirical marginal distribution of predicted labels using:n^(y; 0) = -y P(y|xf; 0)n Z— ii=l
[0084] In some cases, the computed distribution P(y; 0) is an approximation of P(y; 0), which is the true marginal distribution of y. (e.g., the superscript tilde indicating the approximation).
[0085] At 406, the method 400 computes the entropy of the predicted labels. For example, the method 400 may compute the entropy of the predicted labels, by:H(P̃(y; θ)) = - ∑y∈MP̃(y; θ) log P̃(y; θ)yeM
[0086] In some cases, a higher value of H(P̃(y; θ)) ensures balance across label prediction, or that the marginal distribution of the predicted labels is close to a unform distribution, which is a desired quality with a reasonable number of data samples xf, i = 1,..., n. The method 400 may then set a loss function to:nw1(1 / n) ∑i=1nH(P(y|xit; θ)) - w2H(P̃(y; θ))i=l
[0087] where w±and w2are the weights (e.g., the importance) assigned to each term in the loss function and are considered hyper- parameters with 0 < wltw2< 1. When w±= w2= 1, £(0; xf) = — 7(xf; y), where / (xf; y) denotes the mutual information between xfand y, the input and output of the NN 300.
[0088] In some cases, because the conditional distribution P(y|x; θ) produced by the mapping fθis differentiable in θ, the loss function is differentiable in θA. Hence, loss function minimization can be performed through gradient based methods. However, instead of straightforward minimization of the entropy, the method 400 minimizes the entropy as well as the sharpness of the entropy, as described herein.20Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0089] At 408, the method 400 adapts affine parameters by minimizing the loss function. For example, the method 400 minimizes the loss function to realize:0A= nun £{0; xf)
[0090] where 01 denotes the adapted / modified affine parameters of all of the LN layers 325 of the beam prediction NN 300. Note that 0 = {01(02, 0N} and the adapted beam prediction model is denoted by fQt, where 0fdenotes the adapted parameters of the entire NN 300. The method 400 only adapts 0A(e.g., the affine parameters of the LN layers 325 of the NN 300) to 0 and does not change {0\0A}, or the remaining parameters of the NN 300). Thus, the final adapted model parameters, denoted by 0f, are 0f= 0 U{0\0A}-
[0091] The following is an example implementation of the method 400, where a beam prediction DNN model is adapted via its LN layers using unlabeled samples. The input is a beam prediction DNN model f0and unlabeled samples xf= {x^, x2,..., x^} from a target domain. The output is an adapted beam prediction DNN model fet.
[0092] In step 1, a set 0Ais formed comprising of all the affine parameters of all the LN layers of the beam prediction DNN, such that 0 = 0AU{0\0A}.
[0093] In step 2, xfand f0are used to determine P(y|x; 0) for i = 1,..., n.
[0094] In step 3, the conditional entropy is computed, as follows:n = ^P(xf) w(p(y|xf; 0)) i=l
[0095] Next, in step 4, H(P(y; 0)) = — SyeM^Cy; 9) l°9 9) is computed, where P(y;0) = ^F=iP(y|xf; 9).
[0096] In step 5, the loss function £(0; xf) = #(P(y | xf; 0)) — H (p(y; 0)}, iscomputed.
[0097] In step 6, the adapted parameters 0Aare determined, as 0 == min £(9,-t.
[0098] In step 7, the adapted beam prediction DNN model parameters are given by 0f= 91 U{0\0A}.21Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0099] Thus, in various embodiments, an AI / ML model employed to perform beam prediction for a cell site (e.g., a base station, such as a gNB) may be adapted to specific parameters or characteristics associated with the cell site by adapting the parameters (e.g., the affine parameters) of the LN layers of the AI / ML model using target domain data samples. In doing so, a wireless communications system may employ the beam prediction AI / ML model to assist beam prediction procedures in a targeted and efficient manner, among other benefits, via an AI / ML model (e.g., a DNN) that is adapted or otherwise tailored to a specific cell site or scenario.
[0100] Figure 5 illustrates an example of a UE 500 in accordance with aspects of the present disclosure. The UE 500 may include a processor 502, a memory 504, a controller 506, and a transceiver 508. The processor 502, the memory 504, the controller 506, or the transceiver 508, or various combinations thereof or various components thereof may be examples of means for performing various aspects of the present disclosure as described herein. These components may be coupled (e.g., operatively, communicatively, functionally, electronically, electrically) via one or more interfaces.
[0101] The processor 502, the memory 504, the controller 506, or the transceiver 508, or various combinations or components thereof may be implemented in hardware (e.g., circuitry). The hardware may include a processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), or other programmable logic device, or any combination thereof configured as or otherwise supporting a means for performing the functions described in the present disclosure.
[0102] The processor 502 may include an intelligent hardware device (e.g., a general-purpose processor, a DSP, a CPU, an ASIC, an FPGA, or any combination thereof). In some implementations, the processor 502 may be configured to operate the memory 504. In some other implementations, the memory 504 may be integrated into the processor 502. The processor 502 may be configured to execute computer-readable instructions stored in the memory 504 to cause the UE 500 to perform various functions of the present disclosure.
[0103] The memory 504 may include volatile or non-volatile memory. The memory 504 may store computer-readable, computer-executable code including instructions when executed by the processor 502 cause the UE 500 to perform various functions described 22Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTherein. The code may be stored in a non-transitory computer-readable medium such the memory 504 or another type of memory. Computer-readable media includes both non-transitory computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A non-transitory storage medium may be any available medium that may be accessed by a general-purpose or special-purpose computer.
[0104] In some implementations, the processor 502 and the memory 504 coupled with the processor 502 may be configured to cause the UE 500 to perform one or more of the functions described herein (e.g., executing, by the processor 502, instructions stored in the memory 504). For example, the processor 502 may support wireless communication at the UE 500 in accordance with examples as disclosed herein.
[0105] For example, the processor 502 may support wireless communication at the UE 500 in accordance with examples as disclosed herein. The UE 500 may be configured to support a means for determining an NN model for beam prediction, by computing a set of NN parameters associated with multiple neural layers of the NN model and computing a set of affine parameters associated with at least one LN layer of the NN model, and transmitting, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.
[0106] As another example, the UE 500 may be configured to support a means for determining whether to update an NN model for beam prediction, wherein the NN model for beam prediction includes multiple neural layers, at least one LN layer, and a set of affine parameters associated with at least one LN layer, and updating the set of affine parameters based on a set of input data samples.
[0107] The controller 506 may manage input and output signals for the UE 500. The controller 506 may also manage peripherals not integrated into the UE 500. In some implementations, the controller 506 may utilize an operating system such as iOS®, ANDROID®, WINDOWS®, or other operating systems. In some implementations, the controller 506 may be implemented as part of the processor 502.23Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0108] In some implementations, the UE 500 may include at least one transceiver 508. In some other implementations, the UE 500 may have more than one transceiver 508. The transceiver 508 may represent a wireless transceiver. The transceiver 508 may include one or more receiver chains 510, one or more transmitter chains 512, or a combination thereof.
[0109] A receiver chain 510 may be configured to receive signals (e.g., control information, data, packets) over a wireless medium. For example, the receiver chain 510 may include one or more antennas for receive the signal over the air or wireless medium. The receiver chain 510 may include at least one amplifier (e.g., a low-noise amplifier (LNA)) configured to amplify the received signal. The receiver chain 510 may include at least one demodulator configured to demodulate the receive signal and obtain the transmitted data by reversing the modulation technique applied during transmission of the signal. The receiver chain 510 may include at least one decoder for decoding the processing the demodulated signal to receive the transmitted data.
[0110] A transmitter chain 512 may be configured to generate and transmit signals (e.g., control information, data, packets). The transmitter chain 512 may include at least one modulator for modulating data onto a carrier signal, preparing the signal for transmission over a wireless medium. The at least one modulator may be configured to support one or more techniques such as amplitude modulation (AM), frequency modulation (FM), or digital modulation schemes like phase-shift keying (PSK) or quadrature amplitude modulation (QAM). The transmitter chain 512 may also include at least one power amplifier configured to amplify the modulated signal to an appropriate power level suitable for transmission over the wireless medium. The transmitter chain 512 may also include one or more antennas for transmitting the amplified signal into the air or wireless medium.
[0111] Figure 6 illustrates an example of a processor 600 in accordance with aspects of the present disclosure. The processor 600 may be an example of a processor configured to perform various operations in accordance with examples as described herein. The processor 600 may include a controller 602 configured to perform various operations in accordance with examples as described herein. The processor 600 may optionally include at least one memory 604, which may be, for example, an L1 / L2 / L3 cache. Additionally, or alternatively, the processor 600 may optionally include one or more arithmetic-logic units 24Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT(ALUs) 606. One or more of these components may be in electronic communication or otherwise coupled (e.g., operatively, communicatively, functionally, electronically, electrically) via one or more interfaces (e.g., buses).
[0112] The processor 600 may be a processor chipset and include a protocol stack (e.g., a software stack) executed by the processor chipset to perform various operations (e.g., receiving, obtaining, retrieving, transmitting, outputting, forwarding, storing, determining, identifying, accessing, writing, reading) in accordance with examples as described herein. The processor chipset may include one or more cores, one or more caches (e.g., memory local to or included in the processor chipset (e.g., the processor 600) or other memory (e.g., random access memory (RAM), read-only memory (ROM), dynamic RAM (DRAM), synchronous dynamic RAM (SDRAM), static RAM (SRAM), ferroelectric RAM (FeRAM), magnetic RAM (MRAM), resistive RAM (RRAM), flash memory, phase change memory (PCM), and others).
[0113] The controller 602 may be configured to manage and coordinate various operations (e.g., signaling, receiving, obtaining, retrieving, transmitting, outputting, forwarding, storing, determining, identifying, accessing, writing, reading) of the processor 600 to cause the processor 600 to support various operations in accordance with examples as described herein. For example, the controller 602 may operate as a control unit of the processor 600, generating control signals that manage the operation of various components of the processor 600. These control signals include enabling or disabling functional units, selecting data paths, initiating memory access, and coordinating timing of operations.
[0114] The controller 602 may be configured to fetch (e.g., obtain, retrieve, receive) instructions from the memory 604 and determine subsequent instruction(s) to be executed to cause the processor 600 to support various operations in accordance with examples as described herein. The controller 602 may be configured to track memory address of instructions associated with the memory 604. The controller 602 may be configured to decode instructions to determine the operation to be performed and the operands involved. For example, the controller 602 may be configured to interpret the instruction and determine control signals to be output to other components of the processor 600 to cause the processor 600 to support various operations in accordance with examples as described 25Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTherein. Additionally, or alternatively, the controller 602 may be configured to manage flow of data within the processor 600. The controller 602 may be configured to control transfer of data between registers, arithmetic logic units (ALUs), and other functional units of the processor 600.
[0115] The memory 604 may include one or more caches (e.g., memory local to or included in the processor 600 or other memory, such RAM, ROM, DRAM, SDRAM, SRAM, MRAM, flash memory, etc. In some implementations, the memory 604 may reside within or on a processor chipset (e.g., local to the processor 600). In some other implementations, the memory 604 may reside external to the processor chipset (e.g., remote to the processor 600).
[0116] The memory 604 may store computer-readable, computer-executable code including instructions that, when executed by the processor 600, cause the processor 600 to perform various functions described herein. The code may be stored in a non-transitory computer-readable medium such as system memory or another type of memory. The controller 602 and / or the processor 600 may be configured to execute computer-readable instructions stored in the memory 604 to cause the processor 600 to perform various functions. For example, the processor 600 and / or the controller 602 may be coupled with or to the memory 604, the processor 600, the controller 602, and the memory 604 may be configured to perform various functions described herein. In some examples, the processor 600 may include multiple processors and the memory 604 may include multiple memories. One or more of the multiple processors may be coupled with one or more of the multiple memories, which may, individually or collectively, be configured to perform various functions herein.
[0117] The one or more ALUs 606 may be configured to support various operations in accordance with examples as described herein. In some implementations, the one or more ALUs 606 may reside within or on a processor chipset (e.g., the processor 600). In some other implementations, the one or more ALUs 606 may reside external to the processor chipset (e.g., the processor 600). One or more ALUs 606 may perform one or more computations such as addition, subtraction, multiplication, and division on data. For example, one or more ALUs 606 may receive input operands and an operation code, which 26Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTdetermines an operation to be executed. One or more ALUs 606 be configured with a variety of logical and arithmetic circuits, including adders, subtractors, shifters, and logic gates, to process and manipulate the data according to the operation. Additionally, or alternatively, the one or more ALUs 606 may support logical operations such as AND, OR, exclusive-OR (XOR), not-OR (NOR), and not- AND (NAND), enabling the one or more ALUs 606 to handle conditional operations, comparisons, and bitwise operations.
[0118] The processor 600 may support wireless communication in accordance with examples as disclosed herein. For example, the processor 600 may be configured to support a means for determining an NN model for beam prediction, by computing a set of NN parameters associated with multiple neural layers of the NN model and computing a set of affine parameters associated with at least one LN layer of the NN model, and transmitting, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.
[0119] As another example, the processor 600 may be configured to support a means for determining whether to update an NN model for beam prediction, wherein the NN model for beam prediction includes multiple neural layers, at least one LN layer, and a set of affine parameters associated with at least one LN layer, and updating the set of affine parameters based on a set of input data samples.
[0120] Figure 7 illustrates an example of a NE 700 in accordance with aspects of the present disclosure. The NE 700 may include a processor 702, a memory 704, a controller 706, and a transceiver 708. The processor 702, the memory 704, the controller 706, or the transceiver 708, or various combinations thereof or various components thereof may be examples of means for performing various aspects of the present disclosure as described herein. These components may be coupled (e.g., operatively, communicatively, functionally, electronically, electrically) via one or more interfaces.
[0121] The processor 702, the memory 704, the controller 706, or the transceiver 708, or various combinations or components thereof may be implemented in hardware (e.g., circuitry). The hardware may include a processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), or other programmable logic device, or any27Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTcombination thereof configured as or otherwise supporting a means for performing the functions described in the present disclosure.
[0122] The processor 702 may include an intelligent hardware device (e.g., a general-purpose processor, a DSP, a CPU, an ASIC, an FPGA, or any combination thereof). In some implementations, the processor 702 may be configured to operate the memory 704. In some other implementations, the memory 704 may be integrated into the processor 702. The processor 702 may be configured to execute computer-readable instructions stored in the memory 704 to cause the NE 700 to perform various functions of the present disclosure.
[0123] The memory 704 may include volatile or non-volatile memory. The memory 704 may store computer-readable, computer-executable code including instructions when executed by the processor 702 cause the NE 700 to perform various functions described herein. The code may be stored in a non-transitory computer-readable medium such the memory 704 or another type of memory. Computer-readable media includes both non-transitory computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A non-transitory storage medium may be any available medium that may be accessed by a general-purpose or special-purpose computer.
[0124] In some implementations, the processor 702 and the memory 704 coupled with the processor 702 may be configured to cause the NE 700 to perform one or more of the functions described herein (e.g., executing, by the processor 702, instructions stored in the memory 704).
[0125] For example, the processor 702 may support wireless communication at the NE 700 in accordance with examples as disclosed herein. The NE 700 may be configured to support a means for determining an NN model for beam prediction, by computing a set of NN parameters associated with multiple neural layers of the NN model and computing a set of affine parameters associated with at least one LN layer of the NN model, and transmitting, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.28Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT
[0126] As another example, the NE 700 may be configured to support a means for determining whether to update an NN model for beam prediction, wherein the NN model for beam prediction includes multiple neural layers, at least one LN layer, and a set of affine parameters associated with at least one LN layer and updating the set of affine parameters based on a set of input data samples.
[0127] The controller 706 may manage input and output signals for the NE 700. The controller 706 may also manage peripherals not integrated into the NE 700. In some implementations, the controller 706 may utilize an operating system such as iOS®, ANDROID®, WINDOWS®, or other operating systems. In some implementations, the controller 706 may be implemented as part of the processor 702.
[0128] In some implementations, the NE 700 may include at least one transceiver 708. In some other implementations, the NE 700 may have more than one transceiver 708. The transceiver 708 may represent a wireless transceiver. The transceiver 708 may include one or more receiver chains 710, one or more transmitter chains 712, or a combination thereof.
[0129] A receiver chain 710 may be configured to receive signals (e.g., control information, data, packets) over a wireless medium. For example, the receiver chain 710 may include one or more antennas for receive the signal over the air or wireless medium. The receiver chain 710 may include at least one amplifier (e.g., a low-noise amplifier (LNA)) configured to amplify the received signal. The receiver chain 710 may include at least one demodulator configured to demodulate the receive signal and obtain the transmitted data by reversing the modulation technique applied during transmission of the signal. The receiver chain 710 may include at least one decoder for decoding the processing the demodulated signal to receive the transmitted data.
[0130] A transmitter chain 712 may be configured to generate and transmit signals (e.g., control information, data, packets). The transmitter chain 712 may include at least one modulator for modulating data onto a carrier signal, preparing the signal for transmission over a wireless medium. The at least one modulator may be configured to support one or more techniques such as amplitude modulation (AM), frequency modulation (FM), or digital modulation schemes like phase-shift keying (PSK) or quadrature amplitude modulation (QAM). The transmitter chain 712 may also include at least one power29Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTamplifier configured to amplify the modulated signal to an appropriate power level suitable for transmission over the wireless medium. The transmitter chain 712 may also include one or more antennas for transmitting the amplified signal into the air or wireless medium.
[0131] Figure 8 illustrates a flowchart of a method in accordance with aspects of the present disclosure. The operations of the method may be implemented by an NE as described herein. In some implementations, the NE may execute a set of instructions to control the function elements of the NE to perform the described functions.
[0132] At 802, the method may include determining an NN model for beam prediction, by computing a set of NN parameters associated with multiple neural layers of the NN model and computing a set of affine parameters associated with at least one LN layer of the NN model. The operations of 802 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 802 may be performed by an NE as described with reference to Figure 7.
[0133] At 804, the method may include transmitting, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters. The operations of 804 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 804 may be performed by an NE as described with reference to Figure 7.
[0134] It should be noted that the method described herein describes a possible implementation, and that the operations and the steps may be rearranged or otherwise modified and that other implementations are possible.
[0135] Figure 9 illustrates a flowchart of a method in accordance with aspects of the present disclosure. The operations of the method may be implemented by a UE as described herein. In some implementations, the UE may execute a set of instructions to control the function elements of the UE to perform the described functions.
[0136] At 902, the method may include determining whether to update an NN model for beam prediction, wherein the NN model for beam prediction includes multiple neural layers, at least one LN layer, and a set of affine parameters associated with at least one LN layer. The operations of 902 may be performed in accordance with examples as described 30Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTherein. In some implementations, aspects of the operations of 902 may be performed by a UE as described with reference to Figure 5.
[0137] At 904, the method may include updating the set of affine parameters based on a set of input data samples. The operations of 904 may be performed in accordance with examples as described herein. In some implementations, aspects of the operations of 904 may be performed by a UE as described with reference to Figure 5.
[0138] It should be noted that the method described herein describes a possible implementation, and that the operations and the steps may be rearranged or otherwise modified and that other implementations are possible.
[0139] The description herein is provided to enable a person having ordinary skill in the art to make or use the disclosure. Various modifications to the disclosure will be apparent to a person having ordinary skill in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein but is to be accorded the broadest scope consistent with the principles and novel features disclosed herein.31Firm Ref. No. 793MS0275PC
Claims
Lenovo Ref. No. SMM920240248-WO-PCTCLAIMSWhat is claimed is:
1. A first node for wireless communication, comprising:at least one memory; andat least one processor coupled with the at least one memory and configured to cause the first node to:determine a neural network (NN) model for beam prediction, by:computing a set of NN parameters associated with multiple neural layers of the NN model; andcomputing a set of affine parameters associated with at least one layer normalization (LN) layer of the NN model; and transmit, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.
2. The first node of claim 1, wherein the set of affine parameters includes a set of scale parameters and a set of shift parameters for the at least one LN layer.
3. The first node of claim 1, wherein the NN model is based on a set of data samples that includes unlabeled data samples, labeled data samples, or a combination of unlabeled data samples and labeled data samples.
4. The first node of claim 3, wherein the at least one processor is further configured to cause the first node to determine the set of data samples.
5. The first node of claim 3, wherein the at least one processor is further configured to cause the first node to receive the set of data samples from another node.
6. The first node of claim 3, wherein the at least one processor is configured to cause the first node to determine the NN model for beam prediction by:32Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTcomputing a global set of parameters, which includes the set of NN parameters and the set of affine parameters, via a learning algorithm that is based on the set of data samples.
7. A second node for wireless communication, comprising:at least one memory; andat least one processor coupled with the at least one memory and configured to cause the second node to:determine whether to update a neural network (NN) model for beam prediction,wherein the NN model for beam prediction includes:multiple neural layers;at least one layer normalization (LN) layer; anda set of affine parameters associated with at least one LN layer; andupdate the set of affine parameters based on a set of input data samples.
8. The second node of claim 7, wherein the at least one processor is further configured to cause the second node to determine the set of input data samples.
9. The second node of claim 7, wherein the at least one processor is further configured to cause the second node to receive the set of data samples from another node.
10. The second node of claim 7, wherein the at least one processor is configured to cause the second node to update the set of affine parameters by:determining a first quantity associated with a measure of uncertainty in an output of the NN model resulting from the set of input data samples; determining a second quantity associated with a remaining uncertainty in the output of the NN model resulting from the set of input data samples; determining a weighted first quantity for the first quantity and a weighted second quantity for the second quantity; and33Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCTdetermining the set of affine parameters by minimizing a difference between the second weighted quantity and the first weighted quantity.
11. The second node of claim 10, wherein the measure of uncertainty in the output of the NN model is a measure of entropy of the output.
12. The second node of claim 10, wherein the measure of remaining uncertainty in the output of the NN model is a measure of conditional entropy in the output.
13. The second node of claim 10, wherein the weighted first quantity and the weighted second quantity are based on weights having values between 0 and 1, inclusive.
14. The second node of claim 7, wherein the at least one processor is configured to cause the second node to determine to update the NN model based on information that indicates periodic time intervals for updating the NN model.
15. The second node of claim 7, wherein the at least one processor is configured to cause the second node to determine to update the NN model based on receiving an indication to update the NN model from a first node.
16. The second node of claim 7, wherein the at least one processor is configured to cause the second node to determine to update the NN model based on receiving a configuration associated with updating the NN model from a first node.
17. The second node of claim 7, wherein the at least one processor is configured to cause the second node to determine to update the NN model based on information that indicates certain conditions of a communications network that includes the second node.
18. The second node of claim 7, wherein the at least one processor is configured to cause the second node to determine to update the NN model based on determining a quality metric for a functionality of the NN model.34Firm Ref. No. 793MS0275PCLenovo Ref. No. SMM920240248-WO-PCT19. A method performed by a first node of a communications network, the method comprising:determining a neural network (NN) model for beam prediction, by:computing a set of NN parameters associated with multiple neural layers of the NN model; andcomputing a set of affine parameters associated with at least one layer normalization (LN) layer of the NN model; and transmitting, to a second node, a set of model parameters that includes the set of NN parameters and the set of affine parameters.
20. A processor for wireless communication, comprising:at least one controller coupled with at least one memory and configured to cause the processor to:determine whether to update a neural network (NN) model for beam prediction,wherein the NN model for beam prediction includes:multiple neural layers;at least one layer normalization (LN) layer; anda set of affine parameters associated with at least one LN layer; andupdate the set of affine parameters based on a set of input data samples.35Firm Ref. No. 793MS0275PC