Hybrid sequential training for encoder and decoder models
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- QUALCOMM INC
- Filing Date
- 2023-07-24
- Publication Date
- 2026-07-02
AI Technical Summary
Existing wireless communication systems face challenges in efficiently training encoder and decoder models for improved spectral efficiency and integration with multiple access technologies, particularly in 5G and NR networks, which affect the performance of wireless communication devices.
Implementing hybrid sequential training for encoder and decoder models, where a first device receives gradients from a second device to train a model based on activity values and inputs, and transmits functions associated with the trained model to enhance communication performance.
Enhances the training process for encoder and decoder models, improving spectral efficiency and integration with multiple access technologies, thereby optimizing wireless communication systems.
Smart Images

Figure 00000000_0000_ABST
Abstract
Description
[Technical Field]
[0001] (CROSS-REFERENCE TO RELATED APPLICATIONS)
[0001] This patent application claims priority to commonly assigned PCT Patent Application No. PCT / CN2022 / 129967, entitled "HYBRID SEQUENTIAL TRAINING FOR ENCODER AND DECODER MODELS," filed November 4, 2022. The disclosure of the prior application is considered part of, and incorporated by reference into, this patent application.
[0002] Aspects of the present disclosure relate generally to wireless communications and to techniques and apparatus for hybrid sequential training for encoder and decoder models. [Background technology]
[0003]
[0003] Wireless communication systems are widely deployed to provide various telecommunication services such as telephony, video, data, messaging, and broadcasting. Typical wireless communication systems may employ multiple access technologies capable of supporting communication with multiple users by sharing available system resources (e.g., bandwidth, transmit power, etc.). Examples of such multiple access technologies include Code Division Multiple Access (CDMA) systems, Time Division Multiple Access (TDMA) systems, Frequency Division Multiple Access (FDMA) systems, Orthogonal Frequency Division Multiple Access (OFDMA) systems, Single-Carrier Frequency Division Multiple Access (SC-FDMA) systems, Time Division Synchronous Code Division Multiple Access (TD-SCDMA) systems, and Long Term Evolution (LTE). LTE / LTE-Advanced is a set of extensions to the Universal Mobile Telecommunications System (UMTS) mobile standard promulgated by the Third Generation Partnership Project (3GPP).
[0004] A wireless network may include one or more network nodes that support communication for wireless communication devices, such as a user equipment (UE) or multiple UEs. A UE may communicate with a network node via downlink and uplink communications. "Downlink" (or "downlink; DL") refers to the communication link from a network node to a UE, and "uplink" (or "uplink; UL") refers to the communication link from a UE to a network node. Some wireless networks may support device-to-device communications via a local link (e.g., a sidelink (SL), a wireless local area network (WLAN) link, and / or a wireless personal area network (WPAN) link, among other examples).
[0005]
[0005] The above multiple access technologies have been adopted in various telecommunications standards to provide a common protocol that allows various UEs to communicate at a city, country, region, and / or global level. New Radio (NR), sometimes referred to as 5G, is a set of enhancements to the LTE mobile standard promulgated by 3GPP. NR is designed to better support mobile broadband Internet access by improving spectral efficiency, lowering costs, improving service, utilizing new spectrum, and by better integrating with other open standards using Orthogonal Frequency Division Multiplexing (OFDM) with a Cyclic Prefix (CP) (CP-OFDM) on the downlink and CP-OFDM and / or Single-Carrier Frequency Division Multiplexing (SC-FDM) (also known as Discrete Fourier Transform Spread OFDM (DFT-s-OFDM)) on the uplink, as well as supporting beamforming, multiple-input multiple-output (MIMO) antenna technology, and carrier aggregation. As demand for mobile broadband access continues to grow, further improvements in LTE, NR, and other radio access technologies remain useful. Summary of the Invention
[0006] Some aspects described herein relate to a first device for wireless communication. The first device may include one or more memories and one or more processors coupled to the one or more memories. The one or more processors may be configured to receive, from a second device, a function associated with a trained first model, the function being configured to output one or more gradients associated with the trained first model. The one or more processors may be configured to train the second model based on selecting one or more weights associated with the second model using the one or more gradients, the one or more gradients being obtained based on inputting one or more activity values and one or more inputs to the function.
[0007] Some aspects described herein relate to a first device for wireless communication. The first device may include one or more memories and one or more processors coupled to the one or more memories. The one or more processors may be configured to train the first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with outputs of the trained first model. The one or more processors may be configured to transmit, to a second device, a function associated with the trained first model, the function being configured to output one or more activity values based on ground truth inputs.
[0008] Some aspects described herein relate to a method of wireless communication performed by a first device. The method may include receiving, from a second device, a function associated with a trained first model, the function being configured to output one or more gradients associated with the trained first model. The method may include training the second model based on selecting one or more weights associated with the second model using the one or more gradients, the one or more gradients being obtained based on inputting one or more activity values and one or more inputs into the function.
[0009] Some aspects described herein relate to a method of wireless communication performed by a first device. The method may include training a first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with outputs of the trained first model. The method may include transmitting to a second device a function associated with the trained first model, the function being configured to output one or more activity values based on ground truth inputs.
[0010] Some aspects described herein relate to a non-transitory computer-readable medium storing a set of instructions for wireless communication by a first device. The set of instructions, when executed by one or more processors of the first device, may cause the first device to receive from a second device a function associated with a trained first model, the function being configured to output one or more gradients associated with the trained first model. The set of instructions, when executed by the one or more processors of the first device, may cause the first device to train a second model based on selecting one or more weights associated with the second model using the one or more gradients obtained based on inputting one or more activity values and one or more inputs to the function.
[0011] Some aspects described herein relate to a non-transitory computer-readable medium storing a set of instructions for wireless communication by a first device. The set of instructions, when executed by one or more processors of the first device, may cause the first device to train a first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with outputs of the trained first model. The set of instructions, when executed by the one or more processors of the first device, may cause the first device to transmit, to a second device, a function associated with the trained first model, the function being configured to output one or more activity values based on ground truth inputs.
[0012] Some aspects described herein relate to an apparatus for wireless communication. The apparatus may include means for receiving, from a second device, a function associated with a trained first model, the function being configured to output one or more gradients associated with the trained first model. The apparatus may include means for training the second model based on selecting one or more weights associated with the second model using the one or more gradients, the one or more gradients being obtained based on inputting one or more activity values and one or more inputs to the function.
[0013] Some aspects described herein relate to an apparatus for wireless communication. The apparatus may include means for training a first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with outputs of the trained first model. The apparatus may include means for transmitting, to a second device, a function associated with the trained first model, the function being configured to output one or more activity values based on ground truth inputs.
[0014]
[0014] Aspects generally include methods, apparatus, systems, computer program products, non-transitory computer-readable media, user equipment, base stations, network entities, network nodes, wireless communication devices, and / or processing systems substantially as described herein with reference to the drawings and this specification, and as illustrated by the drawings and this specification.
[0015]
[0015] The foregoing has outlined rather broadly the features and technical advantages of embodiments according to the present disclosure in order that the following Detailed Description may be better understood. Additional features and advantages will be described hereinafter. The concepts and specific examples disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present disclosure. Such equivalent constructions do not depart from the scope of the appended claims. The nature of the concepts disclosed herein, both their organization and method of operation, together with associated advantages, will be better understood by considering the following description in conjunction with the accompanying figures. Each of the figures is provided for purposes of illustration and description, and not as a definition of the limits of the claims.
[0016] Although aspects are described in this disclosure by way of example with respect to some examples, those skilled in the art will understand that such aspects can be implemented in many different configurations and scenarios. The techniques described herein can be implemented using a variety of platform types, devices, systems, shapes, sizes, and / or packaging configurations. For example, some aspects can be implemented via integrated chip embodiments or other non-modular component-based devices (e.g., end-user devices, vehicles, communications devices, computing devices, industrial equipment, retail / purchasing devices, medical devices, and / or artificial intelligence devices). Aspects can be implemented in chip-level components, modular components, non-modular components, non-chip-level components, device-level components, and / or system-level components. Devices incorporating the described aspects and features may include additional components and features for implementing and practicing the claimed and described aspects. For example, the transmission and reception of wireless signals may include one or more components (e.g., hardware components including antennas, radio frequency (RF) chains, power amplifiers, modulators, buffers, processors, interleavers, summers, and / or analog summers) for analog and digital purposes. It is contemplated that aspects described herein may be practiced in a wide variety of devices, components, systems, distributed configurations, and / or end-user devices of various sizes, shapes, and configurations. [Brief explanation of the drawings]
[0017]
[0017] In order to be able to understand in detail the features of the present disclosure listed above, a more detailed description briefly summarized above can be obtained by referring to the embodiments, some of which are shown in the accompanying drawings. However, it should be noted that the accompanying drawings only show certain exemplary embodiments of the present disclosure, and therefore should not be considered as limiting the scope of the present disclosure, since the description may admit of other equally effective embodiments. The same reference numbers in different drawings may identify the same or similar elements. [Figure 1]
[0018] FIG. 1 illustrates an example of a wireless network according to the present disclosure. [Figure 2]
[0019] FIG. 1 illustrates an example of a network node in communication with a user equipment (UE) in a wireless network, according to the present disclosure. [Figure 3]
[0020] FIG. 1 illustrates an exemplary disaggregated base station architecture in accordance with the present disclosure. [Figure 4]
[0021] FIG. 1 illustrates an example architecture of a functional framework for radio access network intelligence enabled by data collection, in accordance with the present disclosure. [Figure 5]
[0022] FIG. 1 illustrates an example architecture and associated artificial intelligence / machine learning (AI / ML) based channel state feedback compression according to the present disclosure. [Figure 6]
[0023] FIG. 1 illustrates an example associated with multi-vendor AI / ML training in accordance with the present disclosure. [Figure 7A]
[0024] FIG. 1 illustrates an example associated with joint training for encoder and decoder models, in accordance with the present disclosure. [Figure 7B] FIG. 1 illustrates an example associated with joint training for encoder and decoder models, in accordance with the present disclosure. [Figure 8]
[0025] FIG. 1 illustrates an example associated with sequential training for encoder and decoder models according to the present disclosure. [Figure 9A]
[0026] FIG. 1 illustrates an example associated with vector quantization according to the present disclosure. [Figure 9B] FIG. 1 illustrates an example associated with vector quantization according to the present disclosure. [Figure 10]
[0027] FIG. 1 is a diagram of an example associated hybrid sequential training for encoder and decoder models according to the present disclosure. [Figure 11]
[0028] FIG. 1 is a diagram of an example associated hybrid sequential training for encoder and decoder models according to the present disclosure. [Figure 12]
[0029] FIG. 1 illustrates an exemplary process performed, for example, by a first device, according to the present disclosure. [Figure 13]
[0030] FIG. 1 illustrates an exemplary process performed, for example, by a first device, according to the present disclosure. [Figure 14]
[0031] FIG. 1 is a diagram of an exemplary apparatus for wireless communication according to the present disclosure. [Figure 15]
[0032] FIG. 1 is a diagram of an exemplary apparatus for wireless communication according to the present disclosure. DETAILED DESCRIPTION OF THE INVENTION
[0018]
[0033] Various aspects of the present disclosure will now be described more fully with reference to the accompanying drawings. However, the present disclosure may be embodied in many different forms and should not be construed as limited to any specific structure or function presented throughout this disclosure. Rather, these aspects are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. Those skilled in the art will appreciate that the scope of the present disclosure is intended to encompass all aspects of the present disclosure disclosed herein, whether implemented independently or in combination with any other aspects of the present disclosure. For example, an apparatus can be implemented or a method can be practiced using any number of the aspects described herein. Furthermore, the scope of the present disclosure is intended to encompass such apparatuses or methods practiced using other structure, functionality, or structure and functionality in addition to or other than the various aspects of the present disclosure described herein. It should be understood that any aspect of the present disclosure disclosed herein can be embodied by one or more elements of a claim.
[0019]
[0034] Several aspects of telecommunications systems will now be presented with reference to various devices and techniques. These devices and techniques are described in the Detailed Description below and illustrated in the accompanying drawings by various blocks, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as "elements"). These elements may be implemented using hardware, software, or a combination thereof. Whether such elements are implemented as hardware or software depends on the particular application and design constraints imposed on the overall system.
[0020]
[0035] Although aspects may be described herein using terminology commonly associated with 5G or New Radio (NR) radio access technology (RAT), aspects of the present disclosure may also be applied to other RATs, such as 3G RAT, 4G RAT, and / or RATs subsequent to 5G (e.g., 6G).
[0021]
[0036] 1 illustrates one example of a wireless network 100 in accordance with the present disclosure. Wireless network 100 may be, or may include elements of, a 5G (e.g., NR) network and / or a 4G (e.g., Long Term Evolution (LTE)) network, among other examples. Wireless network 100 may include one or more network nodes 110 (shown as network node 110a, network node 110b, network node 110c, and network node 110d), user equipment (UE) 120 or multiple UEs 120 (shown as UE 120a, UE 120b, UE 120c, UE 120d, and UE 120e), and / or other entities. Network node 110 is a network node that communicates with UE 120. As shown, network node 110 may include one or more network nodes. For example, network node 110 may be an aggregated network node, meaning that the aggregated network node is configured to utilize a radio protocol stack that is physically or logically integrated within a single radio access network (RAN) node (e.g., within a single device or unit). As another example, network node 110 may be a disaggregated network node (sometimes referred to as a disaggregated base station), meaning that network node 110 is configured to utilize a protocol stack that is physically or logically distributed among two or more nodes (e.g., one or more central units (CUs), one or more distributed units (DUs), or one or more radio units (RUs)).
[0022]
[0037] In some embodiments, the network node 110 is or includes a network node, such as a RU, that communicates with the UE 120 over a radio access link. In some embodiments, the network node 110 is or includes a network node, such as a DU, that communicates with other network nodes 110 over a fronthaul link or a midhaul link. In some embodiments, the network node 110 is or includes a network node, such as a CU, that communicates with other network nodes 110 over a midhaul link or with a core network over a backhaul link. In some embodiments, the network node 110 (e.g., an aggregated network node 110 or a disaggregated network node 110) may include multiple network nodes, such as one or more RUs, one or more CUs, and / or one or more DUs. The network nodes 110 may include, for example, NR base stations, LTE base stations, Node Bs, eNBs (e.g., in 4G), gNBs (e.g., in 5G), access points, transmission reception points (TRPs), DUs, RUs, CUs, network mobility elements, core network nodes, network elements, network equipment, RAN nodes, or combinations thereof. In some embodiments, the network nodes 110 may be interconnected to each other or to one or more other network nodes 110 within the wireless network 100 through various types of fronthaul, midhaul, and / or backhaul interfaces, such as direct physical connections, air interfaces, or virtual networks, using any suitable transport network.
[0023]
[0038] In some embodiments, a network node 110 may provide communication coverage for a particular geographic area. In the Third Generation Partnership Project (3GPP), the term “cell” can refer to the coverage area of the network node 110 and / or a network node subsystem serving that coverage area, depending on the context in which the term is used. The network node 110 may provide communication coverage for a macrocell, a picocell, a femtocell, and / or another type of cell. A macrocell may cover a relatively large geographic area (e.g., a few kilometers in radius) and may allow unrestricted access by UEs 120 with service subscriptions. A picocell may cover a relatively small geographic area and may allow unrestricted access by UEs 120 with service subscriptions. A femtocell may cover a relatively small geographic area (e.g., a home) and may allow restricted access by UEs 120 having an association with the femtocell (e.g., UEs 120 in a closed subscriber group (CSG)). A network node 110 for a macro cell may be referred to as a macro network node. A network node 110 for a pico cell may be referred to as a pico network node. A network node 110 for a femto cell may be referred to as a femto network node or a home network node. In the example shown in FIG. 1 , network node 110a may be a macro network node for macro cell 102a, network node 110b may be a pico network node for pico cell 102b, and network node 110c may be a femto network node for femto cell 102c. A network node may support one or multiple (e.g., three) cells. In some examples, a cell may not necessarily be fixed, and the geographic area of a cell may move according to the location of a mobile network node 110 (e.g., a mobile network node).
[0024]
[0039] In some aspects, the term “base station” or “network node” may refer to an aggregated base station, a non-aggregated base station, an integrated access and backhaul (IAB) node, a relay node, or one or more components thereof. For example, in some aspects, a “base station” or a “network node” may refer to a CU, DU, RU, a Near-Real Time (RT) RAN Intelligent Controller (RIC), or a Non-Real Time (Non-RT) RIC, or a combination thereof. In some aspects, the term “base station” or “network node” may refer to one device configured to perform one or more functions, such as those described herein with respect to the network node 110. In some aspects, the term “base station” or “network node” may refer to multiple devices configured to perform one or more functions. For example, in some distributed systems, multiple different devices (which may be located in the same geographic location or different geographic locations) may each be configured to perform at least a portion of the functions or to replicate the performance of at least a portion of the functions, and the term "base station" or "network node" may refer to any one or more of those different devices. In some aspects, the term "base station" or "network node" may refer to one or more virtual base stations or one or more virtual base station functions. For example, in some aspects, two or more base station functions may be instantiated on a single device. In some aspects, the term "base station" or "network node" may refer to one of the base station functions and not another base station function. In this manner, a single device may include more than one base station.
[0025]
[0040] The wireless network 100 may include one or more relay stations. A relay station is a network node that can receive a data transmission from an upstream node (e.g., a network node 110 or a UE 120) and transmit the data transmission to a downstream node (e.g., a UE 120 or a network node 110). A relay station may also be a UE 120 that can relay a transmission for another UE 120. In the embodiment shown in FIG. 1, network node 110d (e.g., a relay network node) can communicate with network node 110a (e.g., a macro network node) and UE 120d to facilitate communications between network node 110a (e.g., a macro network node) and UE 120d. A network node 110 that relays communications may also be referred to as a relay station, a relay base station, a relay network node, a relay node, a repeater, etc.
[0026]
[0041] The wireless network 100 may be a heterogeneous network that includes different types of network nodes 110, such as macro network nodes, pico network nodes, femto network nodes, relay network nodes, etc. These different types of network nodes 110 may have different transmit power levels, different coverage areas, and / or different susceptibility to interference within the wireless network 100. For example, the macro network nodes may have high transmit power levels (e.g., 5-40 watts), while the pico network nodes, femto network nodes, and relay network nodes may have lower transmit power levels (e.g., 0.1-2 watts).
[0027]
[0042] A network controller 130 may be coupled to or in communication with a set of network nodes 110 and may provide coordination and control for these network nodes 110. The network controller 130 may communicate with the network nodes 110 via backhaul or midhaul communication links. The network nodes 110 may communicate with each other directly or indirectly via wireless or wired backhaul communication links. In some aspects, the network controller 130 may be or may include a CU or a core network device.
[0028]
[0043] The UEs 120 may be dispersed throughout the wireless network 100, and each UE 120 may be stationary or mobile. The UEs 120 may include, for example, access terminals, terminals, mobile stations, and / or subscriber units. The UE 120 may be a mobile phone (e.g., a smartphone), a personal digital assistant (PDA), a wireless modem, a wireless communication device, a handheld device, a laptop computer, a cordless phone, a wireless local loop (WLL) station, a tablet, a camera, a gaming device, a netbook, a smartbook, an ultrabook, a medical device, a biometric device, a wearable device (e.g., a smart watch, smart clothing, smart glasses, smart wristband, smart jewelry (e.g., a smart ring or smart bracelet)), an entertainment device (e.g., a music device, a video device, and / or satellite radio), a vehicle component or sensor, a smart meter / sensor, industrial manufacturing equipment, a global positioning system device, a UE function of a network node, and / or any other suitable device configured to communicate over a wireless or wired medium.
[0029]
[0044] Some UEs 120 may be considered machine-type communication (MTC) UEs or evolved or enhanced machine-type communication (eMTC) UEs. MTC UEs and / or eMTC UEs may include, for example, robots, drones, remote devices, sensors, meters, monitors, and / or location tags capable of communicating with a network node, another device (e.g., a remote device), or some other entity. Some UEs 120 may be considered Internet-of-Things (IoT) devices and / or may be implemented as narrowband IoT (NB-IoT) devices. Some UEs 120 may be considered customer premises equipment. The UE 120 may be included within a housing that houses components of the UE 120, such as a processor component and / or a memory component. In some embodiments, the processor component and the memory component may be coupled together. For example, a processor component (e.g., one or more processors) and a memory component (e.g., memory) may be operatively coupled, communicatively coupled, electronically coupled, and / or electrically coupled.
[0030]
[0045] In general, any number of wireless networks 100 may be deployed within a given geographic area. Each wireless network 100 may support a particular RAT and may operate on one or more frequencies. A RAT may also be referred to as a radio technology, air interface, etc. A frequency may also be referred to as a carrier, frequency channel, etc. To avoid interference between wireless networks of different RATs, each frequency may support a single RAT within a given geographic area. In some cases, NR or 5G RAT networks may be deployed.
[0031]
[0046] In some embodiments, two or more UEs 120 (e.g., those shown as UE 120a and UE 120e) may communicate directly (e.g., without using network node 110 as an intermediary to communicate with each other) using one or more sidelink channels. For example, the UEs 120 may communicate using peer-to-peer (P2P) communication, device-to-device (D2D) communication, a vehicle-to-everything (V2X) protocol (which may include, e.g., a vehicle-to-vehicle (V2V) protocol, a vehicle-to-infrastructure (V2I) protocol, or a vehicle-to-pedestrian (V2P) protocol), and / or a mesh network. In such embodiments, the UEs 120 may perform scheduling operations, resource selection operations, and / or other operations described elsewhere herein as being performed by the network node 110.
[0032]
[0047] In some examples, wireless network 100 may include one or more servers, such as servers 135a, 135b, and 135c. In some examples, servers 135a, 135b, and 135c may be wirelessly connected or otherwise connected, such as via a wired connection. Servers 135a, 135b, and 135c may be UE-side servers and may communicate with one or more UEs, such as UEs 120a, 120b, and / or 120c. For example, server 135a may be a UE-side server associated with a first UE vendor and may communicate with UE 120a (e.g., a UE associated with the first UE vendor). Server 135b may be a second UE-side server associated with a second UE vendor different from the first UE vendor and may communicate with UE 120b (e.g., UE 120b may be associated with the second UE vendor). A vendor may be a manufacturer or entity that designs, markets, maintains, and / or sells a device (such as a UE or a network node), or one or more components of a device, among other examples. Server 135b may have similar functionality to server 135a, as described in more detail elsewhere herein, for example, with respect to Figures 10-15. Server 135c may be a network-side server and may communicate with one or more network nodes, such as network node 110a. Servers 135a, 135b, and 135c may also communicate with each other. Servers 135a, 135b, and 135c may communicate using various wireless or wired technologies, such as Ethernet, Wi-Fi, or cellular technologies. For example, servers 135a and 135b may each host an encoder and train the encoder to be used by one or more UEs in encoding information, such as sensed channel condition feedback from reference signals transmitted by one or more network nodes, e.g., by using one or more machine learning (ML) algorithms, as described in more detail elsewhere herein.Server 135c may host a decoder and train the decoder for use by one or more network nodes in decoding information, e.g., by using one or more ML algorithms, as described in more detail elsewhere herein. In some examples, the UE server and the network server may cooperate to train an encoder for use by UEs in encoding information for transmission to the network nodes. For example, server 135a may provide input information to server 135c, such as sensed channel condition feedback received by server 135a from one or more UEs. Server 135c may train the decoder and encoder using the received input information and provide the training information to server 135a for use by server 135a in training the encoder to be provided to one or more UEs. Server 135a may train the encoder using the training information and may send encoder parameters for the trained encoder to one or more UEs, such as UE 120a, for use in encoding information to be transmitted to the network nodes.
[0033]
[0048] Devices of wireless network 100 may communicate using an electromagnetic spectrum, which may be subdivided by frequency or wavelength into various classes, bands, channels, etc. For example, devices of wireless network 100 may communicate using one or more operating bands. In 5G NR, two initial operating bands have been identified with frequency range designations FR1 (410 MHz to 7.125 GHz) and FR2 (24.25 GHz to 52.6 GHz). It should be understood that FR1 is often referred to (interchangeably) as the “sub-6 GHz” band in various documents and papers, although portions of FR1 are above 6 GHz. Similar nomenclature issues may arise with respect to FR2, which is often referred to (interchangeably) as the “millimeter wave” band in documents and papers, even though it is different from the extremely high frequency (EHF) band (30 GHz to 300 GHz), which is identified by the International Telecommunications Union (ITU) as the “millimeter wave” band.
[0034]
[0049] Frequencies between FR1 and FR2 are often referred to as mid-band frequencies. Recent 5G NR studies have identified the operating band for these mid-band frequencies as a frequency range designated FR3 (7.125 GHz to 24.25 GHz). Frequency bands included within FR3 may inherit FR1 and / or FR2 characteristics, thus effectively extending the characteristics of FR1 and / or FR2 to the mid-band frequencies. Higher frequency bands are currently being explored to extend 5G NR operation beyond 52.6 GHz. For example, three higher operating bands have been identified as frequency ranges designated FR4a or FR4-1 (52.6 GHz to 71 GHz), FR4 (52.6 GHz to 114.25 GHz), and FR5 (114.25 GHz to 300 GHz). Each of these higher frequency bands is included within the EHF band.
[0035]
[0050] With the above examples in mind, it should be understood that, unless otherwise specified, terms such as "sub-6 GHz," as used herein, may broadly refer to frequencies that may be below 6 GHz, frequencies that may be in the FR1 range, or frequencies that may include mid-band frequencies. Furthermore, unless otherwise specified, it should be understood that terms such as "millimeter wave," as used herein, may broadly refer to frequencies that may include mid-band frequencies, frequencies that may be in the FR2, FR4, FR4-a, or FR4-1, and / or FR5 ranges, or frequencies that may be in the EHF band. It is contemplated that frequencies included within these operating bands (e.g., FR1, FR2, FR3, FR4, FR4-a, FR4-1, and / or FR5) may be modified, and the techniques described herein may be applicable to those modified frequency ranges.
[0036]
[0051] In some aspects, a server (e.g., server 135a, 135b, and / or 135c) may include communications manager 140. A server may also be referred to herein as a “server device.” As described in more detail elsewhere herein, communications manager 140 may receive, from another server, a function associated with a trained first model, the function being configured to output one or more gradients associated with the trained first model, and train a second model based on selecting one or more weights associated with a second model using the one or more gradients, the one or more gradients being obtained based on inputting one or more activation values and one or more inputs into the function. Additionally or alternatively, communications manager 140 may perform one or more other operations described herein.
[0037]
[0052] In some aspects, a server (e.g., server 135a, 135b, and / or 135c) may include a communications manager 150. As described in more detail elsewhere herein, communications manager 150 may train a first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activation values associated with the output of the trained first model, and transmit to another server a function associated with the trained first model, the function being configured to output one or more activation values based on ground truth. Additionally or alternatively, communications manager 150 may perform one or more other operations described herein.
[0038]
[0053] As noted above, Figure 1 is provided as an example. Other examples may differ from those described with respect to Figure 1.
[0039]
[0054] 2 illustrates an example embodiment 200 of a network node 110 in communication with a UE 120 in a wireless network 100 in accordance with the present disclosure. The network node 110 may be equipped with a set of antennas 234a through 234t, such as T antennas (T≧1). The UE 120 may be equipped with a set of antennas 252a through 252r, such as R antennas (R≧1). The network node 110 of example 200 includes one or more radio frequency components, such as the antennas 234 and a modem 232. In some embodiments, the network node 110 may include an interface, a communication component, or another component that facilitates communication with the UE 120 or another network node. Some network nodes 110 may not include a radio frequency component that facilitates direct communication with the UE 120, such as one or more CUs or one or more DUs.
[0040]
[0055] At the network node 110, a transmit processor 220 may receive data destined for a UE 120 (or set of UEs 120) from a data source 212. The transmit processor 220 may select one or more modulation and coding schemes (MCSs) for the UE 120 based at least in part on one or more channel quality indicators (CQIs) received from the UE 120. The network node 110 may process (e.g., encode and modulate) data for the UE 120 and provide data symbols to the UE 120 based at least in part on the MCS(es) selected for the UE 120. The transmit processor 220 may process system information (e.g., related to semi-static resource partitioning information (SRPI)) and control information (e.g., CQI requests, grants, and / or higher layer signaling) and provide overhead symbols and control symbols. The transmit processor 220 may generate reference symbols for a reference signal (e.g., a cell-specific reference signal (CRS) or a demodulation reference signal (DMRS)) and a synchronization signal (e.g., a primary synchronization signal (PSS) or a secondary synchronization signal (SSS)). The transmit (TX) multiple-input multiple-output (MIMO) processor 230 may perform spatial processing (e.g., precoding) on the data symbols, control symbols, overhead symbols, and / or reference symbols, if applicable, and may provide a set of output symbol streams (e.g., T output symbol streams) to a corresponding set of modems 232 (e.g., T modems), depicted as modems 232a through 232t.For example, each output symbol stream may be provided to a modulator component (denoted as MOD) of modem 232. Each modem 232 may use a corresponding modulator component to process (e.g., for OFDM) the corresponding output symbol stream to obtain an output sample stream. Each modem 232 may further use a corresponding modulator component to process (e.g., convert to analog, amplify, filter, and / or upconvert) the output sample stream to obtain a downlink signal. Modems 232a through 232t may transmit a set of downlink signals (e.g., T downlink signals) via a corresponding set of antennas 234 (e.g., T antennas), denoted as antennas 234a through 234t.
[0041]
[0056] At the UE 120, a set of antennas 252 (depicted as antennas 252a through 252r) may receive downlink signals from the network node 110 and / or other network nodes 110 and may provide a set of received signals (e.g., R received signals) to a set of modems 254 (e.g., R modems) depicted as modems 254a through 254r. For example, each received signal may be provided to a demodulator component (depicted as DEMOD) of the modem 254. Each modem 254 may obtain input samples by conditioning (e.g., filtering, amplifying, downconverting, and / or digitizing) the received signal using a corresponding demodulator component. Each modem 254 may further process the input samples (e.g., for OFDM) using the demodulator component to obtain received symbols. A MIMO detector 256 may obtain received symbols from the modems 254, perform MIMO detection on the received symbols, if applicable, and provide the detected symbols. The receive processor 258 may process (e.g., demodulate and decode) the detected symbols, provide decoded data for the UE 120 to a data sink 260, and provide decoded control and system information to a controller / processor 280. The term “controller / processor” may refer to one or more controllers, one or more processors, or a combination thereof. The channel processor may determine a reference signal received power (RSRP) parameter, a received signal strength indicator (RSSI) parameter, a reference signal received quality (RSRQ) parameter, and / or a CQI parameter, among other examples. In some embodiments, one or more components of the UE 120 may be included within a housing 284.
[0042]
[0057] The network controller 130 may include a communication unit 294, a controller / processor 290, and a memory 292. The network controller 130 may include, for example, one or more devices in a core network. The network controller 130 may communicate with the network node 110 via the communication unit 294.
[0043]
[0058] One or more antennas (e.g., antennas 234a-t and / or antennas 252a-r) may include or be contained within one or more antenna panels, one or more antenna groups, one or more sets of antenna elements, and / or one or more antenna arrays, among other examples. An antenna panel, antenna group, set of antenna elements, and / or antenna array may include one or more antenna elements (in a single housing or multiple housings), a set of coplanar antenna elements, a set of non-coplanar antenna elements, and / or one or more antenna elements coupled to one or more transmitting and / or receiving components, such as one or more components of FIG. 2.
[0044]
[0059] On the uplink, at the UE 120, the transmit processor 264 may receive and process data from the data source 262 and control information from the controller / processor 280 (e.g., for reports including RSRP, RSSI, RSRQ, and / or CQI). The transmit processor 264 may generate reference symbols for one or more reference signals. The symbols from the transmit processor 264 may be precoded by the TX MIMO processor 266, if applicable, further processed by the modem 254 (e.g., for DFT-s-OFDM or CP-OFDM), and transmitted to the network node 110. In some embodiments, the modem 254 of the UE 120 may include a modulator and a demodulator. In some embodiments, the UE 120 includes a transceiver. The transceiver may include any combination of the antenna(s) 252, the modem(s) 254, the MIMO detector 256, the receive processor 258, the transmit processor 264, and / or the TX MIMO processor 266. The transceiver may be used by a processor (e.g., controller / processor 280) and memory 282 to implement aspects of any of the methods described herein (e.g., with reference to Figures 10-15).
[0045]
[0060] At the network node 110, uplink signals from the UE 120 and / or other UEs may be received by an antenna 234, processed by a modem 232 (e.g., a demodulator component of the modem 232, denoted as DEMOD), detected by a MIMO detector 236, if applicable, and further processed by a receive processor 238 to obtain decoded data and control information sent by the UE 120. The receive processor 238 may provide the decoded data to a data sink 239 and the decoded control information to a controller / processor 240. The network node 110 may include a communication unit 244 and may communicate with the network controller 130 via the communication unit 244. The network node 110 may include a scheduler 246 for scheduling one or more UEs 120 for downlink and / or uplink communications. In some embodiments, the modem 232 of the network node 110 may include a modulator and a demodulator. In some embodiments, the network node 110 includes a transceiver. The transceiver may include any combination of antenna(s) 234, modem(s) 232, MIMO detector 236, receive processor 238, transmit processor 220, and / or TX MIMO processor 230. The transceiver may be used by a processor (e.g., controller / processor 240) and memory 242 to implement aspects of any of the methods described herein (e.g., with reference to FIGS. 10-15).
[0046]
[0061] Reference to an element in the singular shall mean "one or more," and not "one and only one," unless expressly stated otherwise. For example, reference to an element (e.g., a "processor," a "controller," a "memory," etc.) should be understood to refer to one or more elements (e.g., "one or more processors," "one or more controllers," and / or "one or more memories," among other examples) unless otherwise specified. When one or more elements performing functions (e.g., method steps) are referred to, one element may perform all of the functions, or more than one element may collectively perform the functions. When more than one element collectively performs a function, each function may not be performed by each of the elements (e.g., different functions may be performed by different elements), and / or each function may not be performed in its entirety by only one element (e.g., different elements may perform different sub-functions of the function). Similarly, when referring to one or more elements configured to cause another element (e.g., a device) to perform a function, one element may be configured to cause the other elements to perform all of the functions, or more than one element may be collectively configured to cause the other elements to perform the functions.
[0047]
[0062] In some examples, a server described herein (e.g., a network server, a UE server, servers 135a, 135b, and / or 135c) may include a bus, a processor, a memory, input components, output components, and / or communication components. A bus may include one or more components that enable wired and / or wireless communication between components of the server. For example, a bus may include electrical connections (e.g., wires, traces, and / or leads) and / or a wireless bus. A processor may include a central processing unit, a graphics processing unit, a microprocessor, a controller, a microcontroller, a digital signal processor, a field programmable gate array, an application specific integrated circuit, and / or another type of processing component. A processor may be implemented in hardware, firmware, or a combination of hardware and software. In some examples, a processor may include one or more processors that can be programmed to perform one or more operations or processes described elsewhere herein.
[0048]
[0063] The memory may include volatile and / or nonvolatile memory. For example, the memory may include random access memory (RAM), read only memory (ROM), a hard disk drive, and / or another type of memory (e.g., flash memory, magnetic memory, and / or optical memory). The memory may include internal memory (e.g., RAM, ROM, or a hard disk drive) and / or removable memory (e.g., removable via a Universal Serial Bus connection). The memory may be a non-transitory computer-readable medium. The memory may store information, one or more instructions, and / or software (e.g., one or more software applications) related to the operation of the server. The input component may enable the server to receive input, such as user input and / or sensed input. For example, the input component may include a touchscreen, a keyboard, a keypad, a mouse, a button, a microphone, a switch, a sensor, a global positioning system sensor, an accelerometer, a gyroscope, and / or an actuator, among other examples. The output components may enable the server to provide output via a display, a speaker, and / or a light emitting diode, etc. The communication components may enable the server to communicate with other devices via wired and / or wireless connections. For example, the communication components may include a receiver, a transmitter, a transceiver, a modem, a network interface card, and / or an antenna.
[0049]
[0064] The server may perform one or more operations or processes described herein, e.g., process 1200 of FIG. 12 , process 1300 of FIG. 13 , and / or other processes as described herein. For example, a non-transitory computer-readable medium (e.g., memory) may store a set of instructions (e.g., one or more instructions or code) for execution by a processor. The processor may execute the set of instructions to perform one or more operations or processes described herein. In some implementations, execution of the set of instructions by one or more processors causes the one or more processors and / or server to perform one or more operations or processes described herein. In some implementations, hardwired circuitry may be used in place of or in combination with instructions to perform one or more operations or processes described herein. Additionally or alternatively, the processor of the server may be configured to perform one or more operations or processes described herein, e.g., process 1200 of FIG. 12 , process 1300 of FIG. 13 , and / or other processes as described herein. Thus, implementations described herein are not limited to any specific combination of hardware circuitry and software.
[0050]
[0065] As described in more detail elsewhere herein, the controller / processor 240 of the network node 110, the controller / processor 280 of the UE 120, and / or any other component(s) of FIG. 2 may perform one or more techniques associated with hybrid sequential training for the encoder and decoder models. In some aspects, a server described herein is the network node 110, is included in the network node 110, or includes one or more components of the network node 110 shown in FIG. 2. In some other aspects, a server described herein is the UE 120, is included in the UE 120, or includes one or more components of the UE 120 shown in FIG. 2. In other aspects, a server described herein may be a device separate from the network node 110 and / or the UE 120 and may be configured to communicate with the network node 110 and / or the UE 120.
[0051]
[0066] For example, the controller / processor 240 of the network node 110, the controller / processor 280 of the UE 120, the controller / processor of the server, and / or any other component(s) of Figure 2 may perform or direct the operations of, for example, process 1200 of Figure 12, process 1300 of Figure 13, and / or other processes as described herein. The memory 242 and the memory 282 may store data and program codes for the network node 110 and the UE 120, respectively. In some embodiments, the memory 242 and / or the memory 282 may include a non-transitory computer-readable medium storing one or more instructions (e.g., code and / or program code) for wireless communication. For example, the one or more instructions, when executed by one or more processors of the server, the network node 110, and / or the UE 120 (e.g., directly or after being compiled, translated, and / or interpreted), may cause the one or more processors, the server, the UE 120, and / or the network node 110 to perform or direct the operations of, for example, process 1200 of Figure 12, process 1300 of Figure 13, and / or other processes as described herein. In some embodiments, executing the instructions may include running the instructions, translating the instructions, compiling the instructions, and / or interpreting the instructions, among other examples.
[0052]
[0067] In some aspects, a server (e.g., server 135a, 135b, and / or 135c) includes means for receiving, from another device, a function associated with the trained first model, the function configured to output one or more gradients associated with the trained first model, and / or means for training a second model based on selecting one or more weights associated with the second model using the one or more gradients, the one or more gradients obtained based on inputting one or more activity values and one or more inputs to the function. In some aspects, the means for the server to perform the operations described herein may include, for example, one or more of communications manager 140, an antenna, a modem, a MIMO detector, a receive processor, a transmit processor, a TX MIMO processor, a controller / processor, an input component, an output component, a communication component, and / or a memory, among other examples.
[0053]
[0068] In some aspects, a server (e.g., a network server, a UE server, servers 135a, 135b, and / or 135c) includes means for training the first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with an output of the trained first model, and / or means for transmitting to the second device a function associated with the trained first model, the function being configured to output one or more activity values based on ground truth inputs. In some aspects, the means for the server to perform the operations described herein may include, for example, one or more of communications manager 150, an antenna, a modem, a MIMO detector, a receive processor, a transmit processor, a TX MIMO processor, a controller / processor, an input component, an output component, a communication component, and / or a memory, among other examples.
[0054]
[0069] 2 are shown as separate components, the functionality described above with respect to these blocks may be implemented in a single hardware, software, or combined component, or various combinations of components. For example, functionality described with respect to transmit processor 264, receive processor 258, and / or TX MIMO processor 266 may be implemented by or under the control of controller / processor 280.
[0055]
[0070] As noted above, Figure 2 is provided as an example. Other examples may differ from those described with respect to Figure 2.
[0056]
[0071] The deployment of a communication system, such as a 5G NR system, can be configured in multiple ways with various components or parts. In a 5G NR system or network, a network node, network entity, network mobility element, RAN node, core network node, network element, base station, or network equipment can be implemented in a centralized or disaggregated architecture. For example, a base station (e.g., a Node B (NB), evolved NB (eNB), NR base station, 5G NB, access point (AP), TRP, or cell, among other examples), or one or more units (or one or more components) performing base station functionality, may be implemented as a centralized base station (also known as a standalone base station or monolithic base station) or a disaggregated base station. A "network entity" or a "network node" may refer to a disaggregated base station or may refer to one or more units (e.g., one or more CUs, one or more DUs, one or more RUs, or a combination thereof) of a disaggregated base station.
[0057]
[0072] An aggregated base station (e.g., an aggregated network node) may be configured to utilize a radio protocol stack that is physically or logically integrated within a single RAN node (e.g., within a single device or unit). A disaggregated base station (e.g., a disaggregated network node) may be configured to utilize a protocol stack that is physically or logically distributed among two or more units (e.g., one or more CUs, one or more DUs, or one or more RUs). In some embodiments, a CU may be implemented within a network node, and one or more DUs may be co-located with the CU or, alternatively, may be geographically or virtually distributed throughout one or more other network nodes. A DU may be implemented to communicate with one or more RUs. Each of the CU, DU, and RU may also be implemented as a virtual unit, such as a virtual central unit (VCU), a virtual distributed unit (VDU), or a virtual radio unit (VRU), among other examples.
[0058]
[0073] The operation or network design of a base station type may take into account the aggregation characteristics of base station functions. For example, disaggregated base stations may be utilized in an IAB network, an open radio access network (O-RAN, such as the network configuration supported by the O-RAN Alliance), or a virtualized radio access network (vRAN, also known as a cloud radio access network (C-RAN)) to facilitate scaling of a communication system by separating base station functions into one or more units that can be deployed independently. A disaggregated base station may include functions implemented across two or more units in various physical locations as well as functions implemented virtually in at least one unit, which may allow flexibility in network design. Various units of a disaggregated base station may be configured for wired or wireless communication with at least one other unit of the disaggregated base station.
[0059]
[0074] 3 illustrates an example disaggregated base station architecture 300 according to the present disclosure. The disaggregated base station architecture 300 may include a CU 310 that can communicate directly with a core network 320 via a backhaul link or indirectly with the core network 320 via one or more disaggregated control units (e.g., a quasi-RT RIC 325 via an E2 link, or a non-RT RIC 315 associated with a Service Management and Orchestration (SMO) framework 305, or both). The CU 310 can communicate with one or more DUs 330 via respective midhaul links, e.g., through an F1 interface. Each of the DUs 330 can communicate with one or more RUs 340 via respective fronthaul links. Each of the RUs 340 can communicate with one or more UEs 120 via respective radio frequency (RF) access links. In some implementations, a UE 120 may be served by multiple RUs 340 simultaneously.
[0060]
[0075] Each of the units, including the CU 310, DU 330, RU 340, and quasi-RT RIC 325, non-RT RIC 315, and SMO framework 305, may include or be coupled to one or more interfaces configured to receive or transmit signals, data, or information (collectively, signals) over a wired or wireless transmission medium. Each of the units, or an associated processor or controller that provides instructions to the corresponding unit's one or more communication interfaces, may be configured to communicate with one or more of the other units over a transmission medium. In some embodiments, each of the units may include a wired interface configured to receive or transmit signals over a wired transmission medium to one or more of the other units, and a wireless interface, which may include a receiver, transmitter, or transceiver (e.g., an RF transceiver) configured to receive, transmit, or transmit and receive signals over a wireless transmission medium to one or more of the other units.
[0061]
[0076] In some aspects, the CU 310 can host one or more higher-layer control functions. Such control functions may include a radio resource control (RRC) function, a packet data convergence protocol (PDCP) function, or a service data adaptation protocol (SDAP) function, among other examples. Each control function may implement an interface configured to communicate signals with other control functions hosted by the CU 310. The CU 310 may be configured to handle user plane functions (e.g., a Central Unit-User Plane (CU-UP) function), control plane functions (e.g., a Central Unit-Control Plane (CU-CP) function), or a combination thereof. In some implementations, the CU 310 may be logically divided into one or more CU-UP units and one or more CU-CP units. When implemented in an O-RAN configuration, the CU-UP unit may bidirectionally communicate with the CU-CP unit via an interface, such as an E1 interface. The CU 310 may be implemented to communicate with the DU 330 as needed for network control and signaling.
[0062]
[0077] Each DU 330 may correspond to a logical unit including one or more base station functions for controlling the operation of one or more RUs 340. In some aspects, the DU 330 may host one or more of a radio link control (RLC) layer, a medium access control (MAC) layer, and one or more upper physical (PHY) layers, at least in part according to a functional division such as that defined by 3GPP. In some aspects, the one or more upper PHY layers may be implemented by one or more modules for forward error correction (FEC) encoding and decoding, scrambling, and modulation and demodulation, among other examples. In some aspects, the DU 330 may further host one or more lower PHY layers, such as those implemented by one or more modules for fast Fourier transform (FFT), inverse FFT (iFFT), digital beamforming, or physical random access channel (PRACH) extraction and filtering, among other examples. Each layer (sometimes referred to as a module) may be implemented with an interface configured to communicate signals with other layers (and modules) hosted by DU330 or with control functions hosted by CU310.
[0063]
[0078] Each RU 340 may implement lower layer functions. In some deployments, the RU 340 controlled by the DU 330 may correspond to a logical node hosting RF processing functions or lower PHY layer functions, such as performing FFT, performing iFFT, digital beamforming, or PRACH extraction and filtering, among other examples, based on a functional partition such as a lower layer functional partition (e.g., a functional partition defined by 3GPP). In such an architecture, each RU 340 may be operated to handle over-the-air (OTA) communications with one or more UEs 120. In some implementations, real-time and non-real-time aspects of control plane and user plane communications with the RU(s) 340 may be controlled by the corresponding DU 330. In some scenarios, this configuration may enable each DU 330 and CU 310 to be implemented in a cloud-based RAN architecture, such as a vRAN architecture.
[0064]
[0079] The SMO framework 305 may be configured to support RAN deployment and provisioning of non-virtualized and virtualized network elements. For non-virtualized network elements, the SMO framework 305 may be configured to support the deployment of dedicated physical resources for RAN coverage requirements, which may be managed via an operation and maintenance interface (e.g., an O1 interface). For virtualized network elements, the SMO framework 305 may be configured to interact with a cloud computing platform (e.g., an open cloud (O-cloud) platform 390) to perform lifecycle management of the network element (e.g., to instantiate virtualized network elements) via a cloud computing platform interface (e.g., an O2 interface). Such virtualized network elements may include, but are not limited to, the CU 310, the DU 330, the RU 340, the non-RT RIC 315, and the quasi-RT RIC 325. In some implementations, the SMO framework 305 may communicate with hardware aspects of a 4G RAN, such as an open eNB (O-eNB) 311, via the O1 interface. Additionally, in some implementations, the SMO framework 305 can communicate directly with each of the one or more RUs 340 via a corresponding O1 interface. The SMO framework 305 can also include a non-RT RIC 315 configured to support the functionality of the SMO framework 305.
[0065]
[0080] The non-RT RIC 315 may be configured to include logic functions that enable non-real-time control and optimization of RAN elements and resources, artificial intelligence / machine learning (AI / ML) workflows, including model training and updates, or policy-based guidance of applications / features in the quasi-RT RIC 325. The non-RT RIC 315 may be coupled to or communicate with the quasi-RT RIC 325 (e.g., via an A1 interface). The quasi-RT RIC 325 may be configured to include logic functions that enable near-real-time control and optimization of RAN elements and resources through data collection and action via interfaces (e.g., via an E2 interface) that connect one or more CUs 310, one or more DUs 330, or both, and the O-eNB to the quasi-RT RIC 325.
[0066]
[0081] In some implementations, the non-RT RIC 315 can receive parameters or external enrichment information from an external server to generate AI / ML models to be deployed in the quasi-RT RIC 325. Such information can be utilized by the quasi-RT RIC 325 and can be received at the SMO framework 305 or non-RT RIC 315 from non-network data sources or from network functions. In some embodiments, the non-RT RIC 315 or quasi-RT RIC 325 can be configured to adjust RAN behavior or performance. For example, the non-RT RIC 315 may monitor long-term trends and patterns in performance and use the AI / ML models to implement corrective actions through the SMO framework 305 (e.g., reconfiguration via the O1 interface) or through the creation of RAN management policies (e.g., A1 interface policies).
[0067]
[0082] As noted above, Figure 3 is provided as an example. Other examples may differ from those described with respect to Figure 3.
[0068]
[0083] 4 illustrates an example architecture 400 of a functional framework for Radio Access Network (RAN) intelligence enabled by data collection, in accordance with the present disclosure. In some scenarios, the functional framework for RAN intelligence may be enabled by further enhancements of data collection through use cases and / or examples. For example, principles or algorithms for RAN intelligence enabled by AI / ML and related functional frameworks (e.g., AI functionality and / or component inputs / outputs for AI-enabled optimization) are utilized or studied to identify benefits of an AI-enabled RAN through possible use cases (e.g., compression, beam management, energy savings, load balancing, mobility management, and / or coverage optimization, among other examples). In one embodiment, as illustrated by architecture 400, the functional framework for RAN intelligence may include multiple logical entities, such as a model training host 402, a model inference host 404, data sources 406, and actors 408.
[0069]
[0084] The model inference host 404 may be configured to execute an AI / ML model based on inference data provided by the data source 406. The model inference host 404 may generate an output (e.g., a prediction) using the inference data input to the actor 408. The actor 408 may be an element or entity of the core network or RAN. For example, the actor 408 may be a UE, a network node, a base station (e.g., a gNB), a CU, a DU, and / or an RU, among other examples. In addition, the actor 408 may also depend on the type of task performed by the model inference host 404, the type of inference data provided to the model inference host 404, and / or the type of output generated by the model inference host 404, among other examples. For example, if the output from the model inference host 404 is associated with beam management, then the actor 408 may be a UE, a DU, or an RU. In another example, if the output from the model inference host 404 is associated with Tx / Rx scheduling, then the actor 408 may be a CU or a DU.
[0070]
[0085] After the actor 408 receives the output from the model inference host 404, the actor 408 may decide whether to take action based on the output. For example, if the actor 408 is a DU or RU and the output from the model inference host 404 is associated with beam management, the actor 408 may decide whether to change and / or modify the Tx / Rx beams based on the output. If the actor 408 decides to take action based on the output, the actor 408 may instruct the action to at least one action target 410. For example, if the actor 408 decides to change / modify the Tx / Rx beams for communication between the actor 408 and the action target 410 (e.g., UE 120), then the actor 408 may send a beam (re)configuration or beam switching instruction to the action target 410. The actor 408 may modify the Tx / Rx beams based on the beam (re)configuration, such as switching to a new Tx / Rx beam or applying different parameters for the Tx / Rx beam, among other examples. As another example, the actor 408 may be a UE, and the output from the model inference host 404 may be associated with beam management. For example, the output may be one or more predicted measurements for one or more beams. The actor 408 (e.g., a UE) may determine that a measurement report (e.g., a Layer 1 (L1) RSRP report) should be sent to the network node 110.
[0071]
[0086] The data source 406 may also be configured to collect data to be used as training data for training an ML model or as inference data for feeding into an ML model inference operation. For example, the data source 406 may collect data from one or more core network and / or RAN entities, which may include the action target 410, and provide the collected data to the model training host 402 for ML model training. For example, after the action target 410 (e.g., the UE 120) receives a beam configuration from the actor 408, the action target 410 may provide performance feedback associated with the beam configuration to the data source 406, which may be used by the model training host 402 to monitor or evaluate the performance of the ML model, such as whether the output (e.g., prediction) provided to the actor 408 is accurate. In some examples, if the output provided by the actor 408 is inaccurate (or if the accuracy is below an accuracy threshold), then the model training host 402 may decide to modify or retrain the ML model used by the model inference host, for example, via an ML model deployment / update.
[0072]
[0087] In cross-node machine learning, a neural network may be divided into two parts, with the first part including an encoder in the UE and the second part including a decoder in the network node. The UE's encoder output may be transmitted to the network node as input to the decoder. For example, the input to the encoder may be channel state information (CSI), such as one or more channel estimates, one or more precoders (e.g., one or more precoding vectors), and / or one or more measurements, among other examples. The encoder may use a trained AI / ML model to compress the CSI. The output of the encoder model (e.g., the trained AI / ML model) may be transmitted to the network node. The network node may input the received information into a decoder in the network node. The decoder may use the trained AI / ML model to attempt to reconstruct the CSI (e.g., input to the encoder at the UE). To evaluate use cases of machine learning-based CSI compression, one or more different types of quantization or dequantization methods, such as vector quantization and / or scalar quantization, among other examples, may be used. For CSI compression using a two-sided model use case, multiple machine learning model training may be utilized.
[0073]
[0088] UEs and network nodes designed, marketed, and maintained by different vendors may implement different encoders and decoders to encode and decode information, such as channel condition feedback information. A UE server (e.g., server 135a or server 135b) may train an encoder offline for implementation by one or more UEs, for example, by applying one or more ML algorithms to train the encoder. The UE server may be operated and maintained by a particular UE vendor, for example, and may determine encoder parameters for transmission to one or more UEs associated with the particular UE vendor. In some cases, the one or more UEs on which the encoder is being trained may send input information, such as channel condition feedback information, to the UE server.
[0074]
[0089] The UE server may send such input information to a network server, such as server 135c. The network server may train the decoder offline for implementation by one or more network nodes, for example, by applying one or more ML algorithms to train the decoder. The network server may be operated and maintained by a particular network node vendor, for example, and may determine the decoder parameters to be provided to one or more network nodes associated with the particular network node vendor. In some cases, one or more network nodes on which the decoder is trained may send input information, such as channel condition feedback information, to the network server. The network server may further oversee the training of the encoder by one or more UE servers. For example, the network server may receive input information from one or more UE servers or from another source and may use the input information to train both the encoder and the decoder. The network server may then encode the input information using the trained encoder to generate training information. The training information may include both the input information and the output of the encoder, such as the coded input information. The network server may send the training information to one or more UE servers. One or more UE servers may use the training information to perform offline training of each respective UE server's encoder. Such training may generate one or more encoder parameters for use by the one or more UEs in encoding information, and the encoder parameters may be transmitted by the one or more UE-side servers to the one or more UEs.
[0075]
[0090] As noted above, Figure 4 is provided as an example. Other examples may differ from those described with respect to Figure 4.
[0076]
[0091] 5 illustrates an example architecture 500 and associated AI / ML-based channel state feedback compression according to the present disclosure. As described elsewhere herein, in cross-node machine learning, a neural network may be split into two parts, with a first part including an encoder 502 at the UE and a second part including a decoder 504 at the network node. The encoder may include an encoder model, which is an AI / ML model trained to compress the CSI. The encoder output at the UE is transmitted to the network node to provide as input to the decoder. The decoder may include a decoder model, which is an AI / ML model trained to reconstruct or restore the CSI.
[0077]
[0092] 5, the encoder 502 may output a compressed channel state feedback (CSF) or another data signal, which is received as input at the decoder 504. The decoder 504 may output a reconstructed CSF (e.g., a recovered CSF) or another data signal, such as a precoding vector, among other examples. In multi-vendor training, each vendor (e.g., a UE vendor or a network node vendor) may be associated with a corresponding server that participates in offline training. The UE server(s) (e.g., server 135a and / or server 135b) may communicate with the network server(s) (e.g., server 135c) during training using a server-to-server connection.
[0078]
[0093] For CSI compression using a two-sided model use case, multiple machine learning model training may be utilized. In some examples, joint training of two-sided models at a single side / entity (e.g., the UE side or the network side) may be utilized. In some examples, joint training of two-sided models at the network side and the UE side, respectively, may be utilized. In yet some other examples, separate training at the network side and the UE side may be utilized, in which the UE side CSI generation portion and the network side CSI reconstruction portion are trained by the UE side and the network side, respectively (e.g., separate training may also be referred to as sequential training). "Joint training" may refer to the fact that the generative model and the reconstruction model may be trained in the same loop for forward and backward propagation. Joint training may be performed at a single node or across multiple nodes (e.g., through gradient exchange between nodes or servers). Separate training may include sequential training starting from UE side training, sequential training starting from network side training, or parallel training by the UE server and the network server.
[0079]
[0094] As noted above, Figure 5 is provided as an example. Other examples may differ from those described with respect to Figure 5.
[0080]
[0095] FIG. 6 is a diagram illustrating an example embodiment 600 associated with multi-vendor AI / ML training in accordance with the present disclosure.
[0081]
[0096] For example, as shown in FIG. 6, a first network node (NN1) (e.g., network node 110) may be associated with a first cell, and a second network node (NN2) (e.g., network node 110) may be associated with a second cell. Multiple UEs 120 (e.g., UE1, UE2, UE3, UE4) may be within the coverage area of NN1 and / or NN2. In cases without multi-vendor training, each UE-network node pair may need to utilize a different encoder-decoder pair. Multi-vendor training eliminates the need to utilize a different encoder-decoder pair for each UE-network node pairing. For example, in cases of multiple UE vendors with one network node vendor, a common network node decoder may be trained to work with multiple UE encoders. As a result, a network node (e.g., NN1) may not need to maintain a separate decoder model for each UE located within the coverage area of the network node's cell. In a single UE vendor example with multiple network node vendors, a common UE encoder may be trained to work with multiple network node decoders. In such an example, the UE may not need to maintain a separate encoder model for each network node (e.g., as the UE moves to a new cell). In a multiple UE vendor example with multiple network node vendors, the UE encoder may be trained to work with multiple network node decoders, while the network node decoder may be trained to work with multiple UE encoders. For example, as shown in FIG. 6, the encoders for UE1 and UE2 may be trained to work with the decoder of NN1, while the encoder for UE4 may be trained to work with the decoder of NN2. However, UE3 may be at the cell edge and between NN1 and NN2, so that the encoder for UE3 may be trained to work with both decoders, NN1 and NN2.In other words, when UE3 moves from the coverage area of NN1 to the coverage area of NN2, UE3 may deploy the same encoder model to communicate with NN1 and NN2 (e.g., where NN1 and NN2 may be associated with different vendors and / or different decoder models). This may reduce the training overhead and / or complexity associated with the AI / ML-based CSI compression described herein, as the UE may not need to maintain multiple encoder models for different network node vendors and / or different network node decoder models. Additionally or alternatively, the network node may not need to maintain multiple decoder models for different UE vendors and / or different UE encoder models.
[0082]
[0097] As noted above, Figure 6 is provided as an example. Other examples may differ from those described with respect to Figure 6.
[0083]
[0098] 7A and 7B illustrate examples 700 and 710 associated with joint training for encoder and decoder models in accordance with the present disclosure. As used herein, joint or joint training performed on a single device may be referred to as Type 1 training. For example, Type 1 training may be associated with joint training of both-side models (e.g., encoder and decoder models) on a single side / entity.
[0084]
[0099] As shown in FIG. 7A, input or ground truth may be provided to an encoder model at the UE (e.g., V in FIG. 7A). in For example, the input may include CSI as described in more detail elsewhere herein. inmay be compressed by an encoder model. The encoder model may output an activation value or activation function (e.g., shown as Z in FIG. 7A). An "activation function" or "activation value" may refer to the output of a neural network (e.g., of an encoder model). For example, the activation function of a node in a neural network defines the output of the node given an input or set of inputs. The UE may transmit the activation function Z, and a network node may receive it. The network node may provide the activation function Z as an input to a decoder model. The decoder model may output an activation function Z (e.g., shown as V in FIG. 7A). out The output may be V in and / or the recovery of the activation function Z.
[0085]
[0100] As shown in FIG. 7B, example 710 illustrates Type 1 training and model transfer. For example, a device (e.g., a UE server or a network server) may train the encoder model and the decoder model. The device may transfer the V in and V out to the loss function, which is the encoder's original input V in and a reconstructed version V of the decoder's original input. out A difference between the trained decoder model and the decoded decoder model may be determined. A gradient may be calculated based on the loss function, and the weights of the encoder or decoder may be updated to train the encoder or decoder. As shown in FIG. 7B, if the joint training is performed at the UE server, then the UE server may send, and the network server may receive, an indication of the trained decoder model (e.g., provided by the network server to one or more network nodes). As another example, if the joint training is performed at the network server, then the network server may send, and the UE server may receive, an indication of the trained encoder model (e.g., provided by the UE server to one or more UEs).
[0086]
[0101] In joint training (e.g., Type 1 training), both the encoder and decoder may be jointly trained, thereby allowing both encoder and decoder model weights to be jointly optimized. In offline joint training, models may be trained offline and provided to either the network node or the UE. However, one-sided joint training may allow a trained model to be exposed to the network node or the UE. Joint training may occur at the UE server or the network server. For example, a UE vendor may train both the encoder and decoder models using its own data set and share the trained decoder model with a network server (e.g., associated with a vendor different from the UE vendor). A decoder model shared with another vendor may reveal or provide relevant information about the implementation details of the UE's components (e.g., the UE's modem, etc.). Similarly, in examples where a network server trains both the encoder and decoder models, the shared encoder model may reveal or provide relevant information about the implementation details of the network node's components. This information may be revealed, in part, due to the symmetry that typically exists between the encoder and decoder. Thus, the trained encoders and decoders may contain confidential information that may be trade secrets or that a vendor may not want to reveal to another vendor.
[0087]
[0102] In some other examples, the encoder model and the decoder model may be trained simultaneously on different devices (e.g., the encoder model and the decoder model are trained in the same loop for forward propagation and backpropagation). For example, the UE server may train the encoder model, and the network server may train the decoder model. Simultaneous training on different devices may be referred to as Type 2 training. For example, Type 2 training may include joint training of both models (e.g., decoder and encoder) on the network side and the UE side, respectively. For example, for each forward propagation loop and / or each backpropagation loop, the UE server may generate a forward propagation result (e.g., V inFor example, one or more UEs may provide data (e.g., CSI) to the UE server to be used to train the encoder and / or decoder. The UE server may generate Z based on forward propagation results (e.g., Z and V in ) to the network server. The network server may then provide Z to the decoder model to generate V out The network server can obtain V out V in The network server may generate backpropagation results (e.g., gradients) based on a loss function that compares the backpropagation results (e.g., gradients). The network server may send the backpropagation results (e.g., gradients), and the UE server may receive them. The UE server may train an encoder model based on the backpropagation results (e.g., gradients). For example, the UE server may update one or more weights of the neural network of the encoder model based on the backpropagation results (e.g., gradients). After training the model, the UE server may send the trained encoder model to one or more UEs. Similarly, the network server may send the trained decoder model to one or more network nodes. The UE(s) and network node(s) may perform inference using the trained model, as described in more detail elsewhere herein.
[0088]
[0103] Type-2 training ensures that secrets and / or confidential information is not shared between the UE server and the network server during training (e.g., using distributed training on different devices rather than on a single device as in Type-1 training). Additionally, Type-2 training may be associated with improved training of models because the models are trained simultaneously and in the same loop for forward and backward propagation. However, Type-2 training is performed simultaneously on different devices. For example, to perform Type-2 training, a training session may be established between the UE server and the network server. Thus, Type-2 training may be associated with constraints on the timing of training (e.g., because performing Type-2 training requires a training session between the UE server and the network server).
[0089]
[0104] As mentioned above, Figures 7A and 7B are provided as examples, and other examples may differ from those described with respect to Figures 7A and 7B.
[0090]
[0105] FIG. 8 illustrates an example embodiment 800 associated with sequential training for encoder and decoder models according to the present disclosure. As used herein, sequential training or separate training may be referred to as Type 3 training. For example, Type 3 training may be associated with separate training of both-sided models (e.g., encoder model and decoder model) in different entities. For example, FIG. 8 illustrates network-driven sequential training. However, Type 3 training may also include UE-driven (e.g., UE-server-driven) sequential training in a manner similar to that described herein.
[0091]
[0106] As shown in Figure 8, multiple UE encoders may be trained based on the trained network node decoders. For example, the network server may be trained in a manner similar to that described with respect to Figures 7A and 7B (e.g., using the encoder model at the network server). The network server may transmit a dataset, which the UE server may receive. The dataset may include one or more inputs (e.g., one or more Vs) used to train the decoder model. in and / or one or more outputs of the encoder (e.g., one or more Z-functions). This may allow different UE servers to use the dataset to train encoder models. For example, as shown in FIG. 8, the UE server may generate V from the dataset. in may be provided as input to the encoder model. The UE server may then provide the output obtained from the encoder model (e.g., Z UE ), and input from the dataset (e.g., V in) to the loss function. The loss function may output gradients that are used by the UE server to update one or more weights of the encoder model, as described in more detail elsewhere herein. For example, training the UE encoder involves combining Z (e.g., the output of the network node encoder) with Z, which is the output of the UE encoder. UE This can be achieved by minimizing the loss between (i.e., the input signal and the output signal). Thus, Type 3 training allows for offline, separate training on different devices. In addition, Type 3 training can be performed at different times on different devices, providing more flexibility in training the encoder and decoder models (e.g., compared to Type 2 training described elsewhere herein).
[0092]
[0107] As noted above, Figure 8 is provided as an example. Other examples may differ from those described with respect to Figure 8.
[0093]
[0108] 9A and 9B are diagrams illustrating embodiments 900 and 910 associated with vector quantization according to the present disclosure.
[0094]
[0109] In vector quantization, an input vector may be quantized and mapped to one or more vectors in a quantization codebook. In some examples, the quantization codebook may contain vectors of size 2 or 4, with each entry represented by 2 bits or another amount of bits. However, in other examples, the quantization codebook may contain vectors of different sizes.
[0095]
[0110] As shown in Figure 9A, the input V in can be input into the encoder model, which then outputs the encoder output Z E Generates the output Z E is the quantized output Z q The quantized output Z q is V in can be processed by a decoder model to reconstruct V outAs shown in FIG. 9B, to perform quantization, a quantizer is used to convert the encoder output Z E Receive Z E into sub-vectors of size d-subsets (e.g., 2 or 4). E,0 , Z E,1 ) is a quantized subvector (e.g., Z q,0 , Z q,1 ), where the quantized subvector is mapped to one of the vectors in the codebook. To perform the codebook-based mapping, the quantizer maps the value of the quantized subvector to two values in the codebook (e.g., one of K values in the codebook). For example, the quantizer may map the input to the closest quantized value in the codebook. The quantized subvector is then mapped to the quantized output Z q are combined to form
[0096]
[0111] As mentioned above, Figures 9A and 9B are provided as examples, and other examples may differ from those described with respect to Figures 9A and 9B.
[0097]
[0112] As described above, different training techniques may be used to train the encoder and decoder models for CSI compression. For example, Type 2 training may be used to ensure that secrets and / or confidential information is not shared between the UE server and the network server during training (e.g., using distributed training on different devices rather than on a single device as in Type 1 training). However, Type 2 training is performed simultaneously on different devices. For example, a training session may be established between the UE server and the network server to perform Type 2 training. Thus, Type 2 training may be associated with constraints on the timing of training (e.g., because performing Type 2 training requires a training session between the UE server and the network server). Type 3 training may be used to provide more flexibility in when training is performed (e.g., by performing separate training on different devices). However, in some cases, Type 2 training may be associated with improved results and / or accuracy of the trained model compared to Type 3 training (e.g., because models in Type 2 training are trained simultaneously and in the same loop for forward and backpropagation rather than using the datasets described above). Thus, the device(s) performing the training may choose to either improve the results and / or accuracy of the training (e.g., by performing Type 2 training) or increase flexibility regarding when the training is performed (e.g., by performing Type 3 training).
[0098]
[0113] Some techniques and apparatus described herein enable hybrid sequential training for encoder and decoder models. For example, a first device may transmit, and a second device may receive, instructions for a function associated with a trained model (e.g., a trained encoder model or a trained decoder model) associated with the first device. For example, the first device may train the first model offline in a manner similar to Type 3 training. The first device may transmit to a second device a function that simulates forward and backward propagation paths to facilitate simultaneous training of a second model on the second device. For example, the function may be an application programming interface (API), a software program, a set of instructions, code, and / or another function.
[0099]
[0114] For example, the first device may be a network server, and the first model may be a decoder model. The second device may be a UE server, and the second model may be an encoder model. The network server may provide the UE server with an activation function (e.g., Z) and a ground truth (e.g., V). in ) as input and output one or more gradients (e.g., to simulate a backpropagation path of a trained decoder model). The UE server may use the one or more gradients to train an encoder model (e.g., to update one or more weights of the encoder model based at least in part on the one or more gradients). As another example, the first device may be a UE server, and the first model may be an encoder model. The second device may be a network server, and the second model may be a decoder model. The UE server may use the ground truth (e.g., V in) as input and output an activation function Z (e.g., to simulate the forward propagation path of the trained encoder model), which the network server may receive. The network server may train a decoder model using the activation function and the ground truth (e.g., by providing the activation function and the ground truth to a loss function and using the gradient of the loss function to update the weights of the decoder model).
[0100]
[0115] As a result, the encoder model and the decoder model may be trained using the forward and backpropagation paths within the same training loop (e.g., in a manner similar to Type 2 training), while also being trained sequentially and / or separately. For example, a function provided by a first device to a second device may enable the forward and backpropagation paths (e.g., fixed to the first device) to be simulated at the second device for simulated joint or simultaneous training. This may improve the accuracy of the training of the encoder and / or decoder models (e.g., by training simultaneously and within the same loop for forward and backpropagation). Additionally, this may increase flexibility regarding when training occurs (e.g., because the encoder and decoder models may be trained separately and / or at different times). For example, a training session may be established between a UE server and a network server to jointly train the encoder and decoder models.
[0101]
[0116] FIG. 10 is a diagram of an example embodiment 1000 associated with hybrid sequential training for encoder and decoder models in accordance with the present disclosure. As shown in FIG. 10 , a network node 110 (e.g., a base station, a CU, a DU, and / or a RU) may be in communication with a UE 120. In some aspects, the network node 110 and the UE 120 may be part of a wireless network (e.g., wireless network 100). The UE 120 and the network node 110 may establish a wireless connection prior to the operations shown in FIG. 10 . As shown in FIG. 10 , the UE 120 may be in communication with a UE server 1005 (e.g., server 135a or server 135b). The UE server 1005 may be associated with a vendor of the UE 120. Similarly, the network node 110 may be in communication with a network server 1010 (e.g., server 135c). The network server 1010 may be associated with a vendor of the network node 110.
[0102]
[0117] As described herein, operations performed by UE 120 and / or UE server 1005 may be referred to as "UE-side" operations. Similarly, operations performed by network node 110 and / or network server 1010 may be referred to as "network-side" operations. In some aspects, one or more (or all) operations described herein as being performed by UE server 1005 may be performed by UE 120. Similarly, one or more (or all) operations described herein as being performed by network server 1010 may be performed by network node 110 (or another network node).
[0103]
[0118] In some aspects, actions described herein as being performed by the network node 110 may be performed by multiple different network nodes. For example, configuration actions may be performed by a first network node (e.g., a CU or DU), and wireless communication actions may be performed by a second network node (e.g., a DU or RU). As used herein, the network node 110 “transmitting” a communication to the UE 120 may refer to direct transmission (e.g., from the network node 110 to the UE 120) or indirect transmission via one or more other network nodes or devices. For example, if the network node 110 is a DU, indirect transmission to the UE 120 may include the DU transmitting the communication to the RU, and the RU transmitting the communication to the UE 120. Similarly, the UE 120 “transmitting” a communication to the network node 110 may refer to direct transmission (e.g., from the UE 120 to the network node 110) or indirect transmission via one or more other network nodes or devices. For example, if the network node 110 is a DU, indirect transmission to the network node 110 may involve the UE 120 transmitting the communication to an RU, which in turn transmits the communication to the DU.
[0104]
[0119] As indicated by reference numeral 1015, the network server 1010 may train a decoder model associated with the network node 110. For example, the network server 1010 may train the decoder model in a manner similar to that described elsewhere herein, e.g., with respect to Type 1 training and / or Type 3 training. For example, the network server 1010 may receive one or more data sets from the network node 110, the UE server 1005, and / or one or more UEs 120 to be used as inputs for training the decoder model. For example, the one or more data sets may include CSI. The network server 1010 may deploy an encoder model at the network server 1010. The encoder model may use inputs or ground truth (e.g., V) provided to the encoder model. in), the network server 1010 may be configured to output an activation function (e.g., Z) based on the inputs provided to the encoder model or the ground truth (e.g., V in ) is a reconstruction of V out The network server 1010 may output the input or ground truth (e.g., V in ), and output V out can be provided to the loss function. The loss function is V in V out The network server 1010 may obtain gradients based on the output of the loss function. The network server 1010 may train the decoder model based on the gradients. For example, the network server 1010 may use the gradients to update one or more weights of the neural network of the decoder model (e.g., in an attempt to minimize the loss function). The network server 1010 may run one or more training loops in a similar manner to update the weights of the decoder model until the output of the loss function meets a training threshold. For example, the network server 1010 may compare the output V of the decoder model with the gradients. out The difference between the input or ground truth V in One or more training loops may be performed until σ is sufficiently reconstructed.
[0105]
[0120] As indicated by reference numeral 1020, the network server 1010 may generate a function based on the trained decoder model. The function may be an API, a set of instructions, code, a software program, and / or another function. The function may be configured to output one or more gradients based on input of activation values and inputs. For example, based on training of the decoder model, the network server 1010 may configure the function to simulate forward and back propagation paths of the decoder model using information obtained via the training loop and / or based on a loss function. For example, the function may be configured to mimic or simulate the forward and back propagation paths of the trained decoder model when executed by a device (e.g., UE server 1005). For example, the function may be configured to mimic or simulate the forward and back propagation paths of the trained decoder model when executed by the device using activation values (e.g., Z) and ground truth (e.g., V) in ) as input and return as output gradients (which may be used, for example, to update the weights of an encoder model, as described in more detail elsewhere herein). In other words, the function may be configured to provide the results of a backpropagation pass (e.g., for a training loop) associated with a trained decoder model.
[0106]
[0121] In some aspects, the network server 1010 may generate the function based on training a decoder model. For example, the network server 1010 may use various activation values (e.g., Z) and ground truth (e.g., V) during the decoder model training process. in) The network server 1010 may configure the function to provide a given gradient based on given activation values and / or ground truth input to the function (e.g., using information obtained via the training loop and / or based on a loss function, thereby allowing the encoder model to be trained using forward and back propagation paths in the same training loop as the decoder model (e.g., in a manner similar to Type 2 training), while also allowing the decoder and encoder models to be trained sequentially and / or separately). Additionally or alternatively, the function may be pre-configured (e.g., by a vendor associated with the network server 1010). In such an example, the network server 1010 may retrieve the function from a memory of the network server 1010.
[0107]
[0122] In some aspects, the network server 1010 (and / or the network node 110) may determine a codebook for vector quantization associated with the compressed CSI, as described in more detail elsewhere herein (e.g., with respect to FIGS. 9A and 9B). For example, the network server 1010 (and / or the network node 110) may train a vector quantization model as part of training a decoder model. For example, the network server 1010 (and / or the network node 110) may train a quantizer associated with the vector quantization. The quantization codebook (e.g., a vector codebook or a scalar codebook) may be determined in the network server 1010 (and / or the network node 110) as part of training the decoder model. In some aspects, a function (e.g., an API or other function) generated by the network server 1010 may include a vector quantization component. For example, the function may be configured to simulate the effect of quantization on activation values or other information output by or input to a trained decoder model.
[0108]
[0123] In some aspects, the network server 1010 may generate functions associated with multiple decoder models. For example, the network server 1010 may train multiple decoder models (e.g., in a manner similar to that described in more detail elsewhere herein). In some aspects, the multiple decoder models may be associated with respective UE vendors. As another example, the multiple decoder models may be associated with respective types of CSI (e.g., a first decoder model may be associated with a precoding vector and a second decoder model may be associated with a channel estimation, among other examples). As another example, the multiple decoder models may be associated with respective channel conditions. As another example, the multiple decoder models may be associated with respective CSI sizes (e.g., the size of the CSI to be communicated between the network node 110 and the UE 120). The network server 1010 may generate functions configured to simulate forward and backward propagation paths of the multiple trained decoder models.
[0109]
[0124] As indicated by reference numeral 1025, the network server 1010 may transmit a function (e.g., associated with a trained decoder model) and the UE server 1005 may receive it. For example, the network server 1010 and the UE server 1005 may establish a connection (e.g., a wireless connection or a wired connection). The function may be transmitted from the network server 1010 to the UE server 1005 over the connection.
[0110]
[0125] As indicated by reference numeral 1030, the UE server 1005 may train the encoder model using a function. For example, the UE server 1005 may train the encoder model based on selecting or updating one or more weights associated with the encoder model using one or more gradients. In some aspects, the one or more gradients may be a function of the output from the encoder (e.g., Z) and the input to the encoder (e.g., V in) for example, the one or more gradients may be obtained based on inputting one or more activation functions and one or more input functions (e.g., ground truth) into the function. For example, the UE server 1005 may train the encoder model in a manner similar to Type 2 training, as described in more detail above. However, rather than providing one or more activation functions and one or more input functions (e.g., ground truth) to the network server 1010 (e.g., as in Type 2 training), the UE server 1005 may input one or more activation values and one or more input functions (e.g., ground truth) into the function received from the network server 1010. The function may be based on inputting one or more activation functions (e.g., V) of a trained decoder model (e.g., providing activation functions into the decoder model and V out , which obtains a gradient based on the output of the loss function), and a backpropagation path (e.g., which provides a gradient based on the output of the loss function). Thus, the encoder model may be trained using forward and backpropagation paths in the same training loop (e.g., in a manner similar to Type-2 training), while also being trained sequentially and / or separately, unlike Type-2 training.
[0111]
[0126] In some aspects, the function may include a vector quantization component, as described above. For example, the function may be configured to simulate the effect of quantization on activation values or other information output by or input to a trained decoder model. In such an example, the UE server 1005 may use the function to train a quantizer and / or a vector quantization model. In another example, the UE server 1005 (or the UE 120) may determine a codebook for vector quantization associated with the compressed CSI, as described in more detail elsewhere herein (e.g., with respect to FIGS. 9A and 9B). For example, the UE server 1005 (and / or the UE 120) may train the vector quantization model as part of training an encoder model. For example, the UE server 1005 (and / or the UE 120) may train a quantizer associated with the vector quantization. A quantization codebook (e.g., a vector codebook or a scalar codebook) may be determined in the UE server 1005 (and / or the UE 120) as part of training the encoder model. In such an example, the input provided to the function (e.g., API) may include quantized activation values output by the encoder model (e.g., quantized using vector quantization and / or a quantization codebook determined by the UE server 1005 and / or the UE 120). In other words, a quantizer may be trained using the encoder model (e.g., and the function may not simulate the effects of such quantization).
[0112]
[0127] In some aspects, as described above, a function may be associated with multiple trained decoder models. In such an example, training the encoder model may include providing to the function an indication of an identifier associated with the decoder model. For example, input to the function (e.g., an API) may include a model identifier (e.g., associated with a given encoder model and / or decoder model). The function may be configured to provide information based on the model identifier provided to the function. In some examples, the UE server 1005 may train a single encoder model to be operable with each of multiple trained decoder models. In other examples, the UE server 1005 may train multiple encoder models to be operable with respective decoder models from the multiple trained decoder models (e.g., if the function is associated with N trained decoder models, then the UE server 1005 may train N encoder models).
[0113]
[0128] In some aspects, the UE server 1005 may receive another function (e.g., a second function) from another network server (e.g., another network server 1010). For example, the other network server may be associated with a different network node vendor than the vendor associated with the network server 1010. The UE server 1005 may train an encoder using a first function (e.g., received from the network server 1010) and using a second function (e.g., received from the other network server 1010). In other words, the UE server 1005 may train an encoder model using multiple functions provided by network servers associated with different vendors. In this manner, a trained encoder model may be configured to be operable with trained decoders associated with multiple functions (e.g., in a manner similar to that described with respect to FIG. 6).
[0114]
[0129] As indicated by reference numeral 1035, the UE server 1005 may send, and the UE 120 may receive, an indication of the trained encoder model. For example, the UE 120 may download the trained encoder model (e.g., trained using a function associated with the decoder model of the network node 110) from the UE server 1005. Similarly, as indicated by reference numeral 1040, the network server 1010 may send, and the network node 110 may receive, an indication of the trained decoder model. For example, the network node 110 may download the trained decoder model from the network server 1010.
[0115]
[0130] As indicated by reference numeral 1045, the UE 120 and the network node 110 may communicate using a trained encoder model and a trained decoder model, respectively. For example, the UE 120 may obtain CSI to be transmitted to the network node 110. The UE 120 may input the CSI into the trained encoder model. The trained encoder model may output an activation function (e.g., compressed CSI). In some aspects, the UE 120 may quantize the activation function output by the trained encoder model (e.g., using a quantization codebook and / or vector quantization). The UE 120 may transmit, and the network node 110 may receive, the activation function output by the trained encoder model (e.g., compressed CSI). In some aspects, the UE 120 may transmit, and the network node 110 may receive, a quantized representation of the activation function (e.g., compressed CSI) output by the trained encoder model. The network node 110 may input the activation function into the trained decoder model. The trained decoder model may output reconstructed CSI, which is a reconstruction of the CSI input to the encoder model (e.g., at UE 120).
[0116]
[0131] As a result, the encoder model and the decoder model may be trained using the forward and backpropagation paths within the same training loop (e.g., in a manner similar to Type 2 training), while also being trained sequentially and / or separately. For example, a function provided by a first device to a second device may enable the forward and backpropagation paths (e.g., fixed to the first device) to be simulated at the second device for simulated joint or simultaneous training. This may improve the accuracy of the training of the encoder and / or decoder models (e.g., by training simultaneously and within the same loop for forward and backpropagation). Additionally, this may increase flexibility regarding when training occurs (e.g., because the encoder and decoder models may be trained separately and / or at different times). For example, a training session may be established between a UE server and a network server to jointly train the encoder and decoder models.
[0117]
[0132] As noted above, Figure 10 is provided as an example. Other examples may differ from those described with respect to Figure 10.
[0118]
[0133] 11 is a diagram of an example embodiment 1100 associated with hybrid sequential training for encoder and decoder models according to the present disclosure. As shown in FIG. 10, a network node 110 (e.g., a base station, a CU, a DU, and / or an RU) may communicate with a UE 120. In some aspects, the network node 110 and the UE 120 may be part of a wireless network (e.g., wireless network 100). The UE 120 and the network node 110 may establish a wireless connection prior to the operations shown in FIG. 11. As shown in FIG. 11, the UE 120 may communicate with a UE server 1005 in a manner similar to that described above with respect to FIG. 10. Similarly, the network node 110 may communicate with a network server 1010 in a manner similar to that described above with respect to FIG. 10.
[0119]
[0134] As indicated by reference numeral 1105, the UE server 1005 may train an encoder model associated with the UE 120. For example, the UE server 1005 may train the encoder model in a manner similar to that described elsewhere herein with respect to Type 1 training and / or Type 3 training. For example, the UE server 1005 may receive from the UE 120 one or more data sets used as input for training the encoder model. For example, the one or more data sets may include CSI. The UE server 1005 may deploy a decoder model at the UE server 1005. The decoder model may generate a reconstructed CSI (e.g., V) based on an input of an activation function (e.g., Z). out ) The UE server 1005 may be configured to output the ground truth (e.g., V in ) as input to the encoder model. The encoder model may output an activation function (e.g., Z). The UE server 1005 may input the activation function into the decoder model. The decoder model may then generate a ground truth reconstruction (e.g., V out ) The UE server 1005 may use the loss function to output V out V in and may compare the output of the decoder model, V, to determine a gradient. The UE server 1005 may use the gradient to update one or more weights of the encoder model (e.g., to minimize a loss function). For example, the UE server 1005 may use the gradient to update one or more weights of the neural network of the encoder model (e.g., in an attempt to minimize a loss function). The UE server 1005 may perform one or more training loops in a similar manner to update the weights of the encoder model until the output of the loss function meets a training threshold. For example, the UE server 1005 may compare the output of the decoder model, V, to a out The difference between the input or ground truth V in One or more training loops may be performed until σ is sufficiently reconstructed.
[0120]
[0135] As indicated by reference numeral 1110, the UE server 1005 may generate a function based on the trained encoder model. The function may be an API, a set of instructions, code, a software program, and / or another function. The function may be generated based on an input function (e.g., ground truth, V in ), and output an activation function (e.g., Z) based on the input of the ground truth (e.g., V). For example, the function, when executed by a device (e.g., network server 1010), may be configured to mimic or simulate the forward and backpropagation paths of a trained encoder model. For example, based on training a decoder model, network server 1010 may configure the function to simulate the forward and backpropagation paths of the decoder model using information obtained via the training loop and / or based on a loss function. For example, the function, when executed by a device, may mimic or simulate the forward and backpropagation paths of the decoder model using information obtained via the training loop and / or based on a loss function. in ) as input and return as output an activation function (e.g., Z) (which may be used as an input for training a decoder model, e.g., as described in more detail elsewhere herein). In other words, the function may be configured to provide the results of a forward propagation path (e.g., for a training loop) associated with the trained encoder model. Thus, the decoder model may be trained using forward and back propagation paths within the same training loop (e.g., in a manner similar to Type-2 training), while also being trained sequentially and / or separately, unlike in Type-2 training.
[0121]
[0136] In some aspects, the UE server 1005 may generate the function based on training an encoder model. For example, the UE server 1005 may use ground truth (e.g., V) during the encoder model training process. in) The UE server 1005 may configure the function to provide a given activation function based on a given ground truth input to the function (e.g., using information obtained via a training loop and / or based on a loss function, thereby allowing the decoder model to be trained using forward and back propagation paths in the same training loop as the encoder model (e.g., in a manner similar to Type 2 training), while also allowing the decoder and encoder models to be trained sequentially and / or separately). Additionally or alternatively, the function may be pre-configured (e.g., by a vendor associated with the UE server 1005). In such an example, the UE server 1005 may retrieve the function from a memory of the UE server 1005.
[0122]
[0137] In some aspects, the UE server 1005 (and / or the UE 120) may determine a codebook for vector quantization associated with the compressed CSI, as described in more detail elsewhere herein (e.g., with respect to FIGS. 9A and 9B). For example, the UE server 1005 (and / or the UE 120) may train a vector quantization model as part of training an encoder model. For example, the UE server 1005 (and / or the UE 120) may train a quantizer associated with the vector quantization. A quantization codebook (e.g., a vector codebook or a scalar codebook) may be determined in the UE server 1005 (and / or the UE 120) as part of training the encoder model. In some aspects, a function (e.g., an API or other function) generated by the UE server 1005 may include a vector quantization component. For example, the function may be configured to simulate the effect of quantization on activation values or other information output by or input to a trained encoder model.
[0123]
[0138] In some aspects, the UE server 1005 may generate functions associated with multiple encoder models. For example, the UE server 1005 may train multiple encoder models (e.g., in a manner similar to that described in more detail elsewhere herein). In some aspects, the multiple encoder models may be associated with respective network node vendors. As another example, the multiple encoder models may be associated with respective types of CSI (e.g., a first encoder model may be associated with a precoding vector and a second encoder model may be associated with a channel estimation, among other examples). As another example, the multiple encoder models may be associated with respective channel conditions. As another example, the multiple encoder models may be associated with respective CSI sizes (e.g., sizes of CSI to be communicated between the network node 110 and the UE 120). The UE server 1005 may generate functions configured to simulate forward and backward propagation paths of the multiple trained encoder models.
[0124]
[0139] As indicated by reference numeral 1115, the UE server 1005 may send a function (e.g., associated with a trained encoder model) and the network server 1010 may receive it. For example, the network server 1010 and the UE server 1005 may establish a connection (e.g., a wireless connection or a wired connection). The function may be sent from the UE server 1005 to the network server 1010 over the connection.
[0125]
[0140] As indicated by reference numeral 1120, the network server 1010 may train the decoder model using the function. For example, the network server 1010 may train the decoder model based on selecting or updating one or more weights associated with the decoder model using one or more gradients obtained from a loss function, as described in more detail elsewhere herein. The one or more gradients may be obtained based on inputting one or more input functions (e.g., ground truth) into the function. For example, the network server 1010 may train the decoder model in a manner similar to Type 2 training. However, rather than receiving one or more activation functions and one or more input functions (e.g., ground truth) from the UE server 1005 (e.g., as in Type 2 training), the network server 1010 may obtain the one or more activation functions and / or one or more input functions (e.g., ground truth) from the function received from the UE server 1005. The function may simulate forward and backward propagation paths of the trained encoder model. Thus, the decoder model may be trained using forward and backpropagation paths in the same training loop as the encoder model (e.g., in a manner similar to Type 2 training), while also being trained sequentially and / or separately, unlike Type 2 training.
[0126]
[0141] In some aspects, the function may include a vector quantization component, as described above. For example, the function may be configured to simulate the effect of quantization on activation values or other information output by or input to a trained decoder model. In such an example, the network server 1010 may use the function to train a quantizer and / or vector quantization model. In other examples, the network server 1010 (or network node 110) may determine a codebook for vector quantization associated with the compressed CSI, as described in more detail elsewhere herein (e.g., with respect to FIGS. 9A and 9B). For example, the network server 1010 (and / or network node 110) may train the vector quantization model as part of training the encoder model. For example, the network server 1010 (and / or network node 110) may train a quantizer associated with vector quantization. A quantization codebook (e.g., a vector codebook or a scalar codebook) may be determined in the network server 1010 (and / or network node 110) as part of training the decoder model. In such examples, the input provided to the function (e.g., API) may include quantized activation values output by the function (e.g., quantized using vector quantization and / or a quantization codebook determined by network server 1010 and / or network node 110). In other words, the quantizer may be trained using a decoder model (e.g., and the function may not simulate the effects of such quantization).
[0127]
[0142] In some aspects, as described above, a function may be associated with multiple trained encoder models. In such examples, training an encoder model may include providing to the function an indication of a decoder model and / or an identifier associated with the encoder model (from the multiple trained encoder models). For example, input to the function (e.g., an API) may include a model identifier (e.g., associated with a given encoder model and / or decoder model). The function may be configured to provide information based on the model identifier provided to the function. In some examples, the network server 1010 may train a single decoder model to be operable with each of the multiple trained encoder models. In other examples, the network server 1010 may train multiple decoder models to be operable with respective encoder models from the multiple trained encoder models (e.g., if the function is associated with N trained encoder models, then the network server 1010 may train N decoder models).
[0128]
[0143] In some aspects, the network server 1010 may receive another function (e.g., a second function) from another UE server (e.g., another UE server 1005). For example, the other UE server may be associated with a different UE vendor than the vendor associated with the UE server 1005. The network server 1010 may train the decoder model using a first function (e.g., received from the UE server 1005) and using a second function (e.g., received from the other UE server). In other words, the network server 1010 may train the decoder model using multiple functions provided by UE servers associated with different vendors. In this manner, the trained decoder model may be configured to be operable with a trained encoder associated with multiple functions (e.g., in a manner similar to that described with respect to FIG. 6).
[0129]
[0144] As indicated by reference numeral 1125, the UE server 1005 may send, and the UE 120 may receive, an indication of the trained encoder model. For example, the UE 120 may download the trained encoder model from the UE server 1005. Similarly, as indicated by reference numeral 1130, the network server 1010 may send, and the network node 110 may receive, an indication of a trained decoder model (e.g., trained using a function associated with the UE 120's encoder model). For example, the network node 110 may download the trained decoder model from the network server 1010.
[0130]
[0145] As indicated by reference numeral 1135, the UE 120 and the network node 110 may communicate using a trained encoder model and a trained decoder model, respectively. For example, the UE 120 may obtain CSI to be transmitted to the network node 110. The UE 120 may input the CSI into the trained encoder model. The trained encoder model may output an activation function (e.g., compressed CSI). In some aspects, the UE 120 may quantize the activation function output by the trained encoder model (e.g., using a quantization codebook and / or vector quantization). The UE 120 may transmit, and the network node 110 may receive, the activation function output by the trained encoder model (e.g., compressed CSI). In some aspects, the UE 120 may transmit, and the network node 110 may receive, a quantized representation of the activation function output by the trained encoder model (e.g., compressed CSI). The network node 110 may input the activation function into the trained decoder model. The trained decoder model may output reconstructed CSI, which is a reconstruction of the CSI input to the encoder model (e.g., at UE 120).
[0131]
[0146] As a result, the encoder model and the decoder model may be trained using the forward and backpropagation paths within the same training loop (e.g., in a manner similar to Type 2 training), while also being trained sequentially and / or separately. For example, a function provided by a first device to a second device may enable the forward and backpropagation paths (e.g., fixed to the first device) to be simulated at the second device for simulated joint or simultaneous training. This may improve the accuracy of the training of the encoder and / or decoder models (e.g., by training simultaneously and within the same loop for forward and backpropagation). Additionally, this may increase flexibility regarding when training occurs (e.g., because the encoder and decoder models may be trained separately and / or at different times). For example, a training session may be established between a UE server and a network server to jointly train the encoder and decoder models.
[0132]
[0147] As noted above, Figure 11 is provided as an example. Other examples may differ from those described with respect to Figure 11.
[0133]
[0148] 12 illustrates an example process 1200 implemented, for example, by a first device, in accordance with the present disclosure. The example process 1200 is an example in which a first device (e.g., a server, a UE server 1005, a network server 1010, a UE 120, and / or a network node 110) performs operations associated with hybrid sequential training for an encoder and decoder model.
[0134]
[0149] 12, in some aspects, process 1200 may include receiving, from the second device, a function associated with the trained first model, the function being configured to output one or more gradients associated with the trained first model (block 1210). For example, the first device (e.g., using the communications manager 140 and / or the receiving component 1402 shown in FIG. 14) may receive, from the second device, a function associated with the trained first model, the function being configured to output one or more gradients associated with the trained first model, as described above.
[0135]
[0150] 12, in some aspects, process 1200 may include training the second model based on selecting one or more weights associated with the second model using one or more gradients, the one or more gradients being obtained based on inputting the one or more activity values and one or more inputs into a function (block 1220). For example, the first device (e.g., using the communications manager 140 and / or model training component 1408 shown in FIG. 14) may train the second model based on selecting one or more weights associated with the second model using one or more gradients, the one or more gradients being obtained based on inputting the one or more activity values and one or more inputs into a function, as described above.
[0136]
[0151] Process 1200 may include additional aspects, such as any single aspect or any combination of aspects described below and / or in connection with one or more other processes described elsewhere herein.
[0137]
[0152] In a first aspect, the process 1200 includes transmitting the second model to the UE or the network node after training the second model.
[0138]
[0153] In a second aspect, alone or in combination with the first aspect, the second model is configured to output compressed CSI, the compressed CSI including one or more of the activity values compressed, and the trained first model is configured to output CSI from an input of the compressed CSI, the one or more inputs including the CSI.
[0139]
[0154] In a third aspect, alone or in combination with one or more of the first and second aspects, the process 1200 includes training a vector quantization model using one or more gradients.
[0140]
[0155] In a fourth aspect, either alone or in combination with one or more of the first to third aspects, the function is configured to perform vector quantization associated with the output of the function.
[0141]
[0156] In a fifth aspect, alone or in combination with one or more of the first to fourth aspects, the function is associated with a plurality of trained first models, and training the second model includes providing an identifier associated with the trained first model as an input to the function.
[0142]
[0157] In a sixth aspect, alone or in combination with one or more of the first through fifth aspects, training the second model further includes training the second model to be configured to operate with each of the plurality of trained first models.
[0143]
[0158] In a seventh aspect, alone or in combination with one or more of the first through sixth aspects, training the second model further includes training a plurality of second models, including the second model, configured to operate with a respective trained first model from the plurality of trained first models.
[0144]
[0159] In an eighth aspect, alone or in combination with one or more of the first through seventh aspects, the function is a first function, and the process 1200 includes receiving, from a third device, instructions for a second function associated with another trained first model, and training the second model includes training the second model using the first function and the second function.
[0145]
[0160] In a ninth aspect, alone or in combination with one or more of the first to eighth aspects, the function is an API.
[0146]
[0161] In a tenth aspect, alone or in combination with one or more of the first through ninth aspects, the first device is a server associated with a UE, the trained first model is a decoder model, and the second model is an encoder model (e.g., in a manner similar to that shown and described with respect to FIG. 10). In some aspects, the function is a combination of an activation function (e.g., Z) and a ground truth (e.g., V in ) as input, and the function may output one or more gradients (e.g., to simulate the forward and backward propagation paths of a decoder model). The one or more gradients may be used to update one or more weights of an encoder model.
[0147]
[0162] In an eleventh aspect, alone or in combination with one or more of the first through tenth aspects, the first device is a server associated with a network node, the trained first model is an encoder model, and the second model is a decoder model (e.g., in a manner similar to that shown and described with respect to FIG. 11). In some aspects, the function is based on a ground truth (e.g., V in ) as input, and the function may output an activation function (e.g., Z). The output of the function (e.g., activation function, Z) may be used as input to a decoder model to train the decoder model.
[0148]
[0163] In a twelfth aspect, alone or in combination with one or more of the first to eleventh aspects, the first device is a UE or a network node.
[0149]
[0164] In a thirteenth aspect, alone or in combination with one or more of the first to twelfth aspects, the function is configured to simulate forward and backward propagation paths of a trained first model based on one or more gradients.
[0150]
[0165] 12 illustrates example blocks of process 1200, in some aspects process 1200 may include additional, fewer, different, or differently configured blocks compared to the blocks illustrated in FIG 12. Additionally, or alternatively, two or more of the blocks of process 1200 may be performed in parallel.
[0151]
[0166] 13 illustrates an example process 1300 implemented, for example, by a first device, in accordance with the present disclosure. The example process 1300 is an example in which a first device (e.g., a server, a UE server 1005, a network server 1010, a UE 120, and / or a network node 110) performs operations associated with hybrid sequential training for encoder and decoder models.
[0152]
[0167] 13, in some aspects, process 1300 may include training the first model based on the one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with an output of the trained first model (block 1310). For example, the first device (e.g., using the communications manager 150 and / or the model training component 1508 shown in FIG. 15) may train the first model based on the one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with an output of the trained first model, as described above.
[0153]
[0168] 13, in some aspects, process 1300 may include transmitting to the second device a function associated with the trained first model, the function being configured to output one or more activity values based on the ground truth inputs (block 1320). For example, the first device may transmit to the second device (e.g., using the communications manager 150 and / or the transmitting component 1504 shown in FIG. 15), as described above, a function associated with the trained first model, the function being configured to output one or more activity values based on the ground truth inputs.
[0154]
[0169] Process 1300 may include additional aspects, such as any single aspect or any combination of aspects described below and / or in connection with one or more other processes described elsewhere herein.
[0155]
[0170] In a first aspect, the process 1300 includes training the first model and then transmitting the trained first model to the UE or a network node.
[0156]
[0171] In a second aspect, alone or in combination with the first aspect, the trained first model is configured to output compressed CSI or to output CSI from an input of compressed CSI.
[0157]
[0172] In a third aspect, alone or in combination with one or more of the first and second aspects, the process 1300 includes training a vector quantization model using the trained first model.
[0158]
[0173] In a fourth aspect, either alone or in combination with one or more of the first to third aspects, the function is configured to perform vector quantization associated with the output of the function.
[0159]
[0174] In a fifth aspect, alone or in combination with one or more of the first to fourth aspects, the function is an API.
[0160]
[0175] In a sixth aspect, alone or in combination with one or more of the first through fifth aspects, the first device is a server associated with a network node, the first model is a decoder model, and the second device is associated with a UE and an encoder model (e.g., in a manner similar to that shown and described with respect to FIG. 10). In some aspects, the function may be a combination of an activation function (e.g., Z) and a ground truth (e.g., V in ) as input, and the function may output one or more gradients (e.g., to simulate the forward and backward propagation paths of a decoder model). The one or more gradients may be used to update one or more weights of an encoder model.
[0161]
[0176] In a seventh aspect, alone or in combination with one or more of the first through sixth aspects, the first device is a server associated with a UE, the first model is an encoder model, and the second device is associated with a network node and a decoder model (e.g., in a manner similar to that shown and described with respect to FIG. 11). In some aspects, the function is based on a ground truth (e.g., V in ) as input, and the function may output an activation function (e.g., Z). The output of the function (e.g., activation function, Z) may be used as input to a decoder model to train the decoder model.
[0162]
[0177] In an eighth aspect, alone or in combination with one or more of the first to seventh aspects, the first device is a network node or a UE.
[0163]
[0178] In a ninth aspect, alone or in combination with one or more of the first to eighth aspects, the function is configured to simulate a forward propagation path and a backward propagation path of the first model.
[0164]
[0179] 13 illustrates example blocks of process 1300, in some aspects process 1300 may include additional, fewer, different, or differently configured blocks compared to the blocks illustrated in FIG 13. Additionally, or alternatively, two or more of the blocks of process 1300 may be performed in parallel.
[0165]
[0180] 14 is a diagram of an example apparatus 1400 for wireless communication in accordance with the present disclosure. The apparatus 1400 may be a first device, or the first device may include the apparatus 1400. In some aspects, the first device may be a server, a UE server 1005, a network server 1010, a UE 120, and / or a network node 110. In some aspects, the apparatus 1400 includes a receiving component 1402 and a transmitting component 1404 that may communicate with each other (e.g., via one or more buses and / or one or more other components). As shown, the apparatus 1400 may communicate with another apparatus 1406 (such as a UE, a base station, or another wireless communication device) using the receiving component 1402 and the transmitting component 1404. As further shown, the apparatus 1400 may include a communications manager 140. The communications manager 140 may include a model training component 1408, among other examples.
[0166]
[0181] In some aspects, apparatus 1400 may be configured to perform one or more operations described herein with respect to FIGS. 10 and 11. Additionally or alternatively, apparatus 1400 may be configured to perform one or more processes described herein, such as process 1200 of FIG. 12, or a combination thereof. In some aspects, apparatus 1400 and / or one or more components shown in FIG. 14 may include one or more components of the first device described with respect to FIG. 2. Additionally or alternatively, one or more components shown in FIG. 14 may be implemented within one or more components described in connection with FIG. 2. Additionally or alternatively, one or more components of the set of components may be implemented at least in part as software stored in memory. For example, a component (or a portion of a component) may be implemented as instructions or code stored in a non-transitory computer-readable medium and executable by a controller or processor to perform the function or operation of the component.
[0167]
[0182] The receiving component 1402 may receive communications, such as reference signals, control information, data communications, or a combination thereof, from the device 1406. The receiving component 1402 may provide the received communications to one or more other components of the device 1400. In some aspects, the receiving component 1402 may perform signal processing (such as filtering, amplification, demodulation, analog-to-digital conversion, demultiplexing, deinterleaving, demapping, equalization, interference cancellation, or decoding, among other examples) on the received communications and provide the processed signals to one or more other components of the device 1400. In some aspects, the receiving component 1402 may include one or more antennas, a modem, a demodulator, a MIMO detector, a receive processor, a controller / processor, a memory, or a combination thereof of the first device described with respect to FIG.
[0168]
[0183] The transmitting component 1404 may transmit a communication to the device 1406, such as a reference signal, control information, a data communication, or a combination thereof. In some aspects, one or more other components of the device 1400 may generate a communication and provide the generated communication to the transmitting component 1404 for transmission to the device 1406. In some aspects, the transmitting component 1404 may perform signal processing (such as filtering, amplification, modulation, digital-to-analog conversion, multiplexing, interleaving, mapping, or encoding, among other examples) on the generated communication and transmit the processed signal to the device 1406. In some aspects, the transmitting component 1404 may include one or more antennas, a modem, a modulator, a transmit MIMO processor, a transmit processor, a controller / processor, a memory, or a combination thereof, of the first device described with respect to FIG. 2. In some aspects, the transmitting component 1404 may be co-located with the receiving component 1402 within a transceiver.
[0169]
[0184] The receiving component 1402 may receive from the second device a function associated with the trained first model, the function being configured to output one or more gradients based on input of the activity values and the inputs. The model training component 1408 may train the second model based on selecting one or more weights associated with the second model using the one or more gradients obtained based on inputting the one or more activity values and the one or more inputs into the function.
[0170]
[0185] The transmitting component 1404 may transmit the second model to the UE or a network node after training the second model.
[0171]
[0186] The model training component 1408 can use one or more gradients to train a vector quantization model.
[0172]
[0187] The number and arrangement of components shown in Figure 14 are provided as an example. In practice, there may be additional, fewer, different, or differently arranged components than those shown in Figure 14. Furthermore, two or more components shown in Figure 14 may be implemented within a single component, or a single component shown in Figure 14 may be implemented as multiple distributed components. Additionally, or alternatively, a set of components shown in Figure 14 may perform one or more functions that are described as being performed by another set of components shown in Figure 14.
[0173]
[0188] FIG. 15 is a diagram of an example apparatus 1500 for wireless communication in accordance with the present disclosure. The apparatus 1500 may be a first device, or the first device may include the apparatus 1500. In some aspects, the first device may be a server, a UE server 1005, a network server 1010, a UE 120, and / or a network node 110. In some aspects, the apparatus 1500 includes a receiving component 1502 and a transmitting component 1504 that may communicate with each other (e.g., via one or more buses and / or one or more other components). As shown, the apparatus 1500 may communicate with another apparatus 1506 (such as a UE, a base station, or another wireless communication device) using the receiving component 1502 and the transmitting component 1504. As further shown, the apparatus 1500 may include a communications manager 150. The communications manager 150 may include one or more of a model training component 1508 and / or a function generation component 1510, among other examples.
[0174]
[0189] In some aspects, apparatus 1500 may be configured to perform one or more operations described herein with respect to FIGS. 10 and 11. Additionally or alternatively, apparatus 1500 may be configured to perform one or more processes described herein, such as process 1300 of FIG. 13, or a combination thereof. In some aspects, apparatus 1500 and / or one or more components shown in FIG. 15 may include one or more components of the first device described with respect to FIG. 2. Additionally or alternatively, one or more components shown in FIG. 15 may be implemented within one or more components described in connection with FIG. 2. Additionally or alternatively, one or more components of the set of components may be implemented at least in part as software stored in memory. For example, a component (or a portion of a component) may be implemented as instructions or code stored in a non-transitory computer-readable medium and executable by a controller or processor to perform the function or operation of the component.
[0175]
[0190] The receiving component 1502 may receive communications, such as reference signals, control information, data communications, or a combination thereof, from the device 1506. The receiving component 1502 may provide the received communications to one or more other components of the device 1500. In some aspects, the receiving component 1502 may perform signal processing (such as filtering, amplification, demodulation, analog-to-digital conversion, demultiplexing, deinterleaving, demapping, equalization, interference cancellation, or decoding, among other examples) on the received communications and provide the processed signals to one or more other components of the device 1500. In some aspects, the receiving component 1502 may include one or more antennas, a modem, a demodulator, a MIMO detector, a receive processor, a controller / processor, a memory, or a combination thereof of the first device described with respect to FIG.
[0176]
[0191] The transmitting component 1504 may transmit a communication to the device 1506, such as a reference signal, control information, a data communication, or a combination thereof. In some aspects, one or more other components of the device 1500 may generate a communication and provide the generated communication to the transmitting component 1504 for transmission to the device 1506. In some aspects, the transmitting component 1504 may perform signal processing (such as filtering, amplification, modulation, digital-to-analog conversion, multiplexing, interleaving, mapping, or encoding, among other examples) on the generated communication and transmit the processed signal to the device 1506. In some aspects, the transmitting component 1504 may include one or more antennas, a modem, a modulator, a transmit MIMO processor, a transmit processor, a controller / processor, a memory, or a combination thereof, of the first device described with respect to FIG. 2 . In some aspects, the transmitting component 1504 may be co-located with the receiving component 1502 within a transceiver.
[0177]
[0192] The model training component 1508 may train the first model based on the one or more inputs to obtain a trained first model, the trained first model being associated with one or more activity values associated with outputs of the trained first model. The transmitting component 1504 may transmit, to the second device, a function associated with the trained first model, the function being configured to output one or more activity values based on the ground truth inputs.
[0178]
[0193] The transmitting component 1504 may transmit the trained first model to the UE or a network node after training the first model.
[0179]
[0194] A model training component 1508 may train a vector quantization model using the trained first model.
[0180]
[0195] The function generation component 1510 may generate a function based at least in part on training the first model.
[0181]
[0196] The number and arrangement of components shown in Figure 15 are provided as an example. In practice, there may be additional, fewer, different, or differently arranged components than those shown in Figure 15. Furthermore, two or more components shown in Figure 15 may be implemented within a single component, or a single component shown in Figure 15 may be implemented as multiple distributed components. Additionally, or alternatively, a set of components shown in Figure 15 may perform one or more functions that are described as being performed by another set of components shown in Figure 15.
[0182]
[0197] The following provides a summary of several aspects of the disclosure.
[0183]
[0198] Aspect 1: A method of wireless communication performed by a first device, the method including: receiving from a second device a function associated with a trained first model, the function configured to output one or more gradients associated with the trained first model; and training a second model based on selecting one or more weights associated with the second model using the one or more gradients, the one or more gradients being obtained based on inputting one or more activation values and one or more inputs to the function. This allows the trained first model and the second model to be trained using forward and back propagation paths within the same training loop (e.g., in a manner similar to Type 2 training), while also being trained sequentially and / or separately.
[0184]
[0199] Aspect 2: The method of aspect 1, further comprising transmitting the second model to a user equipment (UE) or a network node after training the second model, which provides increased flexibility regarding when training occurs.
[0185]
[0200] Aspect 3: The method of Aspect 1 or 2, wherein the second model is configured to output compressed channel state information (CSI), where the CSI includes one or more activity values compressed, and the trained first model is configured to output CSI from an input of the compressed CSI, where the one or more inputs include the CSI. This improves the accuracy of the CSI compression model (e.g., by training the models simultaneously and in the same loop for forward propagation and backpropagation).
[0186]
[0201] Aspect 4: The method of any one of aspects 1 to 3, further comprising training a vector quantization model using the one or more gradients.
[0187]
[0202] Aspect 5: The method of any one of aspects 1 to 3, wherein the function is configured to perform vector quantization associated with the output of the function.
[0188]
[0203] Aspect 6: The method of any one of Aspects 1-5, wherein the function is associated with multiple trained first models, and training the second model includes providing an identifier associated with the trained first models as an input to the function. This allows a single function to simulate forward and backpropagation paths for multiple trained models, thereby saving resources that would otherwise be used to configure, transmit, and / or use multiple functions for the multiple trained models.
[0189]
[0204] Aspect 7: The method of aspect 6, wherein training the second model further includes training the second model to be configured to operate with each of the multiple trained first models. This allows the second model to be trained to operate with the multiple trained models, thereby saving resources that would otherwise be used to configure, transmit, and / or use multiple models for the multiple trained models.
[0190]
[0205] Aspect 8: The method of aspect 6, wherein training the second model further includes training a plurality of second models, including the second model, to be configured to operate with each trained first model from the plurality of trained first models.
[0191]
[0206] Aspect 9: The method of any one of aspects 1 to 8, wherein the function is a first function, and the method further includes receiving, from a third device, instructions for a second function associated with another trained first model, and training the second model includes training the second model using the first function and the second function.
[0192]
[0207] Aspect 10: The method of any one of aspects 1 to 9, wherein the function is an application programming interface (API).
[0193]
[0208] Aspect 11: The method of any one of aspects 1 to 10, wherein the first device is a server associated with a user equipment (UE), the trained first model is a decoder model, and the second model is an encoder model.
[0194]
[0209] Aspect 12: The method of any one of aspects 1 to 10, wherein the first device is a server associated with a network node, the trained first model is an encoder model, and the second model is a decoder model.
[0195]
[0210] Aspect 13: The method of any one of aspects 1 to 10, wherein the first device is a user equipment (UE) or a network node.
[0196]
[0211] Aspect 14: A method of wireless communication performed by a first device, the method including: training the first model based on one or more inputs to obtain a trained first model, the trained first model being associated with one or more activation values associated with outputs of the trained first model; and transmitting to a second device a function associated with the trained first model, the function being configured to output one or more activation values based on ground truth inputs.
[0197]
[0212] Example 15: The method of example 14, further comprising transmitting the trained first model to a user equipment (UE) or a network node after training the first model.
[0198]
[0213] Example 16: The method of example 14 or example 15, wherein the trained first model is configured to output compressed channel state information (CSI) or to output CSI from an input of compressed CSI.
[0199]
[0214] Embodiment 17: The method of any one of embodiments 14 to 16, further comprising training a vector quantization model using the trained first model.
[0200]
[0215] Example 18: The method of any one of Examples 14 to 16, wherein the function is configured to perform vector quantization associated with the output of the function.
[0201]
[0216] Aspect 19: The method of any one of aspects 14 to 18, wherein the function is an application programming interface (API).
[0202]
[0217] Aspect 20: The method of any one of aspects 14 to 19, wherein the first device is a server associated with a network node, the first model is a decoder model, and the second device is associated with a user equipment (UE) and an encoder model.
[0203]
[0218] Aspect 21: The method of any one of aspects 14 to 19, wherein the first device is a server associated with a user equipment (UE), the first model is an encoder model, and the second device is associated with a network node and a decoder model.
[0204]
[0219] Example 22: The method of any one of Examples 14 to 19, wherein the first device is a network node or a user equipment (UE).
[0205]
[0220] Aspect 23: An apparatus for wireless communication in a device, comprising: one or more processors; one or more memories coupled to the one or more processors; and instructions stored in the one or more memories, the instructions being executable by the one or more processors to cause the apparatus to perform one or more methods of aspects 1-13.
[0206]
[0221] Aspect 24: A device for wireless communication, comprising: one or more memories; and one or more processors coupled to the one or more memories, the one or more processors configured to perform one or more methods of aspects 1 to 13.
[0207]
[0222] Aspect 25: An apparatus for wireless communication, comprising at least one means for performing one or more of the methods of aspects 1-13.
[0208]
[0223] Aspect 26: A non-transitory computer-readable medium having stored thereon code for wireless communications, the code including instructions executable by one or more processors to perform one or more of the methods of aspects 1-13.
[0209]
[0224] Aspect 27: A non-transitory computer-readable medium storing a set of instructions for wireless communication, the set of instructions including one or more instructions that, when executed by one or more processors of a device, cause the device to perform one or more methods of aspects 1-13.
[0210]
[0225] Aspect 28: An apparatus for wireless communication in a device, comprising: one or more processors; one or more memories coupled to the one or more processors; and instructions stored in the one or more memories, the instructions being executable by the one or more processors to cause the apparatus to perform one or more methods of aspects 14 to 22.
[0211]
[0226] Aspect 29: A device for wireless communication, comprising: one or more memories; and one or more processors coupled to the one or more memories, the one or more processors configured to perform one or more methods of aspects 14 to 22.
[0212]
[0227] Aspect 30: An apparatus for wireless communication, comprising at least one means for performing one or more of the methods of aspects 14-22.
[0213]
[0228] Aspect 31: A non-transitory computer-readable medium storing code for wireless communications, the code including instructions executable by one or more processors to perform one or more of the methods of aspects 14-22.
[0214]
[0229] Aspect 32: A non-transitory computer-readable medium storing a set of instructions for wireless communication, the set including one or more instructions that, when executed by one or more processors of a device, cause the device to perform one or more of the methods of aspects 14-22.
[0215]
[0230] The above disclosure provides illustration and description, but is not intended to be exhaustive or to limit the embodiments to the precise form disclosed. Modifications and variations may be made in light of the above disclosure or may be acquired from practice of the embodiments.
[0216]
[0231] As used herein, the term "component" is intended to be broadly construed as hardware and / or combinations of hardware and software. "Software" is intended to be broadly construed to mean, among other examples, instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, threads of execution, procedures, and / or functions, whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise. As used herein, a "processor" is implemented in hardware and / or a combination of hardware and software. It will be apparent that the systems and / or methods described herein can be implemented in various forms of hardware and / or combinations of hardware and software. The actual specialized control hardware or software code used to implement these systems and / or methods is not limiting. Thus, the operation and behavior of the systems and / or methods are described herein without reference to specific software code, with the understanding that those skilled in the art will be able to design software and hardware to implement the systems and / or methods based at least in part on the description herein.
[0217]
[0232] As used herein, "meeting a threshold" can refer to a value being greater than the threshold, greater than or equal to the threshold, less than the threshold, less than or equal to the threshold, equal to the threshold, not equal to the threshold, etc., depending on the context.
[0218]
[0233] Although particular combinations of features are recited in the claims and / or disclosed herein, those combinations are not intended to limit the disclosure of various aspects. Many of these features can be combined in ways not specifically recited in the claims and / or disclosed herein. The disclosure of various aspects includes each dependent claim in combination with every other claim in the claim set. As used herein, phrases referring to "at least one of" a list of items refer to any combination of those items, including single members. As an example, "at least one of a, b, or c" is intended to encompass a, b, c, a+b, a+c, b+c, and a+b+c, as well as any combination having multiple identical elements (e.g., a+a, a+a+a, a+a+b, a+a+c, a+b+b, a+c+c, b+b, b+b+b, b+b+c, c+c, and c+c+c, or any other permutation of a, b, and c).
[0219]
[0234] No element, act, or instruction used herein should be construed as essential or required unless expressly described as such. Also, as used herein, the articles "a" and "an" are intended to include one or more items and may be used interchangeably with "one or more." Furthermore, as used herein, the article "the" is intended to include one or more items referred to in connection with the article "the" and may be used interchangeably with "one or more." Furthermore, as used herein, the terms "set" and "group" are intended to include one or more items and may be used interchangeably with "one or more." Where only one item is intended, the phrase "only one" or similar language is used. Also, as used herein, terms such as "has," "have," and "having" are intended to be open-ended terms that do not limit the elements they modify (e.g., an element that "has" A can also have B). Furthermore, the phrase "based on" is intended to mean "based at least in part on," unless expressly stated otherwise. As used herein, the term "or" is also intended to be inclusive when used in a series, and may be used interchangeably with "and / or," except where expressly stated otherwise (e.g., when used in combination with "either" or "only one of").
Claims
1. A first device for wireless communication, One or more memory devices, One or more processors coupled to the one or more memories, A second device receives a function associated with a first trained model, which is configured to output one or more gradients associated with the first trained model. One or more processors configured to train the second model based on selecting one or more weights associated with the second model using the one or more gradients, which are obtained based on inputting one or more activation values and one or more inputs into the function, A first device comprising the following:
2. The aforementioned one or more processors The first device according to claim 1, further configured to transmit the second model to a user device (UE) or network node after training the second model.
3. The second model is configured to output compressed channel state information (CSI), wherein the one or more activation values include the compressed CSI. The first device according to claim 1, wherein the trained first model is configured to output a CSI from the input of the compressed CSI, the CSI having one or more inputs comprising the CSI.
4. The aforementioned one or more processors The first device according to claim 1, further configured to train a vector quantization model using the one or more gradients.
5. The first device according to claim 1, wherein the function is configured to perform a vector quantization associated with the output of the function.
6. The function is associated with a plurality of trained first models, and the one or more processors train the second model, The first device according to claim 1, configured to provide an identifier associated with the trained first model as input to the function.
7. The one or more processors train the second model: Training the second model to be configured to operate with each of the plurality of trained first models, or Training a plurality of second models, including the aforementioned second model, to be configured to operate together with each of the plurality of trained first models, The first device according to claim 6, configured to perform one of the following:
8. The function is the first function, and the one or more processors, The third device is further configured to receive instructions for a second function associated with another trained first model, The one or more processors train the second model: The first device according to claim 1, configured to train the second model using the first function and the second function.
9. The first device according to claim 1, wherein the function is an application programming interface (API).
10. The first device according to claim 1, wherein the first device is a server associated with a user device (UE), the trained first model is a decoder model, and the second model is an encoder model.
11. The first device according to claim 1, wherein the first device is a user device (UE) or a network node.
12. The first device according to claim 1, wherein the function is configured to simulate the forward and backpropagation paths of the trained first model based on the one or more gradients.
13. A first device for wireless communication, One or more memory devices, One or more processors coupled to the one or more memories, A first model is trained based on one or more inputs in order to obtain a first model which is a trained model, wherein the output of the first model is associated with one or more activation values. One or more processors configured to transmit to a second device a function associated with the first trained model, which is configured to output one or more activation values based on a ground truth input, A first device comprising the following:
14. A method of wireless communication performed by a first device, Receiving a function from a second device that is associated with a first trained model and is configured to output one or more gradients associated with the first trained model, Training the second model based on selecting one or more weights associated with the second model using the one or more gradients, which are obtained based on inputting one or more activation values and one or more inputs into the function; Methods that include...
15. A method of wireless communication performed by a first device, To obtain a trained first model, which is associated with one or more activation values associated with the output of the trained first model, training a first model based on one or more inputs; To transmit to a second device a function associated with the first trained model, which is configured to output one or more activation values based on a ground truth input, Methods that include...