Model training method and communication apparatus

By receiving the encoded model output from the network side at the terminal side and adjusting the input, and combining it with training data information for model training, the problem of data transmission overhead in dual-end AI CSI feedback is solved, achieving more efficient model training and reduced power consumption.

WO2026124349A1PCT designated stage Publication Date: 2026-06-18HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2025-12-04
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

In dual-end AI CSI feedback, how can we reduce the overhead of data transmission from the network side to the terminal side to reduce repetitive training and power consumption on the terminal side?

Method used

By receiving the output of the encoding model from the network side, adjusting the input using a data generation strategy, the terminal side reconstructs the input of the encoding model and trains the model by combining relevant information from the training data. Only a portion of the CSI feedback information is transmitted to reduce data transmission overhead.

🎯Benefits of technology

It effectively reduces the data transmission overhead from the network side to the terminal side, avoids repeated training and power consumption on the terminal side, and improves the efficiency and performance of model training.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025140097_18062026_PF_FP_ABST
    Figure CN2025140097_18062026_PF_FP_ABST
Patent Text Reader

Abstract

The present application provides a model training method and a communication apparatus. In the method, when dual-end model connection is completed between a terminal side and a network side, if the terminal side performs further training (or enhanced development) on an encoding model of the terminal side, the network side transmits only some pieces of CSI feedback information that is determined on the basis of a data set of a model of the network side. Compared with a network side transmitting a data set of original CSI, overhead of data transmission can be reduced. In addition, repeated training of the terminal side caused by the network side transmitting the data set of the original CSI and the power consumption of the terminal side caused thereby can also be avoided.
Need to check novelty before this filing date? Find Prior Art

Description

A method for training a model and a communication device

[0001] This application claims priority to Chinese Patent Application No. 202411846051.6, filed on December 13, 2024, entitled "A Method and Communication Apparatus for Training a Model", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of artificial intelligence (AI), and more specifically, to a method for training a model and a communication device. Background Technology

[0003] In AI-based channel state information (CSI) feedback, to support the integration and training of encoding or decoding models in dual-end AI CSI feedback, training datasets are needed for either the terminal or network side for model training. In one implementation, an auto-encoder (AE) model is deployed on the network side. After the network side completes the training of the AE model, it transmits the dataset to the terminal side for encoding model training, thus completing the integration of the dual-end models on both the network and terminal sides.

[0004] Therefore, in the enhanced development of the encoding model on the terminal side after the dual-end model is connected, how to reduce the overhead of data transmission from the network side to the terminal side is an urgent problem to be solved. Summary of the Invention

[0005] This application provides a method and communication device for training a model, which can reduce the overhead of transmitting data from the network side to the terminal side during model training on the terminal side after the two-end models are connected.

[0006] Firstly, a method for training a model is provided, executed by a communication device or a module for the communication device (e.g., a processor, chip, circuit, AI entity, etc., or a logic module, hardware, and / or software capable of implementing all or part of the functions of the communication device), the communication device corresponding to the terminal-side device in the method embodiment. The method includes: receiving a first output of a first encoding model, wherein the first encoding model and the first decoding model are matched, and the first output includes compressed information of channel information; obtaining a first input of the first encoding model based on the first output and the first encoding model, wherein the first input of the first encoding model corresponds to a second output of the first encoding model, and the error between the second output and the first output is less than a threshold; and obtaining a second encoding model based on the first input.

[0007] In the technical solution of this application, during model training on the terminal side, the network side only transmits a portion of the CSI feedback information to the terminal side. Compared to transmitting the original CSI dataset from the network side, this reduces data transmission overhead. Furthermore, it avoids redundant training on the terminal side caused by the network side transmitting the original CSI dataset, and the resulting power consumption on the terminal side.

[0008] In conjunction with the first aspect, in some implementations of the first aspect, obtaining the first input of the first encoding model based on the first output and the first encoding model includes: obtaining the first input based on the first output, the first encoding model, and a data generation strategy; the data generation strategy includes: keeping the model parameters of the first encoding model unchanged, adjusting the input of the first encoding model to determine the input of the first encoding model when the error between the output of the first encoding model and a given label is less than the threshold; wherein the label corresponds to the first output of the first encoding model; and, during the adjustment of the input of the first encoding model, when the error between the output of the first encoding model and the label is less than the threshold, the input corresponding to the output of the first encoding model is the first input.

[0009] In this implementation, the input (i.e., the first input) of the adjusted coding model is reconstructed using a data generation strategy and the first output of the first coding model received from the network side at the terminal side. The first output of the first coding model sent from the network side serves as a label for the output of the first coding model in the data generation algorithm. The model parameters of the first coding model remain unchanged. By adjusting the input of the first coding model, the error between the output of the first coding model and the first output received from the network side is made less than a threshold, thereby obtaining the input of the reconstructed first coding model. This input is then used to train the second coding model at the terminal side.

[0010] In conjunction with the first aspect, in some implementations of the first aspect, the method further includes: receiving information related to the training data of the first encoding model, wherein the first output is obtained based on the training data.

[0011] In this implementation, the network side provides the terminal side with information related to the training data of the first encoding model (or AE model), such as the mean and / or variance of the training dataset used by the network side to train the first encoding model. Since the training of the first encoding model on the network side utilizes this information related to the training data, which is extracted from the AE model (or the first encoding model) on the network side, providing this information to the terminal side helps the terminal side to better reconstruct the input of the first encoding model, thereby improving the performance of the trained second encoding model.

[0012] In conjunction with the first aspect, in some implementations of the first aspect, obtaining the second encoding model based on the first input includes: obtaining the second encoding model based on information related to the first input and the training data.

[0013] In this implementation, the terminal side trains the model based on the reconstruction information of CSI (i.e., the first input) and the training data related to the first encoding model provided by the network side, to obtain the second encoding model.

[0014] In conjunction with the first aspect, in some implementations of the first aspect, the method further includes: receiving first information, the first information indicating the data generation strategy.

[0015] In this implementation, the process of reconstructing the encoding model input on the terminal side can be based on a data generation strategy. Optionally, this data generation strategy can be indicated to the terminal side by the network side.

[0016] In conjunction with the first aspect, in some implementations of the first aspect, obtaining the second encoding model based on the first input includes: obtaining a second decoding model based on the first input and the first encoding model, wherein the output of the second decoding model corresponds to the first input of the first encoding model, and the input of the second decoding model is the output of the first encoding model when the first input is used as the input of the first encoding model; obtaining the second encoding model based on the second decoding model, wherein the second encoding model matches the first decoding model.

[0017] In this implementation, the terminal side first trains a decoder based on the input of the reconstructed encoding model, then solidifies the decoder and trains a second encoding model on the terminal side.

[0018] In conjunction with the first aspect, in some implementations of the first aspect, obtaining the second encoding model based on the first input includes: obtaining a third output based on the first input and the first encoding model; obtaining the second encoding model, wherein the input of the second encoding model includes the first input, the output of the second encoding model includes the third output, and the second encoding model matches the first decoding model.

[0019] In this implementation, the terminal side trains a second encoding model based on the input of the reconstructed encoding model.

[0020] In conjunction with the first aspect, in some implementations of the first aspect, the method further includes: receiving the first encoding model.

[0021] In this implementation, the terminal obtains the first encoded model by receiving it from the network side, such as receiving the model file and / or model parameters. The training of the second encoded model can be an enhancement of the first encoded model. The terminal can deploy the second encoded model, provided it meets the performance requirements of the network side, instead of the first encoded model provided by the network side. This allows the terminal to deploy a suitable model based on performance monitoring, rather than being limited to deploying only the model provided by the network side, especially when the model provided by the network side may not be suitable for the terminal.

[0022] Secondly, a method for training a model is provided, executed by a communication device or a module for the communication device (e.g., a processor, chip, circuit, AI entity, etc., or a logic module, hardware, and / or software capable of implementing all or part of the functions of the communication device), wherein the communication device may correspond to the network-side device in the method embodiment. The method includes: acquiring a first encoding model and a first decoding model, wherein the first encoding model and the first decoding model are matched; and sending a first output of the first encoding model, wherein the first output is used to determine a second encoding model, and the input of the second encoding model is determined based on the first output.

[0023] In conjunction with the second aspect, in some implementations of the second aspect, the method further includes: sending information related to the training data of the first encoding model, wherein the first output is obtained based on the training data.

[0024] In conjunction with the second aspect, in some implementations of the second aspect, the method further includes: sending first information, the first information indicating a data generation strategy, the data generation strategy and the first output being used to determine the input of the second encoding model.

[0025] In conjunction with the second aspect, in some implementations of the second aspect, the method further includes: sending the first encoding model.

[0026] In some implementations of the first or second aspect, the information related to the training data of the first encoding model includes at least one of the mean and variance of the training data.

[0027] Thirdly, a communication device is provided, which has the function of implementing the method of the first aspect, or any possible implementation of the first aspect. The function can be implemented by hardware, by software, or by hardware executing corresponding software. The hardware or software includes one or more units corresponding to the above-described function.

[0028] Fourthly, a communication device is provided, the communication device having the function of implementing the method of the second aspect, or any possible implementation of the second aspect. The function can be implemented by hardware, by software, or by hardware executing corresponding software. The hardware or software includes one or more units corresponding to the above-described function.

[0029] Fifthly, a communication device is provided, comprising at least one processor configured to cause the communication device to execute a method of the first aspect or the third aspect, or any possible implementation thereof; or to execute a method of the second aspect or the fourth aspect, or any possible implementation thereof. Optionally, the at least one processor is coupled to at least one memory for storing a computer program or instructions, the at least one processor being configured to call and execute the computer program or instructions from the at least one memory, causing the communication device to execute a method of the first aspect or any possible implementation thereof; or to execute a method of the second aspect or any possible implementation thereof. Optionally, the at least one processor may be included in the communication device or configured externally to the communication device. Optionally, the communication device further includes the at least one memory. Optionally, the communication device further includes a communication interface.

[0030] Sixthly, a communication device is provided, comprising a communication circuit and a processing circuit. The communication circuit is configured to receive a signal to be processed and transmit the signal to the processing circuit. The processing circuit is configured to process the signal to perform a method as described in the first aspect or any possible implementation thereof; or to perform a method as described in the second aspect or any possible implementation thereof. Optionally, the communication circuit is further configured to output the processed signal. As an example, the communication circuit may be a transceiver, hardware circuit, bus, module, pin, or other type of communication interface. The signal includes information and / or data. Optionally, the communication device may be a chip or a chip system.

[0031] A seventh aspect provides a computer-readable storage medium storing computer program code or instructions that, when executed on a computer, cause the method as described in the first aspect or any possible implementation thereof to be implemented; or, the method as described in the second aspect or any possible implementation thereof to be implemented.

[0032] Eighthly, a computer program product is provided, the computer program product comprising computer program code or instructions, which, when executed on a computer, cause the method in the first aspect or any possible implementation thereof to be implemented; or, as in the second aspect or any possible implementation thereof, the method to be implemented.

[0033] A ninth aspect provides a wireless communication system, including a communication device as described in the third aspect and a communication device as described in the fourth aspect. Attached Figure Description

[0034] Figure 1 is a schematic diagram of a communication system applicable to an embodiment of this application.

[0035] Figure 2 is another schematic diagram of a communication system applicable to an embodiment of this application.

[0036] Figure 3 is a schematic diagram of a possible application framework in a communication system.

[0037] Figure 4 is a schematic diagram of another possible application framework in a communication system.

[0038] Figure 5 is a schematic diagram of AI-CSI feedback based on the AE model.

[0039] Figure 6 is a schematic flowchart of pairing of two-end models based on dataset transmission.

[0040] Figure 7 is a schematic flowchart of pairing two-end models based on model passing.

[0041] Figure 8 is a schematic flowchart of the method 200 for training a model provided in this application.

[0042] Figure 9 is a schematic diagram of the process of reconstructing the input of the encoding model based on the output of the encoding model provided by the network side on the terminal side.

[0043] Figure 10 is a schematic diagram of one implementation of training the second encoding model on the terminal side.

[0044] Figure 11 is a schematic diagram of the complete process of training the second coding model on the terminal side based on method 1.

[0045] Figure 12 is a schematic diagram of another implementation of training the second encoding model on the terminal side.

[0046] Figure 13 is a schematic diagram of the complete process of training the second encoding model on the terminal side based on method 2.

[0047] Figure 14 is a schematic diagram of the application of the technical solution of this application to the ORAN architecture.

[0048] Figure 15 is a schematic block diagram of the communication device 1000 provided in this application.

[0049] Figure 16 is a schematic block diagram of another communication device 1100 provided in this application.

[0050] Figure 17 is a schematic structural diagram of the chip provided in this application. Detailed Implementation

[0051] The technical solutions in this application will now be described with reference to the accompanying drawings.

[0052] The technical solutions provided in this application can be applied to various communication systems, such as 5th generation (5G) or new radio (NR) systems, long term evolution (LTE) systems, LTE frequency division duplex (FDD) systems, LTE time division duplex (TDD) systems, wireless local area network (WLAN) systems, and satellite communication systems. Furthermore, they can be applied to device-to-device (D2D) communication, vehicle-to-everything (V2X) communication, machine-to-machine (M2M) communication, machine-type communication (MTC), as well as Internet of Things (IoT) communication systems, future communication systems, or integrated systems of multiple systems.

[0053] In a communication system, one network element can send signals to or receive signals from another network element. These signals can include information, signaling, or data. A network element can also be replaced by an entity, network entity, device, communication device, communication module, node, or communication node; this embodiment uses a device as an example. For instance, a communication system can include at least one terminal device and at least one network device. The network device can send downlink signals to the terminal device, and / or the terminal device can send uplink signals to the network device.

[0054] Figure 1 is a schematic diagram of a communication system applicable to an embodiment of this application. As shown in Figure 1, the communication system 100 may include at least one network device, such as network device 110 shown in Figure 1; the communication system 100 may also include at least one terminal device, such as terminal device 120 and terminal device 130 shown in Figure 1. Network device 110 and terminal devices (such as terminal devices 120 and 130) can communicate via a wireless link. The communication devices in this communication system, for example, network device 110 and terminal device 120, can communicate via multi-antenna technology.

[0055] Optionally, the communication system may also include at least one AI node.

[0056] Figure 2 is another schematic diagram of a communication system applicable to embodiments of this application. Compared to the communication system 100 shown in Figure 1, the communication system 100 shown in Figure 2 further includes an AI node 140. The AI ​​node 140 is used to perform AI-related operations, such as building training datasets, training or inferring AI models, etc.

[0057] In one implementation, network device 110 can send data related to AI model training to AI node 140, whereby AI node 140 constructs a training dataset and trains the AI ​​model. As an example, the data related to AI model training may include data reported by terminal devices. AI node 140 can send the results of AI model-related operations to network device 110, which then forwards them to the terminal devices. For example, the results of AI model-related operations may include at least one of the following: a trained AI model, model evaluation results, or test results. Exemplarily, a portion of the trained AI model may be deployed on network device 110, and another portion on the terminal devices. Optionally, the trained AI model may be deployed on network device 110, or it may be deployed on the terminal devices.

[0058] It should be understood that Figure 2 is only used as an example of AI node 140 being directly connected to network device 110. In other scenarios, AI node 140 can also be connected to terminal device. Alternatively, AI node 140 can be connected to both network device 110 and terminal device simultaneously. Alternatively, AI node 140 can also be connected to one or more of network device 110 and terminal device through a third-party network element. This application embodiment does not limit the connection relationship between AI network element and other network elements.

[0059] Alternatively, in another implementation, the AI ​​node 140 can also be configured as a module in a network device and / or a terminal device, for example, in the network device 110, terminal device 120, or terminal device 130 shown in FIG1.

[0060] It should be noted that Figures 1 and 2 are schematic diagrams for ease of understanding only. The communication system may also include other devices, such as wireless relay devices and / or wireless backhaul devices, which are not shown in Figures 1 and 2. Furthermore, in practical applications, the communication system may include multiple network devices or multiple terminal devices. This application does not limit the number of network devices and terminal devices.

[0061] In the embodiments of this application, the terminal device may also be referred to as user equipment (UE), access terminal, user unit, user station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent, or user apparatus. The terminal device can be a device that provides voice / data, such as a handheld device or vehicle-mounted device with wireless connectivity. Currently, examples of terminals include: mobile phones, tablets, laptops, PDAs, mobile internet devices (MIDs), wearable devices, virtual reality (VR) devices, augmented reality (AR) devices, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical surgery, wireless terminals in smart grids, wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, cellular phones, cordless phones, session initiation protocol (SIP) phones, wireless local loop (WLL) stations, personal digital assistants (PDAs), handheld devices with wireless communication capabilities, computing devices or other processing devices connected to wireless modems, wearable devices, terminal devices in 5G networks, or terminal devices in future communication systems, etc., and this application does not limit these examples.

[0062] In this embodiment, the device used to implement the functions of the terminal device can be the terminal device itself, or any device capable of supporting the terminal device in implementing corresponding functions, such as a processor, circuit, or chip. This device can be configured in the terminal device or used in conjunction with the terminal device. In this embodiment, the terminal device is used as an example to illustrate the function of the terminal device, and this does not constitute a limitation on the solution of this embodiment.

[0063] The network devices in this application embodiment may include radio access network (RAN) nodes that connect terminal devices to wireless networks, such as base stations. Base stations can broadly encompass various names as follows, or be replaced by the following names: NodeB, evolved NodeB (eNB), next-generation NodeB (gNB), relay station, access point, transmitting and receiving point (TRP), transmitting point (TP), master station, auxiliary station, motor slide retainer (MSR) node, home base station, network controller, access node, wireless node, access point (AP), transmission node, transceiver node, baseband unit (BBU), remote radio unit (RRU), active antenna unit (AAU), remote radio head (RRH), central unit (CU), distributed unit (DU), radio unit (RU), etc. A base station can be a macro base station, micro base station, relay node, donor node, or a combination thereof. A base station can also refer to a communication module, modem, or chip installed within the aforementioned equipment or apparatus. A base station can also be a mobile switching center, equipment performing base station functions in D2D, V2X, and M2M communications, or equipment performing base station functions in future communication systems. A base station can support networks using the same or different access technologies. Optionally, a RAN node can also be a server, wearable device, vehicle, or in-vehicle equipment. For example, the access network equipment in vehicle-to-everything (V2X) technology can be a roadside unit (RSU). The embodiments of this application do not limit the specific technologies or equipment forms used in the network equipment.

[0064] Base stations can be fixed or mobile. For example, a helicopter or drone can be configured to act as a mobile base station, and one or more cells can move depending on the location of the mobile base station. In other examples, a helicopter or drone can be configured as a device to communicate with another base station.

[0065] In some deployments, the network devices mentioned in the embodiments of this application may be devices including CU, DU, or CU and DU, or devices with control plane CU nodes (central unit-control plane (CU-CP)) and user plane CU nodes (central unit-user plane (CU-UP)) and DU nodes. For example, the network devices may include gNB-CU-CP, gNB-CU-UP, and gNB-DU.

[0066] In some deployments, multiple RAN nodes collaborate to assist terminals in achieving wireless access, with different RAN nodes each implementing some of the base station's functions. For example, RAN nodes can be CUs, DUs, CU-CPs, CU-UPs, or RUs. CUs and DUs can be configured separately or included in the same network element, such as a BBU. RUs can be included in radio frequency equipment or radio frequency units, such as RRUs, AAUs, or RRHs.

[0067] In one possible design, the processing unit in the BBU used to implement baseband functions is called the baseband high (BBH) unit, and the processing unit in the RRU / AAU / RRH used to implement baseband functions is called the baseband low (BBL) unit.

[0068] In different systems, CU (or CU-CP and CU-UP), DU, or RU may have different names, but those skilled in the art will understand their meaning. For example, in an open radio access network (ORAN / O-RAN) system, CU can also be called O-CU (open CU), DU can also be called O-DU, CU-CP can also be called O-CU-CP, CU-UP can also be called O-CU-UP, and RU can also be called O-RU. Any of the units among CU (or CU-CP, CU-UP), DU, and RU in this application can be implemented through software modules, hardware modules, or a combination of software modules and hardware modules.

[0069] In this embodiment, the device used to implement the functions of the network device can be a network device itself; it can also be a device capable of supporting the network device in implementing corresponding functions, such as a processor, circuit, or chip. This device can be configured within the network device or used in conjunction with the network device. In this embodiment, the network device is used as an example to illustrate the function of the network device, and this does not constitute a limitation on the solutions described in this embodiment.

[0070] Network devices and / or terminal devices can be deployed on land, including indoors or outdoors, handheld or vehicle-mounted; they can also be deployed on water; and they can also be deployed in the air on airplanes, balloons, and satellites. This application does not limit the scenario in which the network devices and terminal devices are located. Furthermore, terminal devices and network devices can be hardware devices, or software functions running on dedicated hardware or general-purpose hardware, such as virtualization functions instantiated on a platform (e.g., a cloud platform), or entities that include dedicated or general-purpose hardware devices and software functions. This application does not limit the specific form of the terminal devices and network devices.

[0071] Optionally, the AI ​​node can be deployed in one or more of the following locations within the communication system: access network equipment, terminal equipment, or core network equipment, etc. Alternatively, the AI ​​node can be deployed independently, for example, in a location other than any of the aforementioned devices, such as in the host or cloud server of an over-the-top (OTT) system. The AI ​​node can communicate with other devices in the communication system, which can be, for example, one or more of the following: access network equipment, terminal equipment, or core network equipment, etc.

[0072] This application does not limit the number of AI nodes. For example, when there are multiple AI nodes, they can be divided based on function, such as different AI nodes being responsible for different functions.

[0073] Optionally, AI nodes can be independent devices, integrated into the same device to implement different functions, or they can be network elements in hardware devices, software functions running on dedicated hardware, or virtualization functions instantiated on a platform (e.g., a cloud platform). This application does not limit the specific form of the AI ​​nodes described above. AI nodes can also be called AI network elements or AI modules.

[0074] Figure 3 illustrates a possible application framework in a communication system. As shown in Figure 3, network elements in the communication system are connected via interfaces (e.g., NG, Xn) or air interfaces. These network elements, such as core network equipment, access network equipment (RAN nodes), terminals, or one or more devices in the operation administration and maintenance (OAM) system, are equipped with one or more AI modules. Access network equipment can be a single RAN node or can include multiple RAN devices, such as CUs and DUs. The CUs and / or DUs can also be equipped with one or more AI modules. Optionally, the CU can be further divided into CU-CP and CU-UP. One or more AI models are configured in the CU-CP and / or CU-UP.

[0075] The AI ​​module is used to implement corresponding AI functions. AI modules deployed in different network elements can be the same or different. Depending on the parameter configuration, the AI ​​module can implement different functions. The AI ​​module model can be configured based on one or more of the following parameters: structural parameters (e.g., at least one of the following: number of neural network layers, neural network width, inter-layer connections, neuron weights, neuron activation function, or bias in the activation function), input parameters (e.g., type and / or dimension of input parameters), or output parameters (e.g., type and / or dimension of output parameters). The bias in the activation function can also be referred to as the neural network bias.

[0076] An AI module can have one or more models. A model can infer an output, which includes one or more parameters. The learning, training, or inference processes of different models can be deployed on different nodes or devices, or they can be deployed on the same node or device.

[0077] The network device can be a network device equipped with one or more AI modules. For example, the network device can be one or more devices in the core network, access network, or OAM as shown in Figure 3. The AI ​​module can be the RAN intelligent controller (RIC) shown in Figure 4, such as a near real-time RIC or a non-real-time RIC. For example, a near real-time RIC is set in a RAN node (e.g., in a CU or DU), while a non-real-time RIC is set in the OAM, cloud server, core network device, or other network device.

[0078] Figure 4 illustrates another possible application framework in a communication system. As shown in Figure 4, the communication system includes a Resource Interchange (RIC). For example, the RIC could be the AI ​​module in the RAN device shown in Figure 1, used to implement AI-related functions. RICs include near-real-time RICs (near-RT RICs) and non-real-time RICs (non-RT RICs). Non-real-time RICs primarily process non-real-time information, such as data that is not sensitive to latency, with latency in the order of seconds. Real-time RICs primarily process near-real-time information, such as data that is relatively sensitive to latency, with latency in the order of tens of milliseconds.

[0079] Near real-time RICs are used for model training and inference. For example, they are used to train AI models and then use those models for inference. Near real-time RICs can obtain network-side and / or terminal-side information from RAN devices (e.g., CU, CU-CP, CU-UP, DU, and / or RU) and / or terminal devices. This information can be used as training data or as data for inference.

[0080] Optionally, near real-time RIC can deliver inference results to RAN devices and / or terminal devices.

[0081] Optionally, inference results can be exchanged between CU and DU, and / or between DU and RU. For example, near real-time RIC submits inference results to DU, and DU sends them to RU.

[0082] Non-real-time RICs are also used for model training and inference. For example, they can be used to train AI models and then use those models for inference. Non-real-time RICs can obtain network-side and / or terminal-side information from RAN devices (e.g., CU, CU-CP, CU-UP, DU, and / or RU) and / or terminals. This information can be used as training data or as inference data, and the inference results can be delivered to RAN nodes and / or terminals.

[0083] Optionally, inference results can be exchanged between CU and DU, and / or between DU and RU. For example, a non-real-time RIC can submit inference results to DU, which in turn can send them to RU.

[0084] Near real-time RICs and non-real-time RICs can also be configured as separate devices. Alternatively, near real-time RICs and non-real-time RICs can also be part of other devices. For example, near real-time RICs can be configured in RAN nodes (e.g., CU, DU), while non-real-time RICs can be configured in OAM, cloud servers, core network devices, or other devices.

[0085] Optionally, the AI ​​model can be implemented as hardware circuitry, software, or a combination of both, without limitation. Non-limiting examples of software include: program code, program, subroutine, instruction, instruction set, code, code segment, software module, application program, or software application, etc.

[0086] The following section introduces some related technologies involved in the technical solution of this application.

[0087] 1. AI-CSI Feedback

[0088] In existing communication systems such as Long Term Evolution (LTE) and New Radio (NR), base stations acquire Channel State Information (CSI) to determine one or more of the following configurations for scheduling the downlink data channel resources, modulation and coding scheme (MCS), and precoding for the UE. In TDD systems, due to the reciprocity of uplink and downlink channels, the base station can obtain the uplink CSI by measuring the uplink reference signal and then infer a relatively accurate downlink CSI, for example, using the uplink CSI as the downlink CSI. In FDD systems, uplink and downlink reciprocity cannot be guaranteed. The downlink CSI is obtained by the UE measuring the downlink reference signal, such as the CSI-RS or synchronization signal block (SSB) (i.e., the synchronization signal / physical broadcast signal block, SS / PBCH block). Therefore, the UE needs to generate a CSI report according to the protocol predefined method or the base station configuration and feed the CSI back to the base station so that it can obtain the downlink CSI.

[0089] In AI-based CSI (Also known as AI-CSI) feedback, when the model is deployed at the base station, the base station obtains the CSI-RS estimation results at the terminal as labels for model training. An autoencoder (AE) model generally refers to a network structure composed of two AI models, such as an encoder and a decoder. Each model can be an AI model. AE models are also called bilateral models, dual-end models, or collaborative models. The encoder and decoder of an AE are usually trained together and can be used in combination; they can also be called self-models. CSI feedback and reconstruction can be implemented based on AE models.

[0090] Figure 5 illustrates the AI-CSI feedback based on the AE model. As shown in Figure 5, the terminal side compresses the CSI using an encoder (i.e., the encoding model), while the network side recovers the CSI using a decoder (i.e., the decoding model). For the network side, the input to the model is the fed-out CSI, and the output is the recovered CSI; the model training uses the CSI measured by the terminal side as the label (or ground truth) of the recovered CSI.

[0091] Optionally, in Figure 5, the encoder input is the measured CSI, which can also be replaced with the "target CSI". In this implementation, the terminal side compresses the target CSI through the encoder, and the network side reconstructs the target CSI through the decoder.

[0092] Optionally, in one implementation, the encoder shown in Figure 5 may further include a quantizer, and the decoder may include a dequantizer. In this implementation, the encoder compresses and quantizes the CSI, and the decoder dequantizes and decompresses the feedback CSI, outputting the recovered CSI. For example, the terminal can use the target CSI (e.g., denoted as V) as input to the encoder, which can compress the target CSI to obtain the compressed CSI. Optionally, the terminal can quantize the compressed CSI to obtain the quantized CSI. The terminal can then send the quantized CSI as CSI feedback information (e.g., denoted as C) to the network side, for example, via a CSI report.

[0093] The network side can first dequantize the CSI feedback information to obtain compressed CSI with quantization loss. The network side can then use this compressed CSI as input to the decoder, which decompresses the compressed CSI to obtain the reconstructed CSI (e.g., represented as...). ).

[0094] The quantizer used to perform quantization can be predefined, such as protocol predefined or network-side indication, and is not limited. The encoder on the terminal side can be deployed inside the terminal device or in other devices outside the terminal device, such as the aforementioned OTT server; the decoder on the network side can be deployed inside the network device or in other devices outside the network device, such as in intelligent network elements. Furthermore, this application does not limit the number of models included in the AE model. It is understood that the AE model is only one possible model for implementing the above functions and should not constitute any limitation on this application. The AE model can also be replaced with other AI models that can achieve the same or similar functions.

[0095] To support the pairing (or docking) and training of encoding or decoding models in dual-end AI CSI feedback, a dataset is needed for model training on either the terminal or network side. Taking the encoding model as an example, the dataset includes the input of the encoding model, the corresponding output of the encoding model, the quantization method, etc. The encoder input and its corresponding output are determined based on the dataset. The input information is then fed into the encoder to be trained, yielding its actual output. The error between the actual output and the encoder outputs included in the dataset is compared. If the error exceeds a threshold, the encoder is trained (or adjusted) until the error between the actual output and the encoder outputs included in the dataset is less than the threshold.

[0096] Typical pairing of two-end models can include pairing based on dataset transfer or pairing based on model transfer, which will be explained below with reference to Figures 6 and 7 respectively.

[0097] 2. Pairing of two-end models based on dataset transmission

[0098] Dataset integration mainly refers to the sender sending a dataset to the receiver for model training.

[0099] Figure 6 is a schematic flowchart of the pairing of two-end models based on dataset transmission. As shown in Figure 6, in stage 1, the network side can use a virtual CSI compression model to compress the target CSI, and then use a CSI reconstruction model to decompress the compressed CSI to obtain the reconstructed CSI. That is, the target CSI is the input of the CSI compression model, and the output of the CSI compression model can be the compressed CSI. This compressed CSI can be the input of the CSI reconstruction model, and the output of the CSI reconstruction model can be the reconstructed CSI. As introduced in Figure 5 above, the compressed CSI can be further quantized to obtain the quantized CSI. During the model training phase, both the compressed CSI and the quantized CSI can be used as CSI feedback information. The above process is the process of the network side jointly training the encoder and decoder locally, which can also be called joint training. After training is completed, the trained encoder and decoder are matched to form the network side's self-model. In order to send the encoder in the trained self-model to the terminal side to realize the pairing of the two-end models between the terminal side and the network side, the network side sends a dataset to the terminal side for training the CSI compression model on the terminal side. For example, the dataset sent from the network side to the terminal side may include one or more of the following: target CSI, CSI feedback information, or reconstructed CSI. Optionally, the dataset may also include a quantization method. In other words, the quantization method is also indicated by the network side. Optionally, the quantization method is predefined. The quantization method is used to determine how to quantize the compressed CSI and / or how to dequantize the quantized CSI. The quantization method may also be replaced by a dequantization method, a quantizer, or a dequantizer.

[0100] In phase 2, for example, the terminal side uses the received dataset to train a model, obtaining a CSI compressed model on the terminal side. The CSI compressed model trained on the terminal side is matched with the CSI reconstructed model in the self-model on the network side, thereby completing the pairing of the two-end models on the network side and the terminal side.

[0101] In the process shown in Figure 6, the network side transmits the dataset to the terminal side for training the terminal encoder, which is therefore called the pairing (or docking) of the two-end models based on the dataset transmission.

[0102] 3. Pairing of two-end models based on model transfer

[0103] Model integration primarily refers to the sender sending model files and / or model parameters to the receiver for model deployment, development (e.g., model training), and deployment. In this application's embodiments, "sending model" can refer to sending model files and / or model parameters, and "receiving model" can refer to receiving model files and / or model parameters.

[0104] Figure 7 is a schematic flowchart of pairing two-end models based on model transfer. As shown in Figure 7, the network side jointly trains the encoder and decoder locally. After training, the network side sends the encoder's model file and / or model parameters from the trained self-model to the terminal side for the terminal side to obtain the encoded model. This implementation is called pairing two-end models based on model transfer. For example, the network side can obtain the CSI compressed model through joint training. The joint training process can be seen in Figure 6 and will not be repeated here. The network side can distribute the CSI compressed model or CSI reconstructed model obtained through joint training to the terminal side. For example, the network side can send the CSI compressed model's model file and / or model parameters to the terminal side, or send the CSI reconstructed model's model file and / or model parameters to the terminal side. The terminal side can then train the model based on the received model file and / or model parameters.

[0105] It is understandable that the CSI compression model shown in Figures 6 and 7 above can be an encoder, and the CSI reconstruction model can be a decoder.

[0106] 4. Enhanced development monitoring of the counterpart model

[0107] Counter-side model enhancement development monitoring refers to monitoring the receiving party's enhancement development of at least one of the following: model, model parameters, or dataset. After receiving the model, model parameters, or dataset, the receiving party enhances the encoder or decoder it will use in the subsequent inference stage, and then monitors whether the enhanced model (i.e. the model used for inference) meets the network side's requirements.

[0108] In many possible scenarios, after pairing the terminal and network models using the methods shown in Figure 6 or Figure 7, the terminal may perform enhancements on the encoded model. When the terminal performs these enhancements, the network provides the terminal with information related to the training objectives, such as the compression or reconstruction performance targets the model needs to meet, such as square generalized cosine similarity (SGCS) or normalized mean square error (NMSE), to ensure that the terminal's enhancements meet the network's requirements.

[0109] Furthermore, in the model training on the terminal side after the two-end models are paired (or connected), how to reduce the overhead of the network side transmitting data for model training on the terminal side is an urgent problem to be solved.

[0110] Therefore, this application provides a model training method that helps reduce the overhead of data transmission from the network side to the terminal side in scenarios where the model is trained on the terminal side after pairing two models. In the embodiments of this application, the enhancement development of the model can also be described as model updating, adjustment, optimization, training, or fine-tuning, etc., without limitation.

[0111] First, the technical solution of this application is summarized as follows: In the enhancement development or training of the encoding model on the terminal side after model docking, the network side can transmit CSI feedback information (i.e., the encoder output in the self-model of the network side) obtained based on data-free data distillation algorithms represented by DeepDream to the terminal side. The terminal side reconstructs the target CSI based on the received CSI feedback information, and then trains the encoding model on the terminal side based on the reconstructed information.

[0112] The embodiments of this application involve the following technical terms related to AI models, which are uniformly introduced here:

[0113] 1) Dataset: The data used for model training, validation, or testing in machine learning. The quantity and quality of the data will affect the effectiveness of machine learning.

[0114] 2) Model training: By selecting an appropriate loss function, the model parameters are trained using optimization algorithms to minimize the loss function value;

[0115] 3) Loss function: used to measure the difference between the model's predicted value and the true value.

[0116] 4) Channel state information (CSI): The meaning of CSI in this application is broader than that of traditional CSI, including but not limited to channel quality indication (CQI), PMI, rank indicator (RI), CSI-RS resource indicator (CRI), and may also include one or more of the following: channel response information (such as channel response matrix, frequency domain channel response information, time domain channel response information), weight information corresponding to the channel response, precoding matrix information corresponding to the channel response, reference signal receiving power (RSRP), reference signal receiving quality (RSRQ), signal to interference plus noise ratio (SINR), etc.

[0117] 5) In addition, the following CSI-related terms are involved:

[0118] Target CSI: Also known as the full CSI information, the uncompressed CSI information, the raw CSI, or the original CSI information. For ease of distinction and explanation, the target CSI will be represented as V in the following text.

[0119] CSI feedback information: also known as CSI feedback information, channel measurement result feedback information, channel information feedback information, compressed information, compressed channel information, compressed CSI information, compressed channel information, or compressed CSI, etc. For ease of distinction and explanation, CSI feedback information will be represented by "C" below.

[0120] In the embodiments of this application, during the training phase, CSI feedback information can refer to the information obtained after compressing the target CSI, which can be simply referred to as compressed CSI; or it can refer to the information obtained by compressing and quantizing the target CSI, which can be simply referred to as quantized CSI. During the model deployment phase, CSI feedback information can be information sent from the terminal side to the network side. To save feedback overhead, CSI feedback information can refer to the information obtained by compressing and quantizing the measured CSI (also known as the target CSI), i.e., quantized CSI.

[0121] It is understandable that quantizing compressed CSI yields quantized CSI; dequantizing quantized CSI yields compressed CSI. The compressed CSI obtained by dequantizing quantized CSI is compressed CSI with quantization loss.

[0122] Reconstructed CSI: Also known as recovered CSI, reconstructed CSI, restored CSI, reconstructed channel information, decompressed CSI, decompressed channel information, etc. For ease of distinction and explanation, the term reconstructed CSI will be used in the following text. express.

[0123] 6) Model training: By selecting an appropriate function (such as a loss function), the model parameters are trained using optimization algorithms to minimize the difference between the model's predicted value and the ground truth (or target value, label).

[0124] For example, model training methods include, but are not limited to, supervised learning, self-supervised learning, and knowledge distillation. These methods are briefly explained below.

[0125] 7) Supervised Learning: Also known as supervised learning. Based on collected sample values ​​and labels, machine learning algorithms learn the mapping relationship between sample values ​​and labels, and express this learned mapping relationship using a machine learning model. The process of training the machine learning model is the process of learning this mapping relationship. For example, in signal detection, the noisy received signal is the sample, and the corresponding real constellation point is the label. Machine learning aims to learn the mapping relationship between samples and labels through training, that is, to enable the machine learning model to learn a signal detector. During training, the model parameters are optimized by calculating the error between the model's predicted value and the real label. Once the mapping relationship is learned, the learned mapping can be used to predict the sample label of each new sample. The mapping relationship learned in supervised learning can include linear mappings and nonlinear mappings. Based on the type of label, the learning task can be divided into classification tasks and regression tasks.

[0126] 8) Self-supervised learning: This is a type of unsupervised learning. Unsupervised learning relies on collected sample values ​​to allow algorithms to discover inherent patterns within the samples themselves. Self-supervised learning uses the samples themselves as supervisory signals; that is, the model learns the mapping relationship from sample to sample. During training, model parameters are optimized by calculating the error between the model's predicted values ​​and the samples themselves. Self-supervised learning can be used in signal compression and decompression recovery applications; common algorithms include autoencoders and generative adversarial networks.

[0127] 9) Knowledge Distillation: Generally, large models are often single complex networks or collections of networks, possessing excellent performance and generalization ability, while small models, due to their smaller network size, have limited expressive power. Therefore, the knowledge learned by the large model can be used to guide the training of the small model; this process is called knowledge distillation. Knowledge distillation can enable small models to achieve performance comparable to large models, but with fewer parameters and shorter inference latency, thus achieving model compression and acceleration. Furthermore, directly training small models with massive amounts of data often does not yield good performance, while training large models with massive amounts of data and then using the large model to perform knowledge distillation on the small model can achieve better continuation results. In addition, knowledge distillation can also be used to integrate and transfer datasets from different domains.

[0128] Knowledge distillation employs a teacher-student model, where a teacher model assists in training a student model. The teacher model is a complex, large model, while the student model is a simple, small model. Because the teacher model has strong learning capabilities, it can transfer the knowledge it learns to the relatively weaker student model, thereby enhancing the student model's generalization ability. The complex, cumbersome, but effective teacher model remains offline, simply acting as a mentor; the flexible and lightweight student model is the one actually deployed for prediction tasks.

[0129] 10) Data distillation: This is a data compression and refinement method that borrows the idea of ​​knowledge distillation. Its basic principle is to extract a small synthetic dataset from a large-scale real-world dataset so that a model trained on this small synthetic dataset can achieve similar performance to a model trained on the original large-scale dataset.

[0130] The model training methods listed above are merely examples, and this application does not limit the methods used for model training.

[0131] Model training in this application embodiment may include initial training (or original training) or retraining of the model. Model training enables model enhancement. Model enhancement can refer to improving model functionality through model structure optimization and / or model training, such as adding other functions or neural networks to the existing model functionality, so that the enhanced model can be applied to more scenarios or richer requirements.

[0132] 11) Model files and model parameters: Model files and / or model parameters can be used to determine the model. Optionally, the model in this application may refer to the model itself, or it may refer to the model files and / or model parameters used to determine the model.

[0133] The model file can be used to indicate the model structure, which may include, but is not limited to, feedforward neural networks (FNNs), convolutional neural networks (CNNs), or recurrent neural networks (RNNs). The model file can have a fixed format, such as a standard predefined format, or a format pre-negotiated by both ends of the connection. Model parameters can refer to parameters in the neural network model, such as, but not limited to, the number of layers in the neural network, the type and weights of neurons in each layer, etc. This application does not limit the method of distributing model parameters.

[0134] Take DNN as an example. The idea behind DNN comes from the neuronal structure of the brain. Each neuron can perform a weighted summation operation on its inputs and then use the result of the weighted summation operation to generate the output through a non-linear function. For example, the input of a neuron is x = [x0, x1, ..., x...]. N-1 The weights corresponding to the inputs are w = [w0, w1, ..., w] N-1 The bias of the weighted summation is b. The nonlinear function f() can take many forms; for example, the nonlinear function f() can be the maximum value function max{0, x}. Then the effect of a neuron's execution is... Where N is a positive integer, and n is a positive integer greater than or equal to 0 and less than or equal to (N-1).

[0135] A DNN typically has multiple neural network layers, including an input layer, one or more hidden layers, and an output layer. Generally, the first layer is the input layer, the last layer is the output layer, and the layers in between are hidden layers. Each layer contains multiple neurons. Layers are fully connected; that is, any neuron in the i-th layer is connected to any neuron in the (i+1)-th layer. The input layer processes the received values ​​(i.e., the DNN's input) through neurons and then passes them to the hidden layers. Similarly, the hidden layers pass the computation results to the final output layer, producing the DNN's output. This application does not limit the structure and parameters used in the AI ​​model.

[0136] One of the model structure or model parameters can be predefined, while the other can be sent by the sender (e.g., the network side). Alternatively, both the model structure and model parameters can be sent by the sender (e.g., the network side). This application does not impose any restrictions on this.

[0137] In this embodiment of the application, sending a model may refer to sending a model file and / or model parameters, and receiving a model may refer to receiving a model file and / or model parameters.

[0138] The technical solution of this application will be described in detail below.

[0139] Figure 8 is a schematic flowchart of the method 200 for training a model provided in this application. Method 200 involves a network-side device and a terminal-side device, and method 200 can be implemented by the network-side device and the terminal-side device each performing corresponding steps.

[0140] Optionally, one or more steps in method 200 performed by a communication device (e.g., a network device or a terminal device) can be replaced by a device for such communication devices (e.g., referred to as a first device). The first device can be a chip, processor, circuit, or AI entity serving the communication device. The AI ​​entity can be deployed on or outside the communication device. If the first device is deployed outside the communication device, air interface interaction between the first device and the communication device may also be involved, such as interaction of datasets or models, or information exchange related to model training or adjustment, etc., to facilitate the first device to perform relevant processing based on the datasets, models, or information provided by the communication device to complete the method for training the model provided in this application. This description applies to any embodiment of this application, and will not be repeated below.

[0141] As an example, when the communication device is a terminal-side device, the aforementioned AI entity can be a host or cloud server of an over-the-top (OTT) system; when the communication device is a network-side device, the aforementioned AI entity can be a deployment device for a network-side AI model, collectively referred to as an intelligent network element, such as the near real-time RIC in Figure 4 above. The deployment of the network-side or terminal-side AI model may not be within the same physical entity as the network-side device or the terminal-side device. The following embodiments use network-side devices and terminal-side devices as examples for description.

[0142] As described above, the network-side device may include a network device or a means for a network device, such as a smart network element; the terminal-side device may include a terminal device or a means for a terminal device, such as an OTT server. Method 200 may be implemented by multiple executions of corresponding steps by the network device, the means for the network device (e.g., a smart network element), the terminal device, and the means for the terminal device (in the OTT server).

[0143] 210. The network-side device sends the first output of the first encoding model. The first output is used to determine the second encoding model on the terminal side. The input of the second encoding model is determined based on the first output.

[0144] The first output includes compressed information of the channel information.

[0145] Furthermore, the first encoding model and the first decoding model are matched. As an example, the first encoding model and the first decoding model can be obtained through training on the network side. The network-side device sends the first output of the first encoding model, which the terminal side uses to obtain its second encoding model based on the first output. Therefore, the first encoding model on the terminal side is received from the network side, and the second encoding model can be obtained by adjusting the first encoding model, or it can be obtained by the terminal side by training its own designed encoding model based on the first input of the first encoding model received from the network side.

[0146] 220. The network-side device acquires the first encoding model and the first decoding model.

[0147] Optionally, before the network-side device sends the first encoding model, the network device acquires the first encoding model and the first decoding model. For example, the network device trains a self-model based on a collected dataset, which includes the first encoding model and the first decoding model. Alternatively, the network device provides a dataset for training the self-model to the intelligent network element, which then trains the self-model to obtain the model.

[0148] The output of the first encoding model after training is the first output in this embodiment of the application.

[0149] The network side sends the first output of the first coding model to the terminal side. This can be done autonomously by the network side, or it can be sent based on a request from the terminal side. For example, before the terminal device enhances the first coding model, it requests the first output of the first coding model from the network side and receives the first output from the network side. Then, based on the received first output, the terminal side reconstructs the input of the first coding model and obtains the second coding model based on the reconstructed input.

[0150] Optionally, the second encoding model may be an optimization, adjustment or retraining of the first encoding model, or it may be a redesigned encoding model on the terminal side, without limitation.

[0151] The process of obtaining the second coding model on the terminal side based on the first output of the first coding model provided by the network side is as follows: steps 230-250.

[0152] 230. The terminal device receives the first output of the first encoding model.

[0153] Step 230 corresponds to step 210. Step 210 is executed by the network side, and step 230 is executed by the terminal side.

[0154] 240. The terminal device obtains the first input of the first encoding model based on the first output and the first encoding model.

[0155] The first input of the first encoding model corresponds to the second output of the first encoding model, and the error between the second output and the first output is less than a threshold.

[0156] After the terminal device obtains the first output of the first encoding model, it reconstructs the input of the first encoding model based on the first output. The reconstructed input is the first input in this embodiment of the application.

[0157] As an example, the terminal device can obtain a first input based on the first output of the first coding model, the first coding model, and a data generation strategy received from the network side. The data generation strategy includes: keeping the model parameters of the first coding model unchanged, and adjusting the input of the first coding model to determine the input of the first coding model when the error between the output of the first coding model and a given label is less than a threshold. This input of the first coding model is then the first input.

[0158] When the data generation strategy is applied to determine the first input of the first coding model, the label in the data generation strategy is the first output of the first coding model sent by the network side.

[0159] Figure 9 illustrates the process of reconstructing the input of the encoding model from the output of the encoding model provided by the network side on the terminal side. The first encoding model is encoding model 1 in Figure 9, and its input is random Gaussian noise. When random Gaussian noise is input, for example, input 1, encoding model 1 outputs a corresponding output, for example, output 1. Output 1 is compared with the first output of the first encoding model received from the network side. If the error between output 1 and the first output is greater than a threshold, the input of encoding model 1 is adjusted, for example, the input of encoding model 1 is adjusted to input 2. When the input of encoding model 1 is input 2, the corresponding output of encoding model 1 is output 2. Output 2 is compared with the first output of the first encoding model. If the error between output 2 and the first output is greater than a threshold, the input of encoding model 1 is adjusted again. This adjustment process is repeated. Assuming that when the input to encoding model 1 is input x, the output of encoding model 1 is output y, and the error between output y and the first output is less than a threshold, the adjustment process can end at this point. Input x is the reconstruction information of the input of the first coding model determined by the terminal side based on the first output of the first coding model received from the network side, which is also the first input of the first coding model in the embodiment.

[0160] The first output of the first coding model can be called the feedback channel information, and it is sent from the network side to the terminal side. The first input of the first coding model can also be called the reconstructed channel information, which is obtained by the terminal side based on the feedback channel information.

[0161] As an example, the data generation strategy used by the terminal side when determining the first input of the first coding model can be the DeepDream algorithm.

[0162] The DeepDream algorithm is a visualization technique that amplifies the features of a convolutional neural network (CNN) using gradient ascent. Specifically, in the embodiments of this application, given a randomly initialized input to the CNN network... And a label y, for input Optimization is performed. During optimization, the model parameters of the CNN network remain unchanged, or in other words, the model parameters of the CNN network are fixed. The input of the CNN network is adjusted to make... in, For classification loss function, This is a regularization term. In the DeepDream algorithm, the regularization term... The dataset's total variance penalty term and L2 norm penalty term, constructed from the output of the network-side encoding model (i.e., CSI feedback information), are used to make the optimized CNN input closer to the real input. Here, the optimized CNN input is the first input of the first encoding model, and the real input refers to the input corresponding to the first output of the first encoding model.

[0163] After determining the first input through a data generation strategy, the terminal device obtains the second encoding model based on the first input.

[0164] 250. The terminal device obtains the second encoding model based on the first input.

[0165] In this application, the first output of the first encoding model sent from the network side to the terminal side can specifically be CSI feedback information in the AI ​​CSI feedback. The CSI feedback information sent by the network side can be obtained by processing the dataset composed of the output of the network side's encoding model using a data-free data distillation method represented by DeepDream.

[0166] For information on data distillation methods, please refer to the above introduction; further details will not be provided here.

[0167] Instead of providing the network side with a dataset consisting of raw CSI data for the terminal side to enhance or train the encoding model, sending CSI feedback information obtained by data distillation methods from the network side can save air interface overhead. Furthermore, it can avoid redundant training on the terminal side.

[0168] Optionally, the network side instructs the terminal side on the data generation strategy, in which case method 200 may further include step 260.

[0169] 260. The network-side device sends the first message, which indicates the data generation strategy.

[0170] The terminal device receives the first information.

[0171] It is understandable that when the data generation strategy adopted by the network side is known to the terminal side, for example, it may be specified by the protocol or a default configuration, the network side may not need to indicate the data generation strategy to the terminal side. In this case, method 200 may not include step 260.

[0172] As an example, this application provides two methods for obtaining the second encoding model on the terminal side, as shown in Method 1 and Method 2 below.

[0173] In the following embodiments, the first encoding model is represented as encoding model 1 (or encoder 1), and the first decoding model is represented as decoding model 1 (or decoder 1). Furthermore, the second encoding model obtained by performing model enhancement development on the terminal side is represented as encoding model 2.

[0174] Method 1

[0175] The terminal device trains a decoding model (referred to as the second decoding model in the following embodiment) based on the first input of the first encoding model; the second decoding model is then fixed and the second encoding model on the terminal side is trained.

[0176] Figure 10 is a schematic diagram of one implementation of training the second encoding model on the terminal side. As shown in Figure 10, the first input of encoding model 1 is represented as the target CSI V′ (e.g., target CSI V′). Optionally, target CSI V′ can be dataset V1 transmitted from the network side, or dataset V2 obtained from the terminal side. For example, V2 can contain data collected by the terminal side through actual measurements and / or data generated through simulation, etc. The data contained in dataset V2 is not exactly the same as the data contained in dataset V1. Here, dataset V1 represents the input of encoding model 1 in the self-model trained by the network side. Encoding model 1 is first frozen, i.e., the model parameters of encoding model 1 are kept unchanged, and decoding model 2 is trained. After training decoding model 2, when target CSI V′ is used as the input of encoding model 1, the output of decoding model 2 is... The output of decoding model 2 is a reconstruction of the input of encoding model 1. As an example, during the training of decoding model 2, the input of encoding model 1 can be the target CSI V1 transmitted from the network side (e.g., denoted as target CSI V1), or it can be the target CSI V2 obtained by the terminal side itself (e.g., denoted as target CSI V2).

[0177] After training the decoding model 2, it is frozen, and the encoding model 2 is trained. During the training of the encoding model 2, the model parameters of the decoding model 2 remain unchanged, and its output is the dataset obtained in the previous training step. The process of training encoding model 2 may include: adjusting the model parameters of encoding model 2. When the output of the currently trained encoding model is used as the input of decoding model 2, the output of decoding model 2... If the error between the target CSI V′ and the target CSI V′ is less than the threshold, the encoding model is encoding model 2. Encoding model 2 can be obtained by training an encoding model designed by the terminal side (e.g., encoding model 3). The initial values ​​of the model parameters of encoding model 3 can be determined by the terminal side device itself.

[0178] Figure 11 is a schematic diagram of the complete process of training the second encoding model on the terminal side based on method 1. As shown in Figure 11, on the network side, the network side obtains a self-model, which includes encoding model 1 and decoding model 1, and the encoding model 1 and decoding model 1 are matched. The output of encoding model 1 is represented, for example, as CSI feedback C1. The network side also sends the output of encoding model 1 (i.e., the first output, or C1) to the terminal side. In addition, the network side also sends encoding model 1, for example, sending the model file and / or model parameters of encoding model 1. On the terminal side, the terminal device obtains encoding model 2 through three steps (as in steps 1 to 3). Step 1 is that the terminal side determines the first input of the first encoding model based on the first output of encoding model 1 received from the network side; Step 2 is that the terminal side trains decoding model 2 based on the first input of encoding model 1; Step 3 is that the terminal side fixes decoding model 2, thus training encoding model 2. For the specific implementation on the terminal side, please refer to the detailed description of the above embodiments, which will not be repeated here.

[0179] Method 2

[0180] The terminal device trains the second encoding model based on the first input of the first encoding model. As can be seen, unlike method 1, the terminal directly trains the second encoding model based on the first input of the first encoding model.

[0181] Figure 12 illustrates another implementation of training the second encoding model on the terminal side. As shown in Figure 12, the terminal side obtains the output of encoding model 1 based on encoding model 1 and the target CSI V′, as shown in CSI feedback C2 in Figure 12. Then, the terminal side trains the encoding model, where CSI feedback C2 serves as the label for the output of encoding model 2. When the error between the output C3 of the currently trained encoding model and CSI feedback C2 is less than a threshold, the encoding model at this point is encoding model 2.

[0182] Figure 13 is a schematic diagram of the complete process of training the second encoding model on the terminal side based on method 2. As shown in Figure 13, on the network side, the network side obtains a self-model, which includes encoding model 1 and decoding model 1, and the encoding model 1 and decoding model 1 are matched. The output of encoding model 1 is (for example, represented as CSI feedback C1). The network side sends the output of encoding model 1 (i.e., the first output, or C1) to the terminal side. In addition, the network side also sends encoding model 1, for example, sending the model file and / or model parameters of encoding model 1. On the terminal side, the terminal device obtains the second encoding model through three steps. Step 1 is that the terminal side determines the first input of the first encoding model (i.e., the target CSI V′) based on the first output of encoding model 1 received from the network side; Step 2 is that the terminal side obtains the output of encoding model 1, represented as C2, based on the first input and encoding model 1; Step 3 is that the terminal side trains encoding model 2 based on the target CSI V′ and C2 determined in step 2. C2 serves as the label for the output of the trained encoding model 2 in step 3. When the error between the outputs C3 and C2 of the currently trained encoding model is less than the threshold, the encoding model is now encoding model 2.

[0183] The above describes two methods for training the second encoding model on the terminal side.

[0184] In method 200, for example, before step 210, the network side also sends a first encoding model to the terminal side, as in step 270.

[0185] 270. The network side sends the first encoding model. Correspondingly, the terminal side receives the first encoding model.

[0186] As described in the embodiments above, there are various ways for the network side to send the model to the terminal side. For example, the network side can send the model file and / or model parameters of the first encoded model to the terminal side. For a description of the model file or model parameters, please refer to the embodiments above.

[0187] In step 270, when the network side sends model parameters to the terminal side, the network side may send all model parameters of the first encoding model to the terminal side; or it may send only some model parameters, while the remaining model parameters are designed by the terminal side itself, or determined by the terminal side through pre-configuration, default configuration, or protocol specifications. The model file and / or model parameters of the first encoding model may be a portion of the parameters used to train the second encoding model. In addition, it also includes the first input of the first encoding model, data generation strategy, etc.

[0188] Step 270, which precedes step 210, is merely an example and can be implemented in other ways. For instance, step 270 could be placed after step 210, or the first encoding model and its first output could be sent together from the network side to the terminal side.

[0189] Optionally, to improve the performance of the encoding model trained on the terminal side, the network side can send information related to the training data of the self-model to the terminal side, facilitating the terminal side to train the second encoding model. In this implementation, method 200 also includes step 280.

[0190] 280. The network-side device sends information related to the training data of the first encoding model.

[0191] Accordingly, the terminal receives information related to the training data of the first encoding model.

[0192] It is understandable that, since the first encoding model and the first decoding model are trained together on the network side, the training data of the first encoding model is also the training data of its own model. As an example, the training data of the first encoding model may include, but is not limited to, one or more of the mean and variance of the training data of the first encoding model.

[0193] Accordingly, the terminal receives information related to the training data of the first encoding model.

[0194] In the case where method 200 includes step 280, i.e., in the implementation where the network side sends information related to the training data of the first encoding model to the terminal side, in step 250, the terminal side reconstructs the target CSI based on the DeepDream algorithm optimization algorithm. As an example, this optimization algorithm could be DeepInversion.

[0195] The DeepInversion algorithm applies regularization terms in DeepDream. Optimizations have been made. Because the regularization terms in DeepDream cannot ensure that the reconstructed information of the target CSI (corresponding to target CSI V′ in the aforementioned embodiment) has similar information, the regularization terms for the features in the DeepInversion algorithm are defined as follows:

[0196] in, and It represents the batch-wise mean and variance estimates of the features of the l-th layer, indicating the mean and variance of the dataset reconstructed from the terminal side, compared with the input reconstructed from the terminal side. Correspondingly, E(μ) l (x) and This represents the mean and variance of the features at layer l on the full dataset X, corresponding to the mean and variance provided by the network. |||2 represents solving for the L2 norm. The running mean and running variance of the batch normalization (BN) layer in the Teacher model preserve the mean and variance of the training data. This allows the mean and variance of the reconstructed target CSI information to be closer to the statistical values ​​of the real target CSI dataset, resulting in better performance of the trained second encoding model. Here, the running mean and running variance are the mean and variance calculated based on the sample data up to the present time in an online computing scenario.

[0197] In the above method 200, the different execution entities of the terminal-side device or the network-side device are illustrated by deploying them on a single device. For example, the terminal-side AI CSI encoder is deployed on the internal chip of the terminal side, and the network-side AI CSI decoder is deployed on the internal chip of the network-side device.

[0198] In another possible implementation, the terminal-side AI CSI encoder can be deployed outside the terminal-side device, for example, in the host or cloud server of the OTT system; similarly, the network-side AI CSI decoder can also be deployed outside the network-side device, for example, on a network-side AI model deployment device collectively referred to as intelligent network elements, such as a near real-time RIC. As an example, the near real-time RIC is set in the RAN node, for example, in the CU / DU. This will be explained below with reference to Figure 14.

[0199] Figure 14 is a schematic diagram of the technical solution of this application applied to the ORAN architecture. As shown in Figure 14, when the AI ​​CSI model on the network side is deployed outside the network device, such as on an intelligent network element, the intelligent network element sends the first encoding model (denoted as encoding model 1) and the dataset (denoted as dataset #1) of the output of encoding model 1 (specifically, the feedback CSI) to the network device. The network device sends dataset #1 and encoding model 1 to the terminal device. When the AI ​​CSI model on the terminal side is deployed outside the terminal device, such as on an OTT server, the terminal device sends dataset #1 and encoding model 1 to the OTT server. In this way, dataset #1 and encoding model 1 are transmitted from the intelligent network element to the OTT server.

[0200] When the terminal needs to enhance or develop encoding model 1, or train its own designed encoding model, for example, if the performance of encoding model 1 issued by the network side does not meet the performance requirements, the OTT can train encoding model 2 on the terminal side based on method 1 or method 2 in the above embodiments. The process of training encoding model 2 can be referred to the detailed description in the above embodiments, and will not be repeated here.

[0201] After obtaining encoding model 2, the terminal device can determine whether the performance of encoding model 2 meets the performance requirements issued by the network side based on the performance-related information of encoding model 2 indicated by the OTT. As an example, the terminal side can determine the performance of encoding model 2 by monitoring it. Model monitoring refers to monitoring the performance of the AI ​​model to determine whether the AI ​​model is working normally. If the AI ​​model's performance is poor, it can switch to non-AI mode, replace the AI ​​model, or update the AI ​​model. Model monitoring can monitor the performance of the AI ​​model's output or system performance. The performance of the AI ​​model's output can also be called an intermediate key performance indicator (KPI). System performance can be called an eventual KPI. Monitoring the performance of the AI ​​model's output (e.g., accuracy) is done by comparing the AI ​​model's output with the corresponding label or ground truth to determine whether the AI ​​model's performance meets the requirements. Monitoring system performance is done by monitoring whether the performance of the communication system meets the requirements after using the AI ​​model. As an example, intermediate KPIs typically include one or more of the following: generalized cosine similarity (GCS), square generalized cosine similarity (SGCS), and normalized mean square error (NMSE). Final KPIs typically include one or more of the following: throughput, spectral efficiency, transmission rate, block error rate (BLER), hypothetical BLER, and hybrid automatic repeat request (HARQ) feedback. Model monitoring can be performed on the terminal side or the network side.

[0202] When the terminal device determines that encoding model 2 meets the model performance requirements issued by the network side, it can instruct the OTT server to deploy encoding model 2. If encoding model 2 does not meet the model performance requirements issued by the network side, the terminal device will indicate to the network device that it cannot meet the model performance requirements. The network device can request other datasets from the intelligent network element to re-pair (or interface) the model with the terminal side. Alternatively, the network device can request the intelligent network element to disable the corresponding feature, such as disabling the use of the AI ​​model to switch to working with a non-AI model. Based on the feedback from the network device, the intelligent network element can choose to issue other datasets or disable the feature. Optionally, the terminal side can train a new encoding model or perform enhancements on the encoding model more than once. For example, after deploying an encoding model that meets the model performance requirements, the terminal side may train a new encoding model or perform further model enhancements based on reasons such as changes in channel conditions or changes in model performance requirements.

[0203] This application uses AI-CSI feedback as an example, but the concept of the technical solution in this application can be applied to future scenarios involving dual-end model training.

[0204] Based on the training model method provided in this application, during model training on the terminal side after model docking, compared to the network side transmitting the original CSI dataset, the network side only transmits a portion of the CSI feedback information, which can reduce data transmission overhead. Furthermore, it avoids redundant training on the terminal side caused by the network side transmitting the original CSI dataset, and the resulting power consumption on the terminal side. Optionally, the network side can instruct the terminal side on the performance metrics that the enhanced coded model should meet to ensure the performance of the AI ​​CSI feedback.

[0205] The above provides a detailed description of the training model method provided in this application. The following describes the corresponding communication device.

[0206] Figure 15 is a schematic block diagram of the communication device 1000 provided in this application. As shown in Figure 15, the communication device 1000 may include a processing module 1001 and a communication module 1002. The communication device 1000 may be a terminal device, or a communication device applied to or used in conjunction with a terminal device to achieve the corresponding functions of the terminal device, such as a processor, chip, circuit, or AI entity. Alternatively, the communication device 1000 may be a network device, or a communication device applied to or used in conjunction with a network device to achieve the corresponding functions of the network device, such as a processor, chip, circuit, or AI entity.

[0207] The communication module can also be called a transceiver module, transceiver, transceiver device, or transceiver apparatus. The processing module can also be called a processor, processing board, processing unit, or processing apparatus. Optionally, the communication module is used to perform the sending and receiving operations of the terminal-side device or network-side device in the above method. The device in the communication module that implements the receiving function can be regarded as a receiving unit, and the device in the communication module that implements the sending function can be regarded as a sending unit. That is, the communication module includes a receiving unit and a sending unit. When the communication device 1000 is applied to the network-side device or terminal-side device, the processing module 1001 can be used to implement the processing functions of the network-side device or terminal-side device in the embodiments of Figures 8 to 14, and the communication module 1002 can be used to implement the sending and receiving functions of the network-side device or terminal-side device. For example, when applied to the network side, the communication module 1002 can be used to send the first output of the first encoding model, send first information to indicate a data generation strategy, send model parameters of the first encoding model, or send information related to the training data of the first encoding model, etc.; the processing module 1001 can be used to: acquire the first encoding model and the first decoding model; optionally, monitor the model performance of the encoding model trained on the terminal side, etc. When the communication device 1000 is applied to a terminal-side device, the communication module 1002 can be used to: receive the first output of the first encoding model, receive first information to determine a data generation strategy, receive the first encoding model, or receive information related to the training data of the first encoding model, etc.; the processing module 1001 can be used to: train a second encoding model based on data (e.g., the first output of the first encoding model, or the first encoding model) and information (e.g., first information, or information related to the training data of the first encoding model, etc.) received from the network side, etc.

[0208] Furthermore, it should be noted that the aforementioned communication module and / or processing module can be implemented through virtual modules. For example, the processing module can be implemented through software functional units or virtual devices, and the communication module can be implemented through software functions or virtual devices. Alternatively, the processing module or communication module can also be implemented through physical devices. For example, if the device is implemented using a chip / chip circuit, the communication module can be an input / output circuit and / or a communication interface, performing input operations (corresponding to the aforementioned receiving operation) and output operations (corresponding to the aforementioned sending operation); the processing module is an integrated processor, microprocessor, or integrated circuit.

[0209] The module division in this application is illustrative and represents only one logical functional division. In actual implementation, other division methods are possible. Furthermore, the functional modules in the various examples of this application can be integrated into a single processor, exist as separate physical entities, or be integrated into a single module. The integrated modules described above can be implemented in hardware, as software functional modules, or a combination of hardware and software.

[0210] Figure 16 is a schematic block diagram of another communication device 1100 provided in this application. Optionally, the communication device 1100 may be a chip or a chip system. Optionally, in this application, the chip system may be composed of chips or may include chips and other discrete devices.

[0211] The communication device 1100 can be used to implement the functions of any of the network elements (e.g., network-side devices or terminal-side devices) described in the foregoing embodiments. The communication device 1100 may include at least one processor 1110. Optionally, the processor 1110 is coupled to a memory, which may be located within the communication device 1100, integrated with the processor, or located outside the communication device 1100. As an example, the communication device 1100 may also include at least one memory 1120. The memory 1120 stores the necessary computer programs (or computer instructions) and / or data for implementing the corresponding functions of any of the network elements in any of the above method embodiments; the processor 1110 may execute the computer programs stored in the memory 1120 to complete the methods implemented by any of the network elements in any of the above method embodiments.

[0212] The communication device 1100 may also include a communication interface 1130, through which the communication device 1100 can interact with other devices. For example, the communication interface 1130 may be a transceiver, circuit, bus, module, pin, or other type of communication interface. When the communication device 1100 is a chip-based device or circuit, the communication interface 1130 in the device 1100 may also be an input / output circuit, capable of inputting information (or receiving information) and outputting information (or sending information). The processor may be an integrated processor, a microprocessor, an integrated circuit, or a logic circuit, and the processor can determine the output information based on the input information.

[0213] The coupling in this application refers to indirect coupling or communication connection between devices, units, or modules, which can be electrical, mechanical, or other forms, used for information exchange between devices, units, or modules. The processor 1110 may operate in conjunction with the memory 1120 and the communication interface 1130. This application does not limit the specific connection medium between the processor 1110, the memory 1120, and the communication interface 1130.

[0214] Optionally, as shown in FIG16, the processor 1110, the memory 1120, and the communication interface 1130 are interconnected via a bus 1140. The bus 1140 can be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one line is used to represent the bus 1140 in FIG16, but this does not indicate that there is only one bus or one type of bus.

[0215] Figure 17 is a schematic structural diagram of the chip provided in this application. Chip 30 includes a processing circuit 31 and a communication circuit 32. The processing circuit 31 can be a logic circuit, integrated circuit, etc., and the communication circuit 32 can be an input / output circuit, input / output interface, interface circuit, etc., capable of inputting information (or receiving information) or outputting information (or sending information). Chip 30 can execute the methods performed by the network-side device or the terminal-side device in the various embodiments of this application. The processing circuit 31 can be one or more processors, or all or part of the circuitry in one or more processors used for control or processing. Optionally, the functions on the terminal side or the network side can be deployed in different parts of the chip.

[0216] In addition, this application also provides a computer-readable storage medium storing computer instructions, which, when executed on a computer, cause operations and / or processes performed by a terminal-side device or a network-side device in the various method embodiments of this application to be executed.

[0217] This application also provides a computer program product, which includes computer program code or instructions. When the computer program code or instructions are run on a computer, the operations and / or processes performed by the terminal-side device or the network-side device in the various method embodiments of this application are executed.

[0218] This application also provides a chip including a processor, and a memory for storing a computer program is provided independently of the chip. The processor executes the computer program stored in the memory, such that operations and / or processes performed by a terminal-side device or a network-side device in any method embodiment are executed. Further, the chip may also include a communication interface. The communication interface may be an input / output interface or an interface circuit, etc. Further, the chip may also include a memory.

[0219] This application also provides a chip, which may include circuitry and an input / output interface. The circuitry may be logic circuitry, integrated circuits, etc., and exemplaryly, the circuitry may be one or more processors, or all or part of the circuitry in one or more processors used to implement one or more processing, control, or computing functions. The input / output interface may also be an input / output circuit, or an interface circuit, capable of inputting information (or receiving information) and / or outputting information (or sending information). The chip may include a chip system. Optionally, the chip system may be composed of chips or may include chips and other discrete devices. The chip can be used to execute the methods implemented by terminal-side devices or network-side devices in the various embodiments of this application. Optionally, the chip may be a baseband chip, also known as a modem.

[0220] Furthermore, this application provides a communication system, including a terminal-side device and a network-side device as described in any embodiment of this application. This communication system can implement the training model method provided in any of the embodiments shown in Figures 8 to 14.

[0221] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0222] The processor in this application embodiment has signal processing capabilities and can be a central processing unit (CPU), or a general-purpose processor, digital signal processor (DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), or other programmable logic device, discrete gate or transistor logic device, or discrete hardware component. It can implement or execute the methods, steps, and logic block diagrams disclosed in this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in this application can be directly embodied in the execution of the hardware processor, or executed by a combination of hardware and software modules within the processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory; the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above methods.

[0223] In the embodiments of this application, memory is any other medium capable of carrying or storing desired program code in the form of instructions or data structures, and accessible by a computer, but is not limited thereto. The memory in this application can also be a circuit or any other means capable of implementing a storage function for storing computer programs and / or data; or, it can also be a circuit or any other means capable of implementing a storage function for storing computer programs and / or data. As an example, memory can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. Non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be random access memory (RAM), which serves as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory used in the systems and methods described herein is intended to include, but is not limited to, the types described above or any other suitable types of memory.

[0224] The technical solutions provided in this application can be implemented in whole or in part through software, hardware, firmware, or any combination thereof. When implemented using software, they can be implemented in whole or in part as a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, a terminal device, an access network device, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., digital video discs (DVDs)), or semiconductor media, etc.

[0225] In the embodiments of this application, "at least one" refers to one or more items. "More than one" means two or more items. "And / or" is used to describe the relationship between related objects, indicating that there can be three relationships. For example, A and / or B can represent three situations: A exists alone, A and B exist simultaneously, and B exists alone.

[0226] The term "comprising" and any variations thereof used in the embodiments of this application are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the steps or units listed, but may optionally include other steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or devices.

[0227] In this application, examples may reference each other without logical contradiction. For example, methods and / or terms between method embodiments may reference each other, functions and / or terms between device embodiments may reference each other, and functions and / or terms between device examples and method examples may reference each other.

[0228] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0229] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0230] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0231] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0232] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, ROM, RAM, magnetic disks, or optical disks.

[0233] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method for training a model, characterized in that, include: Receive the first output of the first encoding model, the first encoding model and the first decoding model are matched, and the first output includes compressed information of channel information; Based on the first output and the first encoding model, the first input of the first encoding model is obtained, wherein the first input of the first encoding model corresponds to the second output of the first encoding model, and the error between the second output and the first output is less than a threshold. Based on the first input, obtain the second encoding model.

2. The method according to claim 1, characterized in that, The step of obtaining the first input of the first encoding model based on the first output and the first encoding model includes: Based on the first output, the first encoding model, and the data generation strategy, obtain the first input. The data generation strategy includes: keeping the model parameters of the first encoding model unchanged, and adjusting the input of the first encoding model to determine the input of the first encoding model such that the error between the output of the first encoding model and a given label is less than the threshold. Wherein, the label corresponds to the first output of the first encoding model; and, During the process of adjusting the input of the first encoding model, when the error between the output of the first encoding model and the label is less than the threshold, the input corresponding to the output of the first encoding model is the first input.

3. The method according to claim 2, characterized in that, The method further includes: The system receives information related to the training data of the first encoding model, and the first output is obtained based on the training data.

4. The method according to claim 3, characterized in that, The information related to the training data includes at least one of the mean and variance of the training data.

5. The method according to claim 3 or 4, characterized in that, The step of obtaining the second encoding model based on the first input includes: Based on the information related to the first input and the training data, the second encoding model is obtained.

6. The method according to any one of claims 2-5, characterized in that, The method further includes: Receive first information, which indicates the data generation strategy.

7. The method according to any one of claims 1 to 6, characterized in that, The step of obtaining the second encoding model based on the first input includes: Based on the first input and the first encoding model, a second decoding model is obtained, wherein the output of the second decoding model corresponds to the first input of the first encoding model, and the input of the second decoding model is the output of the first encoding model when the first input is used as the input of the first encoding model; Based on the second decoding model, the second encoding model is obtained, and the second encoding model matches the first decoding model.

8. The method according to any one of claims 1 to 6, characterized in that, The step of obtaining the second encoding model based on the first input includes: Based on the first input and the first encoding model, obtain the third output; Obtain the second encoding model, wherein the input of the second encoding model includes the first input, the output of the second encoding model includes the third output, and the second encoding model matches the first decoding model.

9. The method according to any one of claims 1-8, characterized in that, The method further includes: Receive the first encoding model.

10. A method for training a model, characterized in that, include: Obtain a first encoding model and a first decoding model, and match the first encoding model and the first decoding model; Send the first output of the first encoding model, the first output being used to determine the second encoding model, the input of the second encoding model being determined based on the first output.

11. The method according to claim 10, characterized in that, The method further includes: Send information related to the training data of the first encoding model, wherein the first output is obtained based on the training data.

12. The method according to claim 11, characterized in that, The information related to the training data includes at least one of the mean and variance of the training data.

13. The method according to any one of claims 10-12, characterized in that, The method further includes: Send a first message, the first message indicating a data generation strategy, the data generation strategy and the first output being used to determine the input of the second encoding model.

14. The method according to any one of claims 10-13, characterized in that, The method further includes: Send the first encoding model.

15. A communication device, characterized in that, It includes modules or units for performing the method as described in any one of claims 1-9; or, it includes modules or units for performing the method as described in any one of claims 10-14.

16. A communication device, characterized in that, The device includes a processor coupled to a memory, the processor being configured to execute a computer program or instructions stored in the memory to cause the communication device to perform the method as described in any one of claims 1-9, or to perform the method as described in any one of claims 10-14.

17. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions that, when executed on a computer, implement the method as described in any one of claims 1-9, or implement the method as described in any one of claims 10-14.

18. A computer program product, characterized in that, When the computer program product is run, it causes the method as described in any one of claims 1-9 to be implemented, or causes the method as described in any one of claims 10-14 to be implemented.

19. A communication system, characterized in that, It includes means for performing the method as described in any one of claims 1-9 and means for performing the method as described in any one of claims 10-14.