Model training method and apparatus

By training on different types of channel information data, the model matching between the terminal and network equipment is ensured, solving the problem of network equipment recovering channel information when the terminal reports partial channel information, improving the accuracy of channel measurement and reducing data transmission resource overhead.

CN122310102APending Publication Date: 2026-06-30HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2024-12-31
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In scenarios where terminals report partial channel information, how the network device model can correctly recover the channel information becomes a problem that needs to be solved.

Method used

By acquiring different types of data used to characterize channel information, the second model is trained so that it can not only process the data correctly, but also match the first model, thereby enabling the correct transmission of channel information between the terminal and network devices.

Benefits of technology

This technology enables network devices to correctly recover channel information even when the terminal reports partial channel information, thereby improving the accuracy of channel measurement and reducing the overhead of data transmission resources.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122310102A_ABST
    Figure CN122310102A_ABST
Patent Text Reader

Abstract

This application discloses a model training method and apparatus, relating to the field of communication technology. It enables a network device model to correctly recover channel information based on compressed information reported by the terminal, in scenarios where the terminal reports partial channel information. The method includes: acquiring first type data, second type data, and third type data, wherein the first type data is used to characterize first channel information, the second type data is the output data obtained by inputting the first type data into a first model, and the second type data is used to determine the third type data, which characterizes second channel information, including at least one of the following: all or part of the channel information in the first channel information, or the third channel information, wherein the third channel information is different from the first channel information; and training a second model based on the first type data, second type data, and third type data.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of communication technology, and in particular to a model training method and apparatus. Background Technology

[0002] After obtaining channel information by measuring the reference signal, the terminal can input this information into its model (e.g., an encoder) to obtain compressed channel information, which it then feeds back to the base station. The network device inputs this compressed channel information into its model (e.g., a decoder) to reconstruct the channel information. This approach leverages the nonlinear feature extraction capabilities of neural networks to improve the accuracy of channel measurements and reduces the amount of data transmitted between the terminal and network devices, thereby lowering the resource overhead of the terminal reporting channel information.

[0003] To further reduce the amount of data reported by the terminal when reporting channel information, the terminal can use a model to compress the channel information of a portion of the channels and report the compressed information of that portion. However, in scenarios where the terminal reports channel information of only a portion of the channels, how the network device's model can correctly recover the channel information based on the compressed information reported by the terminal becomes a problem that needs to be solved. Summary of the Invention

[0004] To address the aforementioned technical issues, this application provides a model training method and apparatus that enables the network device's model to correctly recover channel information based on the compressed information reported by the terminal, in scenarios where the terminal reports channel information for a portion of the channel.

[0005] Firstly, a model training method is provided. This method can be executed by a first device, or by a component of the first device, such as its processor, chip, or chip system, or by a logic module or software capable of implementing all or part of the functions of the first device. The following description uses the execution of this method by a first device as an example. The model training method includes: the first device acquiring first-type data representing first channel information, third-type data representing second channel information, and second-type data. The second-type data represents the output of the first model based on the first-type data. The second-type channel information includes at least one of the following: all or part of the channel information in the first channel information, or the third-type channel information, wherein the third-type channel information is different from the first-type channel information; the first device trains the second model based on the aforementioned first-type data, third-type data, and second-type data. As an example, the first-type data includes processed first channel information, such as the result of performing a Fourier transform on the first channel information.

[0006] In this embodiment, when the first device trains the second model, the training data includes not only first-type data and second-type data, but also second-type data, which represents the output of the first model based on the first-type data. The second model trained based on this data will not only be able to correctly process both first-type and second-type data, but also achieve the functionality of the first model or be compatible with it. Specifically, when a terminal uses the second model, the terminal's second model can achieve the functionality of the first model. Alternatively, when a network device uses the second model and a terminal uses the first model, the terminal's first model and the network device's second model can be correctly matched. This allows the network device's model to correctly recover channel information based on the compressed information reported by the terminal, even when the terminal reports partial channel information. The first device can be either a terminal or a network device; this application does not limit its use.

[0007] In one possible implementation, the second model has the same function as the first model; in another possible implementation, the second model is used to recover the second type of data.

[0008] For example, the first model is a proxy encoder jointly trained with the decoder in the network device, and the second model is the encoder of the terminal. This makes the terminal's encoder more similar to the proxy encoder of the network device. Since the proxy encoder of the network device is jointly trained with the decoder, it matches the network device's proxy encoder with the decoder, thus enabling the terminal's encoder in this application to match the network device and its decoder. As another example, the first model is the terminal's encoder, and the second model is the network device's decoder. The network device's decoder is trained based on the output data of the terminal's encoder, allowing the network device's decoder to match the terminal's encoder.

[0009] In one possible implementation, the method for acquiring the first type of data includes: after acquiring the third type of data and a first mapping relationship representing the mapping relationship between the third type of data and the first type of data, the first device acquires the first type of data from the third type of data based on the first mapping relationship. In this way, the first device does not need to directly receive the first type of data, thereby reducing the transmission resources required for the first device to acquire the first type of data. The first device can then train a second model based on the first type of data, the second type of data, and the third type of data.

[0010] In one possible implementation, the method further includes: a first device determining a second mapping relationship; the first device obtaining fourth type data from third type data based on the second mapping relationship, the fourth type data being used to characterize fourth channel information, the fourth channel information being a portion of the channel information in the second channel information, and the fourth channel information not being entirely identical to the first channel information; and the first device training a second model based on the fourth type data and the third type data. This increases the diversity of training data for the second model, thereby improving the robustness of model training. Optionally, the first device can perform augmented training on the second model based on the fourth type data and the second type data, thereby further improving the model performance of the second model.

[0011] In one possible implementation, the first device determines the first mapping relationship by: the first device obtaining the first mapping relationship from other devices; or, the first device determining the first mapping relationship autonomously. That is, the first device can determine the first mapping relationship based on different methods.

[0012] In one possible implementation, the first model and the third model are matched.

[0013] In one possible implementation, a first model is used to determine the output of the first model based on the first channel information, and a third model is used to recover the second channel information based on the output of the first model. In other words, the first model and the third model in this application can cooperate to achieve the conversion from the first channel information to the second channel information. Thus, when the first model is a proxy encoder of a network device, the third model is a decoder of a network device, and the second model is an encoder of a terminal, the second model trained based on the output data of the first model can cooperate with the third model to achieve the conversion from the first channel information to the second channel information. Specifically, the third model outputs the second channel information or recovers the second channel information; that is, the data output by the third model is the second channel information; or, since there is some loss when the third model recovers the channel information, resulting in inconsistency between the recovered channel information and the actual channel information, the output of the third model can be an approximation of the second channel information.

[0014] In one possible implementation, the process of the first device training the second model includes: training the second model and the fourth model based on first type data, second type data, and third type data. In other words, the first device jointly trains the second model and the fourth model to obtain the trained second model. This allows the trained second model to be applicable to scenarios where two-end models work together to convert information from the first channel to the second channel. Specifically, when the second model has the same function as the first model, the fourth model has the same function as the third model. For example, the fourth model is a proxy decoder for the terminal, and the third model is a decoder for the network device; in this case, the fourth model and the third model have the same function. Alternatively, when the functions of the second model and the first model match, the fourth model has the same function as the first model. For example, the fourth model is a proxy encoder for the network device, and the first model is an encoder for the terminal; in this case, the fourth model and the first model have the same function.

[0015] In one possible implementation, the process of the first device jointly training the second model and the fourth model includes: the first device training the fourth model based on the first type of data and the second type of data; the first device training the second model and the trained fourth model based on the second data, wherein the second data includes the first type of data and the third type of data, or the second data includes the first type of data, the second type of data and the third type of data.

[0016] For example, when the first device is a network device, it first trains the fourth model (in this case, the fourth model is the proxy encoder of the network device) based on the first type of data and the second type of data, so that the fourth model can realize the function of the first model (in this case, the first model is the encoder of the terminal). Afterward, the first device jointly trains the second model (in this case, the second model is the decoder in the network device) and the fourth model as a whole, so that the trained second model can match the fourth model, and thus match the first model. Optionally, during the joint training process, the first device may not adjust the parameters of the fourth model, but only adjust the parameters of the second model, so that the trained second model can match the trained fourth model, and thus match the model of the first model.

[0017] In one possible implementation, the process of the first device jointly training the second model and the fourth model includes: the first device training the fourth model based on the second type of data and the third type of data; the first device training the second model and the trained fourth model based on the second data, wherein the second data includes the first type of data and the third type of data, or the second data includes the first type of data, the second type of data and the third type of data.

[0018] For example, when training the model in the terminal, the first device first trains the fourth model (in this case, the fourth model is the terminal's proxy decoder) based on the second and third types of data, so that the fourth model matches the first model (in this case, the first model is the network device's proxy encoder). After this, the first device jointly trains the second model (in this case, the second model is the encoder in the terminal) and the fourth model as a whole, so that the trained second model can perform the function of the first model. Optionally, during the joint training process, the first device may not adjust the parameters of the fourth model, but only adjust the parameters of the second model, so that the trained second model can match the trained fourth model, thereby enabling the second model to perform the function of the first model.

[0019] In one possible implementation, the first device can determine a first loss function based on the first data and train a fourth model based on the first loss function, so that the trained fourth model can perform the function of processing the first data.

[0020] In one possible implementation, when the first device is a network device, the fourth model is the proxy encoder of the network device. The process of the first device determining a first loss function based on the first data and training the fourth model based on the first loss function includes: the first device inputting first type data into the fourth model to obtain the output data of the fourth model; the first device calculating the first loss function of the fourth model based on the output data and second type data of the fourth model; and the first device iteratively training the fourth model based on the first loss function, the first type data, and the second type data until the convergence condition of the fourth model is met, so as to obtain the trained fourth model.

[0021] In one possible implementation, when the first device is a terminal, the fourth model is a proxy decoder for the terminal. The process of the first device determining a first loss function based on the first data and training the fourth model based on the first loss function includes: the first device inputting second type data into the fourth model to obtain the output data of the fourth model; the first device calculating the first loss function of the fourth model based on the output data of the fourth model and the third type data; and the first device iteratively training the fourth model based on the first loss function, the second type data, and the third type data until the convergence condition of the fourth model is met, so as to obtain the trained fourth model.

[0022] In one possible implementation, the method further includes: determining a second loss function based on the second data, and using the second loss function to train a second model. This enables the trained second model to process the second data.

[0023] In one possible implementation, when the first device is a terminal, the second model is the terminal's encoder, and the fourth model is the terminal's proxy decoder. The process of the first device determining a second loss function based on the second data and training the second model based on the second loss function includes:

[0024] The first device inputs first type data into the second model to obtain second output data. It then inputs the second output data into a trained fourth model to obtain third output data. Based on the third output data and the third type data, the first device calculates a second loss function; alternatively, the first device calculates a first sub-loss function based on the second output data and the second type data, and a second sub-loss function based on the third output data and the third type data. The first and second sub-loss functions are then weighted and summed to obtain the second loss function. After this, the first device iteratively trains the second model and the trained fourth model based on the second loss function until the convergence conditions for training the second and fourth models are met, thus obtaining the trained second model in the first device.

[0025] In one possible implementation, when the first device is a network device, the second model is the decoder of the network device, and the fourth model is the proxy encoder of the network device. The process of the first device determining the second loss function based on the second data and training the second model based on the second loss function includes:

[0026] The training process of the first device on the second model and the trained fourth model includes: the first device inputs first type of data into the trained fourth model to obtain the output data of the trained fourth model; the first device inputs the output data of the fourth model into the second model to obtain the output data of the second model. The first device calculates a second loss function based on the output data of the second model and the third type of data; or, the first device calculates a third sub-loss function based on the output data of the fourth model and the second type of data, and calculates a fourth sub-loss function based on the output data of the second model and the third type of data, and then performs a weighted sum of the third and fourth sub-loss functions to obtain the second loss function. After this, the first device iteratively trains the second model and the trained fourth model based on the second loss function until the convergence conditions for training the second and fourth models are met, thus obtaining the trained second model in the first device.

[0027] In one possible implementation, the first and third models are models obtained by training on first and third types of data, respectively. In this way, the second model can directly cooperate with the trained first or third model to transmit channel information.

[0028] In one possible implementation, a first model is used to compress (or encode) first channel information to obtain compressed data (or encoded data); a third model is used to recover second channel information based on the compressed data (or encoded data).

[0029] Secondly, a model training apparatus is provided for implementing the various methods described above. This model training apparatus can be the first device described in the first aspect, or an apparatus containing the first device, or an apparatus contained within the first device, such as a chip. The model training apparatus includes modules, units, or means corresponding to the methods described above. These modules, units, or means can be implemented in hardware, software, or by hardware executing corresponding software. The hardware or software includes one or more modules or units corresponding to the functions described above.

[0030] In some possible designs, the model training device may include a processing module and a transceiver module. The transceiver module, also called a transceiver unit, is used to implement the sending and / or receiving functions in any of the above aspects and their possible implementations. The transceiver module may consist of transceiver circuits, transceivers, transceivers, or communication interfaces. The processing module can be used to implement the processing functions in any of the above aspects and their possible implementations.

[0031] In some possible designs, the transceiver module includes a sending module and a receiving module, which are used to implement the sending and receiving functions in any of the above aspects and any possible implementation methods.

[0032] Thirdly, a model training apparatus is provided, comprising: at least one processor; the processor being configured to execute a computer program or instructions stored in a memory to cause the model training apparatus to perform the methods of any of the above aspects. The memory may be coupled to the processor, or may be independent of the processor. The model training apparatus may be the first device as described in the first aspect, or an apparatus comprising the first device, or an apparatus contained within the first device, such as a chip. In some possible designs, the model training apparatus includes a memory for storing necessary program instructions and data.

[0033] In one possible implementation, the processor includes logic circuitry and input and / or output interfaces. The output interfaces are used to perform the sending action in the corresponding method, and the input interfaces are used to perform the receiving action in the corresponding method.

[0034] In one possible implementation, the model training device further includes a communication interface and a communication bus, with the processor, memory, and communication interface connected via the communication bus. The communication interface is used to perform the sending and receiving actions in the corresponding method. The communication interface can also be called a transceiver. Optionally, the communication interface includes a transmitter and a receiver; in this case, the transmitter is used to perform the sending action in the corresponding method, and the receiver is used to perform the receiving action in the corresponding method.

[0035] In some possible designs, the model training device can be a chip or a chip system. When the model training device is a chip system, it can be composed of chips or include chips and other discrete components. When the model training device is a chip, the aforementioned transmitting action / function can be understood as output, and the aforementioned receiving action / function can be understood as input.

[0036] Fourthly, a chip is provided, which includes a processor for implementing the functions involved in any of the above aspects or any implementation thereof.

[0037] In some possible designs, the chip includes a memory for storing necessary program instructions and data.

[0038] Fifthly, a computer-readable storage medium is provided that stores a computer program or instructions that, when run on a model training device, enable the model training device to perform the methods of any of the above aspects or any implementation thereof.

[0039] In a sixth aspect, a computer program product containing instructions is provided, which, when run on a model training device, enables the model training device to execute any of the above aspects or any implementation thereof.

[0040] The technical effects of any of the implementation methods in aspects two through six can be found in the technical effects of the corresponding implementation methods in aspect one, and will not be repeated here.

[0041] It should be noted that any of the possible implementations of any of the above aspects can be combined, provided that the solutions do not contradict each other. Attached Figure Description

[0042] Figure 1 This is a schematic diagram of a neural network model;

[0043] Figure 2 This is a schematic diagram illustrating the application of a dual-ended model encoder and decoder in a CSI-RS measurement scenario.

[0044] Figure 3 This is a schematic diagram of the system architecture of a communication system;

[0045] Figure 4 This is a schematic diagram of the system architecture of another communication system.

[0046] Figure 5 This is a schematic diagram of the structure of an O-RAN system;

[0047] Figure 6 A schematic diagram of a model training device;

[0048] Figure 7 This is a flowchart illustrating a model training method.

[0049] Figure 8 This is a flowchart illustrating another model training method.

[0050] Figure 9 This is a schematic diagram of a first device training a second model independently based on training data, in the case where the second model is an encoder at the terminal.

[0051] Figure 10 This is a schematic diagram illustrating how, in the case where the second model is a decoder for a network device, the first device trains the second model independently based on training data.

[0052] Figure 11 This is a flowchart illustrating another model training method.

[0053] Figure 12 This is a flowchart illustrating another model training method.

[0054] Figure 13 This is a schematic diagram of an encoder in a training terminal;

[0055] Figure 14 This is a schematic diagram of a decoder in a training network device.

[0056] Figure 15 This is a schematic diagram of another model training device. Detailed Implementation

[0057] To facilitate understanding of the technical solutions of the embodiments of this application, a brief introduction to the relevant technologies of this application is given below.

[0058] 1. Artificial intelligence (AI)

[0059] Artificial intelligence (AI) is a technology that uses artificial methods and techniques to imitate, extend, and expand human intelligence, enabling machines to react in a manner similar to human intelligence. AI can endow machines with learning capabilities, allowing them to accumulate experience and solve problems that humans can solve through experience, such as natural language understanding, image recognition, and chess.

[0060] 2. Machine learning (ML)

[0061] Machine learning is one implementation of artificial intelligence. It's a method that empowers machines to perform functions that are impossible through direct programming. Machine learning trains models from training data, and then uses these trained models to make predictions based on that data.

[0062] 3. Neural Network (NN)

[0063] Neural networks are a type of machine learning method; they are mathematical models that mimic the behavioral characteristics of animal neural networks to process information. For example... Figure 1 The diagram illustrates a neural network model (which can be simply referred to as the model) provided in an embodiment of this application. The neural network model includes three types of computational layers: an input layer, a hidden layer, and an output layer. Each computational layer includes one or more logical decision units (also called neurons). Each neuron performs a weighted summation operation on its input values ​​and generates an output through a nonlinear function using the weighted summation result. The weights and nonlinear function used in the weighted summation operation of the neurons in the above neural network are called the parameters of the neural network. The connections between neurons in the neural network are called the structure of the neural network, and the parameters of all neurons in the neural network constitute the parameters of this neural network.

[0064] Common types of neural networks include feedforward neural networks (FNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). These neural networks are all based on neurons.

[0065] A neural network with multiple hidden layers is called a deep neural network (DNN), and the process of machine learning based on deep neural networks is called deep learning.

[0066] 5. Channel State Information-Reference Signal (CSI-RS)

[0067] CSI-RS is a measurement reference signal used to estimate downlink channel quality. By measuring CSI-RS, channel state information (CSI) can be obtained. As an example, CSI includes: channel quality indicator (CQI), rank indicator (RI), and precoding matrix indicator (PMI).

[0068] 6. Application of neural networks in channel measurement

[0069] With the continuous increase in antenna array size (e.g., antenna port count increasing to 256T) and communication bandwidth expansion (e.g., bandwidth increasing to 100MHz), the accuracy of channel measurements is difficult to guarantee under current limited measurement overhead. For example, in scenarios where channel measurements are based on sounding reference signals (SRS), bandwidth expansion leads to higher frequency points, which in turn result in more significant Doppler effects and greater path loss, leading to channel aging and a low signal-to-noise ratio (SNR) in the measured channel. Furthermore, in scenarios based on CSI-RS measurements and channel feedback, increased channel dimensionality makes channel compression more difficult and results in greater quantization loss. Therefore, in communication systems, it is necessary to improve the accuracy of channel measurements under large array and high bandwidth conditions.

[0070] In some implementations, neural network models can be used for compressed feedback of channel information. After the terminal measures the reference signal to obtain channel information, it inputs this information into its neural network model to obtain compressed channel information. The terminal then feeds back the compressed channel information to the base station. The network device inputs this compressed channel information into its neural network model to reconstruct the channel information measured by the terminal. This leverages the nonlinear feature extraction capabilities of neural networks to improve the accuracy of channel measurements.

[0071] As an example, such as Figure 2The diagram illustrates the application of a dual-ended model encoder and decoder in a CSI-RS measurement scenario. The terminal model is the encoder, used to compress the CSI-RS measurement data. The terminal sends the compressed CSI-RS measurement data to the network device. Upon receiving the compressed CSI-RS measurement data, the network device uses its decoder to reconstruct the CSI-RS measurement data, obtaining the final CSI-RS measurement data. This allows for large-port, full-band channel measurement between the terminal and network device with limited CSI-RS measurement overhead, based on a dual-ended model. During CSI-RS measurements, multiple resource sets can be used to achieve multi-port channel measurement. Currently, one resource set supports channel measurement for a maximum of 32 antenna ports; adding multiple resource sets can support channel measurement for larger arrays. In related technologies, to support high-bandwidth channel measurement, the number of resource blocks (RBs) that the terminal needs to measure can be indicated. The density indicates the number of channels measured within one RB, and the value can be 0.5, 1, or 3.

[0072] In the embodiments of this application, combined with Figure 2 In the CSI-RS measurement scenario shown, the terminal can measure only a portion of the CSI-RS and compress that portion of the CSI-RS measurement data; correspondingly, the network device recovers the compressed portion of the CSI-RS measurement data to obtain the full CSI-RS measurement data. Alternatively, the terminal can measure only a portion of the CSI-RS and compress that portion of the CSI-RS measurement data; however, the network device not only recovers the compressed portion of the CSI-RS measurement data to obtain the full CSI-RS measurement data, but it can also predict the unmeasured CSI-RS measurement data to obtain the full CSI-RS measurement data. Alternatively, the terminal can measure only a portion of the CSI-RS and compress that portion of the CSI-RS measurement data; after the terminal sends the compressed portion of the CSI-RS measurement data multiple times, the network device predicts the full CSI-RS measurement data based on these multiple transmissions of the compressed portion of the CSI-RS measurement data.

[0073] The above provides a detailed description of the technologies involved in this application.

[0074] As mentioned earlier, in scenarios where the terminal reports channel information for some channels, the terminal's model and the network device's model cannot be correctly matched, resulting in the inability to correctly transmit channel information between the terminal and the network device.

[0075] To address the aforementioned technical problems, this application provides a model training method. When the first device trains the second model, the training data includes not only first channel information and second channel information, but also output data obtained by inputting the first channel information into the first model. The second model trained based on this data will not only match the first and second channel information, but also match the first model. Thus, when a terminal uses the second model and a network device uses a model jointly trained with the first model to transmit channel information, the terminal's second model will be able to correctly match the network device's model, thereby enabling normal transmission of channel information. Alternatively, when the network device uses the second model and the terminal uses the first model to transmit channel information, the terminal's first model and the network device's second model will be able to correctly match, thereby enabling normal transmission of channel information. The first device can be either a terminal or a network device; this application does not limit its use.

[0076] The following is a detailed description of the solutions provided in the embodiments of this application. Before introducing the embodiments of this application, the following points should be noted.

[0077] In the description of this application, unless otherwise stated, " / " indicates that the objects before and after are in an "or" relationship. For example, A / B can mean A or B. "And / or" in this application is merely a description of the relationship between the related objects, indicating that there can be three relationships. For example, A and / or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. A and B can be singular or plural.

[0078] In the description of this application, A sending a message to B can be understood as A sending a message to B through one or more network elements.

[0079] In the description of this application, unless otherwise stated, "multiple" means two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or multiple items. For example, at least one of a, b and / or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.

[0080] Furthermore, to facilitate a clear description of the technical solutions in the embodiments of this application, the terms "first" and "second" are used in the embodiments of this application to distinguish identical or similar items with substantially the same function and effect. Those skilled in the art will understand that the terms "first" and "second" do not limit the quantity or execution order, and the terms "first" and "second" are not necessarily different.

[0081] In the embodiments of this application, the terms "exemplary" or "for example" are used to indicate that something is an example, illustration, or description. Any embodiment or design that is described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a specific manner to facilitate understanding.

[0082] It is understood that the term "embodiment" used throughout the specification means that a specific feature, structure, or characteristic related to an embodiment is included in at least one embodiment of this application. Therefore, various embodiments throughout the specification do not necessarily refer to the same embodiment. Furthermore, these specific features, structures, or characteristics can be combined in any suitable manner in one or more embodiments. It is understood that in the various embodiments of this application, the sequence number of each process does not imply the order of execution; the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0083] It is understood that in this application, "...when" and "if" both refer to the corresponding processing that will be carried out under certain objective circumstances, and are not limited to a specific time, nor do they require a judgment action to be performed during implementation, nor do they imply any other limitations.

[0084] It is understood that some optional features in the embodiments of this application can be implemented independently in certain scenarios without relying on other features, such as the current solution on which they are based, to solve the corresponding technical problems and achieve the corresponding effects. Alternatively, they can be combined with other features as needed in certain scenarios. Correspondingly, the apparatus given in the embodiments of this application can also implement these features or functions, which will not be elaborated here.

[0085] In this application, unless otherwise specified, the same or similar parts between the various embodiments can be referred to each other. In the various embodiments of this application, and in the various implementation methods / methods / implementations within each embodiment, unless otherwise specified or logically conflicting, the terminology and / or descriptions between different embodiments and between the various implementation methods / methods / implementations within each embodiment are consistent and can be mutually referenced. The technical features in different embodiments and the various implementation methods / methods / implementations within each embodiment can be combined according to their inherent logical relationships to form new embodiments, implementation methods, methods, or implementation approaches. The embodiments described below do not constitute a limitation on the scope of protection of this application.

[0086] The model training method provided in this application can be applied to communication systems where terminals and network devices communicate. For example, as... Figure 3The diagram shown is an architectural schematic of a communication system 30 provided in an embodiment of this application. The communication system 30 includes a network device 301 and a terminal 302. The network device 301 and the terminal 302 can transmit channel information via a model. The model used for transmitting channel information in the network device 301 and the terminal 302 can be a symmetric dual-ended model or an asymmetric dual-ended model. This application does not limit this.

[0087] Optional, such as Figure 3 As shown, the communication system may also include an AI entity 303, which is used to perform operations such as constructing training datasets and training models. Network device 301 and / or terminal 302 can send model-related information to AI entity 303, which then completes operations such as constructing training datasets and training models, and sends the constructed dataset and / or trained model to network device 301 and / or terminal 302.

[0088] It should be noted that the AI ​​entity 303 can be an independently set physical device, or it can be a unit / module / chip, etc., set inside the network device 301 and / or terminal 302. This application does not limit it in this way. Figure 3 This application uses the example of a communication system consisting of one network device, two terminals, and one AI entity. In actual implementation, the number of such devices can be one or more, and this application does not limit the number.

[0089] The technical solutions of this application embodiment can be used in various communication systems, including 3GPP communication systems such as 4th generation (4G) systems (e.g., Long Term Evolution (LTE) systems), 5th generation (5G) systems (e.g., New Radio (NR) systems), LTE and 5G hybrid networking systems, integrated communication and sensing systems, non-terrestrial networks (NTN), device-to-device (D2D) communication systems, vehicle-to-everything (V2X) communication systems, machine-type communication (MTC) systems, Internet of Things (IoT) systems, or other future communication systems. The communication system can also be a non-3GPP communication system; there is no limitation on this.

[0090] The communication systems described above are merely illustrative examples, and are not limited to those described herein. The communication systems provided in this application do not impose any limitations on the solutions described herein. This will be explained uniformly here and will not be repeated below.

[0091] Figure 4 This is a schematic diagram illustrating one possible, non-limiting communication system. For example... Figure 4 As shown, the communication system includes a radio access network (RAN) 100 and a core network (CN) 200. RAN 100 includes at least one RAN node (e.g., Figure 4 110a and 110b (collectively referred to as 110) and at least one terminal (such as Figure 4 RAN100, denoted as RAN100, comprises RAN nodes 120a-120j, collectively referred to as RAN120. RAN100 may also include other RAN nodes, such as wireless relay equipment and / or wireless backhaul equipment. Figure 4 (Not shown in the image). Terminal 120 is connected to RAN node 110 wirelessly. RAN node 110 is connected to core network 200 wirelessly or via wired connection. The core network node in core network 200 and RAN node 110 in RAN 100 can be different physical devices, or they can be the same physical device integrating core network logical functions and radio access network logical functions.

[0092] In one possible implementation, a core network node can refer to equipment in the core network 200 that provides service support to terminal 120. The core network node in core network 200 may also include at least one of the following: access and mobility management function (AMF) network elements, session management function (SMF) network elements, user plane function (UPF) network elements, policy control function (PCF) network elements, unified data management (UDM) network elements, application function (AF) network elements, network exposure function (NEF) network elements, network slice selection function (NSSF) network elements, or location management function (LMF) network elements, etc. Of course, core network 200 may also include other core network nodes, without limitation.

[0093] In one possible implementation, RAN 100 can be a cellular system related to the 3rd Generation Partnership Project (3GPP), such as a 4G, 5G mobile communication system, or a future-oriented evolution system. RAN 100 can also be an open RAN (O-RAN or ORAN), a cloud radio access network (CRAN), an NTN network (such as an NTN supporting pass-through mode and / or regenerative mode, or an NTN supporting eye-viewing mode (earth fixed cell) and / or non-eye-viewing mode (earth moving cell), or a wireless fidelity (WiFi) system. RAN 100 can also be a communication system that integrates two or more of the above systems.

[0094] RAN node 110, sometimes also referred to as access network equipment, RAN entity, or access node, constitutes part of the communication system and assists terminals in achieving wireless access. Multiple RAN nodes 110 in RAN 100 can be of the same type or different types. In some scenarios, the roles of RAN node 110 and terminal 120 are relative, for example... Figure 4 Network element 120i can be a helicopter or a drone, and it can be configured as a mobile base station. For terminals 120j that access RAN 100 through network element 120i, network element 120i is a base station; however, for base station 110a, network element 120i is a terminal. RAN node 110 and terminal 120 are sometimes referred to as communication devices, for example... Figure 4 Network elements 110a and 110b can be understood as communication devices with base station functions, while network elements 120a-120j can be understood as communication devices with terminal functions.

[0095] For RAN node 110, in one possible scenario, RAN node 110 can be a base station, an evolved NodeB (eNodeB, also known as eNB), an access point (AP), a transmission reception point (TRP), a next-generation NodeB (gNB), a future communication base station in a future mobile communication system, or an access node in a WiFi system, etc. RAN node 110 can also be a macro base station (such as...) Figure 4 110a), micro base stations or indoor stations (such as Figure 4 The network equipment can be a relay node or donor node, or a wireless controller in a CRAN scenario. Examples include: satellite base stations, radio network controllers (RNCs), base station controllers (BSCs), base transceiver stations (BTSs), home base stations (e.g., home evolved NodeBs, or home NodeBs, HNBs), relay stations, balloon stations, drone stations, wireless backhaul nodes, or grant nodes (G nodes) in satellite telemetry. It is understood that network equipment can be ground-based or non-ground-based (e.g., satellites, drones, high-altitude communication equipment). Furthermore, the names of network equipment with base station functions may differ in communication systems employing different wireless access technologies; this application does not limit this. Optionally, RAN node 110 can also be a server, wearable device, vehicle, or in-vehicle equipment. For example, in vehicle-to-everything (V2X) technology, the access network equipment can be a roadside unit (RSU). RAN node 110 is also known as the next generation-RAN (NG-RAN) node.

[0096] In another possible scenario, multiple RAN nodes 110 collaborate to assist the terminal in achieving wireless access, with each RAN node 110 implementing a portion of the base station's functions. For example, a RAN node 110 can be a CU, DU, CU-CP, CU-user plane (UP), or radio unit (RU), etc. CUs and DUs can be configured separately or included in the same network element, such as a baseband unit (BBU). RUs can be included in radio frequency equipment or radio frequency units, such as remote radio units (RRUs), active antenna units (AAUs), or remote radio heads (RRHs).

[0097] In different systems, CU (or CU-CP and CU-UP), DU, or RU may have different names, but those skilled in the art will understand their meaning. For example, in an ORAN system, CU can also be called O-CU (open CU), DU can also be called O-DU, CU-CP can also be called O-CU-CP, CU-UP can also be called O-CU-UP, and RU can also be called O-RU. For ease of description, this application uses CU, CU-CP, CU-UP, DU, and RU as examples. Any of the units among CU (or CU-CP, CU-UP), DU, and RU in this application can be implemented through software modules, hardware modules, or a combination of software and hardware modules.

[0098] In one possible scenario, terminal 120 can be a device used to implement wireless communication functions, such as a terminal, a chip or circuit that can be used in the terminal, or an entity associated with the terminal. Specifically, terminal 120 can be user equipment (UE), access terminal, terminal unit, terminal station, mobile station (MS), mobile station, remote station, remote terminal, mobile device, wireless communication equipment, terminal agent or terminal device, subscriber unit, smartphone, wireless data card, tablet computer, wireless modem, laptop computer, machine type communication (MTC) terminal, tag, etc., in a 5G network or a future evolved public land mobile network (PLMN). The access terminal can be a cellular phone, cordless phone, Session Initiation Protocol (SIP) phone, Wireless Local Loop (WLL) station, Personal Digital Assistant (PDA), handset with wireless communication capabilities, computing device or other processing device connected to a wireless modem, in-vehicle device or wearable device, virtual reality (VR) terminal, augmented reality (AR) terminal, wireless terminal in industrial control, wireless terminal in self-driving, wireless terminal in remote medical care, wireless terminal in smart grid, wireless terminal in transportation safety, wireless terminal in smart city, wireless terminal in smart home, or terminal node (T-node) in StarSpark, etc. In one possible implementation, terminal 120 can be mobile or fixed. It is understood that the terminal and the mobile user can be completely independent. All user-related information can be stored in a subscriber identity module (SIM) card, which can be used on the terminal device. The terminal can then interact with network-side devices by sending and / or receiving signals over the air interface.

[0099] The chip or circuit in the terminal includes components inside the terminal, such as at least one of a chip, a central processing unit (CPU), a network processing unit (NPU), and a terminal radio frequency module.

[0100] Entities associated with the terminal include terminal-side servers, computing / processing nodes, computing / processing entities, computing / processing units, and servers such as over-the-top (OTT) servers. OTT refers to various services provided to users by a third party other than the network operator via the operator's network. Examples of OTT services include OTT voice communication services, OTT multimedia services, and OTT data processing services. The terminal interacts with relevant information (e.g., data) through communication with this associated network entity. For example, this associated network entity and the terminal may belong to the same vendor. Since model training, model selection, etc., may not be executed on the terminal but rather on the terminal-side OTT server, the term "terminal" in this embodiment also includes the terminal-side OTT server.

[0101] It should be understood that the terminal in this embodiment may also be referred to as the "UE side" or the "UE part".

[0102] In one possible implementation, the network device (e.g., access node or core network node) and terminal 120 in this embodiment can also be referred to as communication devices. These devices can be general-purpose or dedicated devices. The network device may include an access node (RAN node), an operation administration and maintenance (OAM) device, or a core network node. For the OAM device, it may include devices in an element management system (EMS) or a network management system (NMS). It should be understood that the network device in this embodiment can also be referred to as a "network side" or a "network part." This embodiment does not specifically limit its use in this regard.

[0103] In one possible implementation, the relevant functions of the terminal 120 or network device in this application embodiment can be implemented by one device, multiple devices working together, or one or more functional modules within a single device. This application embodiment does not specifically limit this. It is understood that the above functions can be network elements in hardware devices, software functions running on dedicated hardware, a combination of hardware and software, or virtualization functions instantiated on a platform (e.g., a cloud platform).

[0104] It should be noted that a RAN node can be a device or a component within a device in the aforementioned NG-RAN, such as an ng-eNB node, a gNB node, or a transmission point (TP), transmission and reception point (TRP) within an ng-eNB node and a gNB node, or a central unit (CU) integrated on the NG-RAN. A RAN node can also be a network element with transmission capabilities, such as a transmission measurement function (TMF) network element. In some embodiments, a RAN node can also be an access node in an O-RAN system. A RAN typically consists of a series of modules, such as antennas, RRUs, and BBUs. Traditional RAN architectures define the overall reception and output of a RAN node but do not restrict the transmission and communication between internal modules. O-RAN architectures define the architectural connections and standardized interfaces between various modules within the RAN, allowing the RAN to be decoupled into multiple standard modules, thereby enabling the combination and replacement of modules.

[0105] For example, such as Figure 5The diagram illustrates a possible, non-limiting O-RAN system architecture. The Service Management and Orchestration Framework (SMO), as the network management device in the O-RAN, is used for the operation and management of devices within the O-RAN. The Non-Real-Time RAN Intelligent Controller (Non-RT RIC), located within the SMO module, implements non-real-time intelligent management of RAN functions, such as AI / ML workflows including model training and updates, and guides applications / functions within the Near-RT RIC based on policies. The Near-Real-Time RAN Intelligent Controller (Near-RT RIC) enables near-real-time intelligent management of the RAN. Through data collection and related operations on the E2 interface, it achieves near-real-time control and optimization of O-RAN modules and resources.

[0106] The O-RAN central unit (O-CU) comprises the O-RAN central unit control plane (O-CU-CP) and the O-RAN central unit user plane (O-CU-UP). The O-CU implements the radio resource control (RRC) layer, the packet data convergence protocol (PDCP) layer, the service data adaptation protocol (SDAP) layer, and other control functions. Specifically, the O-CU-CP implements the RRC layer functions and the PDCP control plane functions. The O-CU-UP implements the SDAP layer functions and the PDCP user plane functions.

[0107] The O-RAN distributed unit (O-DU) is used to implement the radio link control (RLC) layer, media access control (MAC) layer, and higher physical layer (Higher PHY). The higher physical layer functions include one or more of the following: forward error correction (FEC) encoding / decoding, scrambling / descrambling, or modulation / demodulation.

[0108] The O-RAN radio unit (O-RU) is used to implement lower physical layer (PHY) functions and radio frequency (RF) functions. These PHY functions include one or more of the following: Fast Fourier Transform (FFT) / Inverse Fast Fourier Transform (iFFT), digital beamforming, or extraction and filtering of the physical random access channel (PRACH). In other words, the O-RU possesses functions similar to TRP and RRH RF devices, as well as PHY processing capabilities. Furthermore, the O-RU, O-CU, and O-DU can also be used as a single unit, i.e., the O-eNB / gNB, to implement the aforementioned functions.

[0109] O-RAN cloud (O-Cloud) is a cloud computing platform that includes physical infrastructure nodes for hosting O-RAN functions such as RIC and O-DU. O-Cloud supports software components (such as operating systems, virtual machine monitoring, and container runtimes), management, and orchestration functions.

[0110] In one possible scenario, the O-RAN system also includes a sensing unit (SU). The SU is mainly used to implement sensing-related functions, such as sending sensing signals and / or receiving echo signals of sensing signals, performing corresponding signal processing based on the received echo signals to obtain sensing measurement data, and performing sensing-related processing, etc.

[0111] As one possible implementation, a RAN node may include at least one of CU, DU, SU, and RU. A communication interface exists between CU and SU. A communication interface may or may not exist between SU ​​and DU. If no communication interface exists between SU ​​and DU, SU and DU can communicate through CU.

[0112] In the O-RAN architecture, the module that receives the report of the difference between the twin channel and the measurement channel can be CU, RT RIC, Non-RT RIC, etc. DU is responsible for receiving signals, signal processing, multipath measurement, and channel difference calculation.

[0113] For example, an O-RAN system includes communication interfaces between newly added internal components and other communication interfaces. For instance, the A1 interface serves as the interface between Non-RT RICs and Near-RT RICs, used for intelligent and dynamic control of radio resources within the O-RAN. Non-RT RICs can provide policies, enriched information, and ML model updates to Near-RT RICs via the A1 interface, while Near-RT RICs can provide policy feedback to Non-RT RICs via the A1 interface.

[0114] The E2 interface is an open interface between two endpoints used to connect the Near-RT RIC and the RAN node. The RAN node includes the CU and DU in 5G, the O-RAN compatible eNB in ​​4G, and the O-CU (O-CU-CP and / or O-CU-UP) and / or O-DU in O-RAN. The Near-RT RIC can obtain data collection and feedback from the RAN node through the E2 node, and the RAN node can obtain control feedback from the Near-RT RIC through the E2 node.

[0115] The O1 interface is the interface between the management entity in the SMO and the O-RAN module, used for operation management. This interface enables network management (such as fault management, configuration management, billing management, performance management, and security management, also known as FCAPS management), software management, and file management. The O2 interface is the interface between the SMO and the infrastructure management framework that supports O-RAN virtual network functions.

[0116] The Open Fronthaul (FH) CUS-Plane interface includes a control plane (C-Plane), a user plane (U-Plane), and a synchronization plane (S-Plane). The control plane is used for real-time control between the O-DU and O-RU, such as transmitting beamforming weights from the O-DU to the O-RU or performing power control from the O-DU to the O-RU. The user plane is used to transmit communication data between the DU and RU for access network devices and terminals. The synchronization plane is used by the O-DU to provide clock synchronization to the O-RU. The Open FH M-Plane interface is the management plane interface, used for connection between the O-RU and O-DU, as well as the SMO, enabling management, monitoring, and configuration functions.

[0117] In addition, the NG interface is the interface between RAN nodes (e.g., base stations, CUs, CU-CPs, CU-UPs) and the core network; NG-u is the user plane NG interface; and NG-c is the control plane NG interface. The Xn interface is the interface between NR RAN nodes; Xn-u is the user plane Xn interface; and Xn-c is the control plane Xn interface. The X2 interface is the interface between LTE RAN nodes; X2-u is the user plane X2 interface; and X2-c is the control plane X2 interface. In NR systems, the X2 interface is mainly used in E-UTRA-NR dual connectivity scenarios (E-UTRA-NR dualconnectivity, EN-DC), where the primary base station is an LTE RAN node connected to the LTE core network via the X2 interface. The E1 interface is the interface between CU-CPs and CU-UPs; the F1-C interface is the interface between CU-CPs and DUs; and the F1-U interface is the interface between CU-UPs and DUs.

[0118] In one possible implementation, Figure 6 This is a schematic diagram illustrating the composition of a model training device 600 provided in an embodiment of this application. Figures 2 to 4 The network devices and terminals shown can all be used Figure 6 The shown composition structure, or including Figure 6 The component shown; or, Figures 2 to 4 The components (e.g., chips) in the network devices and terminals shown can all be adopted. Figure 6 The shown composition structure, or including Figure 6 The components shown. It is understood that the model training device 600 includes means of the necessary form, such as modules, units, elements, circuits, or interfaces, to be properly configured together to perform this solution.

[0119] like Figure 6 As shown, the model training device 600 includes one or more processors 601. The processors 601 are used to implement the processing and determination processes performed by the various devices in the following embodiments. The processor 601 can be a general-purpose processor or a dedicated processor, such as a baseband processor or a central processing unit. The baseband processor can be used to process communication protocols and communication data, while the central processing unit can be used to control the model training device (e.g., RAN node, terminal, or chip), execute software programs, and process data from the software programs.

[0120] Optionally, in one design, processor 601 may include program 603 (sometimes referred to as code or instructions) that can be run on processor 601 to cause model training apparatus 600 to perform the methods described in the following embodiments.

[0121] Optionally, the model training apparatus 600 may include one or more memories 602 storing a program 604 (sometimes referred to as code or instructions) that can be run on the processor 601 to cause the model training apparatus 600 to perform the methods described in the following method embodiments.

[0122] Optionally, processor 601 and / or memory 602 may include AI modules 607 and 608, which are used to implement AI-related functions. These AI modules can be implemented through software, hardware, or a combination of both. For example, an AI module may include an intelligent controller (RIC) module. For instance, the AI ​​module may be a near real-time RIC or a non-real-time RIC.

[0123] Optionally, the processor 601 and / or memory 602 may also store data. The processor and memory may be configured separately or integrated together.

[0124] Optionally, the model training device 600 may further include a transceiver 605, which is used to implement the transmission and reception processes performed by the various devices in the following embodiments. The processor 601, sometimes referred to as a processing unit, controls the model training device (e.g., a RAN node or terminal). The transceiver 605, sometimes referred to as a transceiver unit, transceiver, transceiver circuit, or transceiver, etc., may also include an antenna 606 in the model training device 600.

[0125] It should be pointed out that, Figure 6 The structural composition shown does not constitute a limitation on the training device for this model, except... Figure 6 In addition to the components shown, the model training device may include more or fewer components than illustrated, or combine certain components, or have different component arrangements.

[0126] In this embodiment of the application, the chip system may be composed of chips or may include chips and other discrete devices.

[0127] Furthermore, the actions, terms, etc., involved in the various embodiments of this application can be referenced interchangeably without limitation. The message names or parameter names in the messages exchanged between the various devices in the embodiments of this application are merely examples, and other names may be used in specific implementations without limitation.

[0128] The following is combined Figures 1 to 6 The model training method provided in the embodiments of this application will be described.

[0129] It should be noted that in the following embodiments of this application, the message names between network elements, the names of each parameter, or the names of each piece of information are just examples. Other names may also be used in other embodiments. The model training method provided in this application does not specifically limit these names.

[0130] It is understood that in the embodiments of this application, each network element may execute some or all of the steps in the embodiments of this application. These steps or operations are merely examples, and the embodiments of this application may also execute other operations or variations thereof. Furthermore, the steps may be executed in different orders as presented in the embodiments of this application, and it is not necessary to execute all the operations in the embodiments of this application.

[0131] It is understood that this application uses terminals and network devices as examples to illustrate the execution of the interaction, but this application does not limit the execution subject of the interaction. For example, the method executed by the terminal in this application can also be executed by a module applied to the terminal (e.g., a chip, chip system, or processor), or by a logical node, logical module, or software that can implement all or part of the terminal's functions; similarly, the method executed by the network device in this application can also be executed by a module applied to the network device (e.g., a chip, chip system, or processor), or by a logical node, logical module, or software that can implement all or part of the network device's functions. This application does not specifically limit these aspects.

[0132] The functions and actions performed by each device in the communication system provided in the embodiments of this application are described below, such as... Figure 7 As shown, the model training method includes the following steps:

[0133] Step 701: The first device acquires the first type of data, the second type of data, and the third type of data.

[0134] The data is categorized into three types: a first type of data representing first channel information, a second type of data representing output data obtained by inputting the first type of data into a first model, and a third type of data representing second channel information. The second channel information includes at least one of the following: all or part of the channel information in the first channel information, or third channel information, wherein the third channel information differs from the first channel information. The difference between the third channel information and the first channel information can be understood as the third channel information and the first channel information being channel information from different frequency bands, different ports, or different times; this application does not impose any limitations on this.

[0135] In some embodiments, the first device can directly acquire the first type of data; alternatively, the first device can also acquire the first type of data from the third type of data based on a first mapping relationship. The first mapping relationship is used to characterize the first channel information within the third channel information. As an example, the first mapping relationship is a mask, used to indicate which parts of the third channel information correspond to the first channel information.

[0136] Optionally, the first type of data includes first channel information or processed first channel information; for example, the first type of data includes the result of performing a Fourier transform on the first channel information. The third type of data includes second channel information or processed second channel information, such as information after performing a Fourier transform on the second channel information.

[0137] As one implementation, the first model and the third model are used to determine the second channel information based on the first channel information. For example, the first model is used to determine the output of the first model based on the first channel information, and the third model is used to recover the second channel information based on the output of the first model. Specifically, the output of the third model is the second channel information; or, since there is some loss when the third model recovers the channel information, the recovered channel information is inconsistent with the actual channel information. Therefore, the output of the third model can be an approximation of the second channel information.

[0138] For example, the first model is an encoder used to compress (or encode) the first channel information to obtain compressed data (or encoded data); the third model is a decoder used to recover the second channel information based on the compressed data (or encoded data). The third model can not only decompress the compressed first channel information to recover it, but also predict channel information other than the first channel information based on the compressed first channel information. For example, there are four channels between the terminal and the network device: channel #1, channel #2, channel #3, and channel #4. The first channel information is the channel information of channel #1 and channel #2. After compressing the channel information of channel #1 and channel #2, the first model inputs the compressed data into the third model; the third model recovers the channel information of channel #1 and channel #2 from the compressed data. Furthermore, the third model can predict the channel information of channel #3 and / or channel #4 based on the compressed data. It should be noted that the functions of the third model in recovering compressed data and predicting data other than compressed data can be selected according to actual needs, and this application does not limit this. The compression in the embodiments of this application can also be understood as encoding, and this application does not limit it in this way.

[0139] Optionally, the first model and the third model can be trained models, for example, models obtained after training based on first-type data and third-type data. As an example, the first model is a proxy encoder of the network device, and the third model is a decoder. The network device or model computing device jointly trains the proxy encoder and decoder of the network device based on first-type data (partial channel information) and third-type data (full channel information), and deploys the third model to the network side. Then, based on the first model and first-type data, second-type data can be determined. Afterward, the terminal can obtain the output data of the proxy encoder of the network device, as well as the first-type data and third-type data, to train the encoder in the terminal. This enables the encoder in the terminal to implement the function of the proxy encoder of the network device, thereby matching the encoder in the terminal with the decoder of the network device. It should be noted that the proxy encoder of the network device is used to assist in training the decoder of the network device and may not be used in the actual encoding process. Similarly, the proxy decoder of the terminal is only used to assist in training the encoder of the terminal and may not be used in the actual decoding process. Optionally, the model computing device in this embodiment can be the aforementioned... Figure 3 The AI ​​entities in this application are not limited in this respect.

[0140] For example, the first model is an encoder, and the third model is a proxy decoder. The terminal or model computing device jointly trains the encoder and proxy decoder based on first-type data (partial channel information) and third-type data (full channel information), and deploys the first model to the terminal. Based on the first model and the first-type data, second-type data can be determined. Subsequently, the network device acquires the first, second, and third-type data to train its decoder, enabling the network device's decoder to match the terminal's encoder. Optionally, the network device can acquire the first, second, or third-type data from the terminal or model computing device, or the first, second, or third-type data can be manually input; this application does not limit this.

[0141] It should be noted that the first type of data in the embodiments of this application can also be understood as the input data of the encoder, the second type of data can also be understood as the output data of the encoder or the input data of the decoder, and the third type of data can also be understood as the output data of the decoder (or the corresponding label output by the decoder). When two models with the same function in this application (such as the encoder of the terminal and the proxy encoder of the network device, or the proxy decoder of the terminal and the decoder of the network device) belong to different devices, the input data and output data of the two models with the same function can be the same or different. For example, the input data and output data of the proxy encoder and decoder of the network device can be the same or different from the input data and output data of the encoder and proxy decoder of the terminal. In other words, when training the encoders and decoders in different devices, the first type of data, the second type of data, and / or the third type of data used can be the same or different. During the training process, the first type of data input by the encoders of different devices can be the same or different; the second type of data output by the encoders of different devices can be the same or different; the second type of data input by the decoders of different devices can be the same or different; and the third type of data output by the decoders of different devices can be the same or different.

[0142] Step 702: The first device trains the second model based on the first type of data, the second type of data, and the third type of data.

[0143] In some embodiments, the first device uses first type data, second type data, and third type data as training data to train the second model. Optionally, the first device can train the second model independently based on the aforementioned training data, or it can jointly train the second model and the fourth model based on the aforementioned training data. The fourth model can be a virtual model used in conjunction with the second model to determine the second channel data based on the first channel data. Taking the first device as a terminal as an example, the second model can be the terminal's encoder, and the fourth model can be a virtual proxy decoder for the terminal; taking the first device as a network device as an example, the second model can be the network device's decoder, and the fourth model can be a virtual proxy encoder for the network device.

[0144] As one implementation, when the first device is the terminal and the second model is the encoder of the terminal, the process of the first device training the second model independently based on the aforementioned training data includes: the first device using first type data as the input data of the second model and second type data as the target output data of the second model to train the second model, thereby obtaining the trained second model. Optionally, the first device can also verify the trained second and fourth models of the terminal based on the first type data and the third type data to improve the accuracy of the training results.

[0145] When the first device is a network device and the second model is the decoder of the network device, the process of the first device training the second model independently based on the aforementioned training data includes: the first device training the second model using second-type data as input data and third-type data as output target data to obtain the trained second model. Optionally, the first device can also verify the trained second and fourth models of the network device based on the first-type data and the third-type data to improve the accuracy of the training results.

[0146] When the first device is the terminal and the second model is the encoder of the terminal, the process of the first device jointly training the second and fourth models based on the aforementioned training data includes: First, the first device trains the fourth model using the second type of data as input data and the third type of data as the target output data of the fourth model, obtaining a trained fourth model. Then, the first device jointly trains the second and fourth models using the first type of data as input data for the second model, the output data of the second model as input data for the fourth model, and the third type of data as the target output data of the fourth model, obtaining a trained second model.

[0147] When the first device is a network device and the second model is the decoder of the network device, the process of the first device jointly training the second and fourth models based on the aforementioned training data includes: First, the first device trains the fourth model using first-type data as input data and second-type data as the target output data of the fourth model, obtaining a trained fourth model. Then, the first device jointly trains the second and fourth models using first-type data as input data, the output data of the fourth model as input data, and third-type data as the target output data of the second model, obtaining a trained second model.

[0148] It should be noted that the first device in the embodiments of this application can be a terminal, network device, AI entity, real-time RIC, or non-real-time RIC, etc. This application does not limit it. The second model in the embodiments of this application can be a model of a terminal, used to encode the first channel information; or the second model can also be a model of a network device, used to recover the second channel information based on the encoded first channel information. The channel information in the embodiments of this application can also be understood as channel measurement information, channel measurement data, channel measurement results, or information after preprocessing the channel matrix (e.g., singular value decomposition (SVD) decomposition, etc.). For example, the channel information can be channel state information obtained by measuring CSI-RS, or channel information obtained through other means, which is not limited in this application. The channel information in the embodiments of this application can be channel information obtained by actual measurement, or channel information determined by simulation, etc., which is not limited in this application.

[0149] In some embodiments, the second model has the same function as the first model; or, the second model is used to determine second type data, wherein the output of the second model is second type data; or, since the second model incurs some loss when determining the second type data, resulting in the recovered second type data being inconsistent with the actual second type data, the output of the second model may be data that is approximately the second type data. Optionally, when the second model is a terminal model, the second model has the same function as the first model; for example, the second model is an encoder for the terminal, and the first model is a proxy encoder for the network device. When the second model is a network device model, the second model is used to determine the second type data; for example, the second model is a decoder for the network device, and the first model is an encoder for the terminal.

[0150] In this embodiment, when the first device trains the second model, the training data includes not only first-type data and second-type data, but also output data obtained by inputting the first-type data into the first model. The second model trained based on this data will not only be able to correctly process the first-type data and second-type data, but also achieve the functions of the first model or match it. Specifically, when the terminal uses the second model, the terminal's second model can achieve the functions of the first model. Alternatively, when the network device uses the second model and the terminal uses the first model, the terminal's first model and the network device's second model can be correctly matched. This allows the network device's model to correctly recover the channel information based on the compressed information reported by the terminal, even when the terminal reports channel information for a portion of the channel. The first device can be either a terminal or a network device; this application does not limit its use.

[0151] In this embodiment, the second channel information includes at least one of the following: all or part of the channel information in the first channel information, or a third channel information, wherein the third channel information is different from the first channel information. In other words, in this embodiment, the terminal's encoder can compress a portion of the channel information and send the compressed portion of the channel information to the network device. The network device's decoder can decode the compressed portion of the channel information to obtain all or part of the channel information in the compressed portion of the channel information, and / or predict channel information other than the aforementioned portion of the channel information based on the compressed portion of the channel information.

[0152] When the second channel information includes the first channel information, the first device can directly acquire the first type of data; alternatively, the first device can also acquire the first type of data based on the third type of data. The following, in conjunction with... Figure 8 The process by which the first device acquires first type data based on third type data is described in detail. This process can be implemented through the following steps 801 and 802:

[0153] Step 801: The first device acquires the third type of data and the first mapping relationship.

[0154] Optionally, the first mapping relationship is used to characterize the first type of data in the third type of data. As an example, the first mapping relationship can be represented by a Mask, where the Mask is a dataset, and each data point in the dataset corresponds to a channel information point in the third type of data. The values ​​in the dataset include 0 and 1. A value of 1 in the Mask indicates that the channel information corresponding to that data point in the third type of data is the channel information in the first channel information. A value of 0 in the Mask indicates that the channel information corresponding to that data point in the third type of data is not the channel information in the first channel information. Taking a terminal and network device with 4 channels, namely: channel #1, channel #2, channel #3 and channel #4, and the third type of data including the channel information of these 4 channels, with a Mask of [1, 1, 0, 0] as an example, this Mask indicates that the first type of data includes the channel information of the first two channels from channel #1 to channel #4, that is, the channel information of channel #1 and channel #2.

[0155] In some embodiments, the first device may obtain the first mapping relationship from other devices, or the first device may determine the first mapping relationship independently; this application does not limit this. In other words, the first device obtaining the first mapping relationship includes: receiving the first mapping relationship, or: determining the first mapping relationship.

[0156] Step 802: The first device obtains the first type of data from the third type of data based on the first mapping relationship.

[0157] In some embodiments, the first device determines the first channel information in the third channel information based on the first mapping relationship, and obtains the first channel information from the third channel information as the first type of data.

[0158] For example, referring to the example in step 801 above, the first device determines the first mapping relationship as Mask: [1, 1, 0, 0], and the third type of data includes the channel information of channel #1, channel #2, channel #3, and channel #4. Based on this, the first device obtains the first two channel information from the third type of data, that is, the channel information of channel #1 and channel #2, as the first type of data.

[0159] Based on this, the first device can obtain the first type of data through the Mask without having to obtain the complete first type of data from other devices. Since the amount of data in the Mask is usually much smaller than the amount of data in the first type of data, the first device can greatly reduce the transmission resources required to transmit the first type of data by obtaining the first type of data through the Mask.

[0160] In some embodiments, the first device may also obtain fourth type data from the third type of data based on the second mapping relationship, and train the second model based on the fourth type of data. For example... Figure 8 As shown, this process can be specifically implemented through the following steps 803 to 805:

[0161] Step 803: The first device determines the second mapping relationship.

[0162] In some embodiments, the first device may directly determine the second mapping relationship; or, the first device may determine the second mapping relationship by adjusting the first mapping relationship.

[0163] For example, referring to the example in step 801 above, the first mapping relationship is Mask: [1, 1, 0, 0]. The first device adjusts Mask to [1, 0, 1, 0], and the first device determines that the adjusted Mask: [1, 0, 1, 0] is the second mapping relationship.

[0164] Step 804: The first device obtains the fourth type of data from the third type of data based on the second mapping relationship.

[0165] The fourth type of data is used to represent the fourth channel information, which is a portion of the channel information in the second channel information, and is not entirely the same as the first channel information. In other words, the first device, based on the second mapping relationship, obtains the channel information from the fourth channel information that is not entirely the same as the first channel information as the fourth channel information.

[0166] As an example, referring to the example in step 803 above, the first device determines the second mapping relationship as Mask: [1, 0, 1, 0]. The second mapping relationship indicates that the first and third channel information in the third type of data are the fourth type of data. Based on the second mapping relationship, the first device obtains the channel information of channel #1 and channel #3 from the third type of data as the fourth type of data.

[0167] Optionally, the fourth type of data includes fourth channel information or processed fourth channel information; for example, the fourth type of data includes the result of performing a Fourier transform on the fourth channel information.

[0168] Step 805: The first device trains the second model based on the fourth type of data and the third type of data.

[0169] In some embodiments, the first device may adjust the first mapping relationship multiple times to determine a plurality of second mapping relationships. Based on these plurality of second mapping relationships, the first device determines a plurality of fourth type data from the third type of data, and then trains the second model based on the plurality of fourth type data and the third type of data. This increases the diversity of training data for the second model, thereby improving the robustness of model training.

[0170] It is understandable that there is no corresponding second type of data for the fourth type of data (that is, the first device did not input the fourth type of data into the first model to obtain the corresponding output data). The first device can train the second model based on the fourth type of data and the third type of data after training the second model in step 702 above, and then perform augmentation training on the second model based on the fourth type of data and the third type of data, thereby further improving the model performance of the second model.

[0171] Based on this, the first device can also obtain fourth channel information based on the second mapping relationship, and further train the second model based on the fourth type of data and the third type of data, increasing the diversity of training data for the second model and thus improving the robustness of model training. Alternatively, it can perform augmented training on the second model based on the fourth type of data and the third type of data, thereby further improving the model performance of the second model.

[0172] When training the second model in step 702 above, the first device can train the second model alone based on the training data (referred to as scenario 1), or it can jointly train the second model and the fourth model based on the training data (referred to as scenario 2). The following provides a detailed explanation of each:

[0173] Scenario 1: The first device trains the second model independently based on the training data.

[0174] In Scenario 1, when the first device trains the second model independently based on the training data, it first determines the input and output data of the second model, then selects the training data corresponding to the input and output data of the second model, and trains the second model based on the training data. After training is complete, the first device can also verify the second and fourth models based on the training data to determine the performance of the second model and improve the accuracy of the training results.

[0175] In the case where the second model is a terminal model (i.e., the terminal's encoder) (referred to as Case 1), the input data of the second model is type I data, and the output target data is type II data. In this case, the training data selected for the second model includes both type I and type II data. In the case where the second model is a network device model (i.e., the network device's decoder) (referred to as Case 2), the input data of the second model is type II data, and the output target data is type III data. In this case, the training data selected for the second model includes both type II and type III data. The training processes for the second model differ in Case 1 and Case 2, and will be explained separately below.

[0176] Case 1: The second model is the encoder of the terminal.

[0177] In scenario 1, the process of the first device training the second model independently based on the aforementioned training data is as follows: Figure 9 As shown, the process includes: a first device inputting first type of data into a second model to obtain the output data of the second model. The first device compares the output data of the second model with the second type of data to determine a third loss function. If the third loss function satisfies the third convergence condition, the first device ends training to obtain a trained second model. If the third loss function does not satisfy the third convergence condition, the first device continues iterative training of the second model based on the third loss function, the first type of data, and the second type of data until the third loss function satisfies the convergence condition, thus obtaining a trained second model. After this, the first device can also jointly validate the second model and a fourth model based on the first type of data and the third type of data to improve the accuracy of the training results.

[0178] Case 2: The second model is the decoder for a network device.

[0179] In scenario 2, the process of the first device training the second model independently based on the aforementioned training data is as follows: Figure 10As shown, the process includes: a first device inputting second type data into a second model to obtain the output data of the second model. The first device compares the output data of the second model with third type data to determine a fourth loss function. If the fourth loss function satisfies the fourth convergence condition, the first device ends training to obtain a trained second model. If the fourth loss function does not satisfy the fourth convergence condition, the first device continues iterative training of the second model based on the fourth loss function, the second type data, and the third type data until the fourth loss function satisfies the convergence condition, resulting in a trained second model. Afterward, the first device can also jointly validate the second and fourth models based on the first and third type data pairs to improve the accuracy of the training results.

[0180] Scenario 2: The first device jointly trains the second and fourth models based on the training data.

[0181] In scenario 2, combined Figure 7 ,like Figure 11 As shown, step 702 above can be specifically implemented through the following step 1101:

[0182] Step 1101: The first device trains the second model and the fourth model based on the first type of data, the second type of data and the third type of data.

[0183] The fourth model has the same function as the third model, or the fourth model has the same function as the first model. Optionally, the third model is used in conjunction with the first model to determine a third type of data. For example, when the first model is an encoder for a terminal, the third model is a proxy decoder for the terminal; or, when the first model is a proxy encoder for a network device, the third model is a decoder for the network device.

[0184] As an example, when the fourth model is a proxy decoder for a terminal, the fourth model functions the same as the third model. When the fourth model is a proxy encoder for a network device, the fourth model functions the same as the first model.

[0185] Combination Figure 11 ,like Figure 12 As shown, the process of training the second and fourth models described above can be specifically implemented through the following steps 1201 and 1202:

[0186] Step 1201: The first device trains the fourth model based on the first data.

[0187] The first data includes first type data and second type data, or the first data includes second type data and third type data.

[0188] In some embodiments, the first device may determine a first loss function based on the first data, and the first loss function is used to train the fourth model. It should be noted that the second model can be an encoder for the terminal, and correspondingly, the fourth model can be a proxy decoder for the terminal (denoted as Case 3); or, the second model can be a decoder for a network device, and correspondingly, the fourth model can be a proxy encoder for the network device (denoted as Case 4). In Case 3 and Case 4 above, the process of the first device training the fourth model based on the first data is different, and will be explained separately below:

[0189] Case 3: The second model is the encoder of the terminal, and the fourth model is the proxy decoder of the terminal.

[0190] In scenario 3, the first data includes both second-type and third-type data. The process of the first device training the fourth model includes: the first device inputting the second-type data into the fourth model to obtain the first output data of the fourth model; the first device calculating the first loss function of the fourth model based on the first output data and the third-type data; and the first device iteratively training the fourth model based on the first loss function, the second-type data, and the third-type data until the convergence condition of the fourth model is met, thus obtaining the trained fourth model. It should be noted that if the first loss function meets the convergence condition of the fourth model during the first training process, the first device will use the fourth model trained in the first training as the trained fourth model.

[0191] Case 4: The second model is the decoder of the network device, and the fourth model is the proxy encoder of the network device.

[0192] In scenario 4, the first data includes both first-type and second-type data. The process of the first device training the fourth model includes: the first device inputting the first-type data into the fourth model to obtain the output data of the fourth model; the first device calculating the first loss function of the fourth model based on the output data and the second-type data; and the first device iteratively training the fourth model based on the first loss function, the first-type data, and the second-type data until the convergence condition of the fourth model is met, thus obtaining the trained fourth model. It should be noted that if the first loss function meets the convergence condition of the fourth model during the first training process, the first device will use the fourth model trained in the first training as the trained fourth model.

[0193] Step 1202: The first device trains the second model and the trained fourth model based on the second data.

[0194] The second data includes the first type of data and the third type of data, or the second data includes the first type of data, the second type of data and the third type of data.

[0195] In some embodiments, the first device may determine a second loss function based on the second data, and the second loss function is used to train the second model. It should be noted that, combining cases 3 and 4 above, the process by which the first device trains the second model and the trained fourth model based on the second data differs in different cases, as explained below:

[0196] Based on scenario 3, the second model is the encoder of the terminal, and the fourth model is the proxy decoder of the terminal.

[0197] In scenario 3, the training process of the first device on the second model and the trained fourth model includes: the first device inputs first type data into the second model to obtain second output data; the second output data is then input into the trained fourth model to obtain third output data. The first device calculates a second loss function based on the third output data and the third type data; alternatively, the first device calculates a first sub-loss function based on the second output data and the second type data, and a second sub-loss function based on the third output data and the third type data, and then performs a weighted sum of the first and second sub-loss functions to obtain the second loss function. After this, the first device iteratively trains the second model and the trained fourth model based on the second loss function, the first type data, and the third type data until the convergence conditions for training the second and fourth models are met, thus obtaining the trained second model in the first device. If the second loss function meets the convergence conditions for the second and fourth models during the initial training process, the first device uses the initially trained second model as the trained second model. It should be noted that during the training process described above, the first device can also use the second type of data as a constraint on the output data of the second model to improve training efficiency and the performance of the trained second model.

[0198] Based on scenario 4, the second model represents the decoder of the network device, and the fourth model represents the proxy encoder of the network device.

[0199] In scenario 4, the training process of the first device on the second model and the trained fourth model includes: the first device inputs first type of data into the trained fourth model to obtain the output data of the trained fourth model; the first device inputs the output data of the fourth model into the second model to obtain the output data of the second model. The first device calculates a second loss function based on the output data of the second model and the third type of data; or, the first device calculates a third sub-loss function based on the output data of the fourth model and the second type of data, and calculates a fourth sub-loss function based on the output data of the second model and the third type of data, and performs a weighted sum of the third and fourth sub-loss functions to obtain the second loss function. After this, the first device iteratively trains the second model and the trained fourth model based on the second loss function, the first type of data, and the third type of data until the convergence conditions for training the second and fourth models are met, thus obtaining the trained second model in the first device. If the second loss function meets the convergence conditions for the second and fourth models during the first training process, the first device uses the second model trained in the first training as the trained second model. It should be noted that during the training process described above, the first device can also use the second type of data as a constraint on the output data of the second model to improve training efficiency and the performance of the trained second model.

[0200] The model training method in the embodiments of this application will be described below with reference to specific implementation methods.

[0201] Taking the encoder in the training terminal as an example, the first device in this embodiment can be a terminal or a computing center of a terminal manufacturer. Figure 13 As shown, the model training method provided in this application includes:

[0202] Step 1301: The network device trains the first encoder and the first decoder.

[0203] In some embodiments, the network device acquires first type data and third type data to train a first encoder and a first decoder, so that the first encoder and the first decoder cooperate to predict the third type data based on the first type data. Optionally, the first encoder is a proxy encoder of the network device.

[0204] Step 1302: The network device or data node inputs the first type of data into the first encoder that has been trained to obtain the second type of data.

[0205] The first type of data in step 1302 may be the same as the first type of data in step 1301 above, or it may be different from the first type of data in step 1301 above.

[0206] Step 1303: The terminal acquires the first type of data, the second type of data, and the third type of data.

[0207] The first type of data and the second type of data acquired by the terminal are the same as those in step 1302 above. The third type of data acquired by the terminal may be the same as or different from the third type of data in step 1301 above. The third type of data in step 1303 corresponds to the first type of data in step 1302. For example, the first type of data in step 1302 may be a portion of the channel information in the third type of data in step 1303.

[0208] Step 1304: The terminal trains the second decoder based on the second type of data and the third type of data.

[0209] The second decoder is a proxy decoder configured for the terminal.

[0210] It should be noted that the specific implementation process of step 1304 can refer to case 3 in step 1201 above, and this application will not repeat it here.

[0211] Step 1305: The terminal performs joint training on the second decoder and the second encoder based on the first type of data and the third type of data.

[0212] The second encoder is the encoder configured for the terminal.

[0213] It should be noted that the specific implementation process of step 1305 can refer to case 3 in step 1202 above, and this application will not repeat it here.

[0214] Taking the decoder in a training network device as an example, the first device in this application embodiment can be a network device, such as... Figure 14 As shown, the model training method provided in this application includes:

[0215] Step 1401: The terminal trains the third encoder and the third decoder.

[0216] In some embodiments, the terminal acquires first type data and third type data to train a third encoder and a third decoder, so that the first encoder and the first decoder can predict the third type data based on the first type data. The third decoder is a proxy decoder for the terminal.

[0217] Step 1402: The terminal inputs the first type of data into the trained third encoder to obtain the second type of data.

[0218] The first type of data in step 1402 may be the same as the first type of data in step 1401 above, or it may be different from the first type of data in step 1401 above.

[0219] Step 1403: The network device acquires the first type of data, the second type of data, and the third type of data.

[0220] The first type of data and the second type of data acquired by the network device are the same as those in step 1402 above. The third type of data acquired by the network device may be the same as or different from the third type of data in step 1401 above. The third type of data in step 1403 corresponds to the first type of data in step 1402. For example, the first type of data in step 1402 may be a portion of the channel information in the third type of data in step 1403.

[0221] Step 1404: The network device trains the fourth encoder based on the first type of data and the second type of data.

[0222] The fourth encoder is a proxy encoder configured for network devices.

[0223] It should be noted that the specific implementation process of step 1404 can refer to case 4 in step 1201 above, and this application will not repeat it here.

[0224] Step 1405: The network device performs joint training on the fourth decoder and the fourth encoder based on the first type of data and the third type of data.

[0225] The fourth encoder is the encoder configured for network devices.

[0226] It should be noted that the specific implementation process of step 1405 can refer to case 4 in step 1202 above, and this application will not repeat it here.

[0227] In the embodiments of this application, the partial channel information may be partial channel information in the frequency domain and / or time domain, or it may be channel information of some channels among multiple channels. This application does not limit this.

[0228] The above mainly describes the solutions provided by the embodiments of this application from the perspective of interaction between network elements. Correspondingly, the embodiments of this application also provide a model training device for implementing the various methods described above. The model training device can be the first device in the above method embodiments, or a device containing the first device, or a component usable in the first device; the model training device can be a network device in the above method embodiments, or a device containing the network device, or a component usable in the network device; or, the model training device can be a terminal in the above method embodiments, or a device containing the terminal, or a component usable in the terminal. It is understood that, in order to achieve the above functions, the model training device includes hardware structures and / or software modules corresponding to the execution of each function. Those skilled in the art should readily recognize that, in conjunction with the units and algorithm steps of the various examples described in the embodiments disclosed herein, this application can be implemented in hardware or a combination of hardware and computer software. Whether a function is executed in hardware or by computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0229] This application embodiment can divide the model training device into functional modules according to the above method embodiment. For example, each function can be divided into its own functional module, or two or more functions can be integrated into one processing module. The integrated module can be implemented in hardware or as a software functional module. It should be understood that the module division in this application embodiment is illustrative and only represents one logical functional division. In actual implementation, there may be other division methods.

[0230] for example, Figure 15 This is a schematic diagram of a model training device 1500 provided in an embodiment of this application. The model training device includes a transceiver module 1510. Optionally, it includes a processing module 1520. The transceiver module 1510, also known as a transceiver unit, is used to implement transceiver functions, and may be, for example, a transceiver circuit, transceiver, transceiver device, or communication interface.

[0231] Taking the model training device 1500 as the first device in the above method embodiment, or a device containing the above first device, or a component that can be used in the first device as an example, then: the transceiver module 1510 is used to acquire first type data, second type data and third type data, wherein the first type data is used to characterize first channel information, the second type data is the output data obtained by inputting the first type data into the first model, and the second type data is used to determine the third type data, the third type data is used to characterize second channel information, the second channel information includes at least one of the following: all or part of the channel information in the first channel information, or the third channel information, the third channel information being different from the first channel information.

[0232] The processing module 1520 is used to train the second model based on the first type of data, the second type of data, and the third type of data.

[0233] In one possible implementation, the second model has the same function as the first model; or, the second model is used to determine a second type of data.

[0234] In one possible implementation, the transceiver module 1510 is specifically used to acquire the third type of data and the first mapping relationship; the processing module 1520 is specifically used to acquire the first type of data from the third type of data based on the first mapping relationship.

[0235] In one possible implementation, the processing module 1520 is further configured to: determine a second mapping relationship; based on the second mapping relationship, obtain a fourth type of data from the third type of data, the fourth type of data being used to characterize fourth channel information, the fourth channel information being a portion of the channel information in the second channel information, and the fourth channel information not being completely identical to the first channel information; and train the second model based on the fourth type of data and the third type of data.

[0236] In one possible implementation, the transceiver module 1510 is specifically used to receive the first mapping relationship, or the processing module 1520 is specifically used to determine the first mapping relationship.

[0237] In one possible implementation, the first type of data includes the first channel information.

[0238] In one possible implementation, the first model and the third model are used to determine the second channel information based on the first channel information.

[0239] In one possible implementation, the first model is used to determine the output of the first model based on the first channel information, and the third model is used to recover the second channel information based on the output of the first model.

[0240] In one possible implementation, the processing module 1520 is further configured to: train a second model and a fourth model based on the first type of data, the second type of data, and the third type of data, wherein the fourth model has the same function as the third model, or the fourth model has the same function as the first model.

[0241] In one possible implementation, the processing module 1520 is further configured to: train the fourth model based on first data, wherein the first data includes first type data and second type data, or the first data includes second type data and third type data; train the second model and the trained fourth model based on second data, wherein the second data includes first type data and third type data, or the second data includes first type data, second type data and third type data.

[0242] In one possible implementation, the processing module 1520 is further configured to: determine a first loss function based on the first data, wherein the first loss function is used to train the fourth model.

[0243] In one possible implementation, the processing module 1520 is further configured to: input the second type of data into the fourth model to obtain the first output data of the fourth model; calculate the first loss function of the fourth model based on the first output data and the third type of data; and iteratively train the fourth model based on the first loss function, the second type of data, and the third type of data until the convergence condition of the fourth model is met, so as to obtain the trained fourth model.

[0244] In one possible implementation, the processing module 1520 is further configured to: determine a second loss function based on the second data, the second loss function being used to train the second model.

[0245] In one possible implementation, the processing module 1520 is further configured to: input the first type of data into the second model to obtain the second output data output by the second model; input the second output data into the trained fourth model to obtain the third output data output by the trained fourth model; calculate the second loss function based on the third output data and the third type of data; and iteratively train the second model and the trained fourth model based on the second loss function, the first type of data, and the third type of data until the convergence condition for training the second model and the fourth model is met, so as to obtain the trained second model in the first device.

[0246] In one possible implementation, the first model and the third model are models obtained after training based on the first type of data and the third type of data.

[0247] In one possible implementation, a first model is used to compress (or encode) first channel information to obtain compressed data (or encoded data); a third model is used to recover second channel information based on the compressed data (or encoded data).

[0248] All relevant content of each step involved in the above method embodiments can be referred to in the functional description of the corresponding functional module, and will not be repeated here. Optionally, the model training device 1500 may further include a storage module 1530, which can be used to store instructions and / or data, and the processing module 1520 can read the instructions and / or data in the storage module 1530.

[0249] In this embodiment, the model training device 1500 is presented in an integrated manner, divided into various functional modules. Here, "module" can refer to an application-specific integrated circuit (ASIC), a circuit, a processor and memory executing one or more software or firmware programs, integrated logic circuits, and / or other devices that can provide the aforementioned functions. In a simple embodiment, those skilled in the art will understand that the model training device can employ... Figure 6 The model training device 600 shown is in the form of the model training device.

[0250] Specifically, Figure 15 The functions / implementation process of the transceiver module 1510 and the processing module 1520 can be obtained through... Figure 6 The processor 601 in the model training device 600 shown calls computer execution instructions stored in memory 602 to implement the training. Alternatively, Figure 15 The function / implementation process of the processing module 1520 can be achieved through... Figure 6 The processor 601 in the model training device 600 shown calls computer execution instructions stored in the memory 602 to implement the training. Figure 15 The function / implementation process of the transceiver module 1510 can be obtained through Figure 6 This is achieved through the transceiver 605 in the model training device 600 shown.

[0251] Since the model training apparatus provided in this application embodiment can execute the above-described model training method, the technical effects it can achieve can be referred to the above-described method embodiment, and will not be repeated here.

[0252] It should be understood that one or more of the above modules or units can be implemented by software, hardware, or a combination of both. When any of the above modules or units are implemented by software, the software exists as computer program instructions and is stored in memory. The processor can be used to execute the program instructions and implement the above method flow. The processor can be built into a SoC (System-on-a-Chip) or ASIC, or it can be a separate semiconductor chip. In addition to the core that executes software instructions for computation or processing, the processor may further include necessary hardware accelerators, such as field-programmable gate arrays (FPGAs), programmable logic devices (PLDs), or logic circuits that implement dedicated logic operations.

[0253] When the above modules or units are implemented in hardware, the hardware can be any one or any combination of a central processing unit (CPU), microprocessor, digital signal processing (DSP) chip, microcontroller unit (MCU), artificial intelligence processor, ASIC, SoC, FPGA, PLD, application-specific digital circuit, hardware accelerator, or non-integrated discrete device, which can run the necessary software or perform the above method flow independently of software.

[0254] Optionally, embodiments of this application also provide a model training apparatus (e.g., the model training apparatus may be a chip or a chip system), which includes a processor for implementing the methods in any of the above method embodiments. In one possible design, the model training apparatus further includes a memory. The memory is used to store necessary program instructions and data, and the processor can call the program code stored in the memory to instruct the model training apparatus to execute the methods in any of the above method embodiments. Of course, the memory may not be included in the model training apparatus. When the model training apparatus is a chip system, it may be composed of chips or may include chips and other discrete devices; embodiments of this application do not specifically limit this.

[0255] Optionally, embodiments of this application also provide a computer-readable storage medium storing a computer program or instructions that, when run on a model training device, enable the model training device to execute the methods described in any of the above method embodiments or any implementation thereof.

[0256] Optionally, embodiments of this application also provide a communication system, which includes the network device and the terminal described in the above method embodiments.

[0257] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented using software programs, implementation can be, in whole or in part, in the form of a computer program product. This computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device containing one or more servers, data centers, etc., that can be integrated with the medium. The available media can be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid-state disks, SSDs).

[0258] Although this application has been described herein in conjunction with various embodiments, those skilled in the art, by reviewing the accompanying drawings, the disclosure, and the appended claims, will understand and implement other variations of the disclosed embodiments in carrying out the claimed application. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude multiple instances. A single processor or other unit can implement several functions listed in the claims. While different dependent claims may recite certain measures, this does not mean that these measures cannot be combined to produce good results.

[0259] Although this application has been described in conjunction with specific features and embodiments, it is obvious that various modifications and combinations can be made thereto without departing from the scope of this application. Accordingly, this specification and drawings are merely exemplary illustrations of the application as defined by the appended claims, and are intended to cover any and all modifications, variations, combinations, or equivalents within the scope of this application. Clearly, those skilled in the art can make various alterations and modifications to this application without departing from its scope. Thus, if such modifications and modifications fall within the scope of the claims and their equivalents, this application is also intended to include such modifications and modifications.

Claims

1. A model training method, characterized in that, include: Acquire first type data, second type data, and third type data, wherein the first type data is used to characterize first channel information, the second type data is output data obtained by inputting the first type data into a first model, and the second type data is used to determine the third type data, the third type data is used to characterize second channel information, the second channel information includes at least one of the following: all or part of the channel information in the first channel information, or third channel information, the third channel information being different from the first channel information; The second model is trained based on the first type of data, the second type of data, and the third type of data.

2. The method according to claim 1, characterized in that, The second model has the same function as the first model; Alternatively, the second model can be used to determine the second type of data.

3. The method according to claim 1 or 2, characterized in that, The acquisition of the first type of data includes: Obtain the third type of data and the first mapping relationship; Based on the first mapping relationship, the first type of data is obtained from the third type of data.

4. The method according to claim 3, characterized in that, The method further includes: Determine the second mapping relationship; Based on the second mapping relationship, a fourth type of data is obtained from the third type of data. The fourth type of data is used to characterize the fourth channel information, which is a part of the channel information in the second channel information, and the fourth channel information is not completely the same as the first channel information. The second model is trained based on the fourth type of data and the third type of data.

5. The method according to claim 3 or 4, characterized in that, Obtaining the first mapping relationship includes: Receive the first mapping relationship, or; Determine the first mapping relationship.

6. The method according to any one of claims 1-5, characterized in that, The first type of data includes the first channel information.

7. The method according to any one of claims 1-6, characterized in that, The first model and the third model are used to determine the second channel information based on the first channel information.

8. The method according to claim 7, characterized in that, The first model is used to determine the output of the first model based on the first channel information, and the third model is used to recover the second channel information based on the output of the first model.

9. The method according to claim 7 or 8, characterized in that, The step of training the second model based on the first type of data, the second type of data, and the third type of data includes: The second model and the fourth model are trained based on the first type of data, the second type of data, and the third type of data. The fourth model has the same function as the third model, or the fourth model has the same function as the first model.

10. The method according to claim 9, characterized in that, The training of the second and fourth models based on the first type of data, the second type of data, and the third type of data includes: The fourth model is trained based on the first data, wherein the first data includes the first type of data and the second type of data, or the first data includes the second type of data and the third type of data; The second model and the trained fourth model are trained based on the second data, wherein the second data includes the first type of data and the third type of data, or the second data includes the first type of data, the second type of data and the third type of data.

11. The method according to claim 10, characterized in that, The training of the fourth model based on the first data includes: A first loss function is determined based on the first data, and the first loss function is used to train the fourth model.

12. The method according to claim 10 or 11, characterized in that, The training of the second model and the trained fourth model based on the second data includes: A second loss function is determined based on the second data, and the second loss function is used to train the second model.

13. The method according to any one of claims 7-12, characterized in that, The first model and the third model are models obtained after training based on the first type of data and the third type of data.

14. A model training device, characterized in that, include: A functional unit for performing the method as described in any one of claims 1-13, wherein the action performed by the functional unit is implemented by hardware or by hardware executing corresponding software.

15. A model training device, characterized in that, include: processor; The processor is connected to a memory for storing computer execution instructions. The processor executes the computer execution instructions stored in the memory to enable the model training device to implement the method as described in any one of claims 1-13.

16. A computer-readable storage medium, characterized in that, Includes instructions that, when executed on a computer, cause the computer to perform the method as described in any one of claims 1-13.

17. A chip, characterized in that, The chip includes a processor connected to a memory for storing computer execution instructions. The processor executes the computer execution instructions stored in the memory to enable the model training device to implement the method as described in any one of claims 1-13.

18. A computer program product containing instructions, characterized in that, When it is run on a model training device, the model training device enables the method as described in any one of claims 1-13.