Training data acquisition method and related apparatus

By acquiring and processing the first data sequence from T data sequences, and using its overall correlation to update the weights of the neural network model, the problem of insufficient accuracy in neural network model training output is solved, and higher output accuracy is achieved.

WO2025209536A9PCT designated stage Publication Date: 2026-06-25HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2025-04-02
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Existing neural network model training methods often result in models with insufficient accuracy, failing to meet the required standards.

Method used

By acquiring T data sequences, the first data sequence is determined as the input data sequence and/or label of the neural network model. The correlation between multiple data is considered, and the model weights are updated using the overall difference to improve the output accuracy.

Benefits of technology

The trained neural network model can learn more dimensions of input-output relationships, thus improving the accuracy of the output.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025086887_25062026_PF_FP_ABST
    Figure CN2025086887_25062026_PF_FP_ABST
Patent Text Reader

Abstract

The present application is applied to the field of artificial intelligence. Provided are a training data acquisition method and a related apparatus. In the technical solution provided in the present application, an input data sequence for training a neural network model is acquired on the basis of a plurality of data sequences, wherein a plurality of pieces of data in the input data sequence are sequentially input into the neural network model. The technical solution provided in the present application enables acquired data to improve the accuracy of neural network models obtained by means of training.
Need to check novelty before this filing date? Find Prior Art

Description

Methods and related devices for acquiring training data

[0001] This application claims priority to Chinese Patent Application No. 202410407969.4, filed on April 3, 2024, entitled “Method and Apparatus for Acquiring Training Data”, the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of artificial intelligence, and in particular to methods and apparatus for acquiring training data. Background Technology

[0003] With the development of artificial intelligence research, the applications of neural network models are constantly expanding. Neural network models typically need to be trained to acquire the corresponding functions.

[0004] The training method for neural network models is as follows: A data acquisition device collects training data to form a dataset. The training device then obtains the input data and its labels from this dataset. The input data is fed into the neural network model to obtain its actual output. The difference between the actual output and the labels is compared. If the difference exceeds a threshold, the model parameters are adjusted until the difference is less than the threshold. The trained neural network model learns the relationship between the input data and the labels. Based on this relationship, the model can process the input data to obtain the corresponding output. However, the accuracy of the output from the neural network model trained using this method often fails to meet requirements. Summary of the Invention

[0005] This application provides a method and related apparatus for acquiring training data, which helps to improve the accuracy of the output data of the trained neural network model.

[0006] In a first aspect, this application provides a method for acquiring training data. The method includes: acquiring T data sequences, each of the T data sequences containing at least one data point, where T is an integer greater than 1; determining a first data sequence based on the T data sequences, the first data sequence containing M data points, where M is a positive integer greater than 1, the first data sequence being used as an input data sequence for a neural network model and / or as a label for an input data sequence for a neural network model, wherein the input data sequence is used to sequentially input into the neural network model.

[0007] In this method, because multiple data points in the first data sequence are sequentially input into the neural network, and the label corresponding to the first data sequence is the label of the first data sequence as a whole, the neural network model trained based on the first data sequence and its overall label can consider or utilize the influence of the relationships between multiple data points in the input data sequence on the relationship between input and output when obtaining the output data sequence based on the input data sequence. Compared to inputting multiple data points into a neural network model in an unordered or random manner and training the neural network based on the labels associated with each data point, this method, by considering or utilizing the influence of the relationships between multiple data points in the input data sequence on the relationship between input and output, allows the trained neural network model to learn or extract more dimensions of the relationship between output and input, thus obtaining a more accurate output based on the input.

[0008] Some implementations of this method further include: obtaining first information, where the first information indicates P data sequences from T data sequences used to obtain the first data sequence, where P is a positive integer less than or equal to T. Correspondingly, determining the first data sequence based on the T data sequences includes: determining the first data sequence based on the first information and the T data sequences.

[0009] In some implementations, the T data sequences include a first sequence, wherein the first information includes a first indication information, which indicates that the T data sequences are concatenated with the first sequence.

[0010] In this way, based on the first indication information, the data sequence that is combined with the first sequence to obtain the first data sequence can be determined, thereby determining the first data sequence.

[0011] In some implementations, the T data sequences include a first sequence and a second sequence, wherein the first information includes a second indication information, which indicates whether the second sequence is concatenated with the first sequence.

[0012] In this way, based on the second indication information, it can be determined whether the second sequence can be combined with the first sequence to obtain the first data sequence, thereby determining the first data sequence.

[0013] In some implementations, the second sequence can be a sequence that follows the first sequence in the transmission order or in the storage location.

[0014] In some implementations, the first information includes at least one of the following: a start sequence identifier, P, of the first data sequence, or an end sequence identifier of the first data sequence.

[0015] In this implementation, P data sequences for obtaining the first data sequence are determined from T data sequences based on the start sequence identifier, P, or the end sequence identifier of the first data sequence, thereby determining the first data sequence.

[0016] Here, the first information containing P can be understood as the number of data sequences contained in the first data sequence.

[0017] In some implementations, the first information indicates P data sequences, including: the first information indicates the characteristics of the P data sequences. This allows the P data sequences out of T data sequences that satisfy the characteristics indicated by the first information to be identified as the data sequences used to obtain the first data sequence, thereby determining the first data sequence.

[0018] For example, if the first information indicates the length of the first sequence, then P data sequences with a length equal to the length of the first sequence are used to obtain the first data sequence.

[0019] For example, if the first information indicates an odd number, then the P data sequences marked as odd are used to obtain the first data sequence.

[0020] For example, if the first information indicates an even number, then P data sequences marked as even numbers are used to obtain the first data sequence.

[0021] In some implementations, the first information includes the identifier information for each of the P data sequences. The identifier information for each data sequence can be its index among the T data sequences.

[0022] In this implementation, the P data sequences indicated by the P identifiers contained in the first information can be determined as the data sequences used to obtain the first data sequence, thereby determining the first data sequence.

[0023] In some implementations, the T data sequences include a first sequence, wherein obtaining the T data sequences includes obtaining the first sequence, and obtaining the first sequence includes obtaining second information, the second information being used to obtain the first sequence, and obtaining the first sequence based on the second information.

[0024] In some implementations, the second information is used to indicate the format of the first sequence. In some scenarios, the first sequence is generated based on the second information, i.e., the second information is used to generate the first sequence; in other scenarios, the first sequence is identified based on the second information, i.e., the second information is used to identify the first sequence.

[0025] When the second information indicates the format of the first sequence, in some implementations, the second information may include at least one of the following: the representation of the first sequence, the start data identifier of the first sequence, the length of the first sequence, or the end data identifier of the first sequence. In this implementation, the second information may be referred to as auxiliary information associated with the first sequence.

[0026] In some implementations, the second information may include the initial values ​​of the model state corresponding to the first sequence. The initial values ​​of the model state corresponding to the first sequence can also be referred to as auxiliary information associated with the first sequence.

[0027] In some implementations, multiple data sequences among the P data sequences are associated one-to-one with initial model states. In this case, as an example, the initial model state corresponding to the first data sequence is the initial model state corresponding to the first data sequence input into the neural network model.

[0028] The initial value of the model state corresponding to the first data sequence refers to the initial value of the model state used by the neural network model when obtaining the output data sequence corresponding to the first data sequence.

[0029] In some implementations, this method further includes: obtaining first network state information, wherein the first network state information indicates the initial value of the model state used by the neural network model when obtaining the output data sequence corresponding to the first data sequence.

[0030] In some implementations, the data in the first data sequence includes channel state information.

[0031] In some implementations, neural network models are used for compression and / or prediction of channel state information.

[0032] In some implementations, the first data sequence is also used as the label corresponding to the first data sequence.

[0033] In some implementations, the T data sequences contain a third sequence, wherein obtaining the T data sequences includes obtaining the third sequence, which includes obtaining third information, the third information being used to obtain the third sequence, and obtaining the third sequence based on the third information.

[0034] In some implementations, the third information is used to indicate the format of the third sequence. In some scenarios, the third sequence is generated based on the third information, i.e., the third information is used to generate the third sequence; in other scenarios, the third sequence is identified based on the third information, i.e., the third information is used to identify the third sequence.

[0035] When the third information indicates the format of the third sequence, in some implementations, the third information may include at least one of the following: the representation of the third sequence, the start data identifier of the third sequence, the length of the third sequence, or the end data identifier of the third sequence. In this implementation, the third information may be referred to as auxiliary information associated with the third sequence.

[0036] In some implementations, the third information can include the initial values ​​of the model state corresponding to the third sequence. These initial values ​​can also be referred to as auxiliary information associated with the third sequence.

[0037] In some implementations, at least some of the auxiliary information corresponding to the first sequence and the auxiliary information corresponding to the third sequence are common auxiliary information.

[0038] When the first data sequence is used only as the input data sequence for the neural network model, in some implementations, this method further includes: obtaining S data sequences, each of which contains at least one data point, where S is an integer greater than 1; determining a second data sequence based on the S data sequences, where the second data sequence contains Q data points, where Q is a positive integer greater than 1, and using the second data sequence as a label for the first data sequence, wherein multiple data points in the first data sequence are used to be input into the neural network model sequentially.

[0039] In some implementations, each data point in each of the S data sequences is associated with at least one data point in the T data sequences. A second data sequence can be derived based on this association and the S data sequences.

[0040] In some implementations, these S data sequences include a fourth sequence. Obtaining the S data sequences includes obtaining the fourth sequence, which involves: obtaining fourth information, which is used to obtain the fourth sequence, and obtaining the fourth sequence based on the fourth information.

[0041] In some implementations, the fourth information is used to indicate the format of the fourth sequence. In some scenarios, the fourth sequence is generated based on the fourth information, i.e., the fourth information is used to generate the fourth sequence; in other scenarios, the fourth sequence is identified based on the fourth information, i.e., the fourth information is used to identify the fourth sequence.

[0042] When the fourth information indicates the format of the fourth sequence, in some implementations, the fourth information may include at least one of the following: the representation of the fourth sequence, the start data identifier of the fourth sequence, the length of the fourth sequence, or the end data identifier of the fourth sequence. In this implementation, the fourth information may be referred to as auxiliary information associated with the fourth sequence.

[0043] In some implementations, the fourth information may include the initial values ​​of the model state corresponding to the fourth sequence. These initial values ​​can also be referred to as auxiliary information associated with the fourth sequence.

[0044] In some implementations, these S data sequences include a fifth sequence. Obtaining the S data sequences includes obtaining the fifth sequence, which in turn includes obtaining fifth information, which is used to obtain the fifth sequence, and obtaining the fifth sequence based on the fifth information.

[0045] In some implementations, the fifth information is used to indicate the format of the fifth sequence. In some scenarios, the fifth sequence is generated based on the fifth information, i.e., the fifth information is used to generate the fifth sequence; in other scenarios, the fifth sequence is identified based on the fifth information, i.e., the fifth information is used to identify the fifth sequence.

[0046] When the fifth information indicates the format of the fifth sequence, in some implementations, the fifth information may include at least one of the following: the representation of the fifth sequence, the start data identifier of the fifth sequence, the length of the fifth sequence, or the end data identifier of the fifth sequence. In this implementation, the fifth information may be referred to as auxiliary information associated with the fifth sequence.

[0047] In some implementations, the fifth piece of information may include the initial values ​​of the model state corresponding to the fifth sequence. These initial values ​​can also be referred to as auxiliary information associated with the fifth sequence.

[0048] In some implementations, at least some of the auxiliary information corresponding to the fourth sequence and the auxiliary information corresponding to the fifth sequence are common auxiliary information.

[0049] In some implementations, at least some of the auxiliary information corresponding to the first sequence and the auxiliary information corresponding to the fourth sequence are common auxiliary information.

[0050] Secondly, this application provides a training data acquisition device. This device may include modules corresponding to the methods / operations / steps / actions described in the first aspect. These modules may be hardware circuits, software, or a combination of hardware circuits and software.

[0051] In one design, the device may include a processing module and a communication module. The communication module is used to perform the sending and receiving actions in the method described in the first aspect above, while the processing module is used to perform actions involving processing (e.g., acquisition and determination) in the method described in the first aspect above.

[0052] In one design, the device can be a terminal, or a device, module, circuit, or chip configured in the terminal, or a device that can be used in conjunction with the terminal.

[0053] In one design, the device can be a network-side device, such as a network device, a device, module, circuit, or chip configured in the network device, a device that can be used in conjunction with the network device, or a device that communicates with the network device, such as a server.

[0054] Thirdly, an apparatus is provided, including a processor and a storage medium storing instructions that, when executed by the processor, cause a method as described in the first aspect or any possible implementation thereof to be implemented.

[0055] Fourthly, an apparatus is provided, including processing circuitry for processing data and / or information such that the methods described in the first aspect or any possible implementation thereof are implemented.

[0056] The processing circuit may include one or more processors, or all or part of the circuitry in one or more processors used for processing functions.

[0057] Optionally, the apparatus may further include a memory for storing programs or instructions, and the processor for running the programs or instructions to implement the methods as described in the first aspect or any possible implementation thereof.

[0058] Optionally, the device may also include the transceiver circuit, or an input / output interface.

[0059] Fifthly, a chip is provided, including processing circuitry for running programs or instructions to implement methods as described in the first aspect or any possible implementation thereof.

[0060] Optionally, the chip may further include a memory for storing programs or instructions.

[0061] Optionally, the chip may also include transceiver circuitry, or input / output interfaces.

[0062] A sixth aspect provides a computer-readable storage medium comprising instructions that, when executed by a processor, cause the method as described in the first aspect or any possible implementation thereof to be implemented.

[0063] In a seventh aspect, a computer program product is provided, the computer program product comprising computer program code or instructions, which, when executed, cause the method as described in the first aspect or any possible implementation thereof to be implemented.

[0064] [Revised according to Article 91, 19.11.2025] In an eighth aspect, a training system for a neural network model is provided, the system including means for performing the first aspect or any possible implementation of the first aspect.

[0065] In some possible implementations, the training system includes a communication system, or the communication system includes the training system. Attached Figure Description

[0066] Figure 1 is a system architecture diagram of an embodiment of this application;

[0067] Figure 2 is a flowchart of a training method according to an embodiment of this application;

[0068] Figure 3 is a flowchart of a training method according to an embodiment of this application;

[0069] Figure 4 is a flowchart of a method for obtaining training data according to an embodiment of this application;

[0070] Figure 5 is an architecture diagram of a communication system according to an embodiment of this application;

[0071] Figure 6 is a flowchart of a communication system according to an embodiment of this application;

[0072] Figure 7 is a flowchart of a communication system according to an embodiment of this application;

[0073] Figure 8 is a flowchart of a communication system according to an embodiment of this application;

[0074] Figure 9 is a structural diagram of a training data acquisition device according to an embodiment of this application;

[0075] Figure 10 is a structural diagram of a training data acquisition device according to an embodiment of this application. Detailed Implementation

[0076] The technical solutions in the embodiments of this application will now be described with reference to the accompanying drawings.

[0077] To facilitate a clear description of the technical solutions in the embodiments of this application, the terms "first" and "second" are used in the embodiments of this application to distinguish identical or similar items with essentially the same function and effect. Those skilled in the art will understand that the terms "first" and "second" do not limit the quantity or execution order, and the terms "first" and "second" are not necessarily different.

[0078] It should be noted that, in the embodiments of this application, the terms "exemplary" or "for example" are used to indicate examples, illustrations, or descriptions. Any embodiment or design scheme described as "exemplary" or "for example" in this application should not be construed as being more preferred or advantageous than other embodiments or design schemes. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.

[0079] In this application embodiment, "at least one" refers to one or more, and "more than one" refers to two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, or B alone, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple.

[0080] To better understand the embodiments of this application, some relevant knowledge about neural network models will be introduced below.

[0081] The work of each layer in a neural network model (such as a deep neural network model) can be described by the mathematical expression y = a(W*x + b). The work of each layer in a neural network model can be understood as transforming the input space (the set of input vectors) to the output space (i.e., from the row space to the column space of a matrix) through five operations on the input space: 1. Dimensionality increase / decrease; 2. Magnification / reduction; 3. Rotation; 4. Translation; 5. "Bending". Operations 1, 2, and 3 are performed by "W*x", operation 4 by "+b", and operation 5 by "a()". The term "space" is used here because the objects being classified are not individual things, but a class of things; space refers to the set of all individuals of this class of things. Here, W is a weight vector, and each value in this vector represents the weight value of a neuron in that layer of the neural network. This vector W determines the spatial transformation from the input space to the output space described above; that is, the weights W of each layer control how the space is transformed. The purpose of training a neural network model is to ultimately obtain the weight matrix of all layers of the trained neural network model (the weight matrix formed by the vectors W of many layers). Therefore, the training process of a neural network model is essentially learning how to control the transformation space, and more specifically, learning the weight matrix.

[0082] Because we want the output of the neural network model to be as close as possible to the desired target value, we can compare the current output value of the neural network model with the target value, and then update the weight vector of each layer of the neural network model based on the difference between the two (of course, there is usually an initialization process before the first update, that is, pre-configuring the parameters or initial values ​​of each layer in the neural network model). For example, if the output value of the neural network model is too high, the weight vector is adjusted to make the output value of the neural network model lower, and this adjustment is continued until the neural network model can output the target value. Therefore, it is necessary to predefine "how to compare the difference between the output value and the target value", which is the loss function or objective function. These are important equations used to measure the difference between the output value and the target value. Taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Therefore, the training of the deep neural network becomes the process of minimizing this loss as much as possible. In this application, the target value can be called the label, the true value, the expected value, the target output, or the expected output.

[0083] It is understood that the neural network model in this application can also be replaced by an artificial intelligence (AI) model or a machine learning (ML) model or model.

[0084] Referring to Figure 1, one embodiment of this application provides a system architecture. The system 100 includes an execution device 110, a training device 120, a database 130, and a data acquisition device 140.

[0085] Data acquisition device 140 is used to acquire data and store it in database 130. Training device 120 is used to train a neural network model based on the data maintained in database 130 to obtain neural network model 101. The neural network model 101 obtained by training device 120 is applied to execution device 110. Execution device 110 is used to process the input data using neural network model 101 to obtain output.

[0086] In some implementations, the database 130 may be located in the training device.

[0087] In some implementations, the data acquisition device 140 and the training device 120 can be the same device.

[0088] In some implementations, the training device 120 and the execution device 110 can be the same device.

[0089] Figure 2 is a schematic diagram of a model training method according to an embodiment of this application. Taking the system shown in Figure 1 as an example, this method can be implemented or executed by the training device 120 in Figure 1.

[0090] As shown in Figure 2, input H1 to the neural network model. After H1 is input, the output of the neural network model is denoted as C1. C1 is compared with the label C1' corresponding to H1, and the weight vector of the neural network model is updated based on the difference between C1 and C1'. Input H2 to the neural network model. After H2 is input, the output of the neural network model is denoted as C2. C2 is compared with the label C2' corresponding to H2, and the neural network model is updated based on the difference between C2 and C2', for example, updating the weight vector of the neural network model. And so on, input H1 to the neural network model. n H n The output of the neural network model after inputting the model is denoted as C. n Compare C n With H n The corresponding tag C n ', based on C n With C n The difference between the input and output labels is updated to the weight vector of the neural network model until the difference between the output and input labels is less than a certain threshold.

[0091] However, this training method may have the following problems: the trained neural network model processes the inputs H1 and H2 independently and cannot extract the relationship between H1 and H2 to improve the performance of the neural network.

[0092] To address the aforementioned problems, this application proposes a new model training method. Taking the system shown in Figure 1 as an example, this method can be implemented or executed by the training device 120 in Figure 1.

[0093] In the model training method proposed in this application, at least one actual output is obtained by inputting multiple data into the neural network model; these multiple inputs are treated as a whole to correspond to at least one target output; the at least one actual output is regarded as a whole, and the at least one target output is regarded as a whole, the difference between the two wholes is compared, and the weight vector of the neural network model is updated based on the difference.

[0094] As shown in Figure 3, H1 is input into the neural network model, and the actual output of the neural network model is denoted as C1, while the target output corresponding to H1 is denoted as C1'. Next, H2 is input into the neural network model, and the actual output of the neural network model is denoted as C2, while the target output corresponding to H2 is denoted as C2'. This process continues, inputting H1 into the neural network model... n The actual output of the neural network model is denoted as C. n The target output corresponding to Hn is C. n '; Compare the actual outputs "C1, C2...C" n times n "and the target output "C1', C2'...Cn The difference between ' and ' is used to update the weight vector of the neural network model; where n is an integer greater than 1.

[0095] For example, calculate the output sequence "C1, C2...C n "and the target value sequence "C1', C2'...C n The mean squared error, squared error, or covariance, or correlation, of the two sequences is used as the difference between them, and the weight vector of the neural network model is updated based on this difference.

[0096] Among them, H1, H2...H n These are respectively called one input, C1, C2...C n Each of these is recorded as one actual output, C1', C2'...C n Each of these is recorded as a target output; H1, H2...H n As a whole, it is called an input data sequence, or simply the input sequence; C1, C2...C n As a whole, it is called an actual output data sequence, or simply the actual output sequence; C1', C2'...C n The whole sequence is called a target output data sequence, or simply a target output sequence. The target output sequence can also be called a label sequence. The target output sequence is the label corresponding to the input sequence or the corresponding label sequence. The input sequence, the actual output sequence, or the target output sequence can be collectively referred to as a data sequence, or simply a sequence.

[0097] Because these multiple inputs are based on a pre-defined ordered input neural network model, they can be considered ordered among themselves, and the corresponding at least one output can also be considered ordered. The labels corresponding to these multiple inputs can also be considered ordered.

[0098] During training or inference, the model generates or stores corresponding state information. This state information can affect the output obtained by the model when processing the same input. This state information can be determined by the initial value of the model's state information and the training or inference performed in the past. That is, the model shown in Figure 3 has temporal continuity.

[0099] In some possible implementations, the difference between the output sequence and the label sequence can be represented based on the mean squared difference, squared difference, covariance, or correlation between the two sequences.

[0100] In some possible implementations, the label of the input sequence of the neural network model is the input sequence itself. As an example, multiple data points from the first input sequence are sequentially input into the first neural network model to obtain at least one actual output data point, denoted as the second input sequence. At least one data point from the second input sequence is then input into the second neural network model to obtain multiple actual output data points, which constitute the output sequence. The difference between the output sequence and the first input sequence is compared, and the weight vectors of the first and second neural network models are updated based on this difference to ensure that the difference between the output of the second neural network model and the input of the first neural network model meets the requirements. The target neural network model obtained through training can be either the first neural network model, the second neural network model, or both.

[0101] The function of the second neural network model is exactly the opposite of that of the first neural network model. In other words, when the first data is input into the first neural network model to obtain the second data, and the second data is input into the second neural network model, the expected output of the second neural network model is the first data; when the third data is input into the second neural network model to obtain the fourth data, and the fourth data is input into the first neural network model, the expected output of the first neural network model is the third data.

[0102] As can be seen from the above, in the method proposed in this application, because the difference between the expected output data corresponding to multiple inputs and the actual output corresponding to these multiple inputs is calculated as a whole, and the weights of the neural network model are updated based on this difference, the transformation relationship from input space to output space learned by the trained neural network model not only includes the transformation relationship between one input and one output, but also learns the transformation relationship between other inputs and the current input to output. Thus, in the application scenario of this neural network model, taking the system shown in Figure 1 as an example, when the execution device 110 uses the trained neural network model 101 to obtain the output based on the input, because it also utilizes the transformation relationship between other inputs and the target value corresponding to the current input, it can improve the similarity between the current actual output and the target value, that is, improve the accuracy of the current output.

[0103] To address the aforementioned model training methods, this application provides a method for acquiring training data. Figure 4 is a schematic flowchart of a method for acquiring training data according to an embodiment of this application. As shown in Figure 4, the method includes steps S410, S420, and S430.

[0104] S410, obtain T data sequences, each of the T data sequences contains at least one data, where T is an integer greater than 1.

[0105] S420, Obtain first information, the first information indicating P data sequences from T data sequences used to obtain the first data sequence, where P is a positive integer less than or equal to T.

[0106] S430, determine a first data sequence based on the first information and T data sequences. The first data sequence contains M data, where M is a positive integer greater than 1. The first data sequence is used as the input data sequence of the neural network model and / or as the label of the input data sequence of the neural network model. The data in the input data sequence are used to be input into the neural network model sequentially.

[0107] In this embodiment, when the first data sequence is used as the input sequence for training a neural network model, multiple data points in the first data sequence can be sequentially input into the neural network model. An example of the first data sequence includes H1, H2, ..., H1 in Figure 3. n Where n is an integer greater than 1; an example of how the first data sequence is sequentially input into the neural network model is shown in Figure 3 as H1, H2, ..., H n The way to input data into a neural network model.

[0108] In some implementations of this application, P data sequences are arranged sequentially to obtain a first data sequence. Here, "sequentially" can be understood as the order of the P data sequences in the T data sequences, or the order of the P data sequences indicated by the first information.

[0109] In some implementations of this application, one understanding of how multiple data points in a data sequence can be sequentially input into a neural network model is as follows: the multiple data points in the data sequence are ordered.

[0110] Optionally, the order in which multiple data points are input into the neural network model in the data sequence may be the same as or different from the order of these multiple data points in the data sequence. This application does not impose any restrictions on this, as long as the order in which these multiple data points are input into the neural network model can be determined or known based on the order between these multiple data points.

[0111] Optionally, the order in which multiple data inputs into the neural network model in the data sequence can be predefined, or can be indicated by instruction information.

[0112] In the method of this embodiment, the use of data sequence to associate tags can be understood as: a data sequence can be associated with a tag sequence, or a data sequence can be associated with a tag.

[0113] In the method of this embodiment, the first data sequence is associated with a label sequence or a label. This can be understood as follows: the label sequence associated with the first data sequence contains at least one label corresponding to multiple data in the first data sequence. After multiple data in the first data sequence are sequentially input into the neural network model to obtain at least one actual output corresponding to these multiple data, the at least one label is used to calculate the difference between the at least one actual output and the difference is used to update the weight vector of the neural network model.

[0114] In some implementations of this embodiment, when the first data sequence is used to train a neural network model, the first data sequence can also be used as a label for the input sequence of the neural network model.

[0115] When a data sequence is used both as the input sequence of a neural network model and as the label of the input sequence of a neural network model, one example of training a neural network model based on the data sequence can be found in the aforementioned content on the label of the input sequence of a neural network model, which will not be repeated here.

[0116] In some implementations of this embodiment, obtaining the first data sequence includes: generating the first data sequence. It can be understood that generation in this embodiment can be interpreted as a process from nothing to something. For example, generating the first data sequence can be understood as: there are multiple data sequences that can be used to train a neural network model, but it is uncertain whether these data sequences can constitute a new data sequence, or whether there is a defined order for inputting the data from these multiple data sequences into the neural network model, and / or whether they can be associated with a label sequence as a whole. That is, originally there was no concept of a first data sequence; after the generation operation, the order of inputting these data into the neural network model is determined, and / or it is determined that these data as a whole can be associated with a label sequence.

[0117] In some implementations, the first data sequence is generated by the data acquisition device shown in Figure 1 when the first data sequence is obtained. For example, the data acquisition device acquires a lot of data and generates the first data sequence based on this data. In other words, it determines which data in the data are multiple data included in the first data sequence, or determines which data in the data constitute or compose the multiple data that make up the first data sequence. One example of constituting or composing is splicing.

[0118] In some implementations, obtaining the first data sequence involves generating the first data sequence from a database as shown in Figure 1. For example, the data acquisition device generates T data sequences and sends these data sequences to the database, which then generates the first data sequence.

[0119] In some implementations, obtaining the first data sequence involves generating it using the training device shown in Figure 1. For example, the database obtains T data sequences and sends them to the training device, which then generates the first data sequence.

[0120] The database obtains T data sequences, which can be either generated by the database or received from a data acquisition device.

[0121] In some implementations of this embodiment, the method further includes sending a first data sequence.

[0122] For example, after the data acquisition device obtains the first data sequence, it can send the first data sequence to the database.

[0123] For example, after the database obtains the first data sequence, it can send the first data sequence to the training device. Here, the database obtaining the first data sequence can be either generating the first data sequence or receiving the first data sequence from the data acquisition device.

[0124] Some implementations of this embodiment also include: training a neural network model using a first data sequence.

[0125] For example, after the training device acquires the first data sequence, it can use the first data sequence to train a neural network model. Here, acquiring the first data sequence can be achieved by the training device generating the first data sequence, or by the training device receiving the first data sequence from a database, or, if the database is located on the training device, by the training device receiving the first data sequence from a data acquisition device.

[0126] In some implementations of this embodiment, when the training device trains a neural network model using a first data sequence, the label sequence associated with the first data sequence can be associated with the first data sequence by the training device, wherein the first data sequence can be generated by the training device, a database, or a data acquisition device; or, the label sequence associated with the first data sequence can be associated with the first data sequence by the database, wherein the first data sequence can be generated by the database or a data acquisition device; or, the label sequence associated with the first data sequence can be associated with the first data sequence by the data acquisition device, wherein the first data sequence can be generated by the data acquisition device.

[0127] In some implementations of this embodiment, the method further includes: obtaining the initial value of the model state corresponding to the first data sequence.

[0128] The initial value of the model state corresponding to the first data sequence can be understood as: the initial value of the model state of the neural network model when the first data sequence is used as the input of the neural network model and the actual output sequence of the neural network model is expected to be the label sequence associated with the first data sequence.

[0129] In this application, the initial value of the model state can also be referred to as the initial information of the model state, or the initial value of the model state information. The model state information can also be referred to as any one or more of the following: model state value, cached information related to the model, storage information related to the model, intermediate information (e.g., intermediate information generated by the model), internal information (e.g., internal information of the device deploying the model or internal information of the model), and parameter information generated or updated by the model.

[0130] It can be understood that when the first data sequence is input into the neural network model, the model state value of the neural network model is the initial value of the model state corresponding to the first data sequence.

[0131] In some implementations, the initial value of the model state corresponding to the first data sequence can be the initial value of the model state corresponding to the first data sequence input into the neural network model among P data sequences.

[0132] The initial values ​​of the model state can be represented as scalars, such as 0 and 1. In this case, if the network state is actually an N-dimensional tensor, then the initial value of the model state is to assign the value of this scalar to each element of the N-dimensional tensor. Alternatively, the initial value of the model state can be directly an N-dimensional scalar. N is a positive integer.

[0133] In some implementations, the first information may include at least one of the following: a start sequence identifier, P, of the first data sequence, or an end sequence identifier of the first data sequence.

[0134] The start sequence identifier and the end sequence identifier of the first data sequence can be the same or different.

[0135] In some implementations, when the start sequence identifier and the end sequence identifier of the first data sequence are the same, the start sequence identifier and the end sequence identifier of the first data sequence can be collectively referred to as sequence identifiers.

[0136] In some implementations of this embodiment, at least some of the information in the first information may be predefined or have a default configuration.

[0137] In some implementations of this embodiment, at least part of the first information may be received from other devices, such as from an OAM device.

[0138] In some implementations, the T data sequences include a first sequence, wherein the first information includes a first indication information, which indicates that the T data sequences are concatenated with the first sequence.

[0139] In this embodiment, concatenating one data sequence with another can be understood as the data in the preceding data sequence being arranged sequentially before or after the following data sequence. As an example, "sequentially" here can be understood as arranging the data according to the order in which they appear in the preceding data sequence.

[0140] For example, the six sequences in a set of T data sequences are labeled ID1, ID2, ID3, ID4, ID5, and ID6. For sequence ID2, there is an indication that it contains the sequence identifier ID1, indicating that sequence ID2 and sequence ID1 can be combined. For sequence ID4, there is an indication that it contains the sequence identifier ID2, indicating that it can be combined with sequence ID2. Therefore, it can be seen that concatenating sequences ID1, ID2, and ID4 in sequence yields the first data sequence.

[0141] In some implementations, the T data sequences include a first sequence and a second sequence, wherein the first information includes a second indication information, which indicates whether the second sequence is concatenated with the first sequence.

[0142] For example, the six sequences in a set of T data sequences are labeled ID1, ID2, ID3, ID4, ID5, and ID6. For sequence ID2, the indication information shows that sequence ID2 can be combined with the preceding sequence ID1; for sequence ID3, the indication information shows that sequence ID3 cannot be combined with the preceding sequence ID2. Therefore, it can be seen that concatenating sequences ID1 and ID2 in sequence yields the first data sequence.

[0143] In some implementations, the first information includes at least one of the following: a start sequence identifier, P, of the first data sequence, or an end sequence identifier of the first data sequence.

[0144] For example, the six sequences out of a total of T data sequences are labeled as ID1, ID2, ID3, ID4, ID5, and ID6. The information indicates that the starting sequence is ID2, and P equals 3. Therefore, sequences ID2, ID3, and ID4 are concatenated in order to obtain the first data sequence.

[0145] Here, the first information containing P can be understood as the number of data sequences contained in the first data sequence.

[0146] For example, the six sequences in a set of T data sequences are labeled ID1, ID2, ID3, ID4, ID5, and ID6. The information indicates that the ending sequence is ID5, and P equals 3. Therefore, sequences ID3, ID4, and ID5 are concatenated in order to obtain the first data sequence.

[0147] For example, the six sequences in a set of T data sequences are labeled ID1, ID2, ID3, ID4, ID5, and ID6. The information indicates that the starting sequence is ID3 and the ending sequence is ID5. Therefore, sequences ID3, ID4, and ID5 are concatenated in order to obtain the first data sequence.

[0148] In some implementations, the first information indicates P data sequences, including: the first information indicates the characteristics of the P data sequences. This allows the P data sequences out of T data sequences that satisfy the characteristics indicated by the first information to be identified as the data sequences used to obtain the first data sequence, thereby determining the first data sequence.

[0149] As an example, if the first information indicates the length of the first sequence, then P data sequences with a length greater than or equal to the length of the first sequence are used to obtain the first data sequence.

[0150] For example, if the sequence length indicated by the first information is 100, and the number of data contained in each of the sequences ID3, ID4, and ID5 is greater than or equal to 100, then the sequences ID3, ID4, and ID5 can be concatenated in sequence to obtain the first data sequence.

[0151] As an example, if the first information indicates an odd number, then P data sequences identified as odd are used to obtain the first data sequence.

[0152] For example, if the first information indicates that the identifiers of the P sequences are odd, then sequences ID1, ID3, and ID5 can be concatenated in order to obtain the first data sequence.

[0153] For example, if the first information indicates an even number, then P data sequences marked as even numbers are used to obtain the first data sequence.

[0154] In some implementations, the first information includes the identifier information for each of the P data sequences. The identifier information for each data sequence can be its index among the T data sequences.

[0155] For example, the six sequences out of a plurality of T data sequences are labeled as ID1, ID2, ID3, ID4, ID5, and ID6. The indication information indicates ID1, ID2, and ID3, so it can be known that sequences ID1, ID2, and ID3 are concatenated in order to obtain the first data sequence.

[0156] In some implementations, the order of ID1, ID2, ID3, ID4, ID5, and ID6 can be the transmission order, the storage order, or the order of the ID numbers.

[0157] In some implementations of this method, when the first data sequence is used only as the input sequence, the method further includes: obtaining S data sequences, each of which contains at least one data point, where S is an integer greater than 1; determining a second data sequence based on the S data sequences, where the second data sequence contains Q data points, where Q is a positive integer greater than 1, and using the second data sequence as a label for the first data sequence, wherein multiple data points in the first data sequence are used to sequentially input into the neural network model.

[0158] The method for obtaining Q data sequences can be referenced from the method for obtaining P data sequences mentioned above, and will not be repeated here.

[0159] In some implementations, acquiring T data sequences includes receiving T data sequences. Acquiring the first information includes receiving the first information. In other words, the first information is sent by the device sending the T data sequences to the device receiving the T data sequences.

[0160] As an example, the first piece of information and T data sequences are contained in the same message or the same information and transmitted together.

[0161] In some implementations, the first information can be predefined.

[0162] The following example, using one of the T data sequences as the first sequence, illustrates how to obtain the T data sequences.

[0163] The first sequence contains multiple data points. The first sequence is used as the input data sequence. The multiple data points in the first sequence are used to be input into the neural network model in sequence. The first sequence is used to associate labels.

[0164] In some implementations of this embodiment, obtaining the first sequence includes generating the first sequence. It can be understood that generation in this embodiment can be interpreted as a process from nothing to something. For example, generating the first sequence can be understood as: there are multiple data points that can be used to train a neural network model, but these data points do not have a defined order for inputting into the neural network model, and / or, they do not have a collective label sequence to associate with; that is, originally there was no concept of a first sequence. After the generation operation, the order in which these data points are input into the neural network model is determined, and / or it is determined that these data points, as a whole, can be associated with a label sequence.

[0165] In some implementations, the first sequence is generated by the data acquisition device shown in Figure 1 when the first sequence is obtained. For example, the data acquisition device acquires a lot of data and generates the first sequence based on this data; or, in other words, it determines which data in this data are multiple data included in the first sequence, or determines which data in this data constitute or compose the first sequence. One example of constituting or composing is concatenation. In this implementation, the first data sequence can be obtained by the data acquisition device, database, or training device based on the first sequence.

[0166] Obtaining the first sequence involves generating the first sequence. In some implementations, the first sequence is generated by the database shown in Figure 1. For example, a data acquisition device collects a lot of data and sends this data to a database, which then generates the first sequence. In this implementation, the database or a training device can obtain the first data sequence based on the first sequence.

[0167] In some implementations, obtaining the first sequence involves generating it using the training device shown in Figure 1. For example, a data acquisition device collects a large amount of data and sends it to a database. The database periodically sends this data to the training device, which then generates the first sequence. In this implementation, the training device can obtain the first data sequence based on the first sequence.

[0168] In some implementations of this embodiment, obtaining the first sequence includes: receiving the first sequence.

[0169] In some implementations, the first sequence is received by a database as shown in Figure 1. For example, the first sequence is generated by a data acquisition device, and the database receives the first sequence from the data acquisition device. In this implementation, the first data sequence can be obtained by the database or a training device based on the first sequence.

[0170] In some implementations, the first sequence is received by the training device in Figure 1 when acquiring the first sequence. For example, the first sequence is generated by the database or the training device receives the first sequence from the database after the database receives the first sequence from the data acquisition device. In this implementation, the training device can acquire the first data sequence based on the first sequence.

[0171] In some implementations of this embodiment, sending a first sequence is also included.

[0172] For example, after acquiring the first sequence, the data acquisition device can send the first sequence to the database. Here, the data acquisition device can be the one that generates the first sequence. In this implementation, the database or training device can acquire the first data sequence based on the first sequence.

[0173] For example, after the database obtains the first sequence, it can send the first sequence to the training device. Here, the database obtaining the first sequence could be either generating the first sequence or receiving the first sequence from the data acquisition device. In this implementation, the training device can obtain the first data sequence based on the first sequence.

[0174] In some implementations of this embodiment, the method further includes: obtaining auxiliary information associated with the first sequence, wherein the auxiliary information associated with the first sequence includes: the initial value of the model state corresponding to the first sequence, and / or, first sequence indication information, which is used to indicate the first sequence. The purpose of this indication may be to generate the first sequence, or it may be to identify the first sequence. Identification can be understood as defining or distinguishing.

[0175] The initial state of the model corresponding to the first sequence can be understood as: the initial state of the neural network model when the first sequence is used as the input of the neural network model and the actual output sequence of the neural network model is expected to be the label sequence associated with the first sequence.

[0176] It can be understood that when the first sequence is input into the neural network model, the model state value of the neural network model is the initial value of the model state corresponding to the first sequence.

[0177] In this implementation, the initial values ​​of the model state corresponding to the first sequence are associated, which can improve the model training performance.

[0178] In this implementation, the auxiliary information associated with the first sequence is obtained. The auxiliary information associated with the first sequence includes the initial value of the model state corresponding to the first sequence and / or the indication information of the first sequence. This can be understood as: obtaining the initial value of the model state corresponding to the first sequence, and / or obtaining the indication information of the first sequence.

[0179] In some implementations of this embodiment, obtaining the initial value of the model state corresponding to the first sequence includes: determining the initial value of the model state corresponding to the first sequence. Determining the initial value of the model state corresponding to the first sequence can be understood as: associating the initial value of the model state with the first sequence.

[0180] The initial values ​​of the model state can be predefined, randomly generated, or generated by the device that generates the first data sequence.

[0181] For example, when the data acquisition device generates a first sequence, it determines the initial value of the model state corresponding to the first sequence. In this case, the data acquisition device can send not only the first sequence but also the initial value of the model state corresponding to the first sequence.

[0182] For example, when a database generates a first sequence or a data acquisition device receives a first sequence from a database, the database can determine the initial values ​​of the model state for the first sequence. In this case, the database can not only send the first sequence, but also the initial values ​​of the model state corresponding to the first sequence.

[0183] For example, when the training device generates a first sequence or receives a first sequence from a database, the training device can determine the initial values ​​of the model state for the first sequence.

[0184] In some implementations of this embodiment, obtaining the initial value of the model state corresponding to the first sequence includes: receiving the initial value of the model state corresponding to the first sequence.

[0185] For example, when the database receives the first sequence from the data acquisition device, the database receives the initial values ​​of the model state from the data acquisition device.

[0186] For example, when the training device receives the first sequence from the database or the data acquisition device, the training device receives the initial values ​​of the model state from the data acquisition device or the database.

[0187] In some implementations of this embodiment, obtaining the first sequence indication information may include: generating the first sequence indication information. This implementation can be understood as the first sequence indication information being generated along with the generation of the first sequence.

[0188] For example, when a database, data acquisition device, or training device generates a first sequence, the device also generates first sequence indication information. This generated first sequence indication information can be used by other devices to identify the first sequence. In this case, it may also include: sending the first sequence indication information.

[0189] In some implementations of this embodiment, obtaining the first sequence indication information may include: receiving the first sequence indication information.

[0190] In some implementations, the received first sequence indication information can be used to generate a first sequence. This implementation can be understood as generating a first sequence according to the instructions in the first sequence indication information. In this case, the first sequence indication information is used by the current device to generate the first sequence and by other devices to identify the first sequence.

[0191] For example, a data acquisition device, database, or training device receives first sequence indication information and generates a first sequence under the instruction of the first sequence indication information.

[0192] In some implementations, when the database or data acquisition device generates the first sequence indication information, the database or data acquisition device also sends the first sequence indication information so that other devices can identify the first sequence based on the first sequence indication information.

[0193] In some implementations, the received first sequence indication information is used to identify the first sequence. This implementation can be understood as: identifying the first sequence according to the indication of the first sequence indication information.

[0194] For example, when a training device needs to train a neural network model, it can identify the first sequence based on the first sequence indication information.

[0195] In some implementations, the first sequence indication information can be referred to as the first sequence pattern.

[0196] In some implementations, the first sequence indication information may include at least one of the following: a first representation of the first sequence, a start data identifier of the first sequence, the length of the first sequence, or, an end data identifier of the first sequence.

[0197] The start data identifier and the end data identifier of the first sequence can be the same or different.

[0198] In some implementations, when the start data identifier and the end data identifier of the first sequence are the same, the start data identifier and the end data identifier of the first sequence can be collectively referred to as the data identifier of the sequence.

[0199] In some implementations of this embodiment, at least some of the information in the first sequence indication information may be predefined or have a default configuration. In this implementation, obtaining the first sequence indication information for this part of the information may include: obtaining predefined first sequence indication information.

[0200] In some implementations, the first representation includes: each data point in the sequence is represented by a scalar, each data point in the sequence is represented by a vector, or each data point in the sequence is represented by a basis and a coefficient of each data point relative to the basis. Representing each data point in the sequence by a basis and a coefficient of each data point relative to the basis can be called a basis representation.

[0201] An example of the base method can be found in the codebook representation.

[0202] In some implementations, the basis method can include: each data point in the sequence is represented by its own basis and the coefficients of each data point relative to its own basis, called a unique basis method; or, each data point in the sequence is represented by a common basis of the data in the sequence and the coefficients of each data point relative to the common basis, called a shared basis method. A unique basis method means that the basis can be unique to each data point, while a shared basis method means that multiple data points share a basis.

[0203] In some implementations, the first sequence indication information only indicates that the first representation is a base representation. Whether the base is a unique base or a shared base can be predefined.

[0204] In some implementations, the first sequence indication information directly indicates whether the first representation is a unique basis representation or a common basis representation.

[0205] In some implementations, the T data sequences contain a third sequence, which in turn contains multiple data points. This third sequence is used as the input data sequence, and the multiple data points in the third sequence are sequentially fed into the neural network model. The third sequence is also used to associate labels. Therefore, obtaining the T data sequences may include obtaining the third sequence.

[0206] In this implementation, the relevant content of the third sequence can be referenced from the relevant content of the first sequence, and will not be repeated here.

[0207] Some implementations of this embodiment also include: obtaining auxiliary information associated with the third sequence.

[0208] In some implementations, the auxiliary information associated with the third sequence can be referenced from the auxiliary information associated with the first sequence.

[0209] For the sake of simplicity, the auxiliary information associated with the first sequence is referred to as the first auxiliary information, and the auxiliary information associated with the third sequence is referred to as the second auxiliary information.

[0210] In some implementations, the first auxiliary information and the second auxiliary information are represented in one of the following forms:

[0211] (1) Common auxiliary information + first data sequence + third data sequence, where "+" indicates combination or splicing. The first auxiliary information and the second auxiliary information are the same and are the same auxiliary information, which is called common auxiliary information (abbreviated as common information).

[0212] For example, public auxiliary information includes at least one of the following: representation method, start data identifier, sequence length, or, end data identifier;

[0213] (2) Common auxiliary information + first difference auxiliary information + first sequence + second difference auxiliary information + third sequence, where "+" indicates combination or splicing, the first difference auxiliary information is auxiliary information specific to the first sequence, the first auxiliary information includes the first difference auxiliary information and common auxiliary information, the second difference auxiliary information is auxiliary information specific to the third sequence, the second auxiliary information includes the second difference auxiliary information and common auxiliary information;

[0214] For example, the public auxiliary information includes at least one of the following: representation method, start data identifier, sequence length, or, end data identifier; the first difference auxiliary information includes at least one of the following: representation method, start data identifier, sequence length, or, end data identifier; the second difference auxiliary information includes at least one of the following: representation method, start data identifier, sequence length, or, end data identifier.

[0215] In some implementations, information included in the public auxiliary information may no longer be included in the differential auxiliary information.

[0216] In some implementations, information contained in the common auxiliary information is also contained in the difference auxiliary information. In this case, when the value in the difference auxiliary information differs from the value in the common auxiliary information, optionally, the value in the difference auxiliary information can be prioritized, or the value in the common auxiliary information can be prioritized. Whether the difference auxiliary information or the common auxiliary information is prioritized can be predefined, or it can be indicated by priority or indication information.

[0217] (3) Public auxiliary information + first sequence + second difference auxiliary information + third sequence. The difference between this method and method (2) is that the first auxiliary information is the public auxiliary information, and the second auxiliary information includes the public auxiliary information and the second difference auxiliary information.

[0218] (4) First auxiliary information + first sequence + second auxiliary information + third sequence;

[0219] (5) First sequence + second auxiliary information + third sequence, wherein the first auxiliary information can be predefined;

[0220] (6) First sequence + third sequence, where the first auxiliary information and the second auxiliary information can be predefined.

[0221] It is understood that the aforementioned sequential relationship between auxiliary information and data sequence is merely an example, and this embodiment does not impose any restrictions on the order of transmission or storage of auxiliary information and data sequence.

[0222] The following section uses one of the Q data sequences, denoted as the fourth sequence, as an example to illustrate how to obtain the Q data sequence.

[0223] The fourth sequence contains at least one piece of data for training a neural network model, wherein the fourth sequence is used as a label for the first sequence.

[0224] In this embodiment, the method for obtaining the fourth sequence can refer to the method for obtaining the first sequence in the previous embodiment, and will not be repeated here.

[0225] In some implementations of this embodiment, it may also include: obtaining auxiliary information associated with the fourth sequence.

[0226] The method for obtaining auxiliary information associated with the fourth sequence data can refer to the method for obtaining auxiliary information associated with the first sequence in the previous embodiment.

[0227] For the purpose of differentiation, in this embodiment, the representation of the fourth sequence is denoted as the second representation.

[0228] The method of using the fourth sequence to train the neural network model can be referred to the relevant content of using the label sequence of the first sequence to train the neural network model in the previous embodiment, which will not be repeated here.

[0229] In some implementations, the auxiliary information associated with the fourth sequence may indicate at least one of the following: the initial value of the model state corresponding to the fourth sequence, the second representation of the fourth sequence, the sequence length, the start data identifier of the fourth sequence, or the end data identifier of the fourth sequence, wherein the sequence length includes one of the following: the length of the fourth sequence, the minimum of the lengths of the first sequence and the fourth sequence, the maximum of the lengths of the first sequence and the fourth sequence, or the length of the fourth sequence and the length of the fourth sequence being equal to the length of the first sequence.

[0230] The sequence length is the minimum of the length of the first sequence and the length of the fourth sequence, the maximum of the length of the first sequence and the length of the fourth sequence, or the length of the fourth sequence. When the length of the fourth sequence is equal to the length of the first sequence, the sequence length in the auxiliary information associated with the first sequence is the same as the sequence length in the auxiliary information associated with the fourth sequence.

[0231] One application scenario where the sequence length is the maximum value is as follows: For channel state information prediction, the input sequence contains channel state information at N time points, and the channel state information at each time point is denoted as a data point. Then the length of the input sequence is N. The trained neural network model is used to predict the channel state information at the next M time points, that is, the output sequence contains M data points, and the length of the output sequence is M. The value of M can be a predefined value. Only the value of N needs to be indicated. In this case, the N values ​​can be the maximum value between the length of the input sequence and the length of the output sequence.

[0232] In some implementations, channel information can be used to determine one or more of the following configurations for the downlink or uplink data channel of the scheduling terminal equipment: resources, modulation and coding scheme (MCS), and precoding. Channel information reflects channel characteristics and quality. It can also be called channel state information (CSI) or channel environment information. It is understood that the CSI in this application is not limited to traditional CSIs such as channel quality indication (CQI), precoding matrix indicator (PMI), rank indicator (RI), or channel state information reference signal resource indicator (CSI-RS CRI), and can also be channel response information such as the channel response matrix, reference signal receiving power (RSRP), or signal to interference plus noise ratio (SINR).

[0233] One application scenario where the sequence length is minimized is as follows: The input sequence contains H1, H2, H3, H4, H5, and H6, with a length of 6; by applying a sliding window with a predefined length of 4 (example), we can obtain 6 sets of sequences with a length of 4, denoted as: H1 to H4, H2 to H5, H3 to H6, H4 to H6 with one 0, H5 to H6 with two 0s, and H6 with three 0s.

[0234] The output sequence is a sequence consisting of 6 sets of sequences of length 4. These 6 sets of output sequences are: C1_1, C1_2, C1_3, C1_4 (corresponding to H1 to H4), C2_2, C2_3, C2_4, C2_5 (corresponding to H2 to H5), C3_3, C3_4, C3_5, C3_6 (corresponding to H3 to H6), C4_4, C4_5, C4_6, 0 (corresponding to H4 to H6 and 0), C5_5, C5_6, 0, 0 (corresponding to H5 to H6 and two 0s), and C6_6, 0, 0, 0 (corresponding to H6 and three 0s). The length of this sequence consisting of these 6 sets is 24.

[0235] In this example, the sequence length in the auxiliary information can be the minimum of 6 and 24, which is 6.

[0236] The starting data identifier of the first sequence and the starting data identifier of the fourth sequence may be the same or different.

[0237] In some implementations, the starting data identifiers of the first sequence and the fourth sequence are the same, and can be collectively referred to as the starting data identifiers of the sequences.

[0238] The end data identifier of the first sequence and the end data identifier of the fourth sequence can be the same or different.

[0239] In some implementations, the end data identifier of the first sequence and the end data identifier of the fourth sequence are the same, and can be collectively referred to as the end data identifier of the sequence.

[0240] In some implementations of this embodiment, the Q sequences further include a fifth sequence, which contains at least one piece of data and is used as a label for the third sequence.

[0241] The content of the fifth sequence is similar to that of the fourth sequence, and will not be repeated here.

[0242] In some implementations of this embodiment, the method may further include obtaining auxiliary information associated with the fifth sequence. The method for obtaining auxiliary information associated with the fifth sequence data can refer to the method for obtaining auxiliary information associated with the fourth sequence, and will not be repeated here.

[0243] In some implementations, the first sequence, the third sequence, and the fourth sequence can be represented in one of the following ways:

[0244] (1) First sequence + fourth sequence + third sequence + fifth sequence, that is, one training data is represented as one unit, and one training data contains the input sequence and the label sequence of the input sequence;

[0245] (2) First sequence + third sequence + fourth sequence + fifth sequence, that is, the input data sequence is a unit and the label sequence is a unit.

[0246] Auxiliary information, along with the input and output sequences, can be represented in one of the following ways:

[0247] (1) Common auxiliary information + differential auxiliary information of input sequence + auxiliary information of output sequence + input sequence + output sequence;

[0248] (2) Common auxiliary information + difference auxiliary information of output sequence + input sequence + output sequence;

[0249] (3) Common auxiliary information + differential auxiliary information of input sequence + input sequence + output sequence;

[0250] (4) Auxiliary information of input sequence + auxiliary information of output sequence + input sequence + output sequence;

[0251] (5) Auxiliary information of the input sequence + input sequence + output sequence;

[0252] (6) Auxiliary information of the output sequence + input sequence + output sequence;

[0253] (7) Public auxiliary information + input sequence + output sequence;

[0254] (8) Common auxiliary information + input sequence difference auxiliary information + input sequence + output sequence difference auxiliary information + output sequence;

[0255] (9) Input sequence + auxiliary information of output sequence + output sequence;

[0256] (10) Common auxiliary information + input sequence + output sequence difference auxiliary information + output sequence;

[0257] (11) Auxiliary information of input sequence + input sequence + auxiliary information of output sequence + output sequence.

[0258] The input sequence includes the first sequence and the third sequence, and the output sequence includes the fourth sequence and the fifth sequence.

[0259] It is understandable that a sequence may not have corresponding difference auxiliary information, but when there is common auxiliary information, the common auxiliary information can be used; when a sequence has neither corresponding difference auxiliary information nor common auxiliary information, predefined auxiliary information can be used; when there is both common auxiliary information and corresponding difference auxiliary information, either common auxiliary information or difference auxiliary information can be used, and the choice between using common auxiliary information or difference auxiliary information can be determined based on predefined priority, indication priority, or other indication information.

[0260] The system architecture and related methods provided in this application can be applied to communication networks such as the 3rd Generation Partnership Project (3GPP), Zigbee, Long Range Radio (Lora), Bluetooth (BT), and Wireless Fidelity (Wi-Fi).

[0261] The technical solutions provided in this application can be applied to various communication systems, such as: 5th generation (5G) or new radio (NR) systems, long term evolution (LTE) systems, LTE frequency division duplex (FDD) systems, LTE time division duplex (TDD) systems, wireless local area network (WLAN) systems, satellite communication systems, future communication systems such as 6th generation (6G) mobile communication systems, or integrated systems of multiple systems. The technical solutions provided in this application can also be applied to device-to-device (D2D) communication, vehicle-to-everything (V2X) communication, machine-to-machine (M2M) communication, machine-type communication (MTC), and Internet of Things (IoT) communication systems or other communication systems.

[0262] In a communication system, a device can send signals to or receive signals from another device. These signals can include information, signaling, or data. The device can also be replaced by an entity, network entity, communication equipment, communication module, node, communication node, etc.; this disclosure uses a device as an example. For instance, a communication system can include at least one terminal device and at least one network device. The network device can send downlink signals to the terminal device, and / or the terminal device can send uplink signals to the network device. It is understood that the terminal device in this disclosure can be replaced by a first device, and the network device can be replaced by a second device, both performing the corresponding communication methods described in this disclosure.

[0263] In the embodiments of this application, the terminal device may also be referred to as user equipment (UE), access terminal, user unit, user station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent, or user apparatus.

[0264] Terminal devices can be devices that provide voice / data, such as handheld devices with wireless connectivity, in-vehicle devices, etc. Currently, examples of terminals include: mobile phones, tablets, laptops, PDAs, mobile internet devices (MIDs), wearable devices, virtual reality (VR) devices, augmented reality (AR) devices, wireless terminals in industrial control, wireless terminals in self-driving vehicles, wireless terminals in remote medical surgery, wireless terminals in smart grids, wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, cellular phones, cordless phones, session initiation protocol (SIP) phones, wireless local loop (WLL) stations, personal digital assistants (PDAs), handheld devices with wireless communication capabilities, computing devices or other processing devices connected to wireless modems, wearable devices, terminal devices in 5G networks, or future public land mobile communication networks. Terminal devices in a network (PLMN), devices in a Zigbee network, devices in a LoRa network, Bluetooth slaves, Bluetooth Low Energy (BLE) slaves, Wi-Fi stations (STAs), etc., are not limited to these in the embodiments of this application.

[0265] By way of example and not limitation, in this embodiment, the terminal device can also be a wearable device. Wearable devices, also known as wearable smart devices, are a general term for devices that utilize wearable technology to intelligently design and develop everyday wearables, such as glasses, gloves, watches, clothing, and shoes. Wearable devices are portable devices that are worn directly on the body or integrated into the user's clothing or accessories. Wearable devices are not merely hardware devices, but also achieve powerful functions through software support, data interaction, and cloud interaction. Broadly speaking, wearable smart devices include those that are feature-rich, large in size, and can achieve complete or partial functions without relying on a smartphone, such as smartwatches or smart glasses, as well as those that focus on a specific type of application function and require the use of other devices such as smartphones, such as various smart bracelets and smart jewelry for vital sign monitoring.

[0266] Terminal devices can also be terminal devices in an IoT system, also known as IoT nodes. IoT is an important component of future information technology development. Its main technical characteristic is connecting objects to networks through communication technologies, thereby realizing an intelligent network that enables human-machine interconnection and machine-to-machine interconnection. Connectivity can be achieved through broadband or narrowband technologies. IoT technology, for example, can achieve massive connectivity, deep coverage, and low terminal power consumption through narrowband (NB) technology. IoT technologies include reflective communication technology, spread spectrum technology, and ultra-wideband (UWB), which will not be elaborated further.

[0267] In this embodiment, the device for implementing the functions of the terminal device can be the terminal device itself, or it can be any device capable of supporting the terminal device in implementing those functions, such as a chip system. This device can be installed in or used in conjunction with the terminal device. In this embodiment, the chip system can be composed of chips or may include chips and other discrete components. This embodiment only uses the terminal device as an example to illustrate the device for implementing the functions of the terminal device, and does not constitute a limitation on the solution of this embodiment.

[0268] The network device in this application embodiment may include a device for communicating with a terminal device. This network device may include an access network device or a radio access network device, such as a base station. The network device in this application embodiment may also include a radio access network (RAN) node (or device) for connecting the terminal device to a wireless network.

[0269] The radio access network (RAN) device in this application is a device with wireless transceiver capabilities. The RAN device can provide wireless communication services, enabling terminal devices to access the wireless network. In the embodiments of this application, the network device can refer to a radio access network (RAN) node (or device) used in a cellular network (or mobile network) to connect terminal devices to the wireless network; it can also be a Zigbee base station, a Bluetooth master, a Bluetooth Low Energy (BLE) master, a LoRa base station, or a Wi-Fi access point.

[0270] A base station can broadly encompass, or be replaced by, various names including: NodeB, evolved NodeB (eNB), next-generation NodeB (gNB), relay station, access point, transmitting and receiving point (TRP), transmitting point (TP), master station, auxiliary station, motor slide retainer (MSR) node, home base station, network controller, access node, wireless node, access point (AP), transmission node, transceiver node, baseband unit (BBU), remote radio unit (RRU), active antenna unit (AAU), remote radio head (RRH), central unit (CU), distributed unit (DU), radio unit (RU), positioning node, etc. A base station can be a macro base station, micro base station, relay node, donor node, or similar entities, or combinations thereof. A base station can also refer to a communication module, modem, or chip installed in the aforementioned equipment or device. A base station can also be a mobile switching center and equipment performing base station functions in D2D, V2X, and M2M communications, network-side equipment in 6G networks, or equipment performing base station functions in future communication systems. A base station can support networks with the same or different access technologies. Optionally, a RAN node can also be a server, wearable device, vehicle, or in-vehicle equipment. For example, the access network equipment in vehicle-to-everything (V2X) technology can be a roadside unit (RSU). The embodiments of this application do not limit the specific technology or equipment form used in the network equipment. In some deployments, the network equipment mentioned in the embodiments of this application can be equipment including a CU, or a DU, or equipment including both CU and DU, or equipment with a control plane CU node (central unit-control plane (CU-CP)) and a user plane CU node (central unit-user plane (CU-UP)) and a DU node. For example, the network equipment can include gNB-CU-CP, gNB-CU-UP, and gNB-DU.

[0271] In some deployments, multiple RAN nodes collaborate to assist terminals in achieving wireless access, with different RAN nodes each implementing some of the base station's functions. For example, RAN nodes can be CUs, DUs, CU-CPs, CU-UPs, or RUs. CUs and DUs can be configured separately or included in the same network element, such as a BBU. RUs can be included in radio frequency equipment or radio frequency units, such as RRUs, AAUs, or RRHs.

[0272] RAN nodes can support one or more types of fronthaul interfaces, each corresponding to a DU and RU with different functions. If the fronthaul interface between the DU and RU is a common public radio interface (CPRI), the DU is configured to implement one or more baseband functions, and the RU is configured to implement one or more radio frequency functions. If the fronthaul interface between the DU and RU is another type of interface, relative to CPRI, some downlink and / or uplink baseband functions, such as, for downlink, precoding, digital beamforming (BF), or one or more of inverse fast Fourier transform (IFFT) / cyclic prefix addition (CP), are moved from the DU to the RU; and for uplink, one or more of digital beamforming (BF), or fast Fourier transform (FFT) / cyclic prefix removal (CP), are moved from the DU to the RU. In one possible implementation, the interface can be an enhanced common public radio interface (eCPRI). Under the eCPRI architecture, the segmentation between DU and RU differs, corresponding to different categories (Cat) of eCPRI, such as eCPRI Cat A, B, C, D, E, F.

[0273] Taking eCPRI Cat A as an example, for downlink transmission, the DU is configured to implement one or more functions before and after layer mapping (i.e., coding, rate matching, scrambling, modulation, and layer mapping), while other functions after layer mapping (e.g., resource element (RE) mapping, digital beamforming (BF), or one or more functions of inverse fast Fourier transform (IFFT) / adding cyclic prefix (CP)) are moved to the RU. For uplink transmission, the DU is configured to implement one or more functions before and after demapping (i.e., decoding, rate matching de-matching, descrambling, demodulation, inverse discrete Fourier transform (IDFT), channel equalization, and demapping), while other functions after demapping (e.g., digital BF or one or more functions of fast Fourier transform (FFT) / removing CP) are moved to the RU. It is understandable that the functional descriptions of the DU and RU corresponding to various types of eCPRI can be found in the eCPRI protocol, and will not be elaborated here.

[0274] In one possible design, the processing unit in the BBU used to implement baseband functions is called the baseband high (BBH) unit, and the processing unit in the RRU / AAU / RRH used to implement baseband functions is called the baseband low (BBL) unit.

[0275] In different systems, CU (or CU-CP and CU-UP), DU, or RU may have different names, but those skilled in the art will understand their meaning. For example, in an open radio access network (ORAN / O-RAN) system, CU can also be called O-CU (open CU), DU can also be called O-DU, CU-CP can also be called O-CU-CP, CU-UP can also be called O-CU-UP, and RU can also be called O-RU. Any of the units among CU (or CU-CP, CU-UP), DU, and RU in this application can be implemented through software modules, hardware modules, or a combination of software modules and hardware modules.

[0276] In this embodiment, the apparatus for implementing the functions of a network device can be a network device itself; it can also be an apparatus capable of supporting the network device in implementing those functions, such as a chip system, hardware circuit, software module, or a hardware circuit plus a software module. This apparatus can be installed in the network device or used in conjunction with the network device. In this embodiment, the example of a network device being used to implement the functions of a network device is provided only and does not constitute a limitation on the solutions described in this embodiment.

[0277] Network devices and / or terminal devices can be deployed on land, including indoors or outdoors, handheld or vehicle-mounted; they can also be deployed on water; and they can also be deployed in the air on airplanes, balloons, and satellites. This application does not limit the scenario in which the network devices and terminal devices are located. Furthermore, terminal devices and network devices can be hardware devices, or software functions running on dedicated hardware or general-purpose hardware, such as virtualization functions instantiated on a platform (e.g., a cloud platform), or entities that include dedicated or general-purpose hardware devices and software functions. This application does not limit the specific form of the terminal devices and network devices.

[0278] To support artificial intelligence (AI) technology in wireless networks, AI nodes can be introduced. AI nodes can be AI network elements or AI modules.

[0279] Optionally, the AI ​​node can be deployed in one or more of the following locations within the communication system: access network devices, terminal devices, or core network devices, etc. Alternatively, the AI ​​node can be deployed independently, for example, in a location other than any of the aforementioned devices, such as in the host or cloud server of an over-the-top (OTT) system. The AI ​​node can communicate with other devices in the communication system, which can be one or more of the following: network devices, terminal devices, or core network elements, etc.

[0280] It is understood that this application does not limit the number of AI nodes. For example, when there are multiple AI nodes, these nodes can be divided based on function, such as different AI nodes being responsible for different functions.

[0281] It can also be understood that AI nodes can be independent devices, or they can be integrated into the same device to achieve different functions. Alternatively, they can be network elements in hardware devices, software functions running on dedicated hardware, or virtualization functions instantiated on a platform (e.g., a cloud platform). This application does not limit the specific form of the aforementioned AI nodes.

[0282] The AI ​​node mentioned here can be used as a data acquisition device, database, training device, or execution device as shown in Figure 1.

[0283] Figure 5 illustrates a possible application framework in a communication system. As shown in Figure 5, network elements in the communication system are connected via interfaces (e.g., NG, Xn) or air interfaces. These network element nodes, such as core network equipment, access network nodes (RAN nodes), terminals, or one or more devices in operation administration and maintenance (OAM), are equipped with one or more AI modules (only one is shown in Figure 5 for clarity). The access network node can be a single RAN node or can include multiple RAN nodes, such as CU and DU, or gNB and / or ng-eNB. The CU and / or DU can also be equipped with one or more AI modules. Optionally, the CU can be further divided into CU-CP and CU-UP. One or more AI models are configured in the CU-CP and / or CU-UP.

[0284] The AI ​​module is used to implement corresponding AI functions. AI modules deployed in different network elements can be the same or different. Depending on the parameter configuration, the AI ​​module can implement different functions. The AI ​​module model can be configured based on one or more of the following parameters: structural parameters (e.g., at least one of the following: number of neural network layers, neural network width, inter-layer connections, neuron weights, neuron activation function, or bias in the activation function), input parameters (e.g., type and / or dimension of input parameters), or output parameters (e.g., type and / or dimension of output parameters). The bias in the activation function can also be referred to as the neural network bias.

[0285] An AI module can have one or more models. A model can infer an output, which includes one or more parameters. The learning, training, or inference processes of different models can be deployed on different nodes or devices, or they can be deployed on the same node or device.

[0286] Figure 6 illustrates a possible application framework in a communication system. As shown in Figure 6, the communication system includes a RAN intelligent controller (RIC). For example, the RIC can be the AI ​​module shown in Figure 5, used to implement AI-related functions. The RIC includes near-real-time RICs (near-RT RICs) and non-real-time RICs (non-RT RICs). Non-real-time RICs primarily process non-real-time information, such as data that is not sensitive to latency, with latency in the order of seconds. Real-time RICs primarily process near-real-time information, such as data that is relatively sensitive to latency, with latency in the order of tens of milliseconds.

[0287] The near real-time RIC is used for model training and inference. For example, it can be used to train an AI model and then use that AI model for inference. The near real-time RIC can obtain network-side and / or terminal-side information from RAN nodes (e.g., CU, CU-CP, CU-UP, DU, and / or RU) and / or terminals. This information can be used as training data or inference data. Optionally, the near real-time RIC can deliver inference results to RAN nodes and / or terminals. Optionally, inference results can be exchanged between CU and DU, and / or between DU and RU. For example, the near real-time RIC delivers the inference result to the DU, and the DU sends it to the RU.

[0288] The non-real-time RIC is also used for model training and inference. For example, it can be used to train an AI model and then use that model for inference. The non-real-time RIC can obtain network-side and / or terminal-side information from RAN nodes (e.g., CU, CU-CP, CU-UP, DU, and / or RU) and / or terminals. This information can be used as training data or inference data, and the inference results can be delivered to the RAN nodes and / or terminals. Optionally, inference results can be exchanged between CU and DU, and / or between DU and RU. For example, the non-real-time RIC delivers the inference results to the DU, which then forwards them to the RU.

[0289] The near real-time RIC and non-real-time RIC can also be set up as separate network elements. Optionally, the near real-time RIC and non-real-time RIC can also be part of other devices. For example, the near real-time RIC can be set in the RAN node (e.g., in CU, DU), while the non-real-time RIC can be set in the OAM, cloud server, core network device, or other network device.

[0290] The RIC here can be used as a data acquisition device, database, training device, or execution device as shown in Figure 1.

[0291] Figure 7 is a schematic diagram of a communication system applicable to the method of this application embodiment. As shown in Figure 7, the communication system 1400 may include at least one network device, such as network device 1410 shown in Figure 7; the communication system 1400 may also include at least one terminal device, such as terminal device 1420 and terminal device 1430 shown in Figure 7. Network device 1410 and terminal devices (such as terminal devices 1420 and 1430) can communicate via a wireless link. The communication devices in this communication system, for example, network device 1410 and terminal device 1420, can communicate via multi-antenna technology.

[0292] The network devices in this system can be used as at least one of the following: data acquisition devices, databases, training devices, and execution devices. The terminals can also be used as at least one of the following: data acquisition devices, databases, and execution devices.

[0293] Figure 8 is a schematic diagram of a communication system applicable to the method of this application embodiment. Compared with the communication system 1400 shown in Figure 7, the communication system 1500 shown in Figure 8 also includes an AI entity 1440.

[0294] AI entity 1440 is used to perform AI-related operations, such as building training datasets or training AI models.

[0295] In one possible implementation, network device 1410 can send data related to the training of the AI ​​model to AI entity 1440, whereby AI entity 140 constructs a training dataset and trains the AI ​​model. For example, the data related to the training of the AI ​​model may include data reported by the terminal device. AI entity 1440 can send the results of operations related to the AI ​​model to network device 1410, and then forward them to the terminal device via network device 1410. For example, the results of operations related to the AI ​​model may include at least one of the following: a trained AI model, model evaluation results, or test results. Exemplarily, a portion of the trained AI model may be deployed on network device 1410, and another portion on the terminal device. Alternatively, the trained AI model may be deployed on network device 1410. Or, the trained AI model may be deployed on the terminal device.

[0296] It should be understood that Figure 8 is only used as an example of AI entity 1440 being directly connected to network device 1410. In other scenarios, AI entity 1440 can also be connected to a terminal device. Alternatively, AI entity 1440 can be connected to both network device 1410 and a terminal device simultaneously. Alternatively, AI entity 1440 can also be connected to network device 1410 through a third-party network element. This application embodiment does not limit the connection relationship between AI network element and other network elements.

[0297] AI entity 1440 can also be configured as a module in at least one of the following devices: network devices, terminal devices, and core network elements, for example, in network device 1410 or terminal device shown in Figure 8; or, the AI ​​entity can be located inside an AMF or location management function (LMF).

[0298] It should be noted that Figures 7 and 8 are simplified schematic diagrams for ease of understanding. For example, the communication system may also include other devices, such as wireless relay devices and / or wireless backhaul devices, and may also include one or more core network elements, which are not shown in Figures 7 and 8. In practical applications, the communication system may include multiple network devices or multiple terminal devices. The embodiments of this application do not limit the number of network devices and terminal devices included in the communication system.

[0299] The AI ​​entity in this system can be used as at least one of the following: data acquisition device, database, training device, and execution device.

[0300] The core network elements in this application can be used to implement one or more of the following functions: access and mobility management function (AMF), session management function (SMF), policy control function (PCF), user plane function (UPF), network data analytics function (NWDAF), charging function (CHF), and unified data management (UDM).

[0301] AMF is mainly used for terminal attachment, mobility management, and tracking area update procedures in mobile networks. The access management network element terminates non-access stratum (NAS) messages, completes registration management, connection management, and reachability management, allocates tracking area lists (TA lists), and performs mobility management, and transparently routes session management (SM) messages to the session management network element.

[0302] SMF is primarily used for session management in mobile networks, such as session establishment, modification, and release. For example, it can be used to assign Internet Protocol (IP) addresses to terminals and select user plane network elements that provide packet forwarding functionality.

[0303] PCF primarily provides user subscription data management, policy control, billing policy control, and quality of service (QoS) control.

[0304] UPF is primarily responsible for processing user messages, such as forwarding, billing, and legality monitoring.

[0305] NWDAF provides intelligent analysis services, using artificial intelligence and big data analytics to output analysis results in a standardized format. The output generally includes two forms: statistical analysis of historical data and predictions of future data. Service processing network elements adjust the network based on the NWDAF output to optimize network operation. For example, when a service processing network element detects, through the statistical analysis results output by NWDAF, that the packet loss rate of ongoing service message transmissions at a terminal exceeds a threshold, impacting service experience, it will take corresponding measures, such as increasing the proportion of retransmitted packets, to attempt to improve the service experience.

[0306] CHF mainly collects billing events from various service processing network elements such as SMF, AMF, or UPF, and bills users' services according to the rates determined in advance through negotiation.

[0307] UDM network elements are used to store and manage user terminal network and service subscription data.

[0308] The communication system of this application may also include application functions, which may be referred to as third-party applications.

[0309] In this application, a function can also be a network element or an entity. For example, an AMF can also be an AMF network element or an AMF entity.

[0310] When the technical solution of this application is applied to a communication system, in some implementations, the data in the first data sequence is channel state information. For example, H1, H2...Hn in Figure 3 are channel state information, respectively.

[0311] In some implementations, the trained neural network model is used to compress the channel state information. For example, in Figure 3, C1, C2...Cn are the compressed channel state information H1, H2...Hn, respectively, and C1', C2'...Cn' are the truth values ​​of the compressed channel state information H1, H2...Hn, respectively.

[0312] In some implementations, the trained neural network model is used to predict channel state information. For example, in Figure 3, C1, C2...Cm are the channel state information predicted based on channel state information H1, H2...Hn, and C1', C2'...Cm' are the true values ​​of the channel state information predicted based on channel state information H1, H2...Hn.

[0313] In some implementations, the trained neural network model is used to predict and compress channel state information. For example, in Figure 3, C1, C2...Cm are the channel state information predicted and compressed based on channel state information H1, H2...Hn, and C1', C2'...Cm' are the true values ​​of the channel state information predicted and compressed based on channel state information H1, H2...Hn.

[0314] When the data in the first data sequence is channel state information, in some implementations, the channel state information is denoted as H, and its dimension is (N). port N band ), where N port N represents the number of antenna ports. band Indicates the number of subbands. When H is the original channel information, N port =N T ×N R , where N T N represents the number of transmit antenna ports. R This represents the number of receive antenna ports. When H is a channel eigenvector, N... port =N T .

[0315] The first data sequence can be represented as H (1:T) (T, N) port N band ), that is, a channel sequence with a sequence length of T.

[0316] In some implementations, the data in the second data sequence is compressed channel state information, i.e., compressed channel state information.

[0317] In some implementations, the dimension of the compressed information is N. c , indicating that the compressed information is N c A long real vector, or a complex vector, or a binary vector.

[0318] In some implementations, the second data sequence can be represented as C (1:T) (T, N) c ), that is, a compressed information sequence with a sequence length of T.

[0319] When the data in the first data sequence is represented using a basis method, in some possible implementations, the i-th data is denoted as H. i H i =V P1 *Coeff i1 *V b1 , where V p1 and V b1 As a base, Coeff i1 Let be the coefficient of the i-th data in the basis. In this case, the i-th data can be represented by the identifier of the basis and the coefficients in the basis.

[0320] When the data in the first data sequence is represented using a common basis, in some possible implementations, the i-th data is denoted as H. i h i =V P3 *Coeff i3 , where V p3 As a base, Coeff i3 h is the coefficient of the i-th data on the basis. i (N port *N band Let Hi be flattened into a long vector.

[0321] When the data in the second data sequence is represented using a basis method, in some possible implementations, the i-th data is denoted as C. i Ci = V P2 *Coeff i2 , where V p2 As a base, Coeff i2 Let be the coefficient of the i-th data in the basis. In this case, the i-th data can be represented by the identifier of the basis and the coefficients in the basis.

[0322] In this application, the neural network model used for compression (and prediction) can be called an encoder neural network model, or simply an encoder; the neural network model used for decompression or recovery can be called a decoder neural network model, or simply a decoder.

[0323] As an example, the data acquisition device is a network-side device, such as a base station, and the training device is a terminal-side device, such as a terminal or a server that communicates with the terminal.

[0324] For example, after a base station trains an encoder and decoder, it inputs a set of input data sequences into the trained encoder to obtain the corresponding labels. The base station then sends the input sequences and labels to the terminal-side device, which trains its own encoder, enabling it to adapt to the base station-side decoder. The terminal-side device can be a terminal device, a module (such as a chip) within the terminal device, software containing terminal device functions (such as a control subsystem), or other devices communicating with the terminal device, such as AI network elements, which can be servers like OTT devices or cloud servers.

[0325] Taking the application of a neural network model in the scenario of a base station feeding back downlink reference signal channel state information to a terminal as an example, in some implementations, the base station acquires N channel state information sequences, sequentially inputs these N sequences into a trained encoder to obtain N compressed information sequences, and then sends these N compressed information sequences (compressed information sequences or tag sequences) along with the corresponding N channel state information sequences (channel state information sequences or input sequences) to the terminal for model training, model monitoring, or model fine-tuning. Alternatively, the base station acquires N channel state information sequences, sequentially inputs them into a trained encoder to obtain N compressed information sequences, sequentially inputs these N compressed channel state information sequences into a trained decoder to obtain N recovered channel state information sequences, and then feeds back these N compressed information sequences along with the corresponding N recovered channel state information sequences to the terminal.

[0326] After receiving the N compressed information and N channel state information, the terminal inputs the N channel state information into the encoder in sequence to obtain N compressed information. By comparing the differences between the output N compressed information and the N compressed information sent by the base station, the encoder is trained so that the trained encoder can match the base station's decoder.

[0327] As an example, the data acquisition device is a terminal-side device, such as a terminal or a server communicating with a terminal. The training device is a network-side device, such as a base station or a server communicating with a base station. This network-side device can be a network device, a module (such as a chip) within a network device, software containing network device functions (such as a control subsystem), or other devices communicating with the network device, such as an AI network element, which is a server, such as an OTT device or a cloud server.

[0328] For example, after the terminal trains the encoder and decoder, it inputs a set of data sequences into the trained encoder to obtain the output sequence; then it inputs the output sequence into the trained decoder to obtain the final output sequence, which is denoted as the label of the encoder output sequence. The terminal sends the encoder output sequence and its label to the base station, which uses the base station to train its own decoder, so that the decoder can be adapted to the encoder on the UE side.

[0329] Taking the scenario of applying a neural network model to channel state information (CSO) of downlink reference signals from a terminal to a base station as an example, in some implementations, the terminal device acquires N CSO information sequences. The terminal then sequentially inputs these N CSO information sequences into a trained neural network model to obtain N compressed information sequences. It then feeds back these N compressed information sequences (compressed information sequences or input sequences) and the corresponding N CSO information sequences (channel state information sequences or label sequences) to the base station for model training, monitoring, or fine-tuning. Alternatively, the terminal device acquires N CSO information sequences, sequentially inputs them into a trained encoder to obtain N compressed information sequences, and then sequentially inputs these N compressed CSO information sequences into a decoder trained on the terminal side to obtain N recovered CSO information sequences. Finally, it feeds back these N compressed information sequences and the corresponding N recovered CSO information sequences to the base station.

[0330] After receiving these N compressed information messages, the base station inputs them into the decoder in sequence to obtain N channel state information messages. By comparing the differences between the output N state information messages and the N channel state information messages sent by the terminal, the decoder is trained so that the trained decoder can match the encoder of the terminal.

[0331] As an example, the data acquisition device is a third party, the training device is a terminal-side device, such as a terminal or a server communicating with the terminal; and / or, a network-side device, such as a base station or a server communicating with the base station. The third party can be equipment provided or operated by a non-network equipment manufacturer or terminal manufacturer, such as a server.

[0332] For example, if a third party sends the input sequence and label corresponding to the encoder to the terminal device for the terminal device to train its own encoder, or sends the input sequence and label corresponding to the decoder to the network device for the network device to train its own decoder, then the encoder can be adapted to the decoder on the network side.

[0333] Taking the application of a neural network model to channel state information (CSO) of downlink reference signals from a terminal to a base station as an example, in some implementations, a third party obtains N CSO information sequences. This third party then sequentially inputs these N CSO information sequences into a trained encoder to obtain N compressed sequences. These compressed sequences (compressed sequence or tag sequence) and the corresponding N CSO information sequences (channel state information sequence or input sequence) are then fed back to the terminal or base station for model training, monitoring, or fine-tuning. Alternatively, the third party obtains N CSO information sequences, sequentially inputs them into a trained encoder to obtain N compressed sequences, and then sequentially inputs these N compressed CSO information sequences into a third party-trained decoder to obtain N recovered CSO information sequences. These recovered CSO information sequences and the corresponding N recovered CSO information sequences are then fed back to the terminal or base station for model training, monitoring, or fine-tuning of the decoder.

[0334] After receiving the N compressed information and the corresponding N channel state information, the terminal uses an encoder to compress the N channel state information, thereby obtaining N compressed information. The encoder is trained by comparing the difference between the N compressed information output by the encoder and the received N compressed information.

[0335] After receiving the N compressed information and the corresponding N channel state information, the base station inputs the N compressed information into the decoder in sequence to obtain the N channel state information. By comparing the differences between the N channel state information output by the decoder and the received N channel state information, the decoder is trained.

[0336] Ultimately, this allows the decoder trained at the base station to match the encoder trained at the terminal.

[0337] It should be understood that in this application, the indication includes direct indication (also known as explicit indication) and implicit indication. Direct indication information A refers to information A being included; implicit indication information A refers to information A being indicated through the correspondence between information A and information B, and through direct indication information B. The correspondence between information A and information B can be predefined, pre-stored, pre-burned, or pre-configured.

[0338] It should be understood that in this application, information C is used to determine information D, including both situations where information D is determined solely based on information C and situations where it is determined based on information C and other information. Furthermore, information C can also be used to determine information D indirectly, for example, where information D is determined based on information E, and information E is determined based on information C.

[0339] Figure 9 is a schematic diagram of a training data acquisition device according to an embodiment of this application. As shown in Figure 9, the device 1600 may include a processing module 1601 and a communication module 1602.

[0340] As a first example, device 1600 can be used in the training data acquisition method of any of the foregoing embodiments. For example, processing module 1601 is used to implement processing-related steps such as acquisition, generation, and determination, and communication module 1602 is used to implement steps such as sending and / or receiving.

[0341] Figure 10 is a schematic diagram of a training data acquisition device provided in another embodiment of this application. As shown in Figure 10, the device 1700 includes a processing circuit 1701. The device 1700 may also include a communication circuit 1702. The processing circuit 1701 and the communication circuit 1702 are coupled to each other.

[0342] It is understood that the processing circuit can be one or more processors, or it can be all or part of the processing functions of one or more processors.

[0343] Understandably, the communication circuit 1702 can be a transceiver or an input / output interface.

[0344] Optionally, the device 1700 may further include a memory 1703 for storing instructions executed by the processing circuit 1701, or storing input data required for the running instructions of the processing circuit 1701, or storing data generated after the running instructions of the processing circuit 1701.

[0345] It is understood that the memory 1703 may be located outside the processing circuit 1701, or inside the processing circuit 1701.

[0346] As an example, the processing circuit 1701 is used to implement the functions of the processing module 1601, and the communication circuit 1702 is used to implement the functions of the communication module 1602.

[0347] As an example, device 1700 may be a data acquisition device, a database, or a training device, or it may be a chip applied in a data acquisition device, database, or training device.

[0348] As an example, device 1700 can be a communication device or a chip used in a communication device.

[0349] When device 1700 is a communication device, the communication circuit can be a transceiver; when device 1700 is a chip, the communication circuit can be an input / output circuit, a bus, pins, or other types of communication interfaces. The input circuit in the input / output circuit can be used for receiving, and the output interface can be used for transmitting.

[0350] In some embodiments of this application, a computer-readable storage medium is also provided, which contains computer instructions that, when executed on a processor, can implement the methods implemented by the communication device in any of the above embodiments.

[0351] In some embodiments of this application, a system is also provided that can implement the methods implemented by one or more of the data acquisition device, database, training device, and execution device in any of the above embodiments. As an example, this system may be a communication system.

[0352] It is understood that the processor in the embodiments of this application may be any of the following devices or all or part of the circuitry used for processing functions: a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), field-programmable gate arrays (FPGAs), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. A general-purpose processor may be a microprocessor or any conventional processor.

[0353] The method steps in the embodiments of this application can be implemented in hardware or by a processor executing software instructions. The software instructions can consist of corresponding software modules, which can be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, registers, hard disks, portable hard disks, CD-ROMs, or any other form of storage medium known in the art. An exemplary storage medium is coupled to a processor, enabling the processor to read information from and write information to the storage medium. Of course, the storage medium can also be a component of the processor. The processor and storage medium can reside in a chip, such as an application-specific integrated circuit (ASIC), or a system-on-a-chip (SoC). Additionally, the chip or SoC can reside in a network device or terminal device. Alternatively, the processor and storage medium can exist as discrete components in the network device or terminal device.

[0354] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of this application are performed entirely or partially. The computer can be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user equipment, or other programmable device. The computer program or instructions can be stored in a computer-readable storage medium or transferred from one computer-readable storage medium to another. For example, the computer program or instructions can be transferred from one website, computer, server, or data center to another website, computer, server, or data center via wired or wireless means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium, such as a floppy disk, hard disk, or magnetic tape; it can also be an optical medium, such as a digital video optical disc; or it can be a semiconductor medium, such as a solid-state drive.

[0355] In the various embodiments of this application, unless otherwise specified or in case of logical conflict, the terminology and / or descriptions of different embodiments are consistent and can be referenced by each other. The technical features of different embodiments can be combined to form new embodiments according to their inherent logical relationship.

[0356] It is understood that the various numerical designations used in the embodiments of this application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of this application. The order of the process numbers described above does not imply the order of execution; the execution order of each process should be determined by its function and internal logic.

Claims

1. A method for acquiring training data, characterized in that, The method includes: Obtain T data sequences, where each of the T data sequences contains at least one data point, and T is an integer greater than 1; Obtain first information, wherein the first information indicates P data sequences among the T data sequences used to obtain the first data sequence, where P is a positive integer less than or equal to T; A first data sequence is determined based on the first information and the T data sequences. The first data sequence contains M data points, where M is a positive integer greater than 1. The first data sequence is used as the input data sequence of the neural network model and / or as the label of the input data sequence of the neural network model. The data in the input data sequence is used to be input into the neural network model sequentially.

2. The method according to claim 1, characterized in that, The T data sequences include a first sequence, wherein the first information includes first indication information, which indicates that the data sequence in the T data sequences is concatenated with the first sequence.

3. The method according to claim 1, characterized in that, The T data sequences include a first sequence and a second sequence, wherein the first information includes second indication information, which indicates whether the second sequence is concatenated with the first sequence.

4. The method according to claim 1, characterized in that, The first information includes at least one of the following: the start sequence identifier, P, of the first data sequence, or the end sequence identifier of the first data sequence.

5. The method according to claim 1, characterized in that, The first information indicates the P data sequences, including: the first information indicates the characteristics of the P data sequences.

6. The method according to claim 5, characterized in that, The first information includes the identification information of each of the P data sequences.

7. The method according to any one of claims 2 to 6, characterized in that, The T data sequences include a first sequence, wherein obtaining the first sequence from the T data sequences includes: Obtain second information, which indicates at least one of the following: the start data identifier of the first sequence, the length of the first sequence, or the end data identifier of the first sequence.

8. The method according to claim 7, characterized in that, The second information also indicates the initial value of the model state corresponding to the first sequence.

9. The method according to any one of claims 1 to 8, characterized in that, The data in the first data sequence contains channel state information.

10. The method according to any one of claims 1 to 9, characterized in that, The method further includes: Obtain first network state information, which indicates the initial value of the model state used by the neural network model when obtaining the output data sequence corresponding to the first data sequence.

11. The method according to any one of claims 1 to 10, characterized in that, The neural network model is used for compression and / or prediction of channel state information.

12. The method according to any one of claims 1 to 11, characterized in that, The T data sequences include a third sequence, wherein obtaining the third sequence from the T data sequences includes: Obtain third information, which indicates at least one of the following: the start data identifier of the third sequence, the length of the third sequence, or the end data identifier of the third sequence.

13. The method according to claim 12, characterized in that, The third information also indicates the initial state value of the model corresponding to the third sequence.

14. The method according to any one of claims 1 to 13, characterized in that, When the first data sequence is used as the input data sequence for the neural network model, the method further includes: Obtain S data sequences, each of which contains at least one data item, where S is an integer greater than 1; A second data sequence is determined based on the S data sequences. The second data sequence contains Q data points, where Q is a positive integer greater than 1. The second data sequence is used as a label for the first data sequence. Multiple data points in the first data sequence are used to be input into the neural network model sequentially.

15. The method according to claim 14, characterized in that, Each data point in each of the S data sequences is associated with at least one data point in the T data sequences.

16. The method according to claim 14 or 15, characterized in that, The S data sequences include a fourth sequence, wherein obtaining the fourth sequence from the S data sequences includes: Obtain fourth information, which indicates at least one of the following: a start data identifier of the fourth sequence, the length of the fourth sequence, or, an end data identifier of the fourth sequence.

17. The method according to claim 16, characterized in that, The fourth piece of information also indicates the initial value of the model state corresponding to the fourth sequence.

18. The method according to claim 16 or 17, characterized in that, The S data sequences further include a fifth sequence, wherein obtaining the fifth sequence from the S data sequences includes: Obtain fifth information, which indicates at least one of the following: the start data identifier of the fifth sequence, the length of the fifth sequence, or the end data identifier of the fifth sequence.

19. The method according to claim 18, characterized in that, The fifth piece of information also indicates the initial value of the model state corresponding to the fifth sequence.

20. A device for acquiring training data, characterized in that, It includes functional modules for implementing the method as described in any one of claims 1 to 19.

21. A device for acquiring training data, characterized in that, include: One or more processors, said one or more processors being configured to implement the method as described in any one of claims 1 to 19.

22. A training system for a neural network model, characterized in that, The system includes a training data acquisition device for implementing the method as described in any one of claims 1 to 19.

23. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any one of claims 1 to 19.

24. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method as described in any one of claims 1 to 19.