Artificial intelligence network model training method, positioning method, device and communication equipment

CN117574951BActive Publication Date: 2026-06-26VIVO MOBILE COMM CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
VIVO MOBILE COMM CO LTD
Filing Date
2022-08-03
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In existing technologies, training AI network models relies on a large number of labeled training samples, which leads to high collection costs and limits the practical application of AI technology in areas such as localization.

Method used

The Siamese network model is trained using some unlabeled sample data, and a univariate network model with the same branch structure is trained using labeled sample data. The parameters of the Siamese network model are calibrated through an iterative process to reduce the dependence on labeled samples.

Benefits of technology

This reduces the complexity of collecting labeled sample data, enabling AI technology to be applied in scenarios where it is difficult to obtain a large number of labeled samples, and improving the training efficiency and accuracy of AI network models.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117574951B_ABST
    Figure CN117574951B_ABST
Patent Text Reader

Abstract

The application discloses an artificial intelligence (AI) network model training method and device, a positioning method and device, and communication equipment. The method comprises the following steps: obtaining a first sample data set and a second sample data set, wherein the first sample data set comprises at least two first sample data without labels, and the second sample data set comprises at least two second sample data with labels; performing a target iteration process based on the first sample data set and the second sample data set; when an iteration termination condition is met, determining that a first AI network model or a second AI network model obtained by the target iteration process is a target AI network model; the target iteration process comprises the following steps: performing a first training process on the first AI network model based on the first sample data set, and performing a second training process on the second AI network model based on the second sample data set, wherein the first training process is used for training first parameters of each branch in the first AI network model, and the second training process is used for calibrating the first parameters.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application belongs to the field of communication technology, specifically relating to an artificial intelligence (AI) network model training method, localization method, device, and communication equipment. Background Technology

[0002] In related technologies, research has been conducted on using artificial intelligence (AI) network models in communication networks to perform the functions of certain modules.

[0003] AI network models need to be trained based on training samples. During the training process, the accuracy of the AI ​​network model largely depends on the size and quality of the training sample dataset. Obtaining a large number of labeled training samples usually requires a lot of human resources and time. Therefore, the limited number of labeled training samples restricts the practical application of AI technology in many fields, including localization. Summary of the Invention

[0004] This application provides an AI network model training method, localization method, apparatus, and communication device, which can train AI network models based on a training sample dataset that is partly unlabeled and partly labeled. This can greatly reduce the number of labeled training samples required in the process of training AI network models and expand the application scope of AI technology.

[0005] Firstly, an AI network model training method is provided, which includes:

[0006] Obtain a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data and the first sample data has no label, and the second sample dataset includes at least two second sample data and the second sample data has a label;

[0007] Based on the first sample dataset and the second sample dataset, execute the target iteration process;

[0008] If the iteration termination condition is met, the target iteration process is terminated, and the first AI network model or the second AI network model obtained by the target iteration process is determined as the target AI network model.

[0009] The target iteration process includes: a first training process for the first AI network model based on the first sample dataset, and a second training process for the second AI network model based on the second sample dataset. The first AI network model is a Siamese network model, and the second AI network model is a unary network model. The unary network model has the same structure as a branch of the Siamese network model. The first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

[0010] Secondly, an AI network model training device is provided, the device comprising:

[0011] The first acquisition module is used to acquire a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data and the first sample data has no label, and the second sample dataset includes at least two second sample data and the second sample data has a label;

[0012] The execution module is used to execute the target iteration process based on the first sample dataset and the second sample dataset;

[0013] The first determining module is used to terminate the target iteration process when the iteration termination condition is met, and to determine the first AI network model or the second AI network model obtained by the target iteration process as the target AI network model.

[0014] The target iteration process includes: a first training process for the first AI network model based on the first sample dataset, and a second training process for the second AI network model based on the second sample dataset. The first AI network model is a Siamese network model, and the second AI network model is a unary network model. The unary network model has the same structure as a branch of the Siamese network model. The first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

[0015] Thirdly, a positioning method is provided, which includes:

[0016] Obtain the target channel information of the target terminal;

[0017] The target channel information is processed based on the target AI network model to obtain the positioning information of the target terminal. The target AI network model is trained on a twin network model based on a first sample data without labels and on a unary network model based on a second sample data with labels. The unary network model has the same structure as a branch of the twin network model.

[0018] Fourthly, a positioning device is provided, the device comprising:

[0019] The second acquisition module is used to acquire the target channel information of the target terminal;

[0020] The processing module is used to process the target channel information based on the target AI network model to obtain the positioning information of the target terminal. The target AI network model is trained on a twin network model based on a first sample data without labels and on a unary network model based on a second sample data with labels. The unary network model has the same structure as a branch of the twin network model.

[0021] Fifthly, a communication device is provided, the communication device including a processor and a memory, the memory storing a program or instructions executable on the processor, the program or instructions, when executed by the processor, implementing the steps of the method as described in the first or third aspect.

[0022] Sixthly, a communication device is provided, including a processor and a communication interface, wherein the communication interface is used to acquire a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data points, and the first sample data points are unlabeled, and the second sample dataset includes at least two second sample data points, and the second sample data points are labeled; the processor is used to execute a target iteration process based on the first sample dataset and the second sample dataset, and to terminate the target iteration process when an iteration termination condition is met, and to determine that the first AI network model or the second AI network model obtained by the target iteration process is the target AI network model; wherein the target iteration process includes: performing a first training process on the first AI network model based on the first sample dataset, and performing a second training process on the second AI network model based on the second sample dataset, wherein the first AI network model is a Siamese network model, the second AI network model is a unary network model, the unary network model has the same structure as a branch of the Siamese network model, the first training process is used to train a first parameter of each branch in the first AI network model, and the second training process is used to calibrate the first parameter; or

[0023] The communication interface is used to acquire target channel information of the target terminal; the processor is used to process the target channel information based on the target AI network model to obtain the positioning information of the target terminal, wherein the target AI network model is trained on a twin network model based on a first sample data without labels, and on a unary network model based on a second sample data with labels, and the unary network model has the same structure as a branch of the twin network model.

[0024] In a seventh aspect, a readable storage medium is provided, on which a program or instructions are stored, which, when executed by a processor, implement the steps of the method described in the first or third aspect.

[0025] Eighthly, a chip is provided, the chip including a processor and a communication interface coupled to the processor, the processor being used to run programs or instructions to implement the methods described in the first or third aspect.

[0026] In a seventh aspect, a computer program / program product is provided, the computer program / program product being stored in a storage medium, the computer program / program product being executed by at least one processor to implement the steps of the AI ​​network model training method as described in the first aspect, or to implement the steps of the localization method as described in the third aspect.

[0027] In this embodiment, a Siamese network model is trained based on unlabeled sample data, and a unary network model with the same branch structure as the Siamese network model is trained using labeled sample data. The training result of the unary network model based on labeled sample data can be used to calibrate the training result of the Siamese network model. Thus, after at least one iteration of the Siamese network model and the unary network model, the goal of training an AI network model using unlabeled sample data and calibrating the AI ​​network model using labeled sample data can be achieved. In this iteration process, a portion of unlabeled sample data is used. Compared with related technologies that use all labeled sample data to train AI network models, the amount of labeled sample data used in this embodiment is greatly reduced, thereby reducing the complexity of collecting labeled sample data and enabling AI technology to be applied to scenarios where it is difficult to collect a large amount of labeled sample data. Attached Figure Description

[0028] Figure 1 This is a schematic diagram of the structure of a wireless communication system that can be applied to the embodiments of this application;

[0029] Figure 2aThis is a schematic diagram of the neural network model architecture;

[0030] Figure 2b This is a schematic diagram of a neuron;

[0031] Figure 3 This is a flowchart of an AI network model training method provided in an embodiment of this application;

[0032] Figure 4 This is a schematic diagram of the twin network model in the embodiments of this application;

[0033] Figure 5 This is a schematic diagram of the framework of the target iteration process in the embodiments of this application;

[0034] Figure 6 This is a schematic diagram of the structure of an AI network model training device provided in an embodiment of this application;

[0035] Figure 7 This is a flowchart of a positioning method provided in an embodiment of this application;

[0036] Figure 8 This is a schematic diagram of the structure of a positioning device provided in an embodiment of this application;

[0037] Figure 9 This is a schematic diagram of the structure of a communication device provided in an embodiment of this application. Detailed Implementation

[0038] The technical solutions of the embodiments of this application will be clearly described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this application. All other embodiments obtained by those skilled in the art based on the embodiments of this application are within the scope of protection of this application.

[0039] The terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such terms can be used interchangeably where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein, and the objects distinguished by "first" and "second" are generally of the same class, not limited in number; for example, a first object can be one or more. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship.

[0040] It is worth noting that the technologies described in this application are not limited to Long Term Evolution (LTE) / LTE-Advanced (LTE-A) systems, but can also be used in other wireless communication systems, such as Code Division Multiple Access (CDMA), Time Division Multiple Access (TDMA), Frequency Division Multiple Access (FDMA), Orthogonal Frequency Division Multiple Access (OFDMA), Single-carrier Frequency Division Multiple Access (SC-FDMA), and other systems. The terms "system" and "network" in this application are often used interchangeably, and the described technologies can be used with the systems and radio technologies mentioned above, as well as with other systems and radio technologies. The following description describes New Radio (NR) systems for illustrative purposes, and NR terminology is used in most of the following description; however, these technologies can also be applied to applications other than NR systems, such as 6th generation (6G) radio systems. th Generation 6G communication system.

[0041] Figure 1This diagram illustrates a block diagram of a wireless communication system applicable to embodiments of this application. The wireless communication system includes a terminal 11 and a network-side device 12. The terminal 11 can be a mobile phone, tablet computer, laptop computer, personal digital assistant (PDA), handheld computer, netbook, ultra-mobile personal computer (UMPC), mobile internet device (MID), augmented reality (AR) / virtual reality (VR) device, robot, wearable device, vehicle-mounted device (VUE), pedestrian terminal (PUE), smart home (home devices with wireless communication capabilities, such as refrigerators, televisions, washing machines, or furniture), game console, personal computer (PC), ATM, or self-service machine, etc. Wearable devices include: smartwatches, smart bracelets, smart headphones, smart glasses, smart jewelry (smart bracelets, smart chains, smart rings, smart necklaces, smart anklets, smart anklets, etc.), smart wristbands, smart clothing, etc. It should be noted that the specific type of terminal 11 is not limited in this embodiment. Network-side equipment 12 may include access network equipment or core network equipment. Access network equipment 12 may also be referred to as radio access network equipment, radio access network (RAN), radio access network function, or radio access network unit. Access network equipment 12 may include base stations, WLAN access points, or WiFi nodes, etc. Base stations may be referred to as Node B, evolved Node B (eNB), access point, base transceiver station (BTS), radio base station, radio transceiver, Basic Service Set (BSS), Extended Service Set (ESS), home B node, home evolved B node, Transmitting Receiving Point (TRP), or any other suitable term in the field, as long as the same technical effect is achieved. The base station is not limited to specific technical terms. It should be noted that this application embodiment only uses a base station in an NR system as an example for description and does not limit the specific type of base station.

[0042] Artificial intelligence (AI) has been widely applied in various fields. Integrating AI into wireless communication networks to improve throughput, increase user capacity, and reduce latency is an important task for future wireless communication networks. AI network models can be implemented in various ways, such as neural networks, decision trees, support vector machines, and Bayesian classifiers. This application uses neural networks as an example for illustration, but does not limit the specific type of AI network model.

[0043] Generally, the AI ​​algorithms and network models selected vary depending on the type of problem to be solved. The main method for improving 5G network performance using AI network models is to enhance or replace existing algorithms or processing modules with neural network-based algorithms and models. In specific scenarios, neural network-based algorithms and models can achieve better performance than deterministic algorithms. Commonly used neural networks include deep neural networks, convolutional neural networks, and recurrent neural networks. Existing AI tools can be used to build, train, and validate neural networks.

[0044] In applications, AI network models need to be trained based on a large amount of real labeled sample data.

[0045] For example: Figure 2a The neural network model shown includes an input layer, hidden layers, and an output layer, which can generate output information based on the input and output information (X1~X2) obtained from the input layer. n Predict the possible output (Y). A neural network model consists of a large number of neurons, such as... Figure 2b As shown, the parameters of the neuron include: input parameters a1 to a2. K The parameters include weights w, bias b, and activation function σ(z), and the output value a obtained from these parameters. Common activation functions include the sigmoid function, the hyperbolic tangent function, and the rectified linear unit (ReLU) function, etc. The z in the above function σ(z) can be calculated by the following formula:

[0046] z = a1w1 + ... + a k w k +a K w K +b

[0047] Where K represents the total number of input parameters.

[0048] The parameters of a neural network are optimized using optimization algorithms. Optimization algorithms are a class of algorithms that help us minimize or maximize an objective function (sometimes called a loss function). The objective function is often a mathematical combination of model parameters and data. For example, given data X and its corresponding label Y, we construct a neural network model f(.). With the neural network model, we can obtain the predicted output f(x) based on the input x, and calculate the difference between the predicted value and the true value (f(x) - Y), which is the loss function. Our goal is to find suitable W and b that minimize the value of the loss function. The smaller the loss value, the closer our model is to the reality.

[0049] Most common optimization algorithms are based on backpropagation. The basic idea of ​​backpropagation is that the learning process consists of two parts: forward propagation of the signal and backward propagation of the error. During forward propagation, input samples are introduced from the input layer, processed layer by layer by the hidden layers, and then transmitted to the output layer. If the actual output of the output layer does not match the expected output, the process transitions to backpropagation. Backpropagation involves propagating the output error back to the input layer through the hidden layers in a certain form, distributing the error to all units in each layer, thus obtaining the error signal for each unit. This error signal serves as the basis for adjusting the weights of each unit. This process of adjusting the weights through forward and backward propagation is cyclical. This continuous adjustment of weights is the learning and training process of the network. This process continues until the error in the network output is reduced to an acceptable level, or until the predetermined number of learning iterations is reached.

[0050] Common optimization algorithms include gradient descent, stochastic gradient descent (SGD), mini-batch gradient descent, momentum method, Nesterov (which represents stochastic gradient descent with momentum), adaptive gradient descent (Adagrad), adaptive learning rate adjustment (Adadelta), root mean square prop (RMSprop), and adaptive momentum estimation (Adam).

[0051] During error backpropagation, these optimization algorithms calculate the gradient based on the error / loss obtained from the loss function with respect to the current neuron, add the learning rate, previous gradients / derivatives / partial derivatives, etc., and then pass the gradient to the previous layer.

[0052] Compared to related technologies, which require training AI network models based on a large amount of labeled sample data, and the collection of labeled sample data requires a lot of human and financial resources, unlabeled sample data is easier to obtain.

[0053] In this embodiment, AI network models can be trained using a combination of unlabeled and labeled sample data, which greatly reduces the amount of labeled sample data used when training AI network models.

[0054] The AI ​​network model training method, AI network model training device, positioning method, positioning device, and communication equipment provided in this application will be described in detail below with reference to the accompanying drawings and through some embodiments and application scenarios.

[0055] Please see Figure 3 This application provides an AI network model training method, such as... Figure 3 As shown, the AI ​​network model training method may include the following steps:

[0056] Step 301: Obtain a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data and the first sample data has no label, and the second sample dataset includes at least two second sample data and the second sample data has a label.

[0057] Taking the target AI network model as an example of an AI network model used for locating terminals in a wireless communication network, the sample data in the first and second sample datasets may include at least one of the following: channel state information and channel parameters from the Transmission and Receiving Point (TRP). For example, the sample data includes the channel impulse responses of at least two TRPs, as well as channel parameters such as delay, angle, and received power. The target AI network model can estimate the terminal's location information based on this sample data, or estimate intermediate features that can assist in calculating the terminal's location. In other words, the target AI network model can convert high-dimensional input information into low-dimensional (e.g., 2D or 3D) location information.

[0058] At this time, the label of the second sample data can be a location label, for example: the label can be the actual location information of the terminal that detected the channel state information in the second sample data.

[0059] Optionally, the first AI network model, the second AI network model, and the target AI network model are localization AI network models;

[0060] The first sample data includes at least one of the following:

[0061] Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum;

[0062] Channel parameters, which include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), angle of departure (AOD), and reference signal receiving power (RSRP), where Q is a positive integer;

[0063] And / or,

[0064] The label of the second sample data includes the actual location corresponding to the second sample data, and the second sample data includes at least one of the following:

[0065] Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum;

[0066] The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

[0067] The aforementioned channel state information can be high-dimensional channel state information. During the training of the first and second AI network models based on this channel state information, the channel state information can be input into the first and second AI network models, enabling them to map the high-dimensional channel state information to a low-dimensional space. The principle of mapping high-dimensional channel state information to a low-dimensional space can refer to the principles of positioning AI network models that can determine the terminal location based on channel state information in wireless communication networks. For example, based on the idea of ​​manifold learning, high-dimensional channel state information (CSI) is mapped to a low-dimensional manifold space (such as a two-dimensional space, which can be a virtual space) with the same dimension as the location. This mapping is considered to achieve the principle of proximity preservation, meaning that nearby locations in the actual space have similar CSIs, and similar CSIs are also mapped similarly in the low-dimensional manifold space. Subsequent location-based services can be replaced by locations in the manifold space, which will not be elaborated further here.

[0068] In related technologies, to ensure the positioning accuracy of a positioning AI network model meets requirements, a large amount of readily available training data with real location labels is needed. This necessitates significant human and financial resources to collect location labels for each data point. However, in this embodiment, a small amount of training data with real location labels can be collected, along with a large amount of unlabeled training data. Relatively speaking, unlabeled data is easier to obtain, such as collecting only channel state information without location labels.

[0069] Of course, in practice, the target AI network model mentioned above can also be an AI network model used to implement other functions in wireless communication networks. In this case, the content of the sample data in the first and second sample datasets can be adjusted as needed, without specific limitations.

[0070] Step 302: Based on the first sample dataset and the second sample dataset, execute a target iteration process, wherein the target iteration process includes: performing a first training process on the first AI network model based on the first sample dataset, and performing a second training process on the second AI network model based on the second sample dataset, wherein the first AI network model is a Siamese network model, the second AI network model is a unary network model, the unary network model has the same structure as a branch of the Siamese network model, the first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

[0071] The Siamese network model can be a network model comprising at least two branches, wherein the at least two branches have identical structures and parameters, and each branch can independently convert the input high-dimensional channel information into low-dimensional location information. For example: Figure 4 The Siamese network model shown comprises two branches, each with the same structure and parameters. During the first training process, the Siamese network receives two input samples, namely sample X. j and sample X k , where sample X j Input the first branch, and the output of the first branch is Y. j Sample X k Input the second branch, which outputs the result Y. k .

[0072] The unary network model described above has only one branch. The structure of one branch of the unary network model is the same as that of the twin network model. Thus, the physical meaning of the unary network model is the same as that of each branch of the twin network model. For example, each branch of the unary network model and the twin network model can convert the input high-dimensional channel information into low-dimensional location information.

[0073] In this step, the first training process is used to train the model parameters (i.e., the first parameters) of the first AI network model, and the second training process is used to train the model parameters (i.e., the second parameters) of the second AI network model.

[0074] In one possible implementation, during the first iteration of the target iteration process, the initial value of the first parameter can be determined based on the second parameter trained by the second training process. This allows the first parameter to be calibrated based on the second training process. During the second iteration of the target iteration process, the initial value of the second parameter can be updated based on the first parameter from the first iteration, and the second training process can be executed again to optimize the value of the second parameter. The second parameter trained by the second training process is then used as the initial value of the first parameter for the second iteration, and this process is repeated until the termination iteration condition is met. At this point, the final target AI network model can be determined based on either the first AI network model or the second AI network model from the last iteration.

[0075] As an optional implementation, the step of performing the target iteration process based on the first sample dataset and the second sample dataset includes:

[0076] The second training process is performed on the loss function of the second AI network model based on H second sample data to determine the value of the second parameter in the second AI network model. The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model, where H is a positive integer.

[0077] The first training process is performed on the loss function of the first AI network model based on N first sample data sets to determine the values ​​of the first parameters of the two branches in the first AI network model. The initial values ​​of the first parameters of the two branches in the first AI network model are equal to the values ​​of the second parameters of the second AI network model. The first sample data set includes two first sample data sets, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data sets in the same first sample data set, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data sets. N is a positive integer.

[0078] Based on the value of the first parameter of the target branch in the first AI network model, update the value of the second parameter in the second AI network model.

[0079] The number of iterations in the target iteration process can be one or at least two. Thus, when the number of iterations in the target iteration process is greater than one, the iteration order in the target iteration process can be represented as: second training process → first training process → second training process...

[0080] For example: Figure 5 The schematic diagram of the target iteration process illustrates that a second training process can be performed on the unary network model based on the second training sample data to obtain the second parameters. Then, parameters are assigned to the Siamese network model based on these second parameters. A first training process is then performed on the Siamese network model with the assigned parameters based on the first training sample data, and the first parameters after the first training process are extracted. Parameters are then assigned to the unary network model based on these first parameters, and this process is iterated repeatedly until a preset termination condition is met. Parameter assignment refers to assigning parameters to network models with the same structure. For example, if there are at least two second parameters, since the structures of the branches in the unary network model and the Siamese network model are the same, the first parameters and second parameters can be mapped one-to-one. Therefore, when assigning parameters to the Siamese network model based on these second parameters, the values ​​of the second parameters replace the values ​​of their corresponding first parameters.

[0081] It is worth noting that the value of H can be equal to 1 or greater than or equal to 2, and the number of times the second AI network model is trained in the second training process can be 1 or at least 2. For example, when H is equal to 2 and the number of iterations in the target iteration process is greater than 1, if the second training process trains the second AI network model twice, and uses one second sample data each time, then the iteration order in the target iteration process can be expressed as: train the second AI network model -> train the second AI network model -> train the first AI network model -> train the second AI network model -> train the second AI network model...

[0082] Of course, when H is greater than or equal to 2, the second training process may also involve training the second AI network model once with H second sample data, and using H second sample data each time. No specific limitation is made here.

[0083] Similar to the second training process, the value of N can also be equal to 1 or greater than or equal to 2. The number of times the first AI network model is trained in the first training process can be 1 or at least 2, and each training can use one or at least two sets of first sample data, which will not be elaborated on here.

[0084] The purpose of the second training process is to make the loss function of the second AI network model as small as possible or less than a preset threshold.

[0085] The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model. In other words, the second AI network model is used to transform the second sample data into the second output result. The closer the position reflected by the second output result is to the real position corresponding to the second sample data, the higher the accuracy of the second AI network model.

[0086] Optionally, the dimension of the second output result of the second AI network model is smaller than the dimension of the second sample data.

[0087] In other words, the second AI network model can project sample data from a high-dimensional space to a low-dimensional space. For example, it can determine the two-dimensional or three-dimensional location information of the terminal based on the channel state information between the terminal and multiple TRPs.

[0088] Furthermore, in this embodiment, the first AI network model includes two branches, each corresponding to its own input and output. In this case, the aforementioned set of first sample data includes two sets of first sample data, with each set of two samples serving as the input to its corresponding branch. The purpose of the first training process is to minimize the loss function of the first AI network model or to make it less than a preset threshold.

[0089] Optionally, the dimension of the first output result of each branch of the first AI network model is smaller than the dimension of the first sample data.

[0090] In other words, each branch of the first AI network model can project sample data from a high-dimensional space to a low-dimensional space.

[0091] The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information represents the difference between two first sample data in a set of first sample data. This difference is positively correlated with the distance between the positions corresponding to these two first sample data. That is, the greater the difference between the two sample data, the greater the distance between the high-dimensional spatial positions reflected by these two sample data.

[0092] Of course, the aforementioned first difference information can also be the distance between the features of two first sample data in a set of first sample data. The distance between the features of two first sample data is more conducive to data processing than the difference between two first sample data.

[0093] For example, the loss function of the first AI network model can include a class of functions that should make the distance D1 between two first sample data in high-dimensional space as close as possible to the distance D2 between two first output results mapped to low-dimensional space, and satisfy that D1-D2 is less than M. Considering that the distance between samples in high-dimensional space is not easy to calculate, the distance between the features of the first sample data can be used to replace the distance between the first sample data when calculating the distance, as shown in the formula L of the loss function below:

[0094]

[0095] Where t(x) represents the feature of the first sample data x; and f represents two samples from the first sample data of the nth group in a batch; θ (.) represents the first AI network model; the first AI network model has a first parameter θ; This means finding the value of θ that minimizes the loss function. Indicates and The corresponding first output result; N represents the total number of first sample data included in the first sample dataset; M represents the hyperparameter; This represents the scaling function used to scale the distance of the first output result. Distance from feature space Alignment on scale, where These are trainable parameters, such as linear scaling: The optimal values ​​for parameter M and other hyperparameters can be obtained through ablation experiments.

[0096] The second difference information represents the distance between the low-dimensional positional information output by the two branches of the first AI network model, that is, the distance between the two first output results corresponding to a set of first sample data. Thus, based on the principle that a larger difference between the high-dimensional space samples of two terminals reflects a greater distance between them, and a smaller difference reflects a closer distance, the problem of comparing the difference between the input and output of the first AI network model can be transformed into comparing the distance between two inputs and the distance between two outputs. This simplifies the loss function of the first AI network model, thereby simplifying its training process.

[0097] It should be noted that, in implementation, the first sample data can include at least two types of data, such as the CIR's first diameter angle, CIR's first diameter delay, and CIR's first diameter power. In this case, the loss function of the first AI network model can be expressed as:

[0098]

[0099]

[0100]

[0101] Where D1 represents the second difference information; D2 represents the first difference information; and I represents the total number of features included in a first sample data.

[0102] It should be noted that the distances in the embodiments of this application, such as the distance between the first difference information and the second difference information, the distance between two first sample data, the distance between two first output results, the distance between the input and output of the second AI network model, etc., can be the average absolute distance or Euclidean distance, etc., and are not specifically limited here.

[0103] In one possible implementation, during the first iteration of the target iteration process, the initial value of the second parameter can be determined based on the first parameter trained in the first training process. This allows the second parameter to be calibrated based on the first training process. During the second iteration of the target iteration process, the initial value of the first parameter can be updated based on the second parameter from the first iteration, and the first training process can be executed again to optimize the value of the first parameter. Then, the first parameter trained in the first training process is used as the initial value of the second parameter for the second iteration, and this process is repeated until the termination iteration condition is met. At this point, the final target AI network model can be determined based on either the first AI network model or the second AI network model from the last iteration.

[0104] As an optional implementation, the step of performing the target iteration process based on the first sample dataset and the second sample dataset includes:

[0105] The first training process is performed on the loss function of the first AI network model based on N first sample data sets to determine the values ​​of the first parameters of the two branches in the first AI network model. The first sample data set includes two first sample data sets, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data sets in the same first sample data set, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data sets. N is a positive integer.

[0106] The second training process is performed on the loss function of the second AI network model based on H second sample data to determine the value of the second parameter in the second AI network model. The initial value of the second parameter is equal to the value of the first parameter of the target branch of the first AI network model. The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model. H is a positive integer.

[0107] Based on the value of the second parameter, update the value of the first parameter of each branch in the first AI network model.

[0108] Where N can be an integer greater than 1, the iteration order in the target iteration process can be represented as: first training process -> second training process -> first training process...

[0109] For example: Figure 5In the schematic diagram of the target iteration process shown, the Siamese network model can first be trained based on the first training sample data to obtain the first parameters of each branch. Then, the parameters of the unary network model are assigned according to the first parameters of one of the branches. The unary network model with assigned parameters is then trained based on the second training sample data. The second parameters after the second training process are extracted, and the Siamese network model is assigned parameters based on the second parameters. This process is repeated iteratively until the preset condition for terminating the iteration is met.

[0110] In this embodiment, the first training process can train the first parameters of each branch in the first AI network model. The initial value of the second parameter of the second AI network model in the second training process can be equal to the first parameter after training in the first training process (e.g., the first parameter of one branch of the first AI network model, or the average of the first parameters of all branches of the first AI network model), or the second parameter can be related to the first parameter, for example, the second parameter can be the weighted value of the first parameters of all branches of the first AI network model, or the second parameter can be positively correlated with the first parameter of one branch of the first AI network model, etc. In this way, after training the second parameter through the second training process, the second parameter can be trained more accurately based on labeled data. Then, the second parameter after training in the second training process can be used to calibrate the first parameter, for example, updating the first parameter to the second parameter after training in the second training process, and re-executing the first training process -> second training process based on the updated first parameter.

[0111] It should be noted that this embodiment is similar to the previous optional embodiment of this application, except that in this embodiment, a first training process is performed first, and the parameters of the unary network model are assigned based on the first parameters trained in the first training process, and then a second training process is performed. In contrast, in the previous optional embodiment of this application, a second training process is performed first, and the parameters of the Siamese network model are assigned based on the second parameters trained in the second training process, and then a first training process is performed. For a detailed explanation of the first training process, the second training process, the loss function of the first AI network model, and the loss function of the second AI network model, please refer to the relevant descriptions in the previous optional embodiment of this application, which will not be repeated here.

[0112] It should be noted that during the first iteration of the target iteration process, the initial value of the first or second parameter mentioned above can also be determined in other ways, such as determining a random initial value based on a Gaussian distribution, or training the first or second parameter based on another AI network model and labeled sample data to obtain the initial value of the first or second parameter mentioned above. No specific limitation is made here.

[0113] As an optional implementation, the AI ​​network model training method provided in this application embodiment further includes:

[0114] The values ​​of the third parameter of the third AI network model are obtained by training based on the second sample dataset. The third AI network model has the same structure as the second AI network model.

[0115] The initial value of the first parameter or the second parameter is determined based on the value of the third parameter.

[0116] The third AI network model can be a branch of the first AI network model or an AI network model with the same structure as the second AI network model. The third AI network model is trained based on the second sample dataset to determine its model parameters (i.e., the third parameters). This allows for the approximate range of values ​​for the first or second parameters, and the initial values ​​of the first or second parameters can be determined accordingly. For example, the initial value of the first or second parameter can be determined to be equal to the third parameter; or, if the third AI network model has undergone at least two iterations of training based on the second sample dataset, the initial value of the first or second parameter can be determined to be equal to the third parameter of the last iteration; or, if at least two third parameters are obtained during the training of the third AI network model based on the second sample dataset, the initial value of the first or second parameter can be determined to be equal to the average or weighted average of the two third parameters.

[0117] In practice, if the first training process is performed first in the first iteration, the initial value of the first parameter can be determined based on the third parameter; if the second training process is performed first in the first iteration, the initial value of the second parameter can be determined based on the third parameter.

[0118] In this embodiment, the approximate range of the first parameter or the second parameter can be roughly determined based on the training process of another AI network model, which can reduce the number of iterations in the target iteration process.

[0119] Step 303: If the iteration termination condition is met, terminate the target iteration process and determine the first AI network model or the second AI network model obtained by the target iteration process as the target AI network model.

[0120] The iteration termination condition may include at least one of the following:

[0121] The accuracy of the second AI network model is greater than or equal to the first preset accuracy;

[0122] The accuracy of the first AI network model is greater than or equal to the second preset accuracy.

[0123] The accuracy of the second AI network model can be tested based on sample data in the test set. For example, if the test set includes second sample data, the degree of matching between the labels of the second sample data and the second output results of the second AI network model can be used to determine the accuracy. For example, if 100 second sample data are input into the second AI network model, and 90 of the 100 second output results of the second AI network model match the labels of the corresponding second sample data, then the accuracy of the second AI network model is determined to be 90%.

[0124] The accuracy of the first AI network model can also be determined based on the degree of matching between the labels of the second sample data and the second output of the second AI network model, which will not be elaborated here.

[0125] As one possible implementation, the accuracy of the second AI network model can be simultaneously verified during the second training process based on the second training sample data. If the accuracy of the second AI network model is greater than or equal to the first preset accuracy, the iteration process can be terminated immediately, and the second AI network model can be determined as the target AI network model.

[0126] As one possible implementation, the accuracy of the first AI network model and the second AI network model can be periodically checked. When at least one of them meets the corresponding accuracy, the iteration is terminated, and the target AI network model is determined based on the AI ​​network model that meets the corresponding accuracy.

[0127] Of course, in practice, the iteration termination conditions may also include: the number of iterations reaches the preset number of iterations; all sample data in the first sample dataset or the second sample dataset has been used, etc., which are not specifically limited here.

[0128] It is worth mentioning that, in implementation, the number of first sample data can be set to be greater than the number of second sample data. For example, the first sample dataset includes U first sample data, and the second sample dataset includes V second sample data, where U and V are integers greater than 1, and U is greater than V.

[0129] Furthermore, during the target iteration process, the number of first sample data used in the first training process can be greater than the number of second sample data used in the second training process. For example, after using 100 sets of first sample data to perform the first training process on the first AI network model, the parameters of the second AI network model are assigned according to the first parameters of the trained first AI network model. Then, 10 sets of second sample data are used to perform the second training process on the second AI network model after the parameter assignment, in order to calibrate the model parameters.

[0130] This increases the proportion of unlabeled sample data used in the target iteration process, thereby further reducing the amount of labeled sample data used in the target iteration process.

[0131] For example, the simulation results are shown in Table 1 below:

[0132] Table 1

[0133]

[0134] As shown in Table 1 above, semi-supervised learning refers to the AI ​​network model training method based on partially labeled sample data and partially unlabeled sample data provided in the embodiments of this application, while the supervised learning method refers to the AI ​​network model training method based on labeled sample data in related technologies.

[0135] As shown in Table 1 above, the simulation results indicate that, compared with the supervised learning methods in related technologies, the semi-supervised learning method provided in this application embodiment can improve the localization accuracy of the trained AI network model when using the same number (1000) of labeled sample data. Furthermore, compared with the supervised learning methods in related technologies, the number of labeled sample data used when training AI network models with similar localization accuracy is greatly reduced.

[0136] In implementation, the device executing the AI ​​network model training method provided in this application embodiment can be a terminal in a wireless communication network, such as: Figure 1 The various types of terminals 11 listed herein, or the devices that execute the AI ​​network model training method provided in the embodiments of this application, can also be network-side devices, such as: Figure 1 The network-side device 12 listed in the illustrated embodiment may be a core network device, which is not a specific limitation herein.

[0137] The AI ​​network model training method provided in this application can be executed by an AI network model training device. This application uses an AI network model training device executing the AI ​​network model training method as an example to illustrate the AI ​​network model training device provided in this application.

[0138] Please see Figure 6 This application provides an AI network model training device, such as... Figure 6 As shown, the AI ​​network model training device 600 may include the following modules:

[0139] The first acquisition module 601 is used to acquire a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data and the first sample data has no label, and the second sample dataset includes at least two second sample data and the second sample data has a label;

[0140] Execution module 602 is used to execute a target iteration process based on the first sample dataset and the second sample dataset;

[0141] The first determining module 603 is used to terminate the target iteration process when the iteration termination condition is met, and to determine the first AI network model or the second AI network model obtained by the target iteration process as the target AI network model.

[0142] The target iteration process includes: a first training process for the first AI network model based on the first sample dataset, and a second training process for the second AI network model based on the second sample dataset. The first AI network model is a Siamese network model, and the second AI network model is a unary network model. The unary network model has the same structure as a branch of the Siamese network model. The first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

[0143] Optionally, execution module 602 includes:

[0144] The first iteration unit is used to perform a first training process on the loss function of the first AI network model based on N first sample data groups to determine the values ​​of the first parameters of the two branches in the first AI network model. The first sample data group includes two first sample data, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data in the same first sample data group, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data. N is a positive integer.

[0145] The second iteration unit is used to perform a second training process on the loss function of the second AI network model based on H second sample data, so as to determine the value of the second parameter in the second AI network model. The initial value of the second parameter is equal to the value of the first parameter of the target branch of the first AI network model. The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model. H is a positive integer.

[0146] The first update unit is used to update the value of the first parameter of each branch in the first AI network model according to the value of the second parameter.

[0147] Optionally, execution module 602 includes:

[0148] The third iteration unit is used to perform a second training process on the loss function of the second AI network model based on H second sample data, so as to determine the value of the second parameter in the second AI network model. Here, the loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model, and H is a positive integer.

[0149] The fourth iteration unit is used to perform a first training process on the loss function of the first AI network model based on N first sample data groups to determine the values ​​of the first parameters of the two branches in the first AI network model. The initial values ​​of the first parameters of the two branches in the first AI network model are equal to the values ​​of the second parameters of the second AI network model. The first sample data group includes two first sample data, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data in the same first sample data group, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data. N is a positive integer.

[0150] The second update unit is used to update the value of the second parameter in the second AI network model according to the value of the first parameter of the target branch in the first AI network model.

[0151] Optionally, the structure of the second AI network model is the same as that of a branch of the first AI network model.

[0152] Optionally, the iteration termination condition includes at least one of the following:

[0153] The accuracy of the second AI network model is greater than or equal to the first preset accuracy;

[0154] The accuracy of the first AI network model is greater than or equal to the second preset accuracy.

[0155] Optionally, the AI ​​network model training device 600 also includes:

[0156] The training module is used to train the third AI network model based on the second sample dataset to obtain the values ​​of the third parameter. The third AI network model has the same structure as the second AI network model.

[0157] The second determining module is used to determine the initial value of the first parameter or the second parameter based on the value of the third parameter.

[0158] Optionally, the first AI network model, the second AI network model, and the target AI network model are localization AI network models;

[0159] The first sample data includes at least one of the following:

[0160] Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum;

[0161] Channel parameters, which include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

[0162] And / or,

[0163] The label of the second sample data includes the actual location corresponding to the second sample data, and the second sample data includes at least one of the following:

[0164] Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum;

[0165] The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

[0166] Optionally, the dimension of the first output result of each branch of the first AI network model is smaller than the dimension of the first sample data;

[0167] And / or,

[0168] The dimension of the second output result of the second AI network model is smaller than the dimension of the second sample data.

[0169] Optionally, the first sample dataset includes U first sample data, and the second sample dataset includes V second sample data, where U and V are integers greater than 1, and U is greater than V.

[0170] The AI ​​network model training device 600 provided in this application embodiment can achieve, for example, Figure 3 The various processes implemented by the first device in the method embodiment shown can achieve the same beneficial effects, and will not be described again here to avoid repetition.

[0171] The AI ​​network model training device in this application embodiment can be an electronic device, such as an electronic device with an operating system, or a component in an electronic device, such as an integrated circuit or a chip. The electronic device can be a terminal, a network-side device, or other devices. For example, the terminal can include, but is not limited to, the type of terminal 11 listed above; the network-side device can include, but is not limited to, the type of network-side device 12 listed above; other devices can be servers, network-attached storage (NAS), etc., and this application embodiment does not impose specific limitations.

[0172] Please see Figure 7 The present application provides a positioning method, such as... Figure 7 As shown, the positioning method may include the following steps:

[0173] Step 701: Obtain the target channel information of the target terminal.

[0174] In practice, the entity executing the positioning method provided in this application embodiment can be the target terminal or other communication devices, such as network-side devices or other terminals.

[0175] In one possible implementation, the execution subject of the positioning method provided in this application embodiment is the target terminal. In this case, the target terminal can detect the above-mentioned target channel information by detecting the reference signal and performing channel estimation.

[0176] In another possible implementation, the execution subject of the positioning method provided in this application embodiment is a network-side device or other terminal. In this case, the network-side device or other terminal can obtain the target channel information by detecting the measurement information of the channel related to the target terminal and based on the principle of channel reciprocity, or receive the target channel information obtained by the target terminal through the detection of reference signals and the execution of channel estimation.

[0177] As an optional implementation, the target channel information includes at least one of the following:

[0178] Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum;

[0179] The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

[0180] The meaning, function, and acquisition method of each item in the above target channel information are similar to those in the example below. Figure 3 The target channel information is the same in the method embodiments shown, and will not be repeated here.

[0181] Step 702: Process the target channel information based on the target AI network model to obtain the positioning information of the target terminal. The target AI network model is trained on a twin network model based on first sample data without labels and on a unary network model based on second sample data with labels. The unary network model has the same structure as one branch of the twin network model.

[0182] The training process for the aforementioned target AI network model can be referenced as follows: Figure 3 The process of training the target AI network model in the method embodiment shown will not be described in detail here.

[0183] In related technologies, for positioning AI network models, in order to meet the positioning accuracy requirements, it is necessary to use a large amount of available training data with real location labels to train the AI ​​network model, which requires a lot of human and financial resources to collect the location labels of each data point.

[0184] In this embodiment, a small amount of training data with real location labels and a large amount of unlabeled training data can be collected. Relatively speaking, unlabeled data is easier to obtain, such as collecting only channel state information without location labels. This reduces the complexity of collecting training sample data in the process of training the localization AI network model.

[0185] Based on such Figure 3 After training the target AI network model according to the method embodiment shown, the target AI network model can be deployed on the terminal or network-side device. When the target AI network model is deployed, the terminal can process the channel information it detects based on the target AI network model to obtain the location information of the terminal. When the target AI network model is deployed on the network-side device, the network-side device can obtain the channel information of the target terminal through its own detection or by the target terminal's reporting, and process the channel information of the target terminal based on the target AI network model to obtain the location information of the terminal.

[0186] In applications, this location information can provide data support for location-based functions, such as selecting network-side devices, paths, and beams based on location information, as well as navigation based on location information.

[0187] In one possible implementation, after determining its own location, the terminal can report this location information to network-side devices or send it to other terminals. Similarly, after determining the terminal's location information, the network-side devices can report this location information to core network devices, send it to target terminals, or send it to other network-side devices or other terminals. This interaction with the terminal's location information enhances the flexibility of its application.

[0188] As an optional implementation, prior to step 702 above, the positioning method further includes:

[0189] Obtain relevant information about the target AI network model from the first node, wherein the first node is the node that trained the target AI network model.

[0190] In this embodiment, even if the executing entity of the positioning method provided in this application is not the same device as the node training the target AI network model, model transfer of the AI ​​network model can still be performed to obtain relevant information of the target AI network model from the node training the target AI network model. This relevant information may include the target AI network model's structural information, parameter information, model files, etc. Based on this relevant information, model inference of the target AI network model can be achieved, i.e., the target AI network model can be used to estimate the terminal location.

[0191] Of course, in practical applications, the execution subject of the positioning method provided in this application embodiment and the node that trains the target AI network model can be the same device, which will not be elaborated here.

[0192] This application embodiment implements based on, as... Figure 3 The AI ​​network model training method provided in the illustrated embodiment trains a target AI network model to determine the location information of the terminal.

[0193] The positioning method provided in this application can be executed by a positioning device. This application uses the example of a positioning device executing the positioning method to illustrate the positioning device provided in this application.

[0194] Please see Figure 8 The positioning device provided in this application embodiment, such as Figure 8 As shown, the positioning device 800 may include the following modules:

[0195] The second acquisition module 801 is used to acquire the target channel information of the target terminal;

[0196] The processing module 802 is used to process the target channel information based on the target AI network model to obtain the positioning information of the target terminal. The target AI network model is trained on a twin network model based on first sample data without labels and on a unary network model based on second sample data with labels. The unary network model has the same structure as a branch of the twin network model.

[0197] Optionally, the target channel information includes at least one of the following:

[0198] Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum;

[0199] The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

[0200] Optionally, the positioning device 800 further includes:

[0201] The third acquisition module acquires relevant information about the target AI network model from the first node, wherein the first node is the node that trained the target AI network model.

[0202] The positioning device 800 provided in this application embodiment can achieve the following: Figure 7 The various processes in the method embodiments shown are all capable of achieving the same beneficial effects, and will not be described again here to avoid repetition.

[0203] The positioning device in this application embodiment can be an electronic device, such as an electronic device with an operating system, or a component in an electronic device, such as an integrated circuit or a chip. The electronic device can be a terminal, a network-side device, or other devices. For example, the terminal can include, but is not limited to, the type of terminal 11 listed above; the network-side device can include, but is not limited to, the type of network-side device 12 listed above; other devices can be servers, network-attached storage (NAS), etc., and this application embodiment does not impose specific limitations.

[0204] Optional, such as Figure 9 As shown, this application embodiment also provides a communication device 900, including a processor 901 and a memory 902. The memory 902 stores a program or instructions that can run on the processor 901. For example, when the communication device 900 is used as a first device, the program or instructions executed by the processor 901 implement the following: Figure 3 or Figure 7 The steps of the method embodiment shown are the same and can achieve the same technical effect. To avoid repetition, they will not be described again here.

[0205] This application also provides a communication device, including a processor and a communication interface.

[0206] In one alternative implementation, the communication device is used to perform, for example... Figure 3In the case of the AI ​​network model training method provided in the illustrated embodiment, the communication interface is used to acquire a first sample dataset and a second sample dataset. The first sample dataset includes at least two first sample data points, and these first sample data points are unlabeled. The second sample dataset includes at least two second sample data points, and these second sample data points are labeled. The processor is used to execute a target iteration process based on the first sample dataset and the second sample dataset, and to terminate the target iteration process when an iteration termination condition is met, and to determine the first AI network model or the second AI network model obtained by the target iteration process as the target AI network model. The target iteration process includes: performing a first training process on the first AI network model based on the first sample dataset, and performing a second training process on the second AI network model based on the second sample dataset. The first AI network model is a Siamese network model, and the second AI network model is a unary network model. The unary network model has the same structure as a branch of the Siamese network model. The first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

[0207] In one alternative implementation, the communication device is used to perform, for example... Figure 7 In the case of the positioning method provided in the illustrated embodiment, the communication interface is used to obtain the target channel information of the target terminal; the processor is used to process the target channel information based on the target AI network model to obtain the positioning information of the target terminal, wherein the target AI network model is trained on a Siamese network model based on a first sample data without labels, and on a unary network model based on a second sample data with labels, and the unary network model has the same structure as a branch of the Siamese network model.

[0208] This communication device embodiment is similar to... Figure 3 or Figure 7 The method shown corresponds to the embodiment. Figure 3 or Figure 7 The various implementation processes and methods of the illustrated method embodiments can be applied to the communication device embodiments and achieve the same technical effects.

[0209] This application embodiment also provides a readable storage medium storing a program or instructions that, when executed by a processor, implement... Figure 3 or Figure 7 The various processes of the method embodiments shown can achieve the same technical effect, and will not be described again here to avoid repetition.

[0210] The processor is the processor in the terminal described in the above embodiments. The readable storage medium includes computer-readable storage media, such as computer read-only memory (ROM), random access memory (RAM), magnetic disk, or optical disk.

[0211] This application embodiment also provides a chip, the chip including a processor and a communication interface, the communication interface being coupled to the processor, the processor being used to run programs or instructions to implement, as described above. Figure 3 or Figure 7 The various processes of the method embodiments shown can achieve the same technical effect, and will not be described again here to avoid repetition.

[0212] It should be understood that the chip mentioned in the embodiments of this application may also be referred to as a system-on-a-chip, system chip, chip system, or system-on-a-chip, etc.

[0213] This application embodiment also provides a computer program / program product, which is stored in a storage medium and executed by at least one processor to implement the following: Figure 3 or Figure 7 The various processes of the method embodiments shown can achieve the same technical effect, and will not be described again here to avoid repetition.

[0214] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, it should be noted that the scope of the methods and apparatuses in the embodiments of this application is not limited to performing functions in the order shown or discussed, but may also include performing functions substantially simultaneously or in the reverse order, depending on the functions involved. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

[0215] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a computer software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0216] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.

Claims

1. A method for training an artificial intelligence (AI) network model, characterized in that, include: Obtain a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data and the first sample data has no label, and the second sample dataset includes at least two second sample data and the second sample data has a label; both the first sample data and the second sample data include at least one of channel state information and channel parameters, and the label of the second sample data includes the actual location information corresponding to the second sample data; Based on the first sample dataset and the second sample dataset, execute the target iteration process; If the iteration termination condition is met, the target iteration process is terminated, and the first AI network model or the second AI network model obtained by the target iteration process is determined as the target AI network model. The target iteration process includes: a first training process for the first AI network model based on the first sample dataset, and a second training process for the second AI network model based on the second sample dataset. The first AI network model is a Siamese network model, and the second AI network model is a unary network model. The unary network model has the same structure as a branch of the Siamese network model. The first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

2. The method according to claim 1, characterized in that, The step of performing the target iteration process based on the first sample dataset and the second sample dataset includes: The first training process is performed on the loss function of the first AI network model based on N first sample data sets to determine the values ​​of the first parameters of the two branches in the first AI network model. The first sample data set includes two first sample data sets, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data sets in the same first sample data set, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data sets. N is a positive integer. The second training process is performed on the loss function of the second AI network model based on H second sample data to determine the value of the second parameter in the second AI network model. The initial value of the second parameter is equal to the value of the first parameter of the target branch of the first AI network model. The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model. H is a positive integer. Based on the value of the second parameter, update the value of the first parameter of each branch in the first AI network model.

3. The method according to claim 1, characterized in that, The step of performing the target iteration process based on the first sample dataset and the second sample dataset includes: The second training process is performed on the loss function of the second AI network model based on H second sample data to determine the value of the second parameter in the second AI network model. The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model, where H is a positive integer. The first training process is performed on the loss function of the first AI network model based on N first sample data sets to determine the values ​​of the first parameters of the two branches in the first AI network model. The initial values ​​of the first parameters of the two branches in the first AI network model are equal to the values ​​of the second parameters of the second AI network model. The first sample data set includes two first sample data sets, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data sets in the same first sample data set, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data sets. N is a positive integer. Based on the value of the first parameter of the target branch in the first AI network model, update the value of the second parameter in the second AI network model.

4. The method according to claim 1, characterized in that, The iteration termination condition includes at least one of the following: The accuracy of the second AI network model is greater than or equal to the first preset accuracy; The accuracy of the first AI network model is greater than or equal to the second preset accuracy.

5. The method according to claim 2 or 3, characterized in that, The method further includes: The values ​​of the third parameter of the third AI network model are obtained by training based on the second sample dataset. The third AI network model has the same structure as the second AI network model. The initial value of the first parameter or the second parameter is determined based on the value of the third parameter.

6. The method according to any one of claims 1 to 3, characterized in that, The first AI network model, the second AI network model, and the target AI network model are localization AI network models; The first sample data includes at least one of the following: Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum; Channel parameters, which include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer. And / or, The label of the second sample data includes the actual location corresponding to the second sample data, and the second sample data includes at least one of the following: Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum; The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

7. The method according to claim 6, characterized in that, The dimension of the first output result of each branch of the first AI network model is smaller than the dimension of the first sample data; And / or, The dimension of the second output result of the second AI network model is smaller than the dimension of the second sample data.

8. The method according to any one of claims 1 to 3, characterized in that, The first sample dataset includes U first sample data points, and the second sample dataset includes V second sample data points, where U and V are integers greater than 1, and U is greater than V.

9. A positioning method, characterized in that, include: Obtain the target channel information of the target terminal; The target channel information is processed based on the target AI network model to obtain the positioning information of the target terminal, wherein the target AI network model is trained based on the method described in any one of claims 1 to 8.

10. The method according to claim 9, characterized in that, The target channel information includes at least one of the following: Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum; The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

11. The method according to claim 9, characterized in that, Before processing the target channel information based on the target AI network model to obtain the location information of the target terminal, the method further includes: Obtain relevant information about the target AI network model from the first node, wherein the first node is the node that trained the target AI network model.

12. An artificial intelligence (AI) network model training device, characterized in that, Applied to a first device, the device includes: The first acquisition module is used to acquire a first sample dataset and a second sample dataset, wherein the first sample dataset includes at least two first sample data and the first sample data has no label, the second sample dataset includes at least two second sample data and the second sample data has a label; both the first sample data and the second sample data include at least one of channel state information and channel parameters, and the label of the second sample data includes the actual location information corresponding to the second sample data; The execution module is used to execute the target iteration process based on the first sample dataset and the second sample dataset; The first determining module is used to terminate the target iteration process when the iteration termination condition is met, and to determine the first AI network model or the second AI network model obtained by the target iteration process as the target AI network model. The target iteration process includes: a first training process for the first AI network model based on the first sample dataset, and a second training process for the second AI network model based on the second sample dataset. The first AI network model is a Siamese network model, and the second AI network model is a unary network model. The unary network model has the same structure as a branch of the Siamese network model. The first training process is used to train the first parameters of each branch in the first AI network model, and the second training process is used to calibrate the first parameters.

13. The apparatus according to claim 12, characterized in that, The execution module includes: The first iteration unit is used to perform a first training process on the loss function of the first AI network model based on N first sample data groups to determine the values ​​of the first parameters of the two branches in the first AI network model. The first sample data group includes two first sample data, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data in the same first sample data group, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data. N is a positive integer. The second iteration unit is used to perform a second training process on the loss function of the second AI network model based on H second sample data, so as to determine the value of the second parameter in the second AI network model. The initial value of the second parameter is equal to the value of the first parameter of the target branch of the first AI network model. The loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model. H is a positive integer. The first update unit is used to update the value of the first parameter of each branch in the first AI network model according to the value of the second parameter.

14. The apparatus according to claim 12, characterized in that, The execution module includes: The third iteration unit is used to perform a second training process on the loss function of the second AI network model based on H second sample data, so as to determine the value of the second parameter in the second AI network model. Here, the loss function of the second AI network model is the distance between the second output result of the second AI network model and the label corresponding to the second sample data input to the second AI network model, and H is a positive integer. The fourth iteration unit is used to perform a first training process on the loss function of the first AI network model based on N first sample data sets to determine the values ​​of the first parameters of the two branches in the first AI network model. The initial values ​​of the first parameters of the two branches in the first AI network model are equal to the values ​​of the second parameters of the second AI network model. The first sample data set includes two first sample data sets, which are respectively used as inputs to the two branches to obtain two first output results. The loss function of the first AI network model is the distance between the first difference information and the second difference information. The first difference information indicates the distance between two first sample data sets in the same first sample data set, and the second difference information indicates the distance between the two first output results corresponding to the two first sample data sets. N is a positive integer. The second update unit is used to update the value of the second parameter in the second AI network model according to the value of the first parameter of the target branch in the first AI network model.

15. The apparatus according to claim 12, characterized in that, The iteration termination condition includes at least one of the following: The accuracy of the second AI network model is greater than or equal to the first preset accuracy; The accuracy of the first AI network model is greater than or equal to the second preset accuracy.

16. The apparatus according to claim 13 or 14, characterized in that, Also includes: The training module is used to train the third AI network model based on the second sample dataset to obtain the values ​​of the third parameter. The third AI network model has the same structure as the second AI network model. The second determining module is used to determine the initial value of the first parameter or the second parameter based on the value of the third parameter.

17. The apparatus according to any one of claims 12 to 14, characterized in that, The first AI network model, the second AI network model, and the target AI network model are localization AI network models; The first sample data includes at least one of the following: Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum; Channel parameters, which include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer. And / or, The label of the second sample data includes the actual location corresponding to the second sample data, and the second sample data includes at least one of the following: Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum; The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

18. The apparatus according to claim 17, characterized in that, The dimension of the first output result of each branch of the first AI network model is smaller than the dimension of the first sample data; And / or, The dimension of the second output result of the second AI network model is smaller than the dimension of the second sample data.

19. The apparatus according to any one of claims 12 to 14, characterized in that, The first sample dataset includes U first sample data points, and the second sample dataset includes V second sample data points, where U and V are integers greater than 1, and U is greater than V.

20. A positioning device, characterized in that, include: The second acquisition module is used to acquire the target channel information of the target terminal; The processing module is used to process the target channel information based on the target AI network model to obtain the positioning information of the target terminal, wherein the target AI network model is trained based on the method of any one of claims 1 to 8.

21. The apparatus according to claim 20, characterized in that, The target channel information includes at least one of the following: Channel state information, which includes at least one of the following: channel impulse response (CIR), frequency domain channel information, spatial domain channel information, and time delay power spectrum; The channel parameters include at least one of the following: first path delay, first path power, first path phase, first path angle, maximum Q path delay, maximum Q path power, maximum Q path phase, maximum Q path angle, time of arrival (TOA), time difference of arrival (TDOA), angle of arrival (AOA), departure angle (AOD), and reference signal received power (RSRP), where Q is a positive integer.

22. The apparatus according to claim 20, characterized in that, Also includes: The third acquisition module acquires relevant information about the target AI network model from the first node, wherein the first node is the node that trained the target AI network model.

23. A communication device, characterized in that, It includes a processor and a memory, the memory storing programs or instructions that can run on the processor, the programs or instructions being executed by the processor to implement the steps of the artificial intelligence (AI) network model training method as described in any one of claims 1 to 8, or to implement the steps of the localization method as described in any one of claims 9 to 11.

24. A readable storage medium, characterized in that, The readable storage medium stores a program or instructions that, when executed by a processor, implement the steps of the AI ​​network model training method as described in any one of claims 1 to 8, or the steps of the localization method as described in any one of claims 9 to 11.