Method and apparatus used in node for wireless communication
By monitoring the differences in intermediate feature distributions of AI/ML models during the training and inference phases, this technology overcomes the limitations of existing technologies in monitoring the performance of AI/ML models in wireless communication, achieving efficient performance monitoring without real location data and reducing system costs and latency.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- QUECTEL WIRELESS SOLUTIONS CO LTD
- Filing Date
- 2024-12-26
- Publication Date
- 2026-07-02
Smart Images

Figure CN2024142826_02072026_PF_FP_ABST
Abstract
Description
Methods and apparatus for use in nodes for wireless communication Technical Field
[0001] This application relates to the field of communication technology, and more specifically, to a method and apparatus for use in a node for wireless communication. Background Technology
[0002] With the development of communication technology, certain positioning scenarios require the introduction of artificial intelligence (AI) / machine learning (ML) models to improve positioning accuracy and computational efficiency. After deploying AI / ML models to terminal devices or network equipment, how to monitor the model's performance has become a pressing technical problem. Summary of the Invention
[0003] This application provides a method and apparatus for use in a node for wireless communication. The various aspects of this application will be described below.
[0004] In a first aspect, a method for a first node in wireless communication is provided, comprising: monitoring the performance of a first model based on the feature distribution difference between a first data feature and a second data feature; wherein the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0005] In a second aspect, a method for a second node in wireless communication is provided, comprising: training a first model; wherein the feature distribution difference between a first data feature and a second data feature is used to monitor the performance of the first model; the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0006] Thirdly, a first node for wireless communication is provided, comprising: a first processor, configured to monitor the performance of a first model based on the feature distribution difference between a first data feature and a second data feature; wherein the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0007] Fourthly, a second node for wireless communication is provided, comprising: a second processor for training a first model; wherein the feature distribution difference between a first data feature and a second data feature is used to monitor the performance of the first model; the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0008] Fifthly, a first node for wireless communication is provided, comprising a transceiver, a memory, and a processor, wherein the memory stores a program, the processor invokes the program in the memory, and controls the transceiver to receive or transmit signals to cause the first node to perform the method as described in the first aspect.
[0009] In a sixth aspect, a second node for wireless communication is provided, comprising a transceiver, a memory, and a processor, wherein the memory stores a program, the processor invokes the program in the memory, and controls the transceiver to receive or transmit signals to cause the second node to perform the method as described in the second aspect.
[0010] In a seventh aspect, embodiments of this application provide a communication system including the aforementioned first node and / or second node. In another possible design, the system may further include other devices that interact with the first node or second node as described in the embodiments of this application.
[0011] Eighthly, embodiments of this application provide a computer-readable storage medium storing a computer program that causes a computer to perform some or all of the steps in the methods described above.
[0012] Ninthly, embodiments of this application provide a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of the methods described in the foregoing aspects. In some implementations, the computer program product may be a software installation package.
[0013] In a tenth aspect, embodiments of this application provide a chip including a memory and a processor, the processor being able to call and run a computer program from the memory to implement some or all of the steps described in the methods of the foregoing aspects.
[0014] In this embodiment, by using the intermediate features of the first model, the first node can determine the feature distribution differences between the first and second data features during the training and inference phases (also known as the deployment phase), thereby identifying potential performance degradation of the first model based on its internal features. Therefore, performance monitoring of the first model does not rely on real-world location data and avoids the limitations of comparing input or output data. Consequently, this method ensures timely detection and response to changes in model performance even without direct location information, making the monitoring of the first model more sensitive and effective, and reducing the overall cost of system maintenance and operation. Attached Figure Description
[0015] Figure 1 is a system architecture example diagram of a wireless communication system applicable to embodiments of this application.
[0016] Figure 2 is a schematic diagram of a network architecture applicable to embodiments of this application.
[0017] Figures 3A and 3B are schematic diagrams of wireless protocol stack structures applicable to embodiments of this application.
[0018] Figure 4 is a schematic diagram of neurons in a neural network applicable to embodiments of this application.
[0019] Figure 5 is a schematic diagram of a neural network applicable to embodiments of this application.
[0020] Figure 6 is a schematic diagram of a convolutional neural network that can be applied to embodiments of this application.
[0021] Figure 7 is a flowchart illustrating a method for a first node in wireless communication according to an embodiment of this application.
[0022] Figure 8 is a schematic diagram of the first intermediate layer in the method shown in Figure 7.
[0023] Figure 9 is a flowchart illustrating one possible implementation of the method shown in Figure 7.
[0024] Figure 10 is a flowchart illustrating another possible implementation of the method shown in Figure 7.
[0025] Figure 11 is a flowchart illustrating one possible implementation of the method shown in Figure 9.
[0026] Figure 12 is a flowchart illustrating one possible implementation of the method shown in Figure 10.
[0027] Figure 13 is a schematic diagram of the structure of a first node for wireless communication provided in an embodiment of this application.
[0028] Figure 14 is a schematic diagram of the structure of a second node for wireless communication provided in an embodiment of this application.
[0029] Figure 15 is a schematic diagram of the structure of the device provided in the embodiment of this application.
[0030] Figure 16 is a schematic diagram of the hardware module of the communication device provided in the embodiment of this application. Detailed Implementation
[0031] Communication system architecture
[0032] Figure 1 is a system architecture example diagram of a wireless communication system 100 to which embodiments of this application can be applied. The wireless communication system 100 may include a network device 110 and a terminal device 120. The network device 110 may be a device that communicates with the terminal device 120. The network device 110 may provide communication coverage for a specific geographical area and may communicate with the terminal device 120 located within that coverage area.
[0033] Figure 1 exemplarily illustrates a network device and multiple terminal devices, such as terminal devices 120a to 120j in the figure. Optionally, the wireless communication system 100 may include multiple network devices, and each network device may include other numbers of terminal devices within its coverage area; this application embodiment does not limit this.
[0034] Optionally, the wireless communication system 100 may also include other network entities such as a network controller and a mobility management entity, which is not limited in this embodiment.
[0035] It should be understood that the technical solutions of the embodiments of this application can be applied to various communication systems, such as: 5th-generation (5G) systems or new radio (NR) systems, long-term evolution (LTE) systems, LTE frequency division duplex (FDD) systems, LTE time division duplex (TDD) systems, advanced long-term evolution (LTE-A) systems, enhanced 5G (5G advanced) systems, etc. The technical solutions provided in this application can also be applied to future communication systems, such as 6th-generation (6G) mobile communication systems, satellite communication systems, etc.
[0036] The terminal device in this application embodiment can also be referred to as user equipment (UE), access terminal, user unit, user station, mobile station, mobile station (MS), mobile terminal (MT), remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent, or user device. The terminal device in this application embodiment can be a device that provides voice and / or data connectivity to a user, and can be used to connect people, objects, and machines, such as a handheld device with wireless connectivity, vehicle-mounted device, etc. The terminal device in the embodiments of this application may be a mobile phone, tablet computer, laptop computer, handheld computer, camera equipment, mobile internet device (MID), wearable device, virtual reality (VR) device, augmented reality (AR) device, wireless terminal in industrial control, wireless terminal in self-driving, wireless terminal in remote medical surgery, wireless terminal in smart grid, wireless terminal in transportation safety, wireless terminal in smart city, wireless terminal in smart home, etc. Optionally, the terminal device may be used to act as a base station. For example, the terminal device may act as a scheduling entity, providing sidelink signals between UEs in vehicle-to-everything (V2X) or device-to-device (D2D) connections. For example, cellular phones and cars communicate with each other using sidelink signals. Cellular phones and smart home devices can communicate without relaying communication signals through base stations.
[0037] The network device in this application embodiment can be a device for communicating with terminal devices. This network device can also be called an access network device or a radio access network device, such as a base station (BS). In this application embodiment, the network device can refer to a radio access network (RAN) node or a next-generation RAN (NG-RAN) node (or device) that connects user equipment to a wireless network. A base station can broadly encompass, or be replaced by, various names including: NodeB, evolved NodeB (eNB), next-generation NodeB (gNB), relay station, transmitting and receiving point (TRP), transmitting point (TP), master station (MeNB), secondary station (SeNB), multi-mode radio (MSR) node, home base station, network controller, access node, wireless node, access point (AP), transmission node, transceiver node, baseband unit (BBU), remote radio unit (RRU), active antenna unit (AAU), remote radio head (RRH), central unit (CU), distributed unit (DU), positioning node, etc. A base station can be a macro base station, micro base station, relay node, donor node, or similar, or a combination thereof. A base station can also refer to a communication module, modem, or chip installed within the aforementioned equipment or apparatus. Base stations can also be mobile switching centers, devices that perform base station functions in D2D, V2X, and machine-to-machine (M2M) communications, network-side devices in 6G networks, and devices that perform base station functions in future communication systems. Base stations can support networks using the same or different access technologies. The embodiments of this application do not limit the specific technologies or device forms used in the network equipment.
[0038] Base stations can be fixed or mobile. For example, a helicopter or drone can be configured to act as a mobile base station, and one or more cells can move depending on the location of the mobile base station. In other examples, a helicopter or drone can be configured as a device to communicate with another base station.
[0039] In some deployments, the network device in this application embodiment may refer to a CU or a DU, or the network device may include both a CU and a DU. The gNB may also include an AAU.
[0040] Network devices and terminal devices can be deployed on land, including indoors or outdoors, handheld or vehicle-mounted; they can also be deployed on water; and they can also be deployed in the air on airplanes, balloons, and satellites. This application does not limit the scenario in which the network devices and terminal devices are located.
[0041] It should be understood that all or part of the functions of the communication device in this application can also be implemented by software functions running on hardware, or by virtualization functions instantiated on a platform (e.g., a cloud platform).
[0042] Figure 2 illustrates a schematic diagram of a network architecture 200 according to an embodiment of this application. This network architecture 200 describes the network architecture of a 5G NR / LTE / LTE-A system, which can also be referred to as a 5G system (5GS) / evolved packet system (EPS) network architecture. The network architecture 200 includes at least one of the following: network device 110, terminal device 120, 5G core network (5GC) / evolved packet core (EPC) 210, home subscriber server (HSS) / unified data management (UDM) 220, and Internet service 230. The network device and terminal device in Figure 2 are illustrated using RAN and UE as examples, respectively.
[0043] As shown in Figure 2, network device 110 provides user plane and control plane protocol termination to terminal device 120. Network device 110 is connected to 5GC / EPC 210 via an S1 / NG interface. 5GC / EPC 210 includes a mobility management entity (MME) / authentication management field (AMF) / session management function (SMF) 211, other MMEs / AMFs / SMFs 214, a service gateway (S-GW) / user plane function (UPF) 212, and a packet data network gateway (P-GW) / UPF 213. MME / AMF / SMF 211 is the control node that handles signaling between terminal device 120 and 5GC / EPC 210. Generally, MME / AMF / SMF 211 provides bearer and connection management. All user Internet Protocol (IP) packets are transmitted through the S-GW / UPF212, which is itself connected to the P-GW / UPF213. The P-GW provides UE IP address allocation and other functions. The P-GW / UPF213 is connected to Internet service 230. Internet service 230 includes operator-compliant Internet Protocol services, specifically including the Internet, intranet, IP multimedia subsystem (IMS), and packet-switched streaming services. It is evident that network architecture 200 provides packet-switched services; however, those skilled in the art will readily understand that the various concepts presented herein can be extended to networks providing circuit-switched services or other cellular networks.
[0044] Figures 3A and 3B respectively illustrate a schematic diagram of a wireless protocol stack structure according to an embodiment of this application. Figures 3A and 3B use a 5G wireless protocol stack as an example for illustration. The 5G wireless protocol stack is divided into two planes: the user plane (UP) protocol stack and the control plane (CP) protocol stack. The user plane protocol stack is the protocol suite used for user data transmission, and the control plane protocol stack is the protocol suite used for control signaling transmission in the 5G system. The specific names of each protocol stack layer are as follows:
[0045] As shown in Figure 3A, the user plane protocol stack includes, from top to bottom, the following layers: Service Data Adaptation Protocol (SDAP) layer, Packet Data Convergence Protocol (PDCP) layer, Radio Link Control (RLC) layer, Medium Access Control (MAC) layer, and Physical (PHY) layer.
[0046] As shown in Figure 3B, the control plane protocol stack includes, from top to bottom: non-access stratum (NAS); radio resource control (RRC) layer, PDCP layer, RLC layer, MAC layer, and PHY layer.
[0047] It should be understood that the different layers in the above protocol stack have different functions, and they work together through inter-layer interaction to achieve communication between terminal devices and network devices. With the development of artificial intelligence technology, AI-assisted computing has permeated the processing implementation methods of the above protocol stack. For example, the scheduling algorithm of the MAC layer and the encoding / decoding algorithm of the PHY layer can apply artificial intelligence algorithms to improve the performance of communication algorithms.
[0048] As an example, the wireless protocol architecture in Figures 3A and 3B is applicable to the first node in this application.
[0049] As an example, the wireless protocol architecture in Figures 3A and 3B is applicable to the second node in this application.
[0050] It should be understood that the interpretation of the terminology in the embodiments of this application may refer to the TS36, TS37 and TS38 series of specifications of the 3rd generation partnership project (3GPP), but may also refer to the specifications of the Institute of Electrical and Electronics Engineers (IEEE).
[0051] Positioning technology
[0052] In advanced 5G and the upcoming 6G network systems, positioning technology will play a crucial role. As an example, positioning technology needs to adapt to support a variety of application scenarios, including high accuracy, low latency, and wide coverage. These scenarios include autonomous driving, the Internet of Things (IoT), augmented reality (AR), smart cities, and industrial automation. Traditional positioning methods primarily rely on measurement techniques such as multilateral positioning, Doppler shift, and angle of arrival (AOA). Doppler shift-related measurement techniques include positioning algorithms such as time difference of arrival (TDOA) and frequency difference of arrival (FDOA).
[0053] However, with the increasing complexity of 5G / 6G networks, especially in high-density user environments, non-line-of-sight (NLoS) scenarios, and complex urban terrain, traditional positioning methods require significant computing resources and cannot meet the demands of large-scale IoT and ultra-low latency applications. At this juncture, the introduction of AI / ML technologies provides strong support for overcoming these limitations.
[0054] AI / ML technologies can improve positioning accuracy by processing large amounts of nonlinear, multidimensional data and learning effective features and patterns from complex wireless environments. As an example, channel feature extraction based on deep learning helps improve positioning accuracy. Deep neural networks can capture the spatial and temporal characteristics of wireless signals and automatically learn complex nonlinear changes during signal propagation. In non-line-of-sight environments, AI models can learn complex characteristics such as multipath reflection and attenuation, enhancing their understanding of signal paths and thus improving positioning accuracy. As another example, AI / ML technologies can integrate different measurement data, such as time, angle, and power information, to improve the accuracy of location estimation through multimodal data fusion. Especially in environments with strong NLoS or multipath effects, traditional methods may fail, while AI / ML can infer more precise locations by fusing this information.
[0055] AI / ML technologies also help solve the problem of limited coverage. In 5G / 6G, the use of millimeter-wave frequencies makes signals susceptible to obstruction, leading to limited coverage. AI / ML can extend network coverage through techniques such as predicting and compensating for obstruction effects and dynamically adjusting positioning strategies. As an example, by analyzing historical data and environmental information, AI / ML can predict potential signal obstruction and even continue to provide high-precision positioning through compensation techniques when signals are blocked. For instance, ML models can utilize cooperative information from surrounding base stations, combined with prior knowledge, to predict the signal path and location. As another example, in a wide range of indoor and outdoor environments, AI / ML models can dynamically adjust positioning strategies based on changes in the current environment. For instance, in densely populated urban areas, AI models can dynamically select the optimal base station or network configuration based on complex environmental changes and user density to maximize positioning accuracy and coverage.
[0056] Furthermore, AI / ML technologies can improve computational efficiency through intelligent data processing and resource management. The high data volume transmission of 5G / 6G networks demands higher computational efficiency, and AI / ML can reduce computational resource consumption through distributed AI models, model adaptation, and compression techniques. For example, deploying AI / ML models at the network edge (such as edge computing nodes and user devices) can reduce the load on central servers and achieve distributed processing. Edge AI models can process local data in real time, perform rapid location estimation, and collaborate with central models to improve overall performance when needed. As another example, AI / ML models can adaptively adjust based on network conditions and computational resources. Model compression and quantization techniques can reduce computational complexity and energy consumption while maintaining positioning accuracy, making them more suitable for resource-constrained user devices or network environments.
[0057] The previous section introduced AI / ML support for positioning technology. The following section introduces various positioning methods based on AI / ML.
[0058] Direct positioning and AI-assisted positioning are two different approaches to positioning technology applications, differing in how they achieve positioning and their reliance on AI technology.
[0059] Direct positioning refers to using an AI model to infer the device's location directly from raw signal data. In other words, the AI model itself handles the entire transformation process from input signal to location coordinates. In direct positioning, the AI model directly generates the location output by processing the raw input (such as received signal strength, time of arrival, channel state information, etc.). This approach integrates signal processing and location inference into a single model. Direct positioning fully leverages the powerful feature extraction capabilities of AI models, extracting location-related information from complex signal patterns through techniques such as deep learning, typically providing high positioning accuracy. Taking UE, gNB, and location management function (LMF) positioning as examples, typical application scenarios for direct positioning include the following.
[0060] Scenario 1: Direct AI / ML positioning based on UE location and UE-side model.
[0061] Scenario 2 (Case 2b): UE-assisted / LMF-based localization, using the LMF-side model for direct AI / ML localization.
[0062] Scenario 3 (Case 3b): NG-RAN node-assisted localization, using LMF side model, direct AI / ML localization.
[0063] AI-assisted positioning refers to using AI technology to enhance or optimize certain aspects of traditional positioning methods, rather than directly inferring location from signals. In this context, AI is typically used to improve or supplement existing positioning algorithms, rather than replace them. In AI-assisted positioning, AI is usually used to improve certain aspects of the positioning system, such as increasing the accuracy of signal processing, optimizing algorithm parameters, and performing error correction, while the final positioning calculation still relies on traditional methods (such as geometric algorithms and triangulation). AI-assisted positioning focuses more on optimizing existing systems, is highly adaptable, and can be fine-tuned using AI technology in different scenarios to adapt to environmental changes.
[0064] Taking UE, gNB, and LMF positioning as examples, typical application scenarios for assisted positioning include the following.
[0065] Scenario 5 (Case 2a): UE-assisted / LMF-based localization, using UE-side model, AI / ML-assisted localization.
[0066] Scenario 6 (Case 3a): NG-RAN node-assisted localization, using gNB side model, AI / ML-assisted localization.
[0067] The following sections will introduce the AI-driven network-side positioning system architecture and the UE-side positioning system architecture, respectively.
[0068] In advanced 5G and the upcoming 6G networks, AI-driven positioning architectures typically consist of two main parts: the network side (such as LMF, gNB, etc.) and the user equipment (UE) side. These two parts each play different roles in the positioning system and perform different tasks to collaboratively achieve high-precision, high-efficiency positioning services.
[0069] Network-side positioning systems typically include key network entities such as LMFs, next-generation base stations (e.g., gNBs), and core network nodes. The network side is primarily responsible for centralized computing and data management, handling large-scale network information and complex operations related to positioning tasks. The network side collects a large amount of positioning-related data, which may come from multiple base stations, reference signals (such as positioning reference signals (PRS)), and the UE. Network-side nodes uniformly collect various measurements, such as time of arrival (TOA) and angle of arrival (AOA).
[0070] Optionally, at the AI / ML model deployment and inference level, the network side can deploy AI / ML models in nodes such as LMF or gNB. These models perform location inference based on collected measurement data. The network side has powerful computing resources, capable of processing large amounts of input data and inferring the precise location of the UE in real time.
[0071] As an example, when an AI model is deployed on the LMF side, it is responsible for executing complex localization algorithms and integrating signals from multiple measurement points, applying deep learning or other AI technologies to improve localization accuracy. Especially in direct AI / ML localization scenarios, the LMF is responsible for performing localization inference independently without UE assistance.
[0072] As an example, when the AI model is deployed on the gNB side, in AI / ML-assisted positioning scenarios, the model on the gNB side works in collaboration with the model on the UE side to improve positioning accuracy and coverage by optimizing signal processing and data transmission paths.
[0073] Optionally, at the signaling control layer, the network side needs to perform a large number of control tasks, including coordinating communication between the UE and the gNB, sending auxiliary data (such as location calculation auxiliary data) to the UE, and managing measurement reports. These control tasks ensure that the UE and the network-side positioning system can cooperate efficiently.
[0074] In summary, the network-side architecture has the advantage of stronger computing power, enabling it to handle complex AI / ML models, which supports large-scale user localization. Furthermore, the network side can access global network status and topology information, combining global data to optimize localization inference.
[0075] The UE-side positioning system is primarily responsible for performing distributed positioning tasks, especially in low-latency and high-dynamic scenarios. The UE needs to calculate or assist the network in completing the positioning task in real time based on local signals and some network-side data. The UE is responsible for processing local measurement data in real time, such as signal arrival time, signal strength, and angle. Based on this data, the UE can perform some preprocessing tasks and, in certain scenarios, independently perform position estimation (e.g., direct positioning using the UE-side model).
[0076] Optionally, at the AI / ML model deployment and inference level, lightweight AI / ML models can be deployed on the UE side. These models can perform UE location estimation based on local measurement data and auxiliary data received from the network side. For example, in direct positioning using the UE-side AI model, the AI / ML model performs inference based on received signals and local data to generate real-time location estimation results. Furthermore, the UE communicates with the network side via signaling to provide measurement data or request auxiliary information, such as location calculation auxiliary data. This collaborative positioning architecture enables the UE to perform positioning inference more flexibly.
[0077] Compared to network-side architecture, UE-side architecture has the advantage of being able to perform local inference, reducing the latency of data upload and processing, making it particularly suitable for application scenarios that require real-time feedback (such as augmented reality, vehicle-to-everything).
[0078] Neural Networks
[0079] AI research, exemplified by neural networks, has achieved significant results in many fields and will continue to impact people's lives and work for a long time to come. A neural network can be understood as a computational model composed of multiple interconnected neurons. In a neural network, the connection strength between nodes can be represented as the weighted values corresponding to the input signals, also known as parameters. Each neuron performs a weighted summation of different input signals and outputs the result through a specific activation function. Neurons can achieve nonlinear mappings depending on the activation function.
[0080] Taking the neuron shown in Figure 4 as an example, the input of the neuron can be denoted as A, and each dimension of the input can be denoted as a. j The corresponding weighted value is denoted as w. j Where j takes values of 1, 2, ..., n. The neuron's input can also be set with a bias term to adjust the output, as shown by the constant 1 in Figure 4 (corresponding to the weighting value denoted as b). The weighting value, together with the summation units (SU), enhances or weakens the input. The output of the SU can be input into the activation function f to obtain the output t.
[0081] Common neural networks include convolutional neural networks (CNN), recurrent neural networks (RNN), and deep neural networks (DNN).
[0082] The neural network applicable to the embodiments of this application is described below with reference to Figure 5. The neural network shown in Figure 5 can be divided into three categories according to the position of different layers: input layer 510, hidden layer 520, and output layer 530. Generally, the first layer is the input layer 510, the last layer is the output layer 530, and the intermediate layers between the first and last layers are hidden layers 520.
[0083] The input layer 510 is used to input data, which may be, for example, a received signal received by a receiver. The hidden layer 520 is used to process the input data, for example, to decompress the received signal. The hidden layer may also be called an intermediate layer. The output layer 530 is used to output the processed output data, for example, to output the decompressed signal.
[0084] Referring to Figure 5, the neural network consists of multiple layers, each containing multiple neurons. Neurons between layers can be fully connected or partially connected. For connected neurons, the output of a neuron in one layer can serve as the input to a neuron in the next layer.
[0085] To facilitate understanding, the following uses a CNN as an example, with reference to Figure 6, to illustrate the multiple layers in a neural network. A CNN is a deep neural network with convolutional structures. As shown in Figure 6, the structure of a CNN may include an input layer 610, a convolutional layer 620, a pooling layer 630, a fully connected layer 640, and an output layer 650. The convolutional layer 620, pooling layer 630, and fully connected layer 640 are the intermediate layers of this CNN.
[0086] Each convolutional layer (620) can contain multiple convolution operators, also known as kernels. These operators can be viewed as filters that extract specific information from the input signal. Essentially, a convolution operator is a parameter matrix, which is usually predefined. The parameter values in these matrices need to be obtained through extensive training in practical applications to help the CNN make correct predictions. When a CNN has multiple convolutional layers, the initial convolutional layers tend to extract more general features, which can also be called low-level features. As the CNN depth increases, the features extracted by later convolutional layers become increasingly complex.
[0087] Pooling layers 630 are often introduced periodically after convolutional layers to reduce the number of training parameters and the space required for information extraction. Pooling layers can be introduced in various ways; for example, as shown in Figure 6, a pooling layer can follow a convolutional layer, or multiple convolutional layers can be followed by one or more pooling layers.
[0088] A fully connected layer 640 is used to generate the final output information. Since the convolutional layer 620 and pooling layer 630 are only responsible for extracting features and reducing parameters introduced by the input data, their processing is insufficient to generate the required output information; therefore, a fully connected layer 640 is introduced. Typically, the fully connected layer 640 may also include multiple hidden layers, the parameters of which can be pre-trained using training data relevant to the specific task type. For example, this task type may include decoding data signals received by a receiver. Another example is channel estimation based on pilot signals received by the receiver.
[0089] Following the multiple hidden layers in the fully connected layer 640, the final layer of the entire CNN, the output layer 650, is used to output the result. Typically, the output layer 650 is equipped with a loss function (e.g., a loss function similar to classification cross-entropy) to calculate the prediction error, or to evaluate the degree of difference between the output of the CNN model (also known as the predicted value) and the ideal result (also known as the true value).
[0090] It should be noted that the CNN shown in Figure 6 is only an example of a convolutional neural network. In specific applications, convolutional neural networks can also exist in the form of other network models, and this application does not limit this.
[0091] The preceding text, with reference to Figures 4 to 6, introduced the various layers of a neural network and their importance within the network. As can be seen, a neural network model includes an input layer, an output layer, and intermediate layers. Each layer of the neural network has its specific responsibilities and functions. The input layer is responsible for receiving and preprocessing data; the intermediate layers are responsible for feature extraction and processing; and the output layer is responsible for transforming the processed data into the desired output result. Through the interaction of multiple layers, the neural network can process and analyze complex data and extract useful information from it.
[0092] Optionally, the input or output of any intermediate layer in a neural network model can be called an intermediate feature of the model. That is, the intermediate feature of the model can be the output of the input layer or any intermediate layer, or it can be the input of any intermediate layer or output layer.
[0093] In the field of AI, one of the core functions of neural networks is to automatically extract features from input data. Neural networks extract features from input data through a hierarchical structure. Each layer is responsible for extracting features at different levels, from simple to complex. The feature extraction process is accomplished through the network's weights, biases, and non-linear activation functions. The features extracted by neural networks include low-level features and high-level features.
[0094] Optionally, the primary layers of a network can extract low-level features from the data. For example, for an image, the initial layer might extract simple features such as edges and textures. In communication signal processing, low-level features might include basic signal features such as frequency, phase, and amplitude.
[0095] Optionally, as the number of network layers increases, the features extracted by the neural network become increasingly complex and abstract, capable of representing higher-dimensional patterns. For example, for images, the network may identify shapes and objects. In communication, it may extract the modulation methods, coding structures, and even patterns of interference signals from complex signals.
[0096] Model monitoring
[0097] Currently, monitoring the performance of AI / ML models is a topic of discussion. Taking AI models as an example, some solutions attempt to monitor performance by comparing the statistical characteristics of the AI model's inputs or outputs. These solutions involve comparing the statistical characteristics of the AI model's training data with those of wireless signal data collected in actual deployment scenarios.
[0098] Commonly used statistical methods include kernel density estimation (KDE) and maximum mean discrepancy (MMD). These will be described in detail below.
[0099] Multidimensional kernel density estimation is based on statistics from multidimensional data. In multidimensional data, assuming the data x is d-dimensional, the sample data points can be {x1, x2, ..., x...}. n In multiple sample data points, each x i ∈R d Where 1≤i≤n, R d Let represent a d-dimensional real space. The goal of multidimensional KDE is to estimate the probability density function in the data space using a kernel function K(·). The estimation formula is:
[0100] Where x is the feature point to be estimated; h is the bandwidth parameter (smoothing parameter); d is the dimension of the data x; K(·) is the multidimensional kernel function, usually a multidimensional Gaussian kernel function is chosen for estimation.
[0101] Alternatively, in the multidimensional case, the Gaussian kernel function, as a commonly used kernel function, takes the form:
[0102] Optionally, bandwidth *h* is a key parameter affecting the estimation performance of KDE. A larger bandwidth results in a smoother estimated distribution; a smaller bandwidth results in a more "sharp" estimated distribution. In multidimensional KDE, bandwidth can be a scalar or a diagonal matrix *H*, used to select different smoothing parameters for each dimension. For each dimension *j*, the corresponding bandwidth parameter *h*... j This will affect the distribution estimation along that dimension.
[0103] In summary, kernel density estimation (KDE) is a nonparametric method for estimating the probability density function of a random variable. This method locally smooths each data point across the entire data space to estimate the probability density function of the random variable. It provides a detailed analysis of the data distribution characteristics and is particularly suitable for handling high-dimensional data. Compared to MMD, KDE can more accurately describe the distributional variations of features in high-dimensional data.
[0104] In certain scenarios, after obtaining different probability density functions using the KDE method, the Kullback-Leibler divergence (KL divergence, also known as relative entropy) can be used to quantify the differences between them. KL divergence is an indicator that measures the difference between two probability distributions; a larger value indicates a more significant difference between the two distributions, while a smaller value indicates a smaller difference.
[0105] MMD is a tool in statistics and machine learning used to measure the difference between two probability distributions. MMD is a distance metric based on kernel methods used to determine whether two probability distributions are identical.
[0106] Suppose we have two sample sets X = {x1, x2, ..., xn} m} and Y = {y1, y2, ..., y n The values are from two distributions, P and Q, respectively. MMD can determine whether P and Q are the same by calculating the difference in mean embeddings of the two distributions in the reproducing kernel Hilbert space (RKHS).
[0107] MMD is defined based on the kernel mean embedding of the samples. Its mathematical expression is as follows:
[0108] in, It is the set of functions in RKHS; f is the feature mapping function defined on X. Let represent the expected value. This difference can be calculated using the kernel trick by introducing a kernel function k(x,y). The unbiased estimate of MMD is:
[0109] Here, k(x,y) is used as the kernel function, and a Gaussian kernel or a polynomial kernel is usually chosen.
[0110] In summary, MMD is a nonparametric statistical technique based on kernel methods. The core of MMD is to compare the means of the characteristic distributions of two sample sets to measure the difference between them, thereby determining whether the two sample sets originate from the same probability distribution. The advantage of this method is that it does not require explicit estimation of the probability density function; instead, it calculates the difference in means between samples using a kernel function, making the calculation process more efficient and direct.
[0111] The preceding text introduced various statistical methods for monitoring AI models. Regardless of the method used, monitoring schemes that compare differences in the statistical characteristics of AI model inputs or outputs have many limitations, as detailed below. It should be noted that the embodiments in this application are based on the following analysis, which is not prior art and should be considered part of this application's contribution to the prior art.
[0112] The statistical characteristics of AI model input data typically refer to the distribution characteristics of the raw data received by the AI model (such as multidimensional channel state information (CSI), time of arrival, angle of arrival, etc.), including mean and variance. These monitoring methods detect the consistency and potential bias of the model input by comparing changes in the distribution characteristics of the input data between the training and deployment phases. However, designing an AI model performance monitoring scheme based solely on the differences in the statistical characteristics of input data has several limitations. Specifically, AI model performance monitoring schemes based on differences in the statistical characteristics of input data will have the following problems.
[0113] First, the sensitivity of input data characteristics to model performance is insufficient. Statistical features of input data, such as mean and variance, are low-dimensional attributes that typically only reveal the overall trend of the data. However, the performance of an AI model, especially in terms of prediction accuracy, actually depends on its ability to perform in-depth analysis and extract complex features from the input data. In many cases, changes in the statistical features of the input data do not directly equate to corresponding changes in model performance, especially when data changes are subtle or confined to a specific area. For example, in localization tasks, the statistical features of certain wireless signals (such as the average value of the received signal strength indication (RSSI)) may only fluctuate slightly, but this does not guarantee that the deep features extracted by the AI model are not significantly affected. Furthermore, model performance may be slow to react to overall changes in the input data or be difficult to detect. In complex wireless environments, such as those with multipath propagation or non-line-of-sight propagation, even if the statistical features of the input data appear unchanged, small changes in the environment can have a significant impact on the AI model's decision-making process. These complex patterns often cannot be fully captured simply by analyzing the low-dimensional statistical characteristics of the input data.
[0114] Secondly, it leads to information loss. AI models typically process high-dimensional input data. For example, in the field of wireless communication, the data processed by AI models may include multidimensional CSI, time of arrival, angle of arrival, etc. When analyzing this data, statistical properties often only capture overall trends or basic numerical features, such as mean and variance, which may lead to the omission of complex patterns and subtle differences contained in the data. In other words, when input data is compressed into low-dimensional statistical indicators (such as mean and variance), the complex information in the original high-dimensional data (such as multipath effects and signal attenuation) may be lost, thus failing to accurately reflect the internal state and potential problems of the model when processing the data. In addition, this compression may also make it difficult for AI models to extract details and complex patterns from the input data, because relying solely on statistical properties may fail to detect subtle changes that affect the model's decision-making ability, such as interference or noise fluctuations in specific frequency bands. Therefore, in order to fully understand and optimize the performance of AI models, it is necessary to go beyond traditional low-dimensional statistical analysis and delve deeper into and utilize the high-dimensional information in the input data.
[0115] Secondly, it exhibits poor robustness to anomalous inputs. AI models may encounter anomalies when processing input data, such as equipment malfunctions, malicious attacks, and carefully crafted adversarial examples. These anomalous data may not show significant statistical abnormalities on the surface, but they are sufficient to cause the AI model to output significantly erroneous results. The adversarial example problem is particularly challenging because, although these samples are statistically very similar to normal inputs and rarely trigger alerts from statistically based monitoring systems, they can induce the model to make extremely incorrect decisions. Furthermore, input data may contain defects due to equipment malfunctions or errors during data acquisition. If monitoring relies solely on statistical characteristics, these errors may be overlooked because they may not necessarily affect the overall statistical properties of the data.
[0116] Secondly, there is a lack of insight into the model's internal workings. Monitoring based solely on the statistical properties of input data is essentially a peripheral monitoring method; it focuses only on data changes without delving into how the model processes that data. The limitation of this approach is that it fails to reflect the internal state and operational mechanisms of the AI model. AI models progressively extract and transform features through the multi-layered structure of neural networks, and input data is merely the starting point of this complex process. Therefore, even if the statistical properties of the input data remain unchanged, the model may have already incorrectly analyzed and processed certain inputs in subsequent processing. Such internal problems are difficult to detect in a timely manner using traditional statistical monitoring methods. Furthermore, merely monitoring the statistical properties of input data offers little help in understanding the model's decision-making process, especially when there are complex nonlinear relationships between different layers of the model. This limitation can lead to difficulties in accurately locating and effectively resolving problems, as it is impossible to clearly determine whether the problem originates from the input data itself or from a malfunction in the model's internal processing.
[0117] Secondly, there is the insensitivity to environmental changes. In communication scenarios, environmental changes such as multipath propagation and building obstruction can significantly impact wireless signals, but these effects are often not directly reflected in the statistical characteristics of the input data. This insensitivity means that even if the communication environment changes, such as a user entering a densely built-up area, the statistical characteristics of the input signal may remain unchanged, while the channel conditions and signal propagation paths may have fundamentally changed. In this case, the AI model's adaptability to the new environment may decrease, but these problems may not be detected in time based solely on the statistical characteristics of the input data. Furthermore, monitoring methods that rely on the characteristics of the input data cannot effectively cope with dynamic changes in the scenario, especially when the model needs to demonstrate its generalization ability in new environments. This may lead to delayed detection or even complete failure to detect performance issues in dynamic environments, thus affecting the overall performance and reliability of the model.
[0118] The statistical characteristics of output data typically refer to the distribution of the final prediction results (such as location, classification labels, etc.) of an AI model. Monitoring methods of this type detect changes in model performance by comparing changes in the model's output distribution between the training and deployment phases. While this method is somewhat intuitive, it also has several limitations. Specifically, AI model performance monitoring schemes based on differences in the statistical characteristics of output data suffer from the following problems.
[0119] First, the output data is not sensitive enough to changes in the input. AI model output data is typically low-dimensional, especially in tasks like localization, where the output is often a single physical quantity, such as location coordinates. This low-dimensional output does not always fully reflect the details and feature changes in the input data. Due to the output data's insensitivity to input changes, changes in its statistical properties often only become apparent when the input data undergoes significant changes. For example, in a wireless signal environment, even subtle changes in multipath effects or channel conditions may not be immediately reflected in the model's output. Therefore, relying on the statistical properties of the output data for performance monitoring may delay the detection of model performance anomalies. Furthermore, since model inputs are typically high-dimensional data, such as CSI and time of arrival, while the output is a simplified numerical value, subtle changes in the input data may be "masked" or "compressed" at the output, failing to accurately reflect the true complexity of the input. This conversion from high-dimensional input to low-dimensional output may prevent subtle changes in the model's processing of input data from being fully reflected in the output, thus affecting the accurate evaluation and timely monitoring of model performance.
[0120] Secondly, the statistical properties of the output data suffer from information loss. This loss primarily stems from the fact that the output data is the result of multiple layers of processing by the AI model, containing far less information than the original input data. During the operation of the AI model, high-dimensional input data (such as signal data containing complex multipath propagation information and reflection interference) is mapped and simplified into low-dimensional output results (e.g., location coordinates). This high-dimensional to low-dimensional transformation inevitably leads to information compression and loss, thus failing to retain all key environmental patterns in the output data. Furthermore, in wireless communication environments, the influence of buildings, obstacles, and other reflection factors on signal characteristics may manifest as subtle changes in the input data, but these subtle changes are often not reflected in the low-dimensional output data. Therefore, relying solely on the statistical properties of the output data for monitoring may fail to capture changes in complex environments or reveal whether anomalies have occurred in the model's processing of the input data. This means that output statistical properties can only reflect changes in the final result and cannot provide detailed information about potential problems in the model's internal processing.
[0121] Secondly, the ability to detect anomalous inputs is limited. AI models may encounter anomalous inputs during operation, including adversarial examples, data errors, or noise interference. These anomalous inputs may not immediately lead to significant changes in the output. Therefore, relying solely on the statistical properties of the output data for monitoring may not be effective in detecting these anomalous inputs. For example, adversarial examples are specifically designed to interfere with the AI model's decision-making process without significantly altering the statistical properties of the input data. Adversarial examples may cause significant changes at the model's internal feature extraction level, but because the statistical properties of the output data are not significantly different from normal data, it is difficult to detect such potential threats through output monitoring. Furthermore, the complexity of wireless communication environments can lead to various errors during signal transmission, such as equipment malfunctions or signal interference. While these errors may be reflected at the input data level, they may not be immediately reflected in the output. For example, in a localization task, the model's output location information may have only a slight deviation, while the input data may have been severely interfered with. In such cases, relying solely on the statistical properties of the output data is insufficient to effectively identify and address these problems, thus highlighting the limited ability to detect anomalous inputs.
[0122] Secondly, the model exhibits poor robustness to environmental changes. In communication environments, particularly mobile communication and positioning scenarios, environmental changes significantly impact signal propagation paths, strength, and delays. These changes may not be immediately or clearly reflected in the statistical properties of the model output, resulting in poor robustness to environmental changes. For example, when the environment changes, such as entering an area with severe signal attenuation, the statistical properties of the output results may not quickly reflect this change, making it difficult for monitoring systems that rely on output data to detect the decline in model adaptability in a timely manner. Similarly, in positioning tasks, even though the wireless signal has undergone significant changes, the output coordinates may still fluctuate within a small range. Furthermore, environmental changes are usually gradual; the statistical properties of the model's output may not show significant changes in the short term, but this does not mean that the model performance is unaffected. Long-term cumulative environmental changes may lead to gradual degradation of model performance, eventually causing sudden and significant deviations in the output results. In such cases, by the time monitoring of the output data reveals the problem, the optimal intervention opportunity has often been missed.
[0123] In addition to the issues mentioned above, some solutions for real-time monitoring of AI / ML model performance also require real-world location data, i.e., ground truth. However, in real-world wireless communication scenarios, due to the instability of wireless channels and the variability of environmental factors, directly obtaining real-world location data is both costly and technically difficult. In other words, real-world location data for terminal devices is difficult to obtain during the model deployment phase. Therefore, in wireless communication positioning scenarios, effectively monitoring the performance changes of deployed AI / ML models in the absence of real-world location data becomes a key technical challenge that needs to be addressed.
[0124] Based on this, this application proposes a method for use in nodes of wireless communication. This method achieves real-time monitoring of a first model by comparing the feature differences between data collected during the training and inference phases, reducing the cost and complexity of obtaining the baseline truth. Furthermore, the data features used for difference comparison are intermediate features of the first model, overcoming various limitations of monitoring by comparing input or output data. Taking an AI model for positioning as an example, this performance monitoring method based on extracting intermediate features from the AI model solves the challenge of real-time monitoring of AI model performance in wireless communication environments lacking real location data, providing stable and reliable technical support for positioning services in wireless communication, and is particularly suitable for high-precision positioning tasks in 5G / 6G networks.
[0125] It should be understood that the embodiments of this application can not only solve the real-time monitoring problem of AI / ML models used for positioning, but also solve the real-time monitoring problem of AI / ML models in other scenarios. Furthermore, the method of the embodiments of this application is not limited to UE-side models, but can also be used for NW-side models, or for dual-side models on both the UE and NW sides.
[0126] For ease of understanding, the method for a first node in wireless communication proposed in this application embodiment will be described below with reference to FIG7. The method shown in FIG7 includes step S710, which can be executed by the first node.
[0127] In some implementations, the first node can be any of the communication devices described above. In some implementations, the first node can support the deployment of the first model and support real-time monitoring of the first model.
[0128] As an example, the first node can be any type of terminal device, such as the terminal device 120 shown in Figure 1.
[0129] As an example, the first node can be a relay, such as a relay terminal.
[0130] As an example, the first node can be any type of network device, such as network device 110 shown in Figure 1.
[0131] Referring to Figure 7, in step S710, the performance of the first model is monitored based on the difference in feature distribution between the first data feature and the second data feature.
[0132] The first model can be an AI / ML model, or a model with similar functionality. In some implementations, the first model can be used to locate the first node. In some implementations, the first model can be used for beam prediction.
[0133] As an example, the first model is deployed on the first node side. When the first node is a terminal device, the first model is a UE-side model. When the first node is a network device, the first model is a network-side model. For example, in scenario 1 described above, the first model is deployed on the UE side, and the first model can directly output location information.
[0134] In the above embodiments, the deployment of the first model on the first node side can mean that the first model is deployed directly on the first node, or it can mean that the first model is deployed on a server connected to the first node.
[0135] In the above embodiments, the first model can be deployed on the first node side before training or after training is completed. In this scenario, the inference phase in which the first node uses the first model for prediction can also be called the deployment phase.
[0136] In some embodiments, when the first node is a terminal device, the first node can train a first model independently, and then perform model inference and report the results. For example, the first node can collect training data or historical measurement data to train the first model. After completing the training of the first model, the first node can send the training data and the trained first model to the second node.
[0137] In some embodiments, when the first node is a network device, the first node can train the first model based on the data reported by the terminal device, and then perform model inference and result distribution. After completing the training of the first model, the first node can also send the training data and the trained first model to the second node.
[0138] In some embodiments, the first model can be trained by various devices communicating with the first node, and then deployed to the first node after training. For example, a second node can train the first model and then send the trained first model to the first node. Alternatively, a server connected to the first node can train the first model and then send the trained first model to the first node. Or, when the first node is a terminal device, the first model can be trained by a network device providing services to the area where the first node is located.
[0139] In the above embodiment, the server connected to the first node can send the training data of the first model to the second node.
[0140] The second node can be any network-side device that communicates with the first node and supports the deployment of the first model, such as a network entity in the core network. When the first model is used for positioning, the second node can be an LMF (Local Multi-Functional Array). When the first model is used for beam prediction, the second node can be a corresponding network entity or network device.
[0141] As one example, after the second node completes training the first model, it can distribute the first model back to the first node. The first node receives the first model from the second node; alternatively, it can deploy the first model on its side. Taking the first node as the UE and the second node as the LMF as an example, during the system initialization phase, the first model is first trained on the device where the LMF is located. The training data includes historical wireless signals and their corresponding location information. After training is completed, the first model is distributed to the UE for deployment.
[0142] As an example, the LMF also takes on the responsibility of model updates. Once the periodic training and optimization of the first model is complete, the LMF will distribute the updated first model to the UE. When the first model is used for localization, the UE can use the updated model to perform localization tasks.
[0143] As an example, the UE can trigger the LMF to train or update the first model. Once training or updating is complete, the LMF transmits the new first model to the UE via signaling. This signaling could be, for example, radio resource control (RRC) signaling.
[0144] In some embodiments, the second node can communicate directly with the first node or through other devices. As an example, when the first node is a terminal device, the second node can communicate with the first node through a network device. For instance, when the first node is a UE, the network device is a gNB, and the second node is an LMF, the gNB can act as a data transmission relay. That is, the gNB is responsible for reliably forwarding data and signaling from the LMF to the UE, and collecting feedback information from the UE.
[0145] Once the first model is deployed on the first node, the first node can perform performance monitoring on the first model based on event triggers, or it can perform periodic monitoring of the first model's performance according to settings. Therefore, the model monitoring method is used in the runtime phase after model deployment.
[0146] In some embodiments, when the first node periodically monitors the first model, the monitoring period can be defined as T. That is, the monitoring period for the first model is T. As an example, the first node can monitor the performance of the first model based on the period T. As an example, the monitoring period T can be adjusted according to system settings, the complexity of the communication environment, and the type of service.
[0147] Performance monitoring of the first model can be achieved based on the first data feature and the second data feature. Both the first and second data features are outputs extracted from the intermediate layers of the first model. The difference is that the first data feature is the intermediate layer output of the first model during the training phase, while the second data feature is the intermediate layer output of the first model during the inference phase. As mentioned earlier, the inference phase can also be called the deployment phase.
[0148] The intermediate layer output of the first model can be referred to as the intermediate feature of the first model. An intermediate feature is a type of data feature. As mentioned earlier, the intermediate features of a model are data features generated during the model processing. In this embodiment, the intermediate features of the first model are data features obtained after feature extraction and / or processing of any intermediate layer of the first model.
[0149] As an example, the first data feature is the intermediate feature of the first model during the training phase. The intermediate feature during the training phase can be replaced with: the intermediate layer output during the training phase, the intermediate layer feature output based on the training data, and the intermediate layer feature extraction from the training data.
[0150] In the example above, training data can be stored in the second node. For instance, when the second node is an LMF (Light Filter Function), the training data is stored on the LMF device. The training data can include wireless signal features collected during the model training phase, providing benchmark data for subsequent feature difference calculations.
[0151] As an example, the second data feature is the intermediate feature of the first model during the inference phase. The intermediate feature during the inference phase can be replaced with: the intermediate layer output during the inference phase, the intermediate layer feature output based on the inference data, and the intermediate layer feature extraction of the inference data.
[0152] In the above example, the inference phase is the application or operation phase of the first model, and the inference data can also be called real-time data or wireless signal (channel) data for the current time period. That is, the first node can perform real-time feature acquisition. For example, when the first node is a terminal device, the terminal device can acquire wireless signal data in the current environment in real time. Optionally, the first model can directly use this data for location prediction. Simultaneously, the first model on the terminal device can also extract features from the wireless signal data to obtain second data features.
[0153] In the example above, the real-time data used as inference data can be randomly input wireless signal data in the current time period, and there is no limitation here.
[0154] In the example above, when the first node is a UE, the UE can wirelessly communicate with the network device and collect wireless signal characteristic data in real time based on the communication status. This wireless signal data can be used by the UE to extract the second data feature.
[0155] As an example, when the first node is a terminal device, it can communicate with network devices. The communication data between the first node and the network devices can serve as input data for the first model to perform relevant predictions. For instance, the first node can collect wireless signal data from communications with other nodes. When the first model is used for positioning, this wireless signal data can serve as input data for positioning and can also be used to monitor the positioning performance of the first model in real time. When the first model is used for beam prediction, this wireless signal data can serve as input data for beam prediction and can also be used to monitor the prediction performance of the first model in real time.
[0156] In the example above, the first node can determine the wireless signal data for the current time period and input the wireless signal data into the first model to determine the second data feature corresponding to the wireless signal data for the current time period. As mentioned earlier, the performance monitoring of the first model can be periodic. The wireless signal data for the current time period can be real-time data for the current monitoring period, and the second data feature also corresponds to the current monitoring period.
[0157] Optionally, the current time period can correspond to any of the multiple monitoring periods to perform model monitoring based on real-time data.
[0158] Optionally, the first node can collect real-time wireless signal data based on wireless communication and perform feature extraction using a first model. For example, the first node can obtain second data features for the current time period based on the output of the first intermediate layer.
[0159] It should be noted that the first model used by the second node to obtain the first data feature is the same as the first model used by the first node to obtain the second data feature. Once the first model is adjusted or updated, it needs to be synchronized to both the first and second nodes. For example, the first model at the UE end is promptly sent to the UE by the device where the LMF is located to ensure that the two models for feature extraction are identical.
[0160] In some embodiments, the first data feature and the second data feature may also be obtained by the same node. For example, the first node may directly obtain the first data feature based on the stored training data, and then obtain the second data feature based on the real-time data.
[0161] In some embodiments, the intermediate layer of the first model can be any processing layer capable of feature extraction, other than the input and output layers. That is, the first model can include multiple intermediate layers. For example, when the first model is an AI model, in a multi-layered neural network of the AI model, each layer is responsible for extracting features at different levels. Based on feature extraction from intermediate layers, model monitoring can capture subtle changes in the input data that may not be easily detected in traditional low-dimensional statistical properties.
[0162] The intermediate layer used to extract the first and second data features at different stages is always the first intermediate layer, facilitating the monitoring of performance changes in the first model. In other words, both the first and second data features are intermediate features output by the first intermediate layer of the first model. In some embodiments, the first intermediate layer can be one of multiple intermediate layers included in the first model. For example, the first model includes K intermediate layers, where K is a positive integer, and the first intermediate layer is one of the K intermediate layers.
[0163] As an example, the first data feature is the intermediate feature output by the first intermediate layer of the first model during the training phase. During the training phase, the first model extracts features from the input data through a multi-layer neural network; these features reflect the complex structure and patterns of the data. For example, the first data feature can be represented as a sample set X = {x1, x2, ..., x...} m}, where m represents the number of samples for the feature.
[0164] In some scenarios, during the training of the first model, the first model can extract and store the feature X using the training data. In other scenarios, the first data can be input into the first model during monitoring to determine the first data features.
[0165] As an example, the second data feature is the intermediate feature output by the first intermediate layer of the first model during the inference phase. During the inference phase, the first model extracts features from real-time acquired data to reflect the state of the first model at the current time period. For example, the first data feature can be represented as a sample set Y = {y1, y2, ..., y...} n}, where n represents the number of samples for the feature.
[0166] In the above embodiments, the first node can collect wireless signal data in real time during the current period or under the current environment and extract feature Y. In some scenarios, the first node can perform periodic data collection according to the monitoring period T.
[0167] In some embodiments, the first intermediate layer is an intermediate layer relatively close to the output layer. Features closer to the output layer can be used to provide a deep, highly abstract representation of the data. By comparing the high-dimensional features extracted by the first model during the training and inference phases, the performance of the first model in practical applications can be more reasonably evaluated.
[0168] As an example, the first intermediate layer can be directly connected to the output layer of the first model. For instance, when the first model includes K intermediate layers that sequentially output features starting from the input layer, the first intermediate layer can be the Kth intermediate layer that outputs features. That is, the first intermediate layer can be the penultimate layer in a multi-layer network. For example, the first data feature and the second data feature are the outputs of the penultimate layer of the neural network. The output features of the penultimate layer retain important information from the input data while also possessing a certain degree of abstraction, making it a reasonable choice for monitoring model performance.
[0169] As an example, the first intermediate layer can be the penultimate intermediate layer among multiple intermediate layers that performs feature output.
[0170] For ease of understanding, the following description is based on Figure 8. The model in Figure 8 includes five processing layers for feature extraction: an input layer 801, intermediate layers 802-804, and an output layer 805. The data states processed in the five processing layers are different. The first intermediate layer can be intermediate layer 804, which is close to the output layer 805; that is, the output of intermediate layer 804 can be used as the intermediate feature of this application.
[0171] In some embodiments, the first intermediate layer can be an intermediate layer relatively close to the input layer to improve sensitivity to input changes and reduce information loss. For example, the first intermediate layer can be intermediate layer 802 close to the input layer 801 in FIG8.
[0172] The preceding text, with reference to Figures 7 and 8, describes a method for performance monitoring of the first model based on intermediate layer data features (intermediate features). This model monitoring method does not rely on real-world location data, thereby reducing costs and technical difficulty, and making performance monitoring more efficient. It also improves the model's adaptability to environmental changes, effectively detecting performance fluctuations caused by environmental changes and making timely adjustments. Therefore, the model monitoring method proposed in this application is particularly suitable for processing high-dimensional input data, such as CSI and time of arrival data.
[0173] The following is an exemplary description of a specific method for model monitoring based on the first data feature and the second data feature.
[0174] In some embodiments, performance monitoring of a first model based on a first data feature and a second data feature includes comparing the first data feature and the second data feature to monitor the performance of the first model, or performing calculations or post-processing on the first data feature and the second data feature to monitor the performance of the first model.
[0175] In some embodiments, the first node directly monitors the performance of the first model based on the feature distribution differences between the first and second data features, i.e., model monitoring. For example, the first node can determine the feature distribution differences between the first and second data features to evaluate the performance of the first model. The performance evaluation result determined based on the feature distribution differences is used to complete the model monitoring. The method of model monitoring through feature distribution differences can also be called a performance monitoring method based on feature distribution differences, i.e., a feature distribution difference monitoring method. In this method, when the performance of the first model exceeds a preset state, the first node or system can determine that there is a problem with the first performance based on the feature distribution differences, thereby triggering a correction (also called adjustment) mechanism or update mechanism for the first model.
[0176] As an example, the update mechanism can refer to updating the first model. Optionally, the update mechanism for the first model can achieve a complete update of the first model. Optionally, when the first model is divided into multiple parts, the update mechanism can also be an update of a specific part. For example, a node can update the current AI model based on model monitoring results to obtain an updated AI model.
[0177] As an example, the correction mechanism can refer to adjusting the first model. Optionally, the correction mechanism for the first model can achieve local correction of the first model. Local correction can include parameter adjustment or model adjustment, which is not limited here. For example, a node can adjust certain parameters of the first model according to environmental changes to obtain a corrected AI model.
[0178] In some embodiments, the first node may determine whether to trigger an update or correction of the first model based on the feature distribution difference and a preset threshold. The preset threshold may be a preset error threshold. The preset threshold may be predefined or determined based on higher-level signaling.
[0179] As an example, when the feature distribution difference is greater than or equal to a first preset threshold, the update mechanism of the first model is triggered; or, when the feature distribution difference is greater than or equal to a second preset threshold, the correction mechanism of the first model is triggered.
[0180] As an example, when the feature distribution difference is less than a first preset threshold, the update mechanism of the first model is not triggered; or, when the feature distribution difference is less than a second preset threshold, the correction mechanism of the first model is not triggered.
[0181] As an example, when the difference in feature distribution is greater than a first preset threshold, the update mechanism of the first model is triggered; or, when the difference in feature distribution is greater than a second preset threshold, the correction mechanism of the first model is triggered.
[0182] As an example, when the feature distribution difference is less than or equal to a first preset threshold, the update mechanism of the first model is not triggered; or, when the feature distribution difference is less than a second preset threshold, the correction mechanism of the first model is not triggered.
[0183] In the above embodiments, the first preset threshold is the same as the second preset threshold; or, the first preset threshold is different from the second preset threshold.
[0184] In the above embodiments, both the update mechanism and the correction mechanism are designed to optimize the first model in order to improve its accuracy.
[0185] In the above embodiments, when the update mechanism is triggered, the second node can receive the update request for the first model triggered by the first node. When the correction mechanism is triggered, the second node can receive the correction request for the first model triggered by the first node.
[0186] In some embodiments, feature distribution difference may include at least one of the following: data distribution error of a first data feature and a second data feature determined by the Maximum Mean Difference (MMD) algorithm; and KL divergence of the first data feature and the second data feature determined by the Kernel Density Estimation (KDE) algorithm. For example, the first node can determine the data distribution error, i.e., feature distribution difference, using an MMD-based feature difference calculation scheme. Alternatively, the first node can determine the KL divergence, i.e., feature distribution difference, using a KDE-based feature distribution difference calculation scheme. KL divergence is as described above. Furthermore, the first node can determine the data distribution error and KL divergence using the MMD algorithm and the KDE algorithm at different time periods or at the same time period, respectively.
[0187] As an example, the data distribution error is related to the calculation result of the MMD algorithm. For example, the data distribution error can represent the calculation result of the MMD algorithm. For example, the data distribution error can be the MMD value obtained by the MMD algorithm. Alternatively, the data distribution error can be determined based on the MMD value obtained by the MMD algorithm. For simplicity, the calculation result of the MMD algorithm will be simplified to the MMD result in the following text.
[0188] As an example, when the feature distribution difference is the data distribution error of the first data feature and the second data feature determined by the MMD algorithm, the first preset threshold or the second preset threshold can be a preset error threshold.
[0189] As an example, when the feature distribution difference is the KL divergence of the first data feature and the second data feature determined by the KDE algorithm, the first preset threshold or the second preset threshold can be a threshold preset for the KL divergence.
[0190] In the above embodiments, the data distribution error determined by the MMD algorithm can measure the difference in probability distribution between the first data feature and the second data feature; the KL divergence determined by the KDE algorithm can indicate the difference in probability distribution between the first data feature and the second data feature.
[0191] The following section uses the first model used for localization as an example to introduce the computation schemes based on MMD and KDE, respectively, in conjunction with two implementation examples.
[0192] Example 1: Feature Difference Calculation Scheme Based on MMD
[0193] In this embodiment, the feature distribution difference between the first data feature and the second data feature is the data distribution error. Through the feature difference calculation and monitoring mechanism based on MMD, the system can flexibly and efficiently evaluate the performance changes of the first model, reduce the cost of relying on real data, and ensure the positioning accuracy of the model in complex wireless environments.
[0194] In this embodiment, the MMD algorithm can measure the difference between two distributions based on a kernel function (usually a Gaussian kernel).
[0195] As an example, the first data feature determined by the training data can be represented as X = {x1, x2, ..., x...} m The second data feature, extracted from the first model on the second node, is stored on the first node or the device where the first node is located. The second data feature determined by the real-time acquired wireless signal data can be represented as Y = {y1, y2, ..., y...}. n The first model at the first node is extracted and stored at the first node.
[0196] The formula for calculating the MMD result for the first and second data features mentioned above is as follows:
[0197] Where k(x,y) is the kernel function. A Gaussian kernel function can be used, as shown in the following formula:
[0198] Where σ is the bandwidth parameter of the kernel function, which controls the measurement scale of distance in the feature space; This represents the expected value.
[0199] This formula allows the MMD result to measure the difference in mean between the distributions of the first and second data features in the kernel function mapping space. A small MMD result indicates that the two feature distributions are very similar, and the first model's performance in the new environment is close to that during the training phase. A large MMD result indicates a significant deviation between the two feature distributions, and the first model may need adjustment or updating.
[0200] In practical calculations, the MMD result is approximated using a finite set of samples. The discrete form of the MMD result can be:
[0201] Where m represents the number of feature samples extracted from the training data; n represents the number of feature samples of the wireless signal collected in real time within a model performance monitoring period T; k(x i ,x j ), k(y i ,y j ) represents the kernel function calculation result between feature samples.
[0202] Using the discrete form of Formula ①, the system can quickly calculate the MMD value (i.e., data distribution error), thereby measuring the difference in feature distribution between the data features of the training data and the data features of the new / current environment.
[0203] In this embodiment, the data distribution error is used to compare the feature distribution of the training data stored in the LMF (second node) with the feature distribution of the wireless signal collected in real time by the UE (first node). This comparison allows us to detect whether changes in the deployment environment have affected the performance of the first model. If the feature distribution difference obtained by the MMD algorithm exceeds a preset threshold (either a first or second preset threshold), it indicates that the first model may need adjustment or optimization to adapt to the new environmental conditions.
[0204] To facilitate understanding, the process of Embodiment 1 is illustrated below with reference to Figure 9. Figure 9 is presented from the perspective of the interaction between the first node and the second node. The monitoring in Figure 9 can be periodic monitoring to optimize the first model.
[0205] Referring to Figure 9, in step S910, the first node sends the second data feature to the second node. The first node can determine the second data feature beforehand. Optionally, the first node can periodically collect wireless signal data in real time based on the monitoring period setting and determine the second data feature. In this embodiment, the first node can upload the second data feature to the second node via signaling for subsequent feature difference calculation.
[0206] In step S920, the second node determines the data distribution error based on the first data feature and the second data feature. The second node can determine the first data feature based on the training data, and then determine the data distribution error according to formula ①. When the second node is an LMF, the LMF end can receive the second data feature from the first node, combine it with the first data feature determined based on the training data, and then use the MMD algorithm to calculate the difference between the two feature distributions. For example, the LMF can calculate formula (1) (the kernel mean difference between real-time acquired features), formula (2) (the kernel mean difference between training data features), and formula (3) (the kernel mean difference between training data and real-time data) in sequence according to formula ①, and finally obtain the calculation result of the MMD algorithm.
[0207] In step S930, the first node receives the data distribution error from the second node. After the MMD calculation is completed, the second node will transmit the calculated MMD result (data distribution error) back to the first node via signaling. This process also incurs additional signaling overhead.
[0208] After receiving the data distribution error, the first node compares it with a preset threshold (either a first or second preset threshold). If the data distribution error is less than the preset threshold, it indicates that the feature distribution of the first model in the current environment is largely consistent with the features of the training data, and the model performance has not deviated significantly. If the data distribution error exceeds the threshold, it indicates that the model performance may have been affected by environmental changes, and the system can trigger an update of the first model or other correction mechanisms.
[0209] It should be noted that, without new model updates, the determination of the first data feature at the LMF end and the calculation of equation (2) in formula ① only need to be performed once until the next update. In other words, under normal model performance, some calculations in formula ① only need to be performed once, until the first model update. This mechanism effectively reduces computational burden and signaling overhead.
[0210] In Example 1, the additional signaling overhead mainly manifests in feature transmission and MMD result transmission. In feature transmission, the first node needs to periodically send the second data features of the data collected in the current time period to the second node. In MMD result transmission, after the second node completes its calculation, it sends the MMD result back to the first node for performance evaluation of the first model. Example 1 optimizes signaling overhead. When the first node is the UE and the second node is the LMF, the MMD algorithm calculation process is mainly performed at the LMF end; the UE only needs to send the collected data features to the LMF end via signaling. This design reduces the computational burden on the UE and lowers the overall signaling overhead. Especially when the data distribution error does not trigger the threshold, the frequency of data transmission can be reduced, thereby further conserving network resources.
[0211] Furthermore, Example 1 enables periodic monitoring and rapid response to environmental changes. The system periodically performs monitoring tasks, collecting wireless signal characteristic data from the UE in the current environment, and uses the MMD algorithm to analyze the distribution differences between these data and the characteristic data from the training phase. Once the data distribution error determined by the MMD algorithm exceeds a set threshold, it indicates that the model's performance in the new environment may have deviated, potentially affecting the accuracy of positioning. Given the complexity and constantly changing nature of wireless communication environments, the MMD algorithm, as a detection tool, can quickly capture subtle changes in the environment. This feature-difference-based monitoring method can issue timely warnings, ensuring that the first model can adjust accordingly to actual environmental changes, thereby maintaining positioning accuracy and system stability.
[0212] Example 2: Feature Distribution Difference Calculation Scheme Based on KDE
[0213] In this embodiment, the difference in feature distribution between the first and second data features is represented by the KL divergence. KDE can be used to accurately estimate the data feature distribution during the training and deployment phases. By comparing the distribution differences between the two sets of features, the performance of the first model can be evaluated.
[0214] Similar to Example 1, the feature distribution of the training data is estimated from the training data stored on the second node, while the first node collects wireless signal data in real time and estimates its feature distribution. The difference is that after obtaining the two sets of feature distributions, Example 2 uses KL divergence to measure the difference between them. When the KL divergence exceeds a set threshold, it may indicate a change in the model deployment environment, causing a significant shift in the model's input feature distribution. In this case, the model may need adjustment or optimization to adapt to the new environmental conditions.
[0215] As one embodiment, the first node can receive the probability density distribution of a first data feature from the second node. After determining the probability density distribution of the second data feature, the first node can calculate the KL divergence based on the two probability density distributions, thereby obtaining the feature distribution difference between the first and second data features. Optionally, the probability density distribution can be a list of multiple kernel density estimates.
[0216] In the above embodiment, the first node can receive multiple discrete point coordinates from the second node. Based on the multiple discrete point coordinates, the first node can determine multiple discrete points from the second data features, and then calculate the probability density distribution of the second data features corresponding to the multiple discrete points. Since the discrete point coordinates are the same, the probability density distribution of the second data features can be compared with the probability density distribution of the first data features.
[0217] As an example, when the first node is the UE and the second node is the LMF, the core of the KDE-based computation scheme is that the LMF obtains the feature distribution of the training data through KDE and sends the necessary kernel density estimate, kernel function table index and other information to the UE. The UE then completes the calculation of KL divergence and uses KL divergence to evaluate the difference in feature distribution, thereby monitoring the performance of the first model.
[0218] In the above embodiments, the KDE process requires selecting appropriate kernel functions and bandwidth parameters. Table indexes can also be used to send these parameters to reduce transmission load. For example, the table indexes can be predefined, and the LMF sends the kernel function and bandwidth parameters to the UE through the table indexes. The UE can find the corresponding kernel function parameters by looking up the table using the index and reuse them, without needing to transmit the complete function and parameters each time.
[0219] In one implementation, the first node can receive a table index from the second node to determine the kernel function and bandwidth parameters. The kernel function and bandwidth parameters can be used to determine the probability density function for calculating the probability density distribution. Based on the same kernel function and bandwidth parameters, the first and second nodes use the same algorithmic rules to calculate the probability density distributions of the training data and real-time data, respectively, for comparison.
[0220] In the above embodiments, the LMF side can be responsible for storing the dataset used to train the first model, which is the basis for KDE computation. Through KDE, the LMF can compute the probability density function of the feature distribution of the training set. The calculation of the probability density function involves choosing an appropriate kernel function and bandwidth parameter.
[0221] As an example, the first data feature determined by the training data can be represented as {x1, x2, ..., x...} nThe first data feature can be extracted from the first model on the second node or the device where the second node is located, and stored on the second node or the device where the second node is located, where n represents the number of feature samples. The second data feature of the wireless signal data collected in the current time period can be represented as {y1, y2, ..., y...}. m} is extracted from the first model at the first node and stored at the first node, where m represents the number of feature samples.
[0222] At the second node, the training dataset {x1,x2,...,x} is used. n We use the Gaussian kernel function (probability density function) to estimate the distribution of the features, i.e., the probability density distribution of the first data feature. For each sample point of the first data feature, the density distribution can be estimated using the Gaussian kernel function (probability density function):
[0223] During system operation, the first node collects real-time features {y1,y2,...,y...} m The probability density distribution of the second data feature also needs to be estimated using KDE. The first node also uses a kernel function to estimate its density distribution.
[0224] As mentioned above, this embodiment can evaluate the difference in feature distributions using KL divergence; that is, the difference in feature distributions is represented by KL divergence. As an example, KL divergence can be calculated based on discrete sampling.
[0225] As an example, the coordinates of multiple discrete points can be a matrix of size k×d, where d is the dimension of the feature space.
[0226] In this embodiment, the second node can sample discrete points based on multiple discrete point coordinates. These discrete points can be selected from the feature space corresponding to the training data. This feature space can be represented by the first data features. For example, when the second node is an LMF (Laser-Based Function), the LMF can select a set of discrete points {z1, z2, ..., z...} from the feature space (the first data features). k}, where k is the number of discrete points.
[0227] After sampling at discrete points, the kernel density estimate can be calculated separately at each discrete point to obtain the set of density values for all discrete points. For example, the corresponding kernel density estimate for each discrete point can be calculated as follows: A list is formed by the set of density values at all discrete points. This list represents the density estimate after discretization, also known as the probability density distribution.
[0228] Furthermore, after the second node determines the discrete points, it can transmit the coordinates of the discrete points and the kernel density estimate (or probability density distribution) to the first node. As an example, the LMF needs to transmit the coordinates of the discrete points and the kernel density estimate to the UE. The UE can determine the probability density distribution of the first data feature based on the received kernel density estimate. The kernel density estimate can be a vector of size k, where each value corresponds to the kernel density estimate of a discrete point. The kernel density estimate can also be called the probability density value.
[0229] The first node can calculate the kernel density estimate based on the received discrete correlation data. In other words, after receiving the discrete point coordinates and the kernel density estimate, the second node can calculate the kernel density estimate of the discrete point at the corresponding location. For example, the UE receives multiple discrete point coordinates zi and corresponding kernel density estimates from the LMF. Furthermore, the UE collects the feature space of real-time wireless signals, which can be represented by the second data feature. Discrete points corresponding to these discrete point coordinates are selected in the feature space, and the kernel density estimate of these discrete points is calculated.
[0230] Based on the above calculation results, the first node can be determined by calculating the KL divergence:
[0231] In Example 2, the first node can determine the difference in feature distribution between training data and real-time data by calculating the KL divergence. If the KL divergence value exceeds a preset threshold (either a first or second preset threshold), it indicates a significant difference between the distribution of real-time data and the distribution of training data, which may mean that the model performance has deviated or is abnormal.
[0232] To facilitate understanding, the process of Embodiment 2 is illustrated below with reference to Figure 10. Figure 10 is presented from the perspective of the interaction between the first node and the second node. The monitoring in Figure 10 can be periodic monitoring to optimize the first model.
[0233] In step S1010, the first node receives the probability density distribution of the first data feature from the second node. For example, the second node can first obtain the probability density function of the first data feature based on the training data, and then determine multiple kernel density estimates based on the collected discrete points. The list formed by multiple kernel density estimates is the probability density distribution of the first data feature.
[0234] In step S1020, the first node determines the KL divergence based on the probability density distributions of the first and second data features. Before determining the KL divergence, the first node may first determine the probability density distribution of the second data feature. Optionally, the first node may receive a table index and / or multiple discrete point coordinates from the second node to determine the probability density distribution of the second data feature. For example, the second node sends the kernel density estimate... At the same time, multiple discrete point coordinates are also sent. The first node can determine multiple discrete points from the second data feature based on the multiple discrete point coordinates, and then determine multiple kernel density estimates corresponding to the second data feature based on these discrete points. This yields the probability density distribution of the second data feature. The first node can then determine the KL divergence based on the two probability density distributions.
[0235] In Example 2, the additional signaling overhead mainly consists of discrete point coordinates and their corresponding... The vector needs to be sent to the first node via signaling. Furthermore, when the first model needs to be updated, the LMF will reissue the new first model to the UE, including the new kernel density estimation parameters. If the first model does not need to be updated, the kernel density estimate... And / or table indexes only need to be sent once, and are not resent until the next model update, thus reducing signaling overhead.
[0236] As shown in Figures 9 and 10, the system can continuously collect input data and extract data features. Then, it periodically uses statistical methods such as the MMD algorithm or KDE algorithm to compare the differences between these feature distributions and the feature distributions during training. If the difference exceeds a preset threshold, the system will determine that the performance of the first model may be problematic and trigger an adjustment or update mechanism for the first model, thereby ensuring that the first model maintains stable and efficient performance in dynamic environments.
[0237] The above sections introduced two embodiments where the feature distribution difference is data distribution error and KL divergence. In other embodiments, the feature distribution difference between the first data feature and the second data feature may include data distribution error and KL divergence. For example, when the first node has strong computational power, the performance of the first model can be monitored based on two-dimensional data composed of data distribution error and KL divergence.
[0238] As one implementation, the first node can determine the feature distribution difference between the first and second data features using different methods at different time periods. For example, the data distribution error of the first and second data features can be determined in the first time period, and the KL divergence of the first and second data features can be determined in the second time period. The first and second time periods can be different time periods within a monitoring cycle. The first node can perform performance monitoring based on the data distribution error and KL divergence determined at different time periods.
[0239] As another implementation, the first node can determine the feature distribution difference between the first data feature and the second data feature using two different methods within the same time period. That is, the first node can calculate the data distribution error of the first data feature and the second data feature, as well as the KL divergence of the first data feature and the second data feature, within the same time period, and then perform performance monitoring on the first model based on the two parameters.
[0240] In other embodiments, the first node may also determine the feature distribution difference between the first data feature and the second data feature in one or more other ways for performance monitoring.
[0241] The methods for real-time performance monitoring of the first model based on MMD and KDE, respectively, have been described above with reference to Figures 7 to 10. The embodiments of this application are described in more detail below with reference to specific examples, Figures 11 and 12. It should be noted that the examples in Figures 7 to 10 are merely to help those skilled in the art understand the embodiments of this application, and are not intended to limit the embodiments of this application to the specific numerical values or scenarios illustrated. Those skilled in the art can obviously make various equivalent modifications or variations based on the examples in Figures 7 to 10, and such modifications or variations also fall within the scope of the embodiments of this application.
[0242] Figure 11 illustrates the signaling flow of the MMD-based feature difference calculation scheme. This scheme compares the feature distribution differences between the inference and training phases to evaluate the performance of the first model. Figure 11 is presented from the perspective of the interaction between three entities: gNB, UE, and LMF. The UE is the first node, and the LMF is the second node.
[0243] Referring to Figure 11, in step S1102, the LMF stores the training data. The LMF can then train the first model based on the training data.
[0244] In step S1104, LMF executes the distribution of the first model. When the first model needs to be updated, the updated first model is distributed.
[0245] In step S1106, the UE receives the RRC signaling sent by the gNB to collect radio signal characteristics as input data for the first model.
[0246] In step S1108, LMF obtains feature X based on the training data. Feature X is the first data feature.
[0247] In step S1110, LMF calculates and obtains equation (2) in formula ①.
[0248] In step S1112, the UE obtains feature Y based on the real-time collected wireless signal data. Feature Y is the second data feature.
[0249] In step S1114, the UE sends RRC signaling back to the gNB.
[0250] In step S1116, the UE uploads feature Y to the LMF.
[0251] In step S1118, LMF calculates and obtains equations (1) and (3) in formula ①.
[0252] In step S1120, LMF performs feature difference calculation to obtain the MMD result, which is the data distribution error.
[0253] In step S1122, the LMF sends the MMD result to the UE.
[0254] In step S1124, the UE obtains the MMD result.
[0255] In step S1126, the UE compares a preset error threshold with the MMD result. Based on the comparison between the threshold and the MMD result, the UE determines whether the first model needs to be updated or corrected. The preset error threshold can be either the first preset threshold or the second preset threshold mentioned above.
[0256] Figure 12 shows the signaling flow of the feature distribution difference calculation scheme based on KDE. Similar to Figure 11, Figure 12 is also presented from the perspective of the interaction between the three entities: gNB, UE, and LMF.
[0257] Referring to Figure 12, steps S1202 to S1206 and S1214 are the same as steps S1102 to S1106 and S1114 in Figure 6, and will not be described again.
[0258] In step S1208, LMF uses KDE to estimate the feature distribution of the training set data to obtain the probability density function.
[0259] In step S1210, LMF selects a set of discrete points {z1, z2, ..., z} from the first data features corresponding to the training set data. k}, where k is the number of discrete points, resulting in k discrete points zi.
[0260] In step S1212, LMF calculates the kernel density estimate for each discrete point zi. The kernel density estimates at these discrete points are set into a list. This list represents the probability density distribution of the first data feature.
[0261] In step S1216, the LMF sends the coordinates and kernel density estimates of these discrete points zi to the UE.
[0262] In step S1218, the LMF sends a table index including kernel functions and bandwidth parameters to the UE via RRC signaling.
[0263] In step S1220, the UE obtains the kernel density estimation formula by looking up a table.
[0264] In step S1222, the UE uses the coordinates of these discrete points zi to determine multiple discrete points with corresponding coordinates from the wireless channel data features (also known as the feature space), and calculates the kernel density estimate of these discrete points according to the kernel density estimation formula. The kernel density estimates at these discrete points are set into a list. This list represents the probability density distribution of the second data feature. The wireless channel data feature is the second data feature based on the wireless signal data collected by the UE in real time for the current time period.
[0265] In step S1224, the UE calculates the KL divergence based on the probability density distributions of the first and second data features, i.e., the KL divergence between the training set data and the wireless signals collected in real time by the UE. If the KL divergence exceeds a preset threshold, the model performance is abnormal, and the model needs to be retrained or adjusted. The preset threshold can be either the first or second preset threshold mentioned above.
[0266] Figures 11 and 12 illustrate the feature distribution difference calculation methods based on MMD and KDE, respectively. Table 1 shows the advantages and disadvantages of the two methods and their respective applicable scenarios.
[0267] Table 1
[0268] In summary, this embodiment evaluates the performance of the first model by comparing the differences between the input features extracted by the first model in actual deployment and the feature distribution during the model training phase. Specifically, the system can employ techniques such as Maximum Mean Difference (MMD) or Kernel Density Estimation (KDE) algorithms to calculate the distribution differences between the feature data collected from the deployment environment and the training data. Once these differences exceed a preset threshold, the system automatically determines that the performance of the first model may be abnormal and then initiates an update or adjustment mechanism for the first model.
[0269] When the first model is used to locate the first node, the model monitoring method proposed in this application significantly reduces the dependence on real location data, provides a more sensitive and effective performance monitoring means, and can quickly adapt to environmental changes to ensure that the first model maintains stability and efficiency in complex and dynamic scenarios.
[0270] The following example, Scenario 1 from the previous positioning method, illustrates the feature difference calculation scheme based on MMD. In this scenario, the feature difference calculation scheme based on MMD is used in the runtime phase after model deployment, i.e., the inference phase. By comparing the real-time wireless signal characteristics in the deployment environment with the data characteristics from the training phase, it is possible to evaluate whether the positioning performance of the first model has deviated.
[0271] First, during the system initialization phase, the first model can be trained on the second node. During training, the first data feature X can be extracted. After training is complete, it is deployed to the first node via signaling.
[0272] Secondly, the first node can collect features in real time based on the wireless signal data in the current environment to obtain the second data feature Y.
[0273] Secondly, the first node can transmit data features. To evaluate the differences between the deployment environment and the training environment, the first node uploads the currently extracted second data feature Y to the first node via signaling. The uploaded features can be performed periodically, and the frequency can be adjusted according to system settings and the complexity of the communication environment.
[0274] Next, the first node performs relevant calculations based on the MMD algorithm to evaluate the performance of the first model. The first node can calculate the distribution difference between the real-time acquired second data feature Y and the first data feature X of the training data according to formula ①.
[0275] Finally, the first node returns the result to the second node, initiating a model adaptation mechanism. After the first node completes its calculation, it feeds back the MMD result to the second node via signaling. The second node compares the data distribution error representing the MMD result with a preset threshold. If the data distribution error is less than the preset threshold, it indicates that the feature distribution of the data collected from the real-time environment is largely consistent with the data from the training phase, and the localization performance of the first model is normal. If the data distribution error exceeds the preset threshold, it indicates that environmental changes have led to a decline in model performance. In this case, the system can trigger model updates or other corrective mechanisms, such as parameter fine-tuning or retraining.
[0276] During the implementation of the above method, periodic monitoring and system optimization of the first model can be achieved. When the model performance remains normal, the system only needs to periodically collect wireless signal data, extract features, and calculate the data distribution error between the first and second data features using the MMD algorithm. Frequent model updates are unnecessary, effectively reducing signaling overhead and computational burden. Only when significant environmental changes are detected will the system perform necessary model adjustments and optimizations to ensure positioning accuracy and system stability.
[0277] Through this embodiment, the system can effectively evaluate the performance of the first model deployed at the first node in scenario 1, and monitor the impact of environmental changes on model performance in real time through MMD calculation, ensuring that the location information output by the first node always has high accuracy and robustness.
[0278] The method embodiments of this application have been described in detail above with reference to Figures 1 to 12. The apparatus embodiments of this application will be described in detail below with reference to Figures 13 to 16. It should be understood that the descriptions of the method embodiments correspond to the descriptions of the apparatus embodiments; therefore, any parts not described in detail can be referred to the preceding method embodiments.
[0279] Figure 13 illustrates a first node for wireless communication provided in an embodiment of this application. The first node can be a terminal device or a network device. As shown in Figure 13, the first node 1300 includes a first processor 1310.
[0280] The first processor 1310 can be used to monitor the performance of a first model based on the difference in feature distribution between a first data feature and a second data feature; wherein the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0281] As an example, the first model includes K intermediate layers, where K is a positive integer. The first data feature and the second data feature are both intermediate features output by the first intermediate layer of the first model, and the first intermediate layer is one of the K intermediate layers.
[0282] As an example, the first intermediate layer is directly connected to the output layer of the first model.
[0283] As an example, the first processor 1310 is further configured to trigger the update mechanism of the first model when the feature distribution difference is greater than or equal to a first preset threshold; or, trigger the correction mechanism of the first model when the feature distribution difference is greater than or equal to a second preset threshold.
[0284] As an example, the feature distribution difference includes at least one of the following: the data distribution error of the first data feature and the second data feature determined by the maximum mean difference algorithm; and the KL divergence of the first data feature and the second data feature determined by the kernel density estimation algorithm.
[0285] As an example, the feature distribution difference is the data distribution error. The first processor 1310 is also used to determine the second data feature. The first node 1300 also includes a first transceiver, which can be used to send the second data feature to the second node. The second data feature is used by the second node to determine the data distribution error. The first transceiver is also used to receive the data distribution error from the second node.
[0286] As an example, the feature distribution difference is the KL divergence. The first node 1300 further includes a second transceiver, which can be used to receive the probability density distribution of the first data feature from the second node; the first processor 1310 is also used to determine the probability density distribution of the second data feature; and to determine the KL divergence based on the probability density distribution of the first data feature and the probability density distribution of the second data feature.
[0287] As an example, the second transceiver is further configured to receive a table index from the second node for determining a kernel function and a bandwidth parameter, the kernel function and the bandwidth parameter being used to determine the probability density distribution of the first data feature and the probability density distribution of the second data feature.
[0288] As one embodiment, the second transceiver is further configured to receive multiple discrete point coordinates from the second node; the first processor 1310 is further configured to: determine multiple discrete points from the second data features based on the multiple discrete point coordinates; and calculate the probability density distribution of the second data features corresponding to the multiple discrete points.
[0289] As an example, the first processor 1310 is further configured to determine the wireless signal data for the current time period; and input the wireless signal data into the first model to determine the second data feature corresponding to the wireless signal data for the current time period.
[0290] As an example, the first node 1300 further includes a third transceiver, which can be used to receive the first model from the second node; wherein the first model is used to locate the first node.
[0291] As an example, the first model is an AI / ML model.
[0292] As one embodiment, the first processor 1310 may be a processor 1510. The first node 1300 may also include a memory 1520 and a transceiver 1530, as shown in FIG15.
[0293] Figure 14 illustrates a second node for wireless communication according to an embodiment of this application. The second node can be a device or entity used for positioning on the network side, such as an LMF. As shown in Figure 14, the second node 1400 includes a second processor 1410.
[0294] The second processor 1410 can be used to train the first model; wherein the feature distribution difference between the first data feature and the second data feature is used to monitor the performance of the first model; the first data feature is the intermediate feature of the first model during the training phase, and the second data feature is the intermediate feature of the first model during the inference phase.
[0295] As an example, the first model includes K intermediate layers, where K is a positive integer. The first data feature and the second data feature are both intermediate features output by the first intermediate layer of the first model, and the first intermediate layer is one of the K intermediate layers.
[0296] As an example, the first intermediate layer is directly connected to the output layer of the first model.
[0297] As an example, the second processor 1410 is further configured to receive an update request for the first model triggered by the first node when the feature distribution difference is greater than or equal to a first preset threshold; or, when the feature distribution difference is greater than or equal to a second preset threshold, receive a correction request for the first model triggered by the first node.
[0298] As an example, the feature distribution difference includes at least one of the following: the data distribution error of the first data feature and the second data feature determined by the maximum mean difference algorithm; and the KL divergence of the first data feature and the second data feature determined by the kernel density estimation algorithm.
[0299] As an example, the feature distribution difference is the data distribution error. The second node 1400 further includes a fourth transceiver, which can be used to receive the second data feature from the first node. The second processor 1410 is also used to determine the data distribution error based on the first data feature and the second data feature. The fourth transceiver is also used to send the data distribution error to the first node.
[0300] As an example, the feature distribution difference is the KL divergence, and the second processor 1410 is also used to determine the probability density distribution of the first data feature; the second node 1400 also includes a fifth transceiver, which can be used to send the probability density distribution of the first data feature to the first node.
[0301] As an example, the fifth transceiver is further configured to send a table index to the first node for determining a kernel function and a bandwidth parameter, wherein the kernel function and the bandwidth parameter are used to determine the probability density distribution of the first data feature and the probability density distribution of the second data feature.
[0302] As an example, the fifth transceiver is further configured to send multiple discrete point coordinates to the first node, the multiple discrete point coordinates being used by the first node to determine multiple discrete points from the second data features, the multiple discrete points being used to calculate the probability density distribution of the second data features.
[0303] As an example, the second processor 1410 is also used to input training data into the first model to determine the first data features.
[0304] As an example, the second node 1400 further includes a sixth transceiver, which can be used to send the first model to the first node; wherein the first model is used to locate the first node.
[0305] As an example, the first model is an AI / ML model.
[0306] As one embodiment, the second processor 1410 may be a processor 1510. The second node 1400 may also include a memory 1520 and a transceiver 1530, as shown in FIG15.
[0307] Figure 15 is a schematic structural diagram of a communication device according to an embodiment of this application. The dashed lines in Figure 15 indicate that the unit or module is optional. This device 1500 can be used to implement the methods described in the above method embodiments. Device 1500 can be a chip, user equipment, or network device.
[0308] Apparatus 1500 may include one or more processors 1510. The processor 1510 may support apparatus 1500 in implementing the methods described in the preceding method embodiments. The processor 1510 may be a general-purpose processor or a special-purpose processor. For example, the processor may be a central processing unit (CPU). Alternatively, the processor may be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor.
[0309] The apparatus 1500 may further include one or more memories 1520. The memories 1520 store a program that can be executed by the processor 1510, causing the processor 1510 to perform the methods described in the preceding method embodiments. The memories 1520 may be independent of the processor 1510 or integrated into the processor 1510.
[0310] The device 1500 may also include a transceiver 1530. The processor 1510 can communicate with other devices or chips via the transceiver 1530. For example, the processor 1510 can send and receive data with other devices or chips via the transceiver 1530.
[0311] Figure 16 is a schematic diagram of the hardware modules of the communication device provided in this application embodiment. Specifically, Figure 16 shows a block diagram of a first communication device 1650 and a second communication device 1610 communicating with each other in the access network.
[0312] The first communication device 1650 includes a controller / processor 1659, a memory 1660, a data source 1667, a transmitting processor 1668, a receiving processor 1656, a multi-antenna transmitting processor 1657, a multi-antenna receiving processor 1658, a transmitter / receiver 1654, and an antenna 1652.
[0313] The second communication device 1610 includes a controller / processor 1675, a memory 1676, a data source 1677, a receiver processor 1670, a transmitter processor 1616, a multi-antenna receiver processor 1672, a multi-antenna transmitter processor 1671, a transmitter / receiver 1618, and an antenna 1620.
[0314] In the transmission from the second communication device 1610 to the first communication device 1650, at the second communication device 1610, upper-layer data packets from the core network or from the data source 1677 are provided to the controller / processor 1675. The core network and data source 1677 represent all protocol layers above the L2 layer. The controller / processor 1675 implements the functionality of the L2 layer. In the transmission from the second communication device 1610 to the first communication device 1650, the controller / processor 1675 provides header compression, encryption, packet segmentation and reordering, multiplexing between logical and transport channels, and radio resource allocation for the first communication device 1650 based on various priority metrics. The controller / processor 1675 is also responsible for retransmitting lost packets and signaling to the first communication device 1650. The transmit processor 1616 and the multi-antenna transmit processor 1671 implement various signal processing functions for the L1 layer (i.e., the physical layer). Transmit processor 1616 performs encoding and interleaving to facilitate forward error correction at the second communication device 1610, and mapping of signal clusters based on various modulation schemes (e.g., binary phase shift keying, quadrature phase shift keying, M-phase shift keying, M-quadrature amplitude modulation). Multi-antenna transmit processor 1671 performs digital spatial precoding on the encoded and modulated symbols, including codebook-based precoding and non-codebook-based precoding, and beamforming processing to generate one or more spatial streams. Transmit processor 1616 then maps each spatial stream to subcarriers, multiplexes it with a reference signal (e.g., a pilot) in the time and / or frequency domains, and subsequently uses inverse fast Fourier transform to generate a physical channel carrying the time-domain multicarrier symbol stream. Multi-antenna transmit processor 1671 then performs transmit analog precoding / beamforming operations on the time-domain multicarrier symbol stream. Each transmitter 1618 converts the baseband multicarrier symbol stream provided by the multi-antenna transmitter processor 1671 into an radio frequency stream, which is then provided to different antennas 1620.
[0315] In the transmission from the second communication device 1610 to the first communication device 1650, at the first communication device 1650, each receiver 1654 receives a signal through its corresponding antenna 1652. Each receiver 1654 recovers the information modulated onto the radio frequency carrier and converts the radio frequency stream into a baseband multicarrier symbol stream, which is then provided to the receiver processor 1656. The receiver processor 1656 and the multi-antenna receiver processor 1658 implement various signal processing functions of Layer 1. The multi-antenna receiver processor 1658 performs receive analog precoding / beamforming operations on the baseband multicarrier symbol stream from the receiver 1654. The receiver processor 1656 uses a fast Fourier transform to convert the baseband multicarrier symbol stream after the receive analog precoding / beamforming operations from the time domain to the frequency domain. In the frequency domain, the physical layer data signal and the reference signal are demultiplexed by the receiver processor 1656, where the reference signal is used for channel estimation, and the data signal is recovered in the multi-antenna receiver processor 1658 after multi-antenna detection to recover any spatial stream destined for the first communication device 1650. Symbols on each spatial stream are demodulated and recovered in the receive processor 1656, generating soft decisions. The receive processor 1656 then decodes and deinterleaves the soft decisions to recover the upper-layer data and control signals transmitted by the second communication device 1610 over the physical channel. The upper-layer data and control signals are then provided to the controller / processor 1659. The controller / processor 1659 implements the functions of Layer 2. The controller / processor 1659 may be associated with a memory 1660 storing program code and data. The memory 1660 may be referred to as computer-readable media. In the transmission from the second communication device 1610 to the first communication device 1650, the controller / processor 1659 provides multiplexing, packet reassembly, decryption, header decompression, and control signal processing between the transmission and logical channels to recover the upper-layer data packets from the second communication device 1610. The upper-layer data packets are then provided to all protocol layers above Layer 2. Various control signals may also be provided to Layer 3 for Layer 3 processing.
[0316] In the transmission from the first communication device 1650 to the second communication device 1610, at the first communication device 1650, upper-layer data packets are provided to the controller / processor 1659 using a data source 1667. The data source 1667 represents all protocol layers above Layer 2. Similar to the transmission functions at the second communication device 1610 described in the transmission from the second communication device 1610 to the first communication device 1650, the controller / processor 1659 implements header compression, encryption, packet segmentation and reordering, and multiplexing between logic and transport channels, implementing Layer 2 functions for the user plane and control plane. The controller / processor 1659 is also responsible for retransmitting lost packets and signaling to the second communication device 1610. Transmit processor 1668 performs modulation mapping and channel coding processing, while multi-antenna transmit processor 1657 performs digital multi-antenna spatial precoding, including codebook-based and non-codebook-based precoding, and beamforming processing. Subsequently, transmit processor 1668 modulates the generated spatial stream into a multi-carrier / single-carrier symbol stream. After analog precoding / beamforming operations in multi-antenna transmit processor 1657, the stream is provided to different antennas 1652 via transmitter 1654. Each transmitter 1654 first converts the baseband symbol stream provided by multi-antenna transmit processor 1657 into a radio frequency symbol stream before providing it to antenna 1652.
[0317] In the transmission from the first communication device 1650 to the second communication device 1610, the function at the second communication device 1610 is similar to the receiving function at the first communication device 1650 described in the transmission from the second communication device 1610 to the first communication device 1650. Each receiver 1618 receives radio frequency signals through its corresponding antenna 1620, converts the received radio frequency signals into baseband signals, and provides the baseband signals to the multi-antenna receiving processor 1672 and the receiving processor 1670. The receiving processor 1670 and the multi-antenna receiving processor 1672 jointly implement the L1 layer function. The controller / processor 1675 implements the L2 layer function. The controller / processor 1675 may be associated with a memory 1676 that stores program code and data. The memory 1676 may be referred to as computer-readable media. In the transmission from the first communication device 1650 to the second communication device 1610, the controller / processor 1675 provides multiplexing, packet reassembly, decryption, header decompression, and control signal processing between the transmission and logical channels to recover the upper-layer data packets from the first communication device 1650. The upper-layer data packets from the controller / processor 1675 can be provided to the core network or all protocol layers above Layer 2, and various control signals can also be provided to the core network or Layer 3 for Layer 3 processing.
[0318] As one embodiment, the first communication device 1650 includes: at least one processor and at least one memory, the at least one memory including computer program code; the at least one memory and the computer program code are configured to be used with the at least one processor, and the first communication device 1650 at least: performs performance monitoring on a first model based on the feature distribution difference between a first data feature and a second data feature; wherein the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0319] As one embodiment, the first communication device 1650 includes: a memory storing a computer-readable instruction program, which generates an action when executed by at least one processor, the action including: monitoring the performance of a first model based on the feature distribution difference of a first data feature and a second data feature; wherein the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
[0320] As an example, the first communication device 1650 corresponds to the first node in this application.
[0321] As one embodiment, the second communication device 1610 corresponds to the second node in this application.
[0322] As an example, the first communication device 1650 is a user equipment that can act as a relay node.
[0323] As an example, the first communication device 1650 is a network control relay (NCR).
[0324] As an example, the first communication device 1650 is a relay wireless repeater.
[0325] As an example, the first communication device 1650 is a relay.
[0326] As one embodiment, the second communication device 1610 is a Location Management Function (LMF).
[0327] As an example, the first communication device 1650 corresponds to the first node in this application, and the controller / processor 1659 is used to monitor the performance of the first model based on the feature distribution difference between the first data feature and the second data feature.
[0328] This application also provides a computer-readable storage medium for storing a program. This computer-readable storage medium can be applied to a terminal or network device provided in this application embodiment, and the program causes a computer to execute the methods performed by the terminal device, network device, or core network entity in the various embodiments of this application.
[0329] This application also provides a computer program product. The computer program product includes a program. This computer program product can be applied to a terminal or network device provided in this application embodiment, and the program causes a computer to execute the methods performed by the terminal device, network device, or core network entity in the various embodiments of this application.
[0330] This application also provides a computer program. This computer program can be applied to the terminal or network device provided in this application, and causes the computer to execute the methods performed by the terminal device, network device, or core network entity in the various embodiments of this application.
[0331] It should be understood that the terms "system" and "network" in this application can be used interchangeably. Furthermore, the terminology used in this application is only for explaining specific embodiments of the application and is not intended to limit the application. The terms "first," "second," "third," and "fourth," etc., in the specification, claims, and accompanying drawings of this application are used to distinguish different objects, not to describe a specific order. In addition, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion.
[0332] In the embodiments of this application, the term "instruction" can be a direct instruction, an indirect instruction, or an indication of a relationship. For example, A instructing B can mean that A directly instructs B, such as B being able to obtain information through A; it can also mean that A indirectly instructs B, such as A instructing C, so B can obtain information through C; or it can mean that there is a relationship between A and B.
[0333] In the embodiments of this application, "B corresponding to A" means that B is associated with A, and B can be determined based on A. However, it should also be understood that determining B based on A does not mean that B is determined solely based on A; B can also be determined based on A and / or other information.
[0334] In the embodiments of this application, the term "correspondence" can indicate a direct or indirect correspondence between two things, or an association between two things, or a relationship such as instruction and being instructed, configuration and being configured.
[0335] In this application embodiment, "predefined" or "preconfigured" can be implemented by pre-storing corresponding codes, tables, or other means that can be used to indicate relevant information in the device (e.g., including user equipment and network devices). This application does not limit the specific implementation method. For example, predefined can refer to what is defined in the protocol.
[0336] In this application embodiment, the "protocol" may refer to a standard protocol in the field of communication, such as the LTE protocol, the NR protocol, and related protocols applied to future communication systems. This application does not limit this.
[0337] In the embodiments of this application, the term "and / or" is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this document generally indicates that the preceding and following related objects have an "or" relationship.
[0338] In the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
[0339] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0340] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0341] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0342] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can read or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., digital video discs, DVDs) or semiconductor media (e.g., solid-state disks, SSDs), etc.
[0343] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method in a first node for wireless communication, characterized by, include: The performance of the first model is monitored based on the difference in feature distribution between the first and second data features. Wherein, the first data feature is the intermediate feature of the first model during the training phase, and the second data feature is the intermediate feature of the first model during the inference phase.
2. The method of claim 1, wherein, The first model includes K intermediate layers, where K is a positive integer. The first data feature and the second data feature are both intermediate features output by the first intermediate layer of the first model, and the first intermediate layer is one of the K intermediate layers.
3. The method of claim 2, wherein, The first intermediate layer is directly connected to the output layer of the first model.
4. The method according to any one of claims 1 to 3, characterized in that, The method further includes: When the difference in feature distribution is greater than or equal to a first preset threshold, the update mechanism of the first model is triggered; or, When the difference in the feature distribution is greater than or equal to a second preset threshold, the correction mechanism of the first model is triggered.
5. The method according to any one of claims 1-4, characterized in that, The characteristic distribution differences include at least one of the following: The data distribution error of the first data feature and the second data feature determined by the maximum mean difference algorithm; The KL divergence of the first and second data features determined by the kernel density estimation algorithm.
6. The method of claim 5, wherein, The difference in feature distribution is the data distribution error, and the method further includes: Determine the second data feature; The second data feature is sent to the second node, and the second data feature is used by the second node to determine the data distribution error. Receive the data distribution error from the second node.
7. The method of claim 5, wherein, The difference in feature distribution is the KL divergence, and the method further includes: Receive the probability density distribution of the first data feature from the second node; Determine the probability density distribution of the second data feature; The KL divergence is determined based on the probability density distribution of the first data feature and the probability density distribution of the second data feature.
8. The method of claim 7, wherein, The method further includes: The system receives a table index from the second node for determining a kernel function and a bandwidth parameter, the kernel function and the bandwidth parameter being used to determine the probability density distribution of the first data feature and the probability density distribution of the second data feature.
9. The method according to claim 7 or 8, characterized in that, The method further includes: Receive the coordinates of multiple discrete points from the second node; Based on the coordinates of the multiple discrete points, multiple discrete points are determined from the second data features; Calculate the probability density distribution of the second data feature corresponding to the plurality of discrete points.
10. The method according to any one of claims 1-9, characterized in that, The method further includes: Determine the wireless signal data for the current time period; The wireless signal data is input into the first model to determine the second data feature corresponding to the wireless signal data in the current time period.
11. The method according to any one of claims 1-10, characterized in that, The method further includes: Receive the first model from the second node; The first model is used to locate the first node.
12. The method according to any one of claims 1-11, characterized in that, The first model is an AI / ML model.
13. A method in a second node for wireless communication, the method comprising: include: Train the first model; The difference in feature distribution between the first data feature and the second data feature is used to monitor the performance of the first model; the first data feature is an intermediate feature of the first model during the training phase, and the second data feature is an intermediate feature of the first model during the inference phase.
14. The method of claim 13, wherein, The first model comprises K intermediate layers, K being a positive integer, and the first data feature and the second data feature are intermediate features output by a first intermediate layer of the first model, the first intermediate layer being one of the K intermediate layers.
15. The method of claim 14, wherein, The first intermediate layer is directly connected to an output layer of the first model.
16. The method according to any one of claims 13-15, characterized by, The method further comprises: when the feature distribution difference is greater than or equal to a first preset threshold, receiving an update request of the first model triggered by the first node; or when the feature distribution difference is greater than or equal to a second preset threshold, receiving a correction request of the first model triggered by the first node.
17. The method according to any one of claims 13-16, characterized by, The feature distribution difference comprises at least one of: a data distribution error of the first data feature and the second data feature determined by a maximum mean difference algorithm; a KL divergence of the first data feature and the second data feature determined by a kernel density estimation algorithm.
18. The method of claim 17, wherein, The feature distribution difference is the data distribution error, and the method further comprises: receiving the second data feature from the first node; determining the data distribution error according to the first data feature and the second data feature; sending the data distribution error to the first node.
19. The method of claim 17, wherein, The feature distribution difference is the KL divergence, and the method further comprises: determining a probability density distribution of the first data feature; sending the probability density distribution of the first data feature to the first node.
20. The method of claim 19, wherein, The method further comprises: sending a table index for determining a kernel function and a bandwidth parameter to the first node, the kernel function and the bandwidth parameter being used to determine the probability density distribution of the first data feature and the probability density distribution of the second data feature.
21. The method of claim 19 or 20, wherein, The method further comprises: sending a plurality of discrete point coordinates to the first node, the plurality of discrete point coordinates being used by the first node to determine a plurality of discrete points from the second data feature, the plurality of discrete points being used to calculate the probability density distribution of the second data feature.
22. The method of any one of claims 13-21, wherein, The method further comprises: inputting training data into the first model to determine the first data feature.
23. The method according to any one of claims 13-22, characterized by, The method further comprises: sending the first model to the first node; wherein the first model is used to locate the first node.
24. The method of any one of claims 13-23, wherein, The first model is an AI / ML model.
25. A first node for wireless communication, comprising: Comprise: a first processor configured to monitor performance of a first model according to a feature distribution difference of a first data feature and a second data feature; wherein the first data feature is an intermediate feature of the first model in a training phase, and the second data feature is an intermediate feature of the first model in an inference phase.
26. The first node of claim 25, wherein, The first model comprises K intermediate layers, K being a positive integer, and the first data feature and the second data feature are intermediate features output by a first intermediate layer of the first model, the first intermediate layer being one of the K intermediate layers.
27. The first node of claim 26, wherein, The first intermediate layer is directly connected to an output layer of the first model.
28. The first node of any of claims 25-27, wherein, The first processor is further configured to: when the feature distribution difference is greater than or equal to a first preset threshold, trigger an update mechanism of the first model; or when the feature distribution difference is greater than or equal to a second preset threshold, trigger a correction mechanism of the first model.
29. The first node of any of claims 25-28, wherein, The feature distribution difference comprises at least one of the following: The data distribution error of the first data feature and the second data feature determined by a maximum mean difference algorithm; The KL divergence of the first data feature and the second data feature determined by a kernel density estimation algorithm.
30. The first node of claim 29, wherein, The feature distribution difference is the data distribution error, and the first processor is further configured to determine the second data feature. The first node further comprises: A first transceiver configured to send the second data feature to a second node, the second data feature being used by the second node to determine the data distribution error; The first transceiver is further configured to receive the data distribution error from the second node.
31. The first node of claim 29, wherein, The feature distribution difference is the KL divergence, and the first node further comprises: A second transceiver configured to receive a probability density distribution of the first data feature from a second node; The first processor is further configured to determine a probability density distribution of the second data feature; and determine the KL divergence according to the probability density distribution of the first data feature and the probability density distribution of the second data feature.
32. The first node of claim 31, characterised in that, The second transceiver is further configured to receive a table index for determining a kernel function and a bandwidth parameter from the second node, the kernel function and the bandwidth parameter being used to determine the probability density distribution of the first data feature and the probability density distribution of the second data feature.
33. The first node of claim 31 or 32, characterized by The second transceiver is further configured to receive a plurality of discrete point coordinates from the second node. The first processor is further configured to: Determine a plurality of discrete points from the second data feature according to the plurality of discrete point coordinates; and Calculate the probability density distribution of the second data feature corresponding to the plurality of discrete points.
34. The first node of any of claims 25-33, wherein, The first processor is further configured to: Determine wireless signal data of a current time period; Input the wireless signal data into the first model to determine the second data feature corresponding to the wireless signal data of the current time period.
35. The first node of any of claims 25-34, wherein, The first node further comprises: A third transceiver configured to receive the first model from a second node; The first model is used for positioning the first node.
36. The first node of any of claims 25-35, wherein, The first model is an AI / ML model.
37. A second node for wireless communication, the second node comprising: Comprise: A second processor configured to train a first model; The feature distribution difference of the first data feature and the second data feature is used for performance monitoring of the first model; the first data feature is an intermediate feature of the first model in a training stage, and the second data feature is an intermediate feature of the first model in an inference stage.
38. The second node of claim 37, wherein, The first model comprises K intermediate layers, K being a positive integer, and the first data feature and the second data feature are both intermediate features output by a first intermediate layer of the first model, the first intermediate layer being one of the K intermediate layers.
39. The second node of claim 38, wherein, The first intermediate layer is directly connected to an output layer of the first model.
40. The second node of any of claims 37-39, wherein, The second processor is further configured to: When the feature distribution difference is greater than or equal to a first preset threshold, receive an update request of the first model triggered by the first node; or When the feature distribution difference is greater than or equal to a second preset threshold, receive a correction request of the first model triggered by the first node.
41. The second node of any of claims 37-40, wherein, The feature distribution difference comprises at least one of: a data distribution error of the first data feature and the second data feature determined by a maximum mean discrepancy algorithm; a KL divergence of the first data feature and the second data feature determined by a kernel density estimation algorithm.
42. The second node of claim 41, wherein, The feature distribution difference is the data distribution error, and the second node further comprises: a fourth transceiver configured to receive the second data feature from the first node; the second processor is further configured to determine the data distribution error according to the first data feature and the second data feature; the fourth transceiver is further configured to send the data distribution error to the first node.
43. The second node of claim 41, wherein, The feature distribution difference is the KL divergence, and the second processor is further configured to determine a probability density distribution of the first data feature; The second node further comprises: a fifth transceiver configured to send the probability density distribution of the first data feature to the first node.
44. The second node of claim 43, characterised in that, The fifth transceiver is further configured to send a table index for determining a kernel function and a bandwidth parameter to the first node, the kernel function and the bandwidth parameter being used to determine the probability density distribution of the first data feature and the probability density distribution of the second data feature.
45. The second node of claim 43 or 44, characterized by The fifth transceiver is further configured to send a plurality of discrete point coordinates to the first node, the plurality of discrete point coordinates being used by the first node to determine a plurality of discrete points from the second data feature, the plurality of discrete points being used to calculate the probability density distribution of the second data feature.
46. The second node of any of claims 37-45, wherein, The second processor is further configured to input training data into the first model to determine the first data feature.
47. The second node of any of claims 37-46, wherein, The second node further comprises: a sixth transceiver configured to send the first model to the first node; wherein the first model is used to locate the first node.
48. The second node of any of claims 37-47, wherein, The first model is an AI / ML model.
49. A node for wireless communication, the node comprising: A node comprising a transceiver, a memory, and a processor, the memory being configured to store a program, the processor being configured to invoke the program in the memory and control the transceiver to receive or send signals, so that the node performs the method of any one of claims 1-12 or 13-24.
50. An apparatus comprising: A device comprising a processor configured to invoke a program from a memory, so that the device performs the method of any one of claims 1-12 or 13-24.
51. A chip, comprising: A chip comprising a processor configured to invoke a program from a memory, so that a device installed with the chip performs the method of any one of claims 1-12 or 13-24.
52. A computer-readable storage medium, comprising: A computer program stored thereon, the program causing the computer to perform the method of any one of claims 1-12 or 13-24.
53. A computer program product, characterised in that, A computer program causing a computer to perform the method of any one of claims 1-12 or 13-24.
54. A computer program, characterized in that, The computer program causes a computer to perform the method of any one of claims 1-12 or 13-24.