Distributed indoor positioning method, device and storage medium based on federated learning

By employing a federated learning-based distributed indoor positioning method, utilizing the ResNet model and a distributed training strategy, the issues of indoor positioning accuracy and privacy protection are addressed, achieving higher positioning accuracy and model generalization performance.

CN116861993BActive Publication Date: 2026-06-16UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Filing Date
2023-06-14
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing indoor positioning technologies struggle to achieve accurate, effective, and reliable positioning in complex indoor environments, especially when GNSS signals are blocked, resulting in suboptimal positioning accuracy. Furthermore, centralized training can lead to data privacy issues and problems such as overfitting or underfitting.

Method used

We employ a distributed indoor positioning method based on federated learning. By constructing an indoor positioning scenario, we use a ResNet model for AI/ML deep learning, combine CIR sample estimation of LosS and NLoS paths, perform distributed training and parameter aggregation, utilize distributed nodes to collect small-scale local data for positioning, and adopt a personalized federated learning strategy to achieve model personalization and privacy protection.

🎯Benefits of technology

It improves positioning accuracy, reduces the risk of overfitting and underfitting, protects user privacy, reduces data transmission latency and path loss, and enhances model generalization performance and positioning accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116861993B_ABST
    Figure CN116861993B_ABST
Patent Text Reader

Abstract

The present application relates to the technical field of indoor positioning, and particularly relates to a distributed indoor positioning method based on federated learning, equipment and a storage medium thereof, which comprises constructing an indoor positioning scene, measuring samples to generate a training data set based on DL PRS received in a time slot; based on the training data set, AI / ML deep learning is performed through a Resnet model; the UE position in the indoor positioning scene is predicted by AI / ML after deep learning based on the strategy of federated learning, distributed multi-point cooperative positioning is adopted, distributed training is performed for the data set, and a personalized deep neural network model can be provided for users to improve the overall performance of the algorithm; the strategy of federated averaging algorithm plus meta learning is adopted, personalized federated learning is used, and the goal is to find an initial shared model, and the current or new user can adapt to the initial shared model by performing one or several steps of gradient descent, so as to provide a more personalized model for distributed users.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the technical field of indoor positioning, specifically relating to a distributed indoor positioning method, device, and storage medium based on federated learning. Background Technology

[0002] With the proliferation of location-based services such as the Internet of Things (IoT) and Machine-to-Care (MTC) on mobile devices, precise positioning technology has received considerable attention in recent years. Indoor positioning technologies include algorithms based on information such as wireless signals, visible light, and sound. Among these technologies, wireless-based methods are the most popular because wireless communication technology is relatively mature and does not require additional components to mobile devices.

[0003] Indoor positioning plays a fundamental role in a wide range of Internet of Things (IoT) applications, such as indoor emergency rescue, precision marketing in shopping malls, smart factory asset management and tracking, mobile healthcare services, virtual reality gaming, and location-based social media. Despite the ever-growing market demand, providing a viable indoor positioning solution is not easy in many cases. Global Navigation Satellite Systems (GNSS), as the most popular positioning technology, has achieved great success in open outdoor environments, with sub-meter accuracy achieved through various enhancement technologies. However, due to its lower power, GNSS signals are not well received indoors, making continuous and reliable positioning impossible. In many situations, especially in deeper indoor areas, GNSS signals may be completely blocked.

[0004] Extensive research has been conducted on indoor positioning, leading to the development of various technologies based on WiFi, Bluetooth, ultra-wideband (UWB), pseudosatellites, geomagnetism, acoustic / ultrasonic methods, or pedestrian trajectory estimation (PDR). While each technology has its advantages, achieving accurate, effective, reliable, and real-time positioning solutions for indoor applications remains highly challenging due to the complex layout, topology, and signal transmission environment of indoor spaces.

[0005] Generally speaking, localization methods can be divided into two main categories: geometry-based methods and feature matching-based methods. Geometry-based methods can be further divided into triangulation, trilateration, and joint estimation methods, while feature matching-based methods are mainly referred to as fingerprint recognition methods.

[0006] Chinese patent CN116095600A discloses an indoor positioning method based on 5G spatiotemporal big data collaboration. It uses 5G spatiotemporal big data as a benchmark and relies on the wide spatial coverage of 5G networks for data collaborative positioning. However, when using machine learning to build positioning and location models, there is a large amount of labeled training data. The collection process may lead to serious data privacy issues. At the same time, it is constrained by the indoor environment and is easily reflected, refracted and scattered by various indoor objects. As a result, the signal emitted from one transmitter will reach the receiver through many different propagation paths, resulting in unsatisfactory positioning accuracy. Summary of the Invention

[0007] The purpose of this invention is to provide a distributed indoor positioning method, device, and storage medium based on federated learning, addressing the aforementioned problems.

[0008] To achieve the above objectives, the technical solution adopted by this invention is: a distributed indoor positioning method based on federated learning, comprising:

[0009] To construct an indoor positioning scenario, the measurement samples are generated into a training dataset based on the DL PRS received within a time slot.

[0010] AI / ML deep learning is performed using the ResNet model based on the training dataset.

[0011] The federated learning-based strategy uses AI / ML after deep learning to predict the UE position in indoor positioning scenarios.

[0012] Furthermore, the indoor positioning scenario includes a client and a server, wherein the client uploads the trained model to the server;

[0013] The server averages the parameters and then sends the model parameters to each client.

[0014] Furthermore, the training dataset includes CIR samples.

[0015] Furthermore, CIR samples are estimated using both LoS and NLoS paths;

[0016] In the NLoS path, the CIR sample from the transmitting antenna element s to the receiving antenna element u is:

[0017]

[0018] Where, τ, τ n , τ n,i It is a delay parameter;

[0019] δ(·) is the Dirac delta function;

[0020] The NLoS channel coefficients for the cluster n∈{3,4,…,N} from the transmitting antenna s to the receiving antenna u over time t.

[0021] It is the NLoS channel coefficient on the m-th ray of the cluster N∈{1,2}, from the transmitting antenna element s to the receiving antenna element u over time t;

[0022] In the LoS path, the LoS channel coefficients are added to the NLoS channel impulse response, and these two terms are scaled according to the desired K coefficient.

[0023]

[0024] Among them, K R It is Rice's K-factor;

[0025] It is the LoS channel coefficient from the time t of the transmitting antenna element s to the receiving antenna element u.

[0026] Optionally, the ResNet model extracts the three-dimensional data features of the CIR samples, which are represented as A×256×2;

[0027] Where A is the CIR information of the number of base stations in a client received by a UE;

[0028] "256" represents the number of Fourier fast transform sampling points for the impulse response waveform;

[0029] “2” represents two types of data: real domain and complex domain.

[0030] Furthermore, in the ResNet model, convolutional and pooling layers are used to reduce the dimensionality of the input 3D data features, and normalization and ReLU activation functions are added after each convolutional layer.

[0031] After deformation, AI / ML training is performed by residual blocks. The weight layers in the residual blocks are represented by convolutional layers. Dimensionality increase is performed after the second and fourth residual networks. When the dimensions change between residual blocks, the input in the residual block needs to be increased in dimension before it can be directly added to the output.

[0032] In the output layer operation, average pooling is used to reduce the data dimensionality. The number of channels is transformed from 256 to 2 through linear layer transformation, so that the model output has the same dimension as the UE coordinates, thereby obtaining the predicted UE coordinates.

[0033] Furthermore, the client receives the global model parameters sent by the server and trains each client using the ResNet model. Based on the RSRP dataset generated in the scenario simulation, the client can obtain the user data closest to each base station for training. Each client obtains the same number of datasets for training, and the parameters of the ResNet model are fed back to the server after training.

[0034] The server distributes global model parameters to the client, receives ResNet model parameters from the client, and updates the global model parameters.

[0035] Furthermore, the client receives global model parameters w from the server. t And set the client model parameter w k =w t And perform stochastic gradient descent for two local iterations:

[0036]

[0037]

[0038] Where b represents a local segment of the training dataset;

[0039] k is the kth client node;

[0040] Then the weight difference g before and after k =w k -w t Send to the server, where t is the t-th iteration;

[0041] The server receives the weight differences g1, g2, ..., g from k client nodes. K And update the global model parameters by weighted average.

[0042] Meanwhile, the present invention also provides a distributed indoor positioning device based on federated learning, comprising:

[0043] A processor and a memory, the memory being used to store computer programs, the processor being used to call and run the computer programs stored in the memory to execute a federated learning-based distributed indoor positioning method.

[0044] Furthermore, the present invention also provides a storage medium for distributed indoor positioning based on federated learning, comprising: a computer program for storing a computer program that causes a computer to execute a distributed indoor positioning method based on federated learning.

[0045] In summary, due to the adoption of the above technical solutions, the beneficial effects of the present invention include at least one of the following:

[0046] 1. This invention employs distributed multi-point collaborative localization and performs distributed training on the dataset, which can effectively prevent overfitting and / or underfitting, thereby improving the overall performance of the algorithm.

[0047] 2. This invention adopts a federated average learning strategy, using personalized federated learning. The goal is to find an initial shared model that current or new users can easily adapt to their local datasets by performing one or more steps of gradient descent on their own data. This retains all the benefits of the federated learning architecture and provides a more personalized model for each user through the structure. Furthermore, the federated framework makes it possible to achieve the same localization performance as large-scale data-driven learning models while protecting user privacy.

[0048] 3. In this invention, each mobile user / intelligent agent collects a small local dataset and approximates a global machine learning model collaboratively. Compared to centralized training, this data collection method reduces excessive path loss and multipath fading caused by long distances when transmitting data to the data processing center, as well as reducing latency during transmission. Furthermore, since each distributed node only transmits parameter information, the number of bits transmitted is far less than in centralized training, which improves transmission speed and reduces the memory burden at the data processing nodes.

[0049] 4. This invention uses distributed nodes to collect small-scale user data from nearby geographical locations for training and positioning, avoiding interference that is easily caused by long-distance transmission, reducing multipath effects, and thus improving positioning accuracy.

[0050] 5. This invention uses distributed node model parameter aggregation to achieve multi-point collaboration, which can make the model have better generalization performance. In terms of positioning accuracy, it uses distributed training with datasets with regional features, which can provide personalized solutions for coordinate prediction for each user, thereby obtaining smaller positioning errors and thus better model generalization, positioning accuracy and convergence speed. Attached Figure Description

[0051] Figure 1 This is a schematic diagram of an indoor positioning scenario;

[0052] Figure 2 This is a diagram illustrating the channel coefficient generation process.

[0053] Figure 3 For ResNet model;

[0054] Figure 4 This is a schematic diagram of the residual block;

[0055] Figure 5Parameter table for indoor positioning scenarios;

[0056] Figure 6 The location error curve for the distributed algorithm;

[0057] Figure 7 This is a graph showing the positioning error curve of the centralized algorithm.

[0058] Figure 8 A comparison chart of positioning errors between centralized and distributed algorithms;

[0059] Figure 9 A CDF graph comparing distributed and centralized algorithms;

[0060] Figure 10 A graph showing that consistently low prediction error is achieved across multiple epochs;

[0061] Figure 11 Localization error map for each epoch;

[0062] Figure 12 This is a flowchart of a distributed indoor positioning method based on federated learning. Detailed Implementation

[0063] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations.

[0064] Therefore, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the invention without inventive effort are within the scope of protection of the invention.

[0065] It should be noted that, unless otherwise specified, the embodiments and features described in this invention can be combined with each other.

[0066] This invention discloses a distributed indoor positioning method based on federated learning, comprising:

[0067] To construct an indoor positioning scenario, the measurement samples are generated into a training dataset based on the DL PRS received within a time slot.

[0068] AI / ML deep learning is performed using the ResNet model based on the training dataset.

[0069] The federated learning-based strategy uses AI / ML after deep learning to predict the UE position in indoor positioning scenarios.

[0070] The purpose of this design is to employ distributed multi-point collaborative localization, using distributed training on the dataset to effectively prevent overfitting and / or underfitting, thereby improving the overall performance of the algorithm. The federated averaging plus-variable learning strategy uses personalized federated learning, aiming to find an initial shared model that current or new users can easily adapt to their local datasets by performing one or more steps of gradient descent on their own data. This retains all the benefits of the federated learning architecture and provides a more personalized model for each user through its structure. Furthermore, the federated framework makes it possible to achieve the same localization performance as large-scale data-driven learning models while protecting user privacy. Each mobile user / intelligent agent collects a small local dataset and approximates the global machine learning model collaboratively. Compared to centralized training, this data collection method reduces excessive path loss and multipath fading caused by long distances when transmitting data to the data processing center, and also reduces latency during transmission. Meanwhile, since each distributed node only transmits parameter information, the number of bits transmitted is far less than in centralized training, which improves transmission speed and reduces memory burden at data processing nodes. Training and localization by collecting small-scale user data from nearby geographic locations through distributed nodes avoids interference caused by long-distance transmission, reduces multipath effects, and thus improves localization accuracy. Using distributed node model parameter aggregation to achieve multi-point collaboration improves model generalization performance. Regarding localization accuracy, distributed training with datasets featuring regional characteristics provides personalized solutions for coordinate prediction for each user, resulting in smaller localization errors and better model generalization, localization accuracy, and convergence speed. For specific usage procedures, please refer to [link / reference]. Figure 12 .

[0071] In practical implementation, indoor positioning scenarios include a client and a server, whereby the client uploads the trained model to the server.

[0072] The server averages the parameters and then sends the model parameters to each client.

[0073] It should be noted that a client typically consists of 18 base stations.

[0074] In practice, the training dataset includes CIR samples and RSRP datasets generated from scenario simulations.

[0075] It should be noted that the CIR samples are used for deep model training, where CIR stands for channel impulse response, which is generated by simulation of a general channel model.

[0076] Specifically, in a general channel model, taking a channel representation in the frequency domain as an example, the frequency domain space is divided into N samples of size Δf, and the total bandwidth occupied by the samples is w = nΔf. This representation is commonly seen in orthogonal frequency division multiplexing (OFDM) signals, but it is not limited to its use in this signal.

[0077] With N Tx The transmitter (Tx) of the antenna and the N-band Rx The channel between the antenna and the receiver (Rx), at frequencies n∈{0, ..., N-1} and sequences k∈{0, ..., K-1}, can be approximated by the following model:

[0078]

[0079] Where L is the number of physical propagation paths (e.g., given by a ray tracer);

[0080] α is the complex channel gain;

[0081] It is the response of the Rx array, as the angle of arrival (AoA)θ∈R 2 Functions to obtain azimuth and elevation angles;

[0082] It is used as the deviation angle The Tx array response is used as a function to obtain the azimuth and elevation angles;

[0083] τ represents the arrival time (ToA);

[0084] v represents the Doppler frequency shift;

[0085] T s Indicates the duration of a sequence;

[0086] AoA is defined in the Rx reference standard, and AoD is defined in the Tx reference standard; therefore, these angles depend on their respective orientations. Below 6 GHz, explicit geometric information in a single channel is difficult to utilize due to limited latency and angular resolution, coupled with weak connections between the path and environment geometry. In contrast, at mmWave and above, the path is more closely related to the environment geometry and can be resolved more easily. Therefore, we assume that each path in Equation 1 corresponds to a single entity object.

[0087] The signal observation at Rx can be expressed as:

[0088]

[0089] in, It is an orthogonally simulated Rx synthesizer that satisfies Use M Rx ≤N Rx One radio frequency link, f n,k For the k-th Tx signal passing through the Tx array, satisfying This represents the combined noise. Where, P... tx Let N be the average transmit power, and N0 be the noise power spectral density. The transmitted signal f... n,k It is usually known (pilot signals in localization or bistatic sensing or known data in monostatic sensing), but may be partially unknown in semi-blind estimation.

[0090] like Figure 1 As shown in the scenario, in the indoor scene localization problem, each UE has an unknown state s. The unknown state can be inferred from the observation formula 2. The state information includes the location x∈R. 3 azimuth angle o∈R 3 Time delay τ∈R, power p∈R, etc., are usually represented by parameters.

[0091] Basic communication nodes (BSs i∈{1,…,N) B}) has some known state information, namely position x i ∈R 3 and direction o i ∈R 3 Furthermore, it is time-synchronized. In indoor scenarios, positioning is user-centric downlink (DL) positioning. In DL, each BS i transmits signals via orthogonal subcarriers to generate a channel. UE observations on Rx

[0092] Formula 1 is widely used in communication systems, but when considering positioning problems, the channel is generally considered. It is divided into line-of-sight path and non-line-of-sight path, i.e.

[0093]

[0094] In confirming the NLoS channel coefficients:

[0095] like Figure 2 As shown, the state and geometric information at the UE can be obtained from the parameters in the channel coefficients. For the N-2 weakest clusters, such as n = 3, 4, ..., N, the channel coefficients can be written as:

[0096]

[0097] Where Frx,u,θ F rx,u,φ This is the field pattern of the receiving antenna. In the direction of the spherical basis vector, F rx,s,θ F rx,s,φ These are the field patterns of the transmitting antenna element s along the directions of the spherical basis vectors. It's important to note that the diagrams are given in a global coordinate system (GCS), therefore, depending on the antenna orientation transformations in the settings, It has an azimuth angle and an angle of arrival φ n、m、AOA and elevation angle and arrival angle θ n、m、ZOA The spherical unit vector, and given by Equation 5.

[0098]

[0099] Where n represents the cluster, and m represents the ray n within the cluster. The azimuth departure angle φ n、m、AOD and elevation angle departure angle θ n、m、ZOD The spherical unit vector is given by Equation 6.

[0100]

[0101] Where n represents a cluster, and m represents a ray within cluster n. It is the position vector of the receiving antenna element u.

[0102] , κ is the position vector of the transmitting antenna element s. m,n Let λ be the cross-polarization power ratio on a linear scale, and λ0 be the wavelength of the carrier frequency. If polarization is neglected, the 2x2 polarization matrix can be expressed by the scalar exp(jΦ). n,m ) replace, and only apply the vertical polarization field pattern.

[0103] Meanwhile, the path loss expression for NLoS in the InF-DH scenario is PL = 18.6 + 35.7log 10 (d 3D )+20log 10 (f c The shadow fading is σ. SF =7.2. By applying path loss and channel fading to the channel parameters, a channel model for indoor scenarios can be obtained.

[0104] In confirming the Loss channel coefficients:

[0105] The general expression for the LoS channel model can be written as follows:

[0106]

[0107] Channel complex power gain The general mathematical expression can be written as

[0108]

[0109] Where λ is the wavelength at the carrier wave. and These represent the antenna element responses at Tx and Rx, respectively, with the power gain determined by the path loss.

[0110] When applied to real-world scenarios, the mathematical expression is elaborated in more detail. In indoor positioning scenarios, i.e., InF-DH scenarios, according to 3GPP Release 17, the channel parameters in the channel model are expressed as follows:

[0111]

[0112] Path loss and shadow fading are respectively represented by PL LOS =31.84 + 21.50log 10 (d 3D +19.00log 10 (f c ), σ SF =4.3. After applying path loss and shadowing fading to the channel coefficients, the channel model of the Loss-of-Stake (LoS) path can be obtained.

[0113] After establishing the general channel model, it is necessary to calculate the channel impulse response for both LosS and NLoS paths in the localization problem separately:

[0114] In an NLoS path, assume there are N clusters, each with 20 rays. For the two strongest clusters, such as n=1 and 2, the rays are delayed and propagated to three sub-clusters with fixed delay offsets. The number of rays and the ray power are different in the three sub-clusters.

[0115] For subclusters i∈{1,2,3}, the mapping to the ray is represented by R. i This is expressed as follows. For N-2 weakest clusters, let n = 3, 4, ..., N, where the power of different rays in each cluster is equal. In the NLoS case, the CIR from the transmitting antenna element s to the receiving antenna element u is...

[0116]

[0117] Where τ, τ n , τ n,i It is the delay parameter, and δ(·) is the Dirac delta function, i.e., the unit impulse response. It is the NLoS channel coefficient of the cluster n∈{3,4,…,N} from the transmitting antenna s to the receiving antenna u over time t. It is the NLoS channel coefficient on the m-th ray of the cluster N∈{1,2}, from the transmitting antenna element s to the receiving antenna element u over time t.

[0118] In the LoS path, the LoS channel coefficients are added to the NLoS channel impulse response, and these two terms are scaled according to the desired K coefficient.

[0119]

[0120] Among them, K R It is Rice's K-factor. It is the LoS channel coefficient from the time t of the transmitting antenna element s to the receiving antenna element u.

[0121] After completing the above process, the measurement samples are generated based on the DL PRS received within a time slot. For each DL PRS reception, the UE will perform measurements corresponding to each measurement value, such as DL-RSTD, CIR, and RSRP. The tag associated with each sample is the known location information of the target UE. To evaluate each AI / ML-based scheme, 80,000 samples and associated tags are generated. The evaluation scheme for each scheme is as follows:

[0122] Due to the requirements for indoor user deployment in Release 16, the dataset generation needs to consider drop (i.e., the number of times a user is deployed) and user deployment distribution. In this dataset, we consider the case where 80,000 users are all deployed indoors at once (1 drop), and the distribution is random.

[0123] When performing indoor positioning, the user in the scene is represented as U = {u i} i=1,...,N For user u in the k-th drop k Objectively speaking, it possesses a true global coordinate system relative to the indoor scene. in Let P represent the east and north coordinates in the horizontal space within the indoor scene, respectively. k Only when u k It is valid when ∈U. Assuming that the predicted coordinates of each user through the deep model are valid, the problem can be transformed into how to use environmental information and convolutional neural systems {P}. k |u k The global coordinates are approximated by ∈U}.

[0124] In AI / ML deep learning based on CIR samples using the ResNet model, since the structure of CIR input is similar to that of image input, the ResNet model is used as the AI / ML localization scheme. The ResNet model is frequently used in image processing to extract features from CIR 3D data. For each CIR sample, we use complex values ​​in the time domain to represent its complete features, with an input dimension of "18×256×2". The 3D input is generated from 18 base station CIR information received by a UE and 256 FFT sampling points, where 2 represents both real-domain and complex-domain data content. The ResNet structure is as follows... Figure 4 As shown. In the first four reshape layers, the CIR input is transformed to a size of 18×18×64, and then 12 Con2D layers and 3×3 convolution kernels are used for convolution operations. At the dashed line, a shortcut operation is performed, that is, the data is reduced in dimensionality using a stride of 2.

[0125] AI / ML deep learning using the ResNet model is mainly based on residual networks and convolutional neural networks. In mathematics, the convolution of two functions is defined as...

[0126] (f*g)(x)=∫f(z)g(xz)dz (Formula 14)

[0127] In other words, convolution calculates the overlap between f and g when a function is "flipped" and shifted by x. When dealing with discrete objects, integration becomes summation. For example, for vectors drawn from a set of square-summable, infinite-dimensional vectors with pointer Z.

[0128] (f*g)(i)=∑ a f(a)g(ia) (Formula 15)

[0129] For a two-dimensional tensor, it is the sum of the corresponding indices (a, b) of f and (ia, jb) of g:

[0130]

[0131] In a neural network, the operation performed by a convolutional layer is actually a cross-correlation operation, not a convolution operation. In a convolutional layer, the input tensor and the kernel tensor are cross-correlated to produce the output tensor. The convolutional layer performs a cross-correlation operation on the input and the kernel weights, and then adds a scalar bias to produce the output. Therefore, the two trained parameters in a convolutional layer are the kernel weights and the scalar bias. Just like initializing a fully connected layer, the kernel weights are randomly initialized when training a model based on convolutional layers.

[0132] In residual networks, such as Figure 4The original input shown is x, and the desired mapping to be trained is f(x) (which serves as the input to the activation function above in the figure). Figure 4 The portion within the dashed box in the left half of the text needs to be directly fitted to obtain the mapping f(x), while... Figure 4 The portion within the dashed box in the right half requires fitting the residual mapping f(x)-x. In practice, residual mappings are often easier to optimize for performance. The right figure shows the basic architecture of ResNet – the residual block. In the residual block, the input can propagate forward more quickly through cross-layer data channels.

[0133] In extracting features from CIR 3D data, its dimension can be represented as 2xKx256, where K is the number of base stations, 256 is the number of Fourier Fast Transform sampling points for the impulse response waveform, and 2 represents the real and imaginary parts of the data. That is, a single CIR environmental information represents the real and imaginary parts of the channel impulse response received by a single UE in an indoor environment from 18 base stations. ResNet is originally used to process multi-channel image information; in this design, the input data dimension can be considered as Kx256 2-channel image data.

[0134] In the first four layers of the network, convolutional and pooling layers are used to reduce the dimensionality of the input information to 64 channels of size KxK, so that it can be processed by 3×3 convolutional kernels when it enters the residual block. At the same time, a normalization operation is added after each convolutional layer, that is, the batchnorm2d operation is used to normalize the samples of each batch of input, and the ReLU activation function is added after the normalization operation, that is, σ(·)=ReLU(x)=max(x,0), to increase the linear relationship between the neural network layers.

[0135] After the deformation process, the network is trained using 6 residual blocks. The weights in the residual blocks are represented by convolutional layers. The number of channels and the kernel size of the convolutional operation are... Figure 4 Detailed annotations are provided. Dimensionality increase is performed after the second and fourth residual networks, which doubles the number of channels by setting the convolution stride to 2. With this dimensionality change, the input propagation across layers in the residual block requires downsampling to increase the dimensionality before it can be directly added to the output. This operation is marked with dashed lines in the design diagram.

[0136] In the output layer operations, average pooling is used to reduce the data dimensionality, transforming the output data size to 1×1.

[0137] Finally, a linear layer transformation is used to reduce the number of channels from 256 to 2, making the model output the same dimension as the UE coordinates, thus obtaining the predicted UE coordinates.

[0138] In implementing the federated learning strategy, the horizontal federated learning model uses a local terminal to distribute the training task to the central node. The local terminal is primarily responsible for updating local model parameters, calculating the loss value, and calculating the gradients of the model weights. The central node collects the gradient information uploaded locally and uses an aggregation algorithm to fuse all gradients to update the global model. Then, the central node passes the new model weights to the local terminal. These processes are repeated until the model converges.

[0139] In the initial phase of training, the weights of the global model are assigned random values. Then, the global model sends the weights to the local model, which in turn updates its local model.

[0140] During local training, each local terminal inputs its local private data into the local model to predict the error, and calculates the error value using the mean squared error (MSE) loss function, as shown in the following equation:

[0141]

[0142] θ = {W, b}

[0143] Then, the weight gradient of the last convolutional layer is calculated as follows, where u and v represent the input and output of the convolutional layer:

[0144]

[0145]

[0146] Where ⊙ represents the basic product operation, and according to the chain rule, for the output v of non-last layer... l The expression J can be represented as

[0147]

[0148] This can be generalized to all convolutional layers, where the gradient of each parameter can be expressed as...

[0149]

[0150]

[0151] Finally, the gradient of all parameters can be expressed as Indicates shared ownership In a horizontal federation, the parameter input of the local model is the average gradient of the parameters of each local virtual connection issued by the global model.

[0152] During the global process, the global model collects gradients from each local model for updating model weights. For each iteration of the global model, the local models provide the updated gradients, i.e., the gradient sets. The gradient of each parameter is of length [length missing]. A list.

[0153] The global model then uses an aggregation algorithm to integrate all gradient information to update itself. After S rounds of training on the global model, the parameters are sent to the local models. The local model parameters are updated, and the next iteration begins.

[0154] In federated learning, the objective function can typically be written as:

[0155]

[0156]

[0157] The goal of Federated Meta-Learning is to design personalized variations of functional problems using the fundamental ideas behind the Meta-Learning (MAML) framework. The basic concept of Federated Meta-Learning can be explained as follows: In MAML, assuming a new task arrives and there is a limited computational budget to update our model, the goal is to find an initial state for this new task that allows the model to perform well after being updated through one or more gradient descent steps. Therefore, the objective function of meta-learning can be written as:

[0158]

[0159] Where α≥0 is the step size, the advantage of this formula is that it not only maintains the strengths of gradient descent (FL) but also allows for differences between users. Users can use this objective function as an initial point and update it slightly relative to the objective function with their own data. This means that users can obtain an initialization result and update it by examining their own data, performing one or more gradient descent steps, and performing the necessary steps.

[0160] The client receives global model parameters w from the server. t And let w k =w t And perform stochastic gradient descent for E local iterations:

[0161]

[0162]

[0163] Where b represents a local segment of the training dataset;

[0164] k is the kth client node;

[0165] Then the weight difference g before and after k =w k -w t Send to the server, where t is the t-th iteration;

[0166] The server receives the weight differences g1, g2, ..., g from k client nodes. K And update the global model parameters by weighted average.

[0167] In simulation testing, the positioning evaluation is performed using the following methods:

[0168] In indoor positioning scenarios, P pre These are inaccurate predicted coordinate values, containing errors. A deep learning model F is trained to predict these coordinate errors in the region. The coordinate error is the true coordinate P. k With predicted coordinates P pre The difference between the two, i.e., the optimization objective is to minimize the MSE value between them.

[0169] For the accuracy evaluation index, the MSE error distribution function (CDF) of the 90th, 80th, 67th, and 50th percentiles is used to measure the accuracy level of the algorithm.

[0170] Indoor positioning scene parameter settings, such as Figure 5 For the model using the single-frequency dataset, the performance is as follows, with the positioning errors of different positioning schemes recorded at the 50%, 67%, 80%, and 90th percentiles of positioning accuracy, as shown in Table 1:

[0171] method 50% 67% 80% 90% Traditional methods 11.89 13.62 14.78 16.36 AI+CIR 0.27 0.36 0.43 0.54

[0172] (Table 1)

[0173] AI / ML-based methods can significantly improve positioning accuracy. When using traditional positioning algorithms for UE coordinate prediction, the positioning accuracy exceeds 10m at the 90th percentile of the CDF. AI / ML-based methods can reduce the positioning error to within 1 meter.

[0174] In the multi-point collaborative localization federated learning simulation, with a single-frequency dataset as input, the parameter settings of the personalized federated learning algorithm are shown in Table 2:

[0175] Total number of clients 18 Global iteration count (epochs) 20 Local iteration count (epochs) 30 The number of clients selected for training in each round 10 The number of samples (batch size) in each round of local training. 32 Local training learning rate 0.0001 The parameters (momentum) of the SGD optimizer 0.0001

[0176] (Table 2)

[0177] Distributed algorithm performance, with 10 clients selected in each round of global training, such as... Figure 6 As shown, the positioning error corresponding to 90% accuracy achieved over 50 epochs is 0.205m. Convergence occurs at the 44th epoch, and after the 25th epoch, the positioning error stabilizes below 0.3m, indicating convergence after the 25th epoch.

[0178] Centralized algorithm performance, such as Figure 7 As shown, according to the positioning error curve, the positioning performance of the centralized algorithm after 600 iterations converges after about 140 epochs, with an optimal positioning error of 0.54m. In comparison, the convergence speed of the distributed algorithm is significantly faster than that of the centralized algorithm.

[0179] Performance comparison between distributed and centralized algorithms, such as Figure 8 As shown, a comparison over 50 epochs reveals that the distributed algorithm outperforms the centralized algorithm in terms of convergence, positioning accuracy, and positioning performance stability.

[0180] For distributed and centralized algorithms, the optimal positioning error distributions of the two are compared, such as... Figure 9 As shown in Table 3,

[0181] method 50% 67% 80% 90% Traditional methods 11.89 13.62 14.78 16.36 CIR input + centralized algorithm 0.27 0.36 0.43 0.54 CIR Input + Distributed Algorithm 0.13 0.16 0.18 0.21

[0182] (Table 3)

[0183] As can be seen from the CDF curve, for most users, the positioning error of the distributed algorithm is smaller than that of the centralized algorithm. It can be considered that the positioning effect has been significantly improved in terms of accuracy through the distributed algorithm. It should be noted that the traditional method mentioned above refers to the traditional positioning method used in the background technology.

[0184] For global training, such as Figure 10 As shown, different numbers of clients were selected for each round of global training to observe the positioning error. The number of clients participating in the training was set to 10, 12, 14, 16, and 18 respectively. Observing the difference in positioning error under different numbers of clients, we can use the condition of consistently achieving low prediction error and stable prediction error over multiple epochs as the standard for judging the positioning accuracy. It can be seen that when the number of clients is 16, the positioning error obtained from the predicted coordinates is consistently the lowest, while when the number of clients is 10, the positioning error obtained from the predicted coordinates is consistently the highest. When the number of clients is 12, 14, and 18, the difference in the obtained positioning error is not significant, and the performance is in the middle range. Therefore, it can be concluded that the number of clients and the positioning performance are not linearly related, but rather the performance reaches a peak at a certain value. In this experiment, the positioning accuracy is best when the number of clients is 16.

[0185] like Figure 11 As shown, since the positioning error is different in each epoch, we take the positioning error as stable and below 0.3m as the standard for judging convergence. We select different numbers of clients for training and observe the number of iterations when the convergence condition is reached. It can be observed that the more clients there are, the fewer iterations are needed to reach the convergence condition. It can be considered that in the simulation, the more clients participating in the training, the faster the convergence speed.

[0186] This embodiment also provides a distributed indoor positioning device based on federated learning, including:

[0187] A processor and a memory, the memory being used to store computer programs, the processor being used to call and run the computer programs stored in the memory to execute a federated learning-based distributed indoor positioning method.

[0188] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor.

[0189] The memory can be an internal storage unit or an external storage device, such as a plug-in hard drive, a smart media card (SMC), a secure digital card (SD), or a flash card. Furthermore, the memory may include both internal storage units and external storage devices. The memory is used to store the computer program and other programs and data, and can also be used to temporarily store data that has been output or will be output.

[0190] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is merely an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit. Furthermore, the specific names of the functional units and modules are only for easy differentiation and are not intended to limit the scope of protection of this application. The specific working process of the units and modules in the above system can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0191] In the above embodiments, the descriptions of each embodiment have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0192] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0193] In the embodiments provided by this invention, it should be understood that the disclosed apparatus / terminal devices and methods can be implemented in other ways. For example, the apparatus / terminal device embodiments described above are merely illustrative. For instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0194] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0195] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0196] If the integrated module / unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments of the present invention can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include: any entity or device capable of carrying the computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc.

[0197] In this embodiment, a computer storage medium is also provided, which stores a computer program. The computer storage medium can be one of magnetic random access memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, flash memory, magnetic surface memory, and optical disk. It can also be various devices including one or any combination of the above-mentioned memories, such as mobile phones, computers, tablet devices, etc. The computer program can drive a system to resolve conflicts between log data of different formats, and the computer program processor can execute a distributed indoor positioning method based on federated learning.

[0198] Finally, it should be noted that the above are merely preferred embodiments of the present invention and are not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A federated learning based distributed indoor positioning method, characterized in that, include: To construct an indoor positioning scenario, the measurement samples are generated from the DL PRS received within a time slot to form a training dataset, which includes CIR samples. AI / ML deep learning is performed using the ResNet model based on the training dataset. The ResNet model extracts the three-dimensional data features of the CIR samples, which are represented as A×256×2. Where A is the CIR information of the number of base stations in a client received by a UE; "256" represents the number of Fourier fast transform sampling points for the impulse response waveform; "2" represents two types of data content: real and complex domains. In the ResNet model, convolutional and pooling layers are used to reduce the dimensionality of the input three-dimensional data features. Normalization and ReLU activation functions are added after each convolutional layer. After deformation, AI / ML training is performed by residual blocks. The weight layers in the residual blocks are represented by convolutional layers. Dimensionality increase is performed after the second and fourth residual networks. When the dimensions change between residual blocks, the input in the residual block needs to be increased in dimension before it can be directly added to the output. In the output layer operation, average pooling is used to reduce the data dimensionality. The number of channels is transformed from 256 to 2 through linear layer transformation, so that the model output has the same dimension as the UE coordinates, thereby obtaining the predicted UE coordinates. The federated learning-based strategy uses AI / ML after deep learning to predict the UE position in indoor positioning scenarios.

2. The federated learning based distributed indoor positioning method according to claim 1, characterized in that: The indoor positioning scenario includes a client and a server, wherein the client uploads the trained model to the server. The server averages the parameters and then sends the model parameters to each client.

3. The federated learning based distributed indoor positioning method according to claim 1, characterized in that: CIR samples are estimated using both LoS and NLoS methods; In the NLoS path, the CIR samples from the transmitting antenna element s to the receiving antenna element u are: ; wherein is a delay parameter; δ(·) is the Dirac delta function; is a cluster NLoS channel coefficient from transmit antenna s to receive antenna u at time t; is a cluster the NLoS channel coefficient from transmit antenna element s to receive antenna element u at time t on the mth ray of the cluster In the LoS path, the LoS channel coefficients are added to the NLoS channel impulse response, and these two terms are scaled according to the desired K coefficient: ; in, It is Rice's K-factor; It is the LoS channel coefficient from the time t of the transmitting antenna element s to the receiving antenna element u.

4. The distributed indoor positioning method based on federated learning according to claim 1, characterized in that: The client receives global model parameters sent by the server and trains each client using the ResNet model. Based on the RSRP dataset generated in the scenario simulation, it obtains the user data closest to each base station for training. Each client obtains the same number of datasets for training and feeds back the parameters of the trained ResNet model to the server. The server distributes global model parameters to the client, receives ResNet model parameters from the client, and updates the global model parameters.

5. The distributed indoor positioning method based on federated learning according to claim 4, characterized in that: The client receives global model parameters from the server. And set the client model parameters And perform stochastic gradient descent for two local iterations: ; in Local data partitioned within the training dataset; k is the kth client node; Then the difference in weights before and after Send to the server, where t is the t-th iteration; The server receives the weight difference between the previous and subsequent values ​​from k client nodes. And update the global model parameters by weighted average. .

6. A distributed indoor positioning device based on federated learning, characterized in that, include: A processor and a memory for storing a computer program, the processor for calling and running the computer program stored in the memory to perform the method as described in any one of claims 1 to 5.

7. A storage medium for distributed indoor positioning based on federated learning, characterized in that: Used to store a computer program that causes a computer to perform the method as described in any one of claims 1 to 5.