Model training procedure with privacy preserving

The privacy-preserving model training procedure addresses multi-vendor operability and privacy issues in federated learning by using knowledge distillation, enhancing security and efficiency in model updates across diverse communication devices.

WO2026124819A1PCT designated stage Publication Date: 2026-06-18NOKIA TECHNOLOGIES OY

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
NOKIA TECHNOLOGIES OY
Filing Date
2025-10-01
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing model training methods in communication networks face challenges such as lack of support for multi-vendor operability, privacy concerns, and inefficiencies in data exchange, particularly in federated learning scenarios, leading to vulnerabilities and suboptimal model performance.

Method used

Implement a privacy-preserving model training procedure using knowledge distillation algorithms to enable secure and flexible model updates between communication devices, allowing for local training and knowledge transfer without exposing raw data, while supporting diverse model architectures and resource management.

🎯Benefits of technology

Enhances security and reliability of model training by preserving privacy, improving flexibility and efficiency in data exchange, and ensuring optimal model performance across heterogeneous devices.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure EP2025078139_18062026_PF_FP_ABST
    Figure EP2025078139_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Example embodiments of the present disclosure are directed to model training procedure with privacy-preserving. In a method, a first apparatus receives, from a second apparatus, first information of a first model for privacy-preserving of the first apparatus. The first apparatus updates a second model based on the first information. The second model is comprised in the first apparatus The first apparatus updates the first model based on a training result of the updated second model. The first apparatus transmits, to the second apparatus, second information of the updated first model. In this way, the privacy of the model training procedure is enhanced, and the flexibility and effectiveness of the model training procedure are improved.
Need to check novelty before this filing date? Find Prior Art

Description

MODEL TRAINING PROCEDURE WITH PRIVACY PRESERVINGFIELD

[0001] Various example embodiments of the present disclosure generally relate to the field of telecommunication and in particular, to methods, devices, apparatuses and computer readable storage medium for model training procedure with privacy-preserving.BACKGROUND

[0002] A communication network may serve as a facility that enables communications between two or more communication devices or provides communication devices access to a data network. A mobile or wireless communication network is one example of a communication network. A communication device may be provided with a service by an application server.

[0003] The communication network may operate in accordance with standards such as those provided by Third Generation Partnership Project (3GPP) or European Telecommunications Standards Institute (ETSI). Examples of standards provided by 3GPP are the so-called 3GPP standards for cellular technology generations, such as 3 GPP standards for 4G technology, 5G technology, 6G technology, etc.SUMMARY

[0004] In a first aspect of the present disclosure, there is provided a first apparatus. The first apparatus comprises at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the first apparatus to: receive, from a second apparatus, first information of a first model for privacy-preserving of the first apparatus; update a second model based on the first information, wherein the second model is comprised in the first apparatus; update the first model based on a training result of the updated second model; and transmit, to the second apparatus, second information of the updated first model.

[0005] In some embodiments, the first information may comprise at least one of information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model.

[0006] In some embodiments, the first apparatus may receive, from the second apparatus, third information of a third model for privacy-preserving, wherein the third information is determined based on the second information, and the third model is associated with the first model.

[0007] In some embodiments, the third information may comprise at least one of information of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model.

[0008] In some embodiments, the first apparatus may determine the training result of the updated second model based on local data of the first apparatus.

[0009] In some embodiments, the second information may comprise at least one of information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

[0010] In some embodiments, the first apparatus may transmit, to the second apparatus, a message for privacy-preserving, the message comprising at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0011] In some embodiments, the first apparatus may receive, from the second apparatus, an indication indicating at least one of an allowance of using the second model for updating the first model, or that the first information of the first model is to be received.

[0012] In some embodiments, the first apparatus may comprise a terminal device, and the second apparatus may comprise at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

[0013] In a second aspect of the present disclosure, there is provided a second apparatus. The second apparatus comprises at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the second apparatus to: transmit, to a first apparatus, first information of a first model for privacy-preserving of the first apparatus; receive, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updatedsecond model is associated with the first information.

[0014] In some embodiments, the second apparatus may determine third information of a third model for privacy -preserving based on the second information, wherein the third model is associated with the first model; and transmit the third information to the first apparatus.

[0015] In some embodiments, the second apparatus may receive, from the first apparatus, a message for privacy -preserving, the message comprising at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0016] In some embodiments, the second apparatus may determine whether the first apparatus is allowed with the privacy-preserving based on the message; and in accordance with a determination that the first apparatus is allowed, determine the first model based on at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0017] In some embodiments, the second apparatus may transmit, to the first apparatus, an indication indicating at least one of an allowance of using a second model for updating the first model, or that the first information of the first model is to be received.

[0018] In a third aspect of the present disclosure, there is provided a method. The method comprises: receiving, from a second apparatus, first information of a first model for privacypreserving of the first apparatus; updating a second model based on the first information, wherein the second model is comprised in the first apparatus; updating the first model based on a training result of the updated second model; and transmitting, to the second apparatus, second information of the updated first model.

[0019] In a fourth aspect of the present disclosure, there is provided a method. The method comprises: transmitting, to a first apparatus, first information of a first model for privacypreserving of the first apparatus; receiving, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updated second model is associated with the first information.

[0020] In a fifth aspect of the present disclosure, there is provided a first apparatus. The first apparatus comprises means for receiving, from a second apparatus, first information of a first model for privacy -preserving of the first apparatus; means for updating a second modelbased on the first information, wherein the second model is comprised in the first apparatus; means for updating the first model based on a training result of the updated second model; and means for transmitting, to the second apparatus, second information of the updated first model.

[0021] In a sixth aspect of the present disclosure, there is provided a second apparatus. The second apparatus comprises means for transmitting, to a first apparatus, first information of a first model for privacy-preserving of the first apparatus; means for receiving, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updated second model is associated with the first information.

[0022] In a seventh aspect of the present disclosure, there is provided a computer readable medium. The computer readable medium comprises instructions stored thereon for causing an apparatus to perform at least the method according to the third aspect.

[0023] In an eighth aspect of the present disclosure, there is provided a computer readable medium. The computer readable medium comprises instructions stored thereon for causing an apparatus to perform at least the method according to the fourth aspect.

[0024] It is to be understood that the Summary section is not intended to identify key or essential features of embodiments of the present disclosure, nor is it intended to be used to limit the scope of the present disclosure. Other features of the present disclosure will become easily comprehensible through the following description.BRIEF DESCRIPTION OF THE DRAWINGS

[0025] Some example embodiments will now be described with reference to the accompanying drawings, where:

[0026] FIG. 1 illustrates an example communication environment in which example embodiments of the present disclosure can be implemented;

[0027] FIG. 2 illustrates a signaling flow of a model training procedure with privacypreserving in accordance with some embodiments of the present disclosure;

[0028] FIG. 3 illustrates a schematic diagram of federated learning in accordance with some embodiments of the present disclosure;

[0029] FIG. 4 illustrates an example flowchart of a model training procedure with privacypreserving in accordance with some embodiments of the present disclosure;

[0030] FIG. 5 illustrates an example signaling flow of a model training procedure with privacy -preserving in accordance with some embodiments of the present disclosure;

[0031] FIG. 6 illustrates a flowchart of a method implemented at a first apparatus in accordance with some example embodiments of the present disclosure;

[0032] FIG. 7 illustrates a flowchart of a method implemented at a second apparatus in accordance with some example embodiments of the present disclosure;

[0033] FIG. 8 illustrates a simplified block diagram of a device that is suitable for implementing example embodiments of the present disclosure; and

[0034] FIG. 9 illustrates a block diagram of an example computer readable medium in accordance with some example embodiments of the present disclosure.

[0035] Throughout the drawings, the same or similar reference numerals represent the same or similar element.DETAILED DESCRIPTION

[0036] Principle of the present disclosure will now be described with reference to some example embodiments. It is to be understood that these embodiments are described only for the purpose of illustration and help those skilled in the art to understand and implement the present disclosure, without suggesting any limitation as to the scope of the disclosure. Embodiments described herein can be implemented in various manners other than the ones described below.

[0037] In the following description and claims, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skills in the art to which this disclosure belongs.

[0038] References in the present disclosure to “one embodiment,” “an embodiment,” “an example embodiment,” and the like indicate that the embodiment described may include a particular feature, structure, or characteristic, but it is not necessary that every embodiment includes the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it iswithin the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

[0039] It shall be understood that although the terms “first,” “second,”..., etc. in front of noun(s) and the like may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another and they do not limit the order of the noun(s). For example, a first element could be termed a second element, and similarly, a second element could be termed a first element, without departing from the scope of example embodiments. As used herein, the term “and / or” includes any and all combinations of one or more of the listed terms.

[0040] As used herein, “at least one of the following: ” and “at least one of ” and similar wording, where the list of two or more elements are joined by “and” or “or”, mean at least any one of the elements, or at least any two or more of the elements, or at least all the elements.

[0041] As used herein, unless stated explicitly, performing a step “in response to A” does not indicate that the step is performed immediately after “A” occurs and one or more intervening steps may be included.

[0042] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises”, “comprising”, “has”, “having”, “includes” and / or “including”, when used herein, specify the presence of stated features, elements, and / or components etc., but do not preclude the presence or addition of one or more other features, elements, components and / or combinations thereof.

[0043] As used in this application, the term “circuitry” may refer to one or more or all of the following:(a) hardware-only circuit implementations (such as implementations in only analog and / or digital circuitry) and(b) combinations of hardware circuits and software, such as (as applicable):(i) a combination of analog and / or digital hardware circuit(s) with software / firmware and(ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions) and(c) hardware circuit(s) and or processor(s), such as a microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g., firmware) for operation, but the software may not be present when it is not needed for operation.

[0044] This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and / or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in server, a cellular network device, or other computing or network device.

[0045] As used herein, the term “communication network” refers to a network following any suitable communication standards, such as New Radio (NR), Long Term Evolution (LTE), LTE-Advanced (LTE-A), Wideband Code Division Multiple Access (WCDMA), High-Speed Packet Access (HSPA), Narrow Band Internet of Things (NB-IoT) and so on. Furthermore, the communications between a terminal device and a network device in the communication network may be performed according to any suitable generation communication protocols, including, but not limited to, the first generation (1G), the second generation (2G), 2.5G, 2.75G, the third generation (3G), the fourth generation (4G), 4.5G, the fifth generation (5G), 5.5G, the sixth generation (6G) communication protocols, and / or any other protocols either currently known or to be developed in the future. Embodiments of the present disclosure may be applied in various communication systems. Given the rapid development in communications, there will of course also be future type communication technologies and systems with which the present disclosure may be embodied. It should not be seen as limiting the scope of the present disclosure to only the aforementioned system.

[0046] As used herein, the term “network device” refers to a node in a communication network via which a terminal device accesses the network and receives services therefrom. The network device may refer to a base station (BS) or an access point (AP), for example, a node B (NodeB or NB), an evolved NodeB (eNodeB or eNB), an NR NB (also referred to asa gNB), a Remote Radio Unit (RRU), a radio header (RH), a remote radio head (RRH), a relay, an Integrated Access and Backhaul (IAB) node, a low power node such as a femto, a pico, a non-terrestrial network (NTN) or non-ground network device such as a satellite network device, a low earth orbit (LEO) satellite and a geosynchronous earth orbit (GEO) satellite, an aircraft network device, and so forth, depending on the applied terminology and technology. In some example embodiments, radio access network (RAN) split architecture comprises a Centralized Unit (CU) and a Distributed Unit (DU) at an IAB donor node. An IAB node comprises a Mobile Terminal (IAB-MT) part that behaves like a UE toward the parent node, and a DU part of an IAB node behaves like a base station toward the next-hop IAB node.

[0047] The term “terminal device” refers to any end device that may be capable of wireless communication. By way of example rather than limitation, a terminal device may also be referred to as a communication device, user equipment (UE), a Subscriber Station (SS), a Portable Subscriber Station, a Mobile Station (MS), or an Access Terminal (AT). The terminal device may include, but not limited to, a mobile phone, a cellular phone, a smart phone, voice over IP (VoIP) phones, wireless local loop phones, a tablet, a wearable terminal device, a personal digital assistant (PDA), portable computers, desktop computer, image capture terminal devices such as digital cameras, gaming terminal devices, music storage and playback appliances, vehicle-mounted wireless terminal devices, wireless endpoints, mobile stations, laptop-embedded equipment (LEE), laptop-mounted equipment (LME), USB dongles, smart devices, wireless customer-premises equipment (CPE), an Internet of Things (loT) device, a watch or other wearable, a head-mounted display (HMD), a vehicle, a drone, a medical device and applications (e.g., remote surgery), an industrial device and applications (e.g., a robot and / or other wireless devices operating in an industrial and / or an automated processing chain contexts), a consumer electronics device, a device operating on commercial and / or industrial wireless networks, and the like. The terminal device may also correspond to a Mobile Termination (MT) part of an IAB node (e.g., a relay node). In the following description, the terms “terminal device”, “communication device”, “terminal”, “user equipment” and “UE” may be used interchangeably.

[0048] As used herein, the term “resource,” “transmission resource,” “resource block,” “physical resource block” (PRB), “uplink resource,” or “downlink resource” may refer to any resource for performing a communication, for example, a communication between a terminal device and a network device, such as a resource in time domain, a resource infrequency domain, a resource in space domain, a resource in code domain, or any other combination of the time, frequency, space and / or code domain resource enabling a communication, and the like. In the following, unless explicitly stated, a resource in both frequency domain and time domain will be used as an example of a transmission resource for describing some example embodiments of the present disclosure. It is noted that example embodiments of the present disclosure are equally applicable to other resources in other domains.

[0049] A core network function as described herein may be implemented as a core network entity that includes a combination of hardware processing circuit and software and / or firmware comprising machine-readable instructions, or software comprising machine- readable instructions that are executable by at least one processor of hardware processing circuit of an apparatus. A hardware processing circuit includes at least one processor and at least one memory storing machine-readable instructions that are executable by the at least one processor of the hardware processing circuit. A processor includes any or some combination of an accelerator, a microprocessor, a core of a multi-core microprocessor, a microcontroller, a programmable integrated circuit, a programmable gate array, a digital signal processor, a central processing unit, a graphic processing unit, a tensor processing unit. Memory includes any or some combination of volatile or non-volatile memory (e.g., a flash memory, cache, a random-access memory (RAM), and / or a read-only memory (ROM)). The memory stores the machine-readable instructions of the software and / or firmware for execution by the at least one processor of the hardware processing circuit. The machine- readable instructions are executable by the at least one processor of the hardware processing circuit cause the hardware processing circuit to perform the actions or operations of the methods described herein. For example, the session management function described herein may be implemented as a session management entity and the session management policy control function described herein may be implemented as a session management policy control entity, respectively.

[0050] FIG. 1 illustrates an example communication environment 100 in which example embodiments of the present disclosure can be implemented. In the communication environment 100, there are a plurality of communication devices, including first apparatus 110-1, 110-2, and 110-3 (which are collectively referred to as the first apparatus 110) and a second apparatus 120. The communication environment 100 also shows a model 140 which is accessible to the second apparatus 120. The first apparatus 110 may communicate with thesecond apparatus 120 bidirectionally. In the example of FIG. 1, the first apparatus 110 may be a UE and the second apparatus 120 may be a base station serving the UE.

[0051] As shown, in the communication environment 100, the first apparatus 110-1 may have data 130-1 (also referred to as local data 130-1), which may be accessible by the first apparatus 110-1 and may need to be protected locally. The local data 130-1 may be used for model training by the first apparatus 110-1. Additionally, the first apparatus 110-2 may have data 130-2 (also referred to as local data 130-2), and the first apparatus 110-3 may have data 130-3 (also referred to as local data 130-3).

[0052] The model 140 may be associated with the second apparatus 120. The model 140 may be implemented as a ML model and may be used for model training. Additionally, the model 140 may be used for federated learning (FL). For example, the second apparatus 120 may transmit data of the second apparatus 120 to the model 140 as an input.

[0053] For model training procedure with FL, a transmission from the first apparatus 110 to the second apparatus 120 may be data of a partial model. Correspondingly, a transmission from the second apparatus 120 to the first apparatus 110 may be data of an aggregated model. In these cases, the second apparatus 120 may be implemented as an aggregator, and the model 140 may be an aggregated full model.

[0054] It is to be understood that the number of devices and their connections shown in FIG. 1 are only for the purpose of illustration without suggesting any limitation. The communication environment 100 may include any suitable number of devices configured to implementing example embodiments of the present disclosure.

[0055] Communications in the communication environment 100 may be implemented according to any proper communication protocol(s), comprising, but not limited to, cellular communication protocols, wireless local network communication protocols such as Institute for Electrical and Electronics Engineers (IEEE) 802.11 and the like, and / or any other protocols currently known or to be developed in the future. Moreover, the communication may utilize any proper wireless communication technology, comprising but not limited to: Code Division Multiple Access (CDMA), Frequency Division Multiple Access (FDMA), Time Division Multiple Access (TDMA), Frequency Division Duplex (FDD), Time Division Duplex (TDD), Multiple-Input Multiple-Output (MIMO), Orthogonal Frequency Division Multiple (OFDM), Discrete Fourier Transform spread OFDM (DFT-s-OFDM) and / or any other technologies currently known or to be developed in the future.

[0056] Load balancing, network energy saving, and mobility optimization are three main use cases of artificial intelligence (Al) in RAN. However, the implementations for Al face significant limitations, for example, a lack of support for multi-vendor operability in model training, transfer, and deployment. FL is used to overcome the challenges. Collaborative Al model training such as federated learning across multiple network nodes may be used to maintain data privacy and reduce communication overhead. Distributed / federated learning is particularly valuable in the context of RAN because it enables Al capabilities to be extended to the edge of the network, including the gNB-DU level.

[0057] Moreover, network functionality and interface procedures to support distributed / federated machine learning among NG-RAN nodes are to be studied. More efficient use of network resources may be enabled. The adaptability of the RAN to varying conditions may be improved. Furthermore, federated learning between the core network device implementing network data analytics function (NWDAF) in the core network and the NG-RAN may be explored, allowing for more comprehensive network optimization. For example, in the 6G communication network, the distributed learning and / or federated learning may be used.

[0058] In some implementations, horizontal federated learning (HFL) implementations between different core network devices implementing NWDAF (also referred to as “NWDAF”) may be not able to address a multi-vendor scenario, and an issue related to heterogeneous features, models and privacy issues. In these cases, vertical federated learning (VFL) may be used to address some of the problems of the HFL.

[0059] In the VFL, multiple parties perform training on data sets that share the same sample space but differ in feature space. An alignment in sample and feature spaces among participating entities is usually required before applying VFL. Moreover, different from the HFL, the VFL allows to perform joint training without exposing raw data, with each entity owning its own model but not needing the same model architectures.

[0060] In 5G core network (5GC), the VFL for analytics derivation leveraging sample and optionally feature alignment between the entities participating in VFL may be supported. Furthermore, the main entity facilitating the HFL operation may be NWDAF and other entities may be other NWDAF instances. The VFL may be performed between the NWDAF and a application function (also referred to as “AF”). In the multi-vendor scenario, the participating NWDAF instances to collaborate in the VFL without a need for model sharingare allowed.

[0061] Additionally, the VFL may support distributed inference. Moreover, the VFL vendor specific local models and features may be deployed in each participant. So that it is possible that each participant selects or configures the local model to be used, as such vendor or operator specific local models and features, including not standardized features, are simpler to implement comparing with HFL.

[0062] For the VFL, unlike the FL, where different parties collaborate with models trained on the same features but different samples (horizontal FL), the VFL involves multiple parties with different features for the same set of users. In the VFL, the participants may collaborate to build a global model that leverages diverse feature sets. The participants may share the model parameters after local training. Each participant may train their model on their unique feature set and then shares the model parameters (e.g., weights) with a central aggregator.

[0063] In a public land mobile network (PLMN) with multiple NWDAFs, each NWDAF instance may perform data collection according to available data sources. Depending on the analytics identification and the deployment scenario, the different NWDAF instances may share the same sample space or train on different sample spaces.

[0064] Many applications in mobile networks require a large amount of verified and labelled data from multiple distributed sources like UEs or distributed gNBs or other entities such as NWDAF to be used to train a single common model. To minimize or even remove the training data exchange among collaborating network units, the FL may be applied. FL is a form of collaborative machine learning. In the FL, unlike centralized learning for which model training occurs at a single node, different versions of the global (FL) model are locally trained at the different distributed hosts. Additionally, only local models (as a whole or only model updates) are exchanged in an iterative fashion. Different from distributed machine learning, in the FL, a single ML model is trained at distributed nodes to use computation power of different nodes.

[0065] In other words, FL is different from distributed learning in the sense that each distributed node in a FL scenario has its own local training data which may not come from the same distribution as the data at other similar peer nodes. Moreover, each node computes parameters for its local ML model, and share with the aggregator. Furthermore, the central host does not compute a version or part of the model but combines parameters of all the distributed local models to generate a global model.

[0066] After training a local ML model, each individual learner may transfer its local model parameters (e.g., weight and bias values of a neural network), instead of a raw training dataset, to an aggregating unit (e.g., centralized or peer aggregation entity). Alternatively, or in addition, the learner may transmit their whole model to the aggregator.

[0067] The aggregating unit may utilize the local model parameters of all involved learners (also referred to as “participants”) to update a global model. The global model may eventually be fed back to the local learners for further local model updating iterations until the global model converges per a stopping criterion (e.g., set by a predefined number of iterations or a maximum tolerable value of the loss function to be minimized). As a result, each local learner may benefit from the datasets of the other local learners only through the global model, shared by the aggregator, without explicitly accessing high volume of privacysensitive data available at each of the other local learners.

[0068] For example, in a communication network, the UEs may serve as local learners and the gNB may serve as an aggregator node. Note that partial (or, locally updated) model and aggregated model both are transmitted on regular data communication links.

[0069] For centralized FL, the training process may include, for example, but not limited to, initialization, client selection, reporting and aggregation, and termination. For example, the initialization may include a machine learning model (e.g., linear regression or neural network) chosen to be trained on local nodes and initialized at a central server side. Moreover, in the client selection, all or a fraction of local nodes may be selected to start training on local data. The selected nodes may acquire a current statistical model by the central server, while the others wait for a next federated round.

[0070] In the reporting and aggregation, each selected node may send its locally updated version of the model to the central server for aggregation. The central server may aggregate the received models (e.g., by means of averaging corresponding model parameters) and send back the model updates to the nodes. The reporting and aggregation may be repeated iteratively.

[0071] For the termination of the FL, once a pre-defined termination criterion is met (e.g., a maximum number of iterations is reached), the central server may aggregate the updates and finalizes the global model.

[0072] In the FL, the training dataset may be kept where it is generated. Moreover, themodel training may be performed locally at each individual learner in the federation. Model aggregation may be performed by a centralized entity. Alternatively, the aggregation may be performed by specific peer entities that may collect models from their neighbors for aggregation in case inter-learner communication is feasible.

[0073] Moreover, the FL may require large amount of data transmission over air interface. With improvement in UE capabilities and better algorithms, it is increasingly realistic to train, or re-train, or finetune, models locally at UE side. In these cases, the amount of data transmitted over air interface may be reduced by using local training data. However, when UE is in a fast-changing radio and non-radio environment from where the input data is collected, the FL may be not suitable. Additionally, every change in the environment may require potentially a new ML model version for optimal inference results.

[0074] Continual learning solutions for (resource constrained) edge devices are emerging and being developed in applied ML research community. Application of these novel techniques to 3GPP UE functionalities is to be studied.

[0075] In some implementations, an FL model may be trained based on contribution of local FL models from a larger group of UEs (e.g., throughout a cell) and may be generalized to more unknown situations (inputs). In these cases, the aggregator may be placed either at network side or one of the UEs may host aggregator if device to device (e.g. side-link) based relaying is activated for the UEs. Moreover, the global model may be implemented in the aggregator.

[0076] The local FL model may be an FL model trained with local data in UEs. Additionally, the local FL model may have the same architecture as global FL model. Furthermore, the UEs may have UE own local models. The UE own local model may be trained with local data in UEs. The model architecture of the UE own local model may differ from other UEs’ local FL models as well as global model. The model may be mostly designed by the UE or third parties’ vendors.

[0077] Note that in all current FL implementations, it is assumed that the local FL model and the UE own local models are the same, but they may be different under some scenarios. Following the above notions, some drawbacks may be associated with the conventional FL approach within the context of network -UE collaboration.

[0078] Although the FL provides improved privacy since raw data never leaves the UE, itdoes not provide guarantee. The FL involves each UE sending unaudited gradient updates to the aggregator, which is problematic since deep neural networks may memorize individual training examples. In these cases, the privacy of the local data of the UE may be completely accessible. The UEs may be vulnerable to gradient-based and white box inference attacks. Some solutions are proposed based on differential privacy, but it is shown that there may be a tradeoff between privacy budget and model performance. Furthermore, in the FL framework, the same model is shared among the aggregator and the UEs disclosing potential sensitive information in multi-vendor scenario.

[0079] Additionally, the UEs may seek to engage only in some of the FL training iterations to facilitate the development of their own local models suitable for broader geographic coverage and varying data distributions. However, in the FL framework, the UEs involved in FL iterations use the same architecture which is provided by the aggregator. Hence, the conventional FL may not support such use case where the UE has its own local model and would like to be involved in only some of the FL iterations without a lot of signaling to improve its own local model.

[0080] Moreover, in the context of participating in FL training, the UEs may encounter a delay as they await the arrival of the trained global FL model before incorporating it to train local FL model for the next iteration. The delivery of the global FL model may be sluggish, leading to a mismatch between the evolving data distribution and the model update, thereby resulting in diminished inference performance. If the UEs are powerful to perform retraining or finetuning, they may leave the FL and use their own local models for inference. However, the inability and unwillingness of UEs to rejoin the FL may be challenges for the FL, if their own local models has already good performance. Consequently, this divergence in model architectures may hamper the UEs from rejoining the FL process, as their own local models differ from the shared FL model, subsequently depriving all the advantages that the UEs conferred by the FL framework.

[0081] In some implementations, in the FL framework, not all UEs may be eligible to be included in the FL training procedure since the same model is shared between aggregator and local trainers. If a UE does not have enough computational or energy resources to dedicate to the FL task, the UE may not be included in the training process and its data may not be leveraged to improve the global model.

[0082] In case of synchronous FL, slower UEs may not be able to respect sync timingrequirements due to limited resources dedicated to the FL task. In these cases, their updates may be missed.

[0083] Furthermore, the UEs may have varying data granularity, which may lead to different models being trained, depending on whether they focus on fine-grained, detailed data or coarser, more aggregated patterns. When these models are aggregated, to create a global model, this heterogeneity may cause significant issues. The UEs with fine-grained data may learn highly localized patterns, while those with coarser data may capture more global trends. Aggregating these distinct updates may result in a global model that either overfits to local patterns or underfits, failing to capture critical details, thereby reducing overall model performance and generalizability. Additionally, aggregation strategies may struggle to balance the contributions of these UEs, causing a mismatch in the resulting model's alignment with the underlying data distributions of each UE.

[0084] In accordance with some example embodiments of the present disclosure, there is provided a solution for model training procedure with privacy -preserving. In the solution, an aggregator transmit a model for privacy preserving to a participant. The participant update a local model of the participant with the model for privacy preserving. Then, the participant update the model for privacy preserving with the updated local model. The participant transmit information of the model for privacy preserving to the aggregator. In this way, the security and reliability of the model training procedure with privacy -preserving is improved.

[0085] Example embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

[0086] As briefly described, in a solution of the present disclosure, the model for privacypreserving may be used for the FL. FIG. 2 illustrates a signaling flow 200 of a model training procedure with privacy-preserving in accordance with some embodiments of the present disclosure. The signaling flow 200 involves the first apparatus 110 and the second apparatus 120 in FIG. 1. For the purposes of discussion, the signaling flow 200 will be discussed with reference to FIG. 1. In the FL, the first apparatus 110 may be the participants, and the second apparatus 120 may be the aggregator. Additionally, the model for privacy -preserving may be used in VFL, HFL, or other non-FL scenarios for knowledge sharing and knowledge transferring from a party to the requested party. In some embodiments, the model for privacy - preserving may enable the VFL and may act as an intermediate data exchange.

[0087] The signaling flow 200 involves the first apparatus 110 and the second apparatus120 in FIG. 1. For the purposes of discussion, the signaling flow 200 will be discussed with reference to FIG. 1. In some implementations, the first apparatus 210 may include a terminal device, and the second apparatus 120 may include at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

[0088] In operation, the second apparatus 120, transmits (2010), to the first apparatus 110, information (referred to as “first information”) of a model (referred to as “first model”) for privacy-preserving of the first apparatus 110. Correspondingly, the first apparatus 110 receive the first information from the second apparatus 120. In some implementations, the first model may be implemented as the privacy-preserving model, and the second model may be implemented as the local model of the UE.

[0089] Specifically, the first information may include, for example, but not limited to, information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model. In some embodiments, the design rule maybe determined by the first apparatus 110 by selecting a knowledge distillation algorithm. In some implementations, the design rule may include a knowledge transfer mechanism such as knowledge distillation or co-distillation procedures, for example, data free knowledge distillation or noise-engineered knowledge distillation.

[0090] Additionally, the first apparatus 110 may transmit, to the second apparatus 120, a message for privacy -preserving. The second apparatus 120 may receive the message from the first apparatus 110. Specifically, the message may include at least one of resource information of the first apparatus 110, information of local data of the first apparatus 110, or a level of privacy of the first apparatus 110. In some implementations, the message may be transmitted before the first information of the first model is transmitted to the first apparatus 110. The message may be considered as a subscription message.

[0091] In some implementations, the second apparatus 120 may determine whether the first apparatus 110 is allowed with the privacy-preserving based on the message. If the first apparatus 110 is allowed, the second apparatus 120 may determine the first model based on at least one of resource information of the first apparatus 110, information of local data of the first apparatus 110, or a level of privacy of the first apparatus 110.

[0092] For example, there may be a look up table for a relationship between a set of first models and a set of the participants or a set of groups of participants. Specifically, anarchitecture of the first model may be related to a participant or a group of participants.Moreover, the participants may be divided into a plurality of groups based on the message.

[0093] In some embodiments, the second apparatus 220 may transmit, to the first apparatus 110, an indication indicating at least one of: an allowance of using a second model for updating the first model, or that the first information of the first model is to be received. The first apparatus 110 may receive the indication. If receives the indication, the first apparatus 110 may use the second model and prepare to receive the first information of the first model.

[0094] The first apparatus 110 updates (2030) a second model based on the first information. The second model is comprised in the first apparatus In these cases, the first apparatus 110 may use the design rule. Specifically, the second model may be a local model of the first apparatus 110. In some embodiments, the first apparatus 110 update the second model by transferring knowledge of the first model to the second model, for example, by using a knowledge distillation algorithm.

[0095] The first apparatus 110 updates (2040) the first model based on a training result of the updated second model. In some embodiments, the training result of the updated second model may be determined based on local data of the first apparatus 110 by the first apparatus 110. For example, the updated second model may be trained with the local data of the first apparatus 110. In addition, the first model may be updated by transferring knowledge of the updated second model to the first model, for example, by using a knowledge distillation algorithm.

[0096] In some embodiments, the first apparatus 110, transmits (2050), to the second apparatus 120, second information of the updated first model. The second apparatus 120 receives (2060) the second information from the first apparatus 110. In these cases, the second information may include information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model. For example, the information of the updated first model may include the parameters of the updated first model, and the difference between the updated first model and the updated second model may be a deviation between the two models.

[0097] For example, the performance of the first model in the performance report may include an accuracy, a precision, a recall, an area under a receiver operating characteristic (ROC) curve, an F-score, a mean squared error (MSE), a root mean squared error (RMSE),or a mean absolute error (MAE) of the first model. It is noted the metrics mentioned herein are only for the purpose of illustration, without suggesting any limitation. Embodiments of the present disclosure are not limited here.

[0098] In some implementations, the model training procedure may include an iterative process. In these cases, the iterative process may include determine further information based on the second information, and use the further information as further first information in the next iteration.

[0099] Specifically, the further information (referred to as “third information”) of a third model may be determined based on the second information. In these cases, the third model may be associated with the first model. In addition, the second apparatus 120 may transmit the third information to the first apparatus 110. For example, the second apparatus 120 may aggregate the third information based on the performance report of the first apparatus 110.

[0100] Furthermore, the second apparatus 120 may transmit, to the first apparatus 110, the third information of the third model for privacy-preserving. The third information may be determined based on the second information, and the third model is associated with the first model. Correspondingly, the first apparatus 110 may receive the third information of the third model from the second apparatus 120.

[0101] For example, the second apparatus 120 may aggregate third information from one or more first apparatus 110 and determine the third information. In some embodiments, the third information may be determined by inputting the second information from one or more first apparatus 110 to a global model of the second apparatus 120. In these cases, the third information may be the output of the global model.

[0102] Specifically, the third information may include, for example, but not limited, to information of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model. For example, the information of the third model may include parameters of the third model. In some implementations, the third information may be used for the privacypreserving model in the next iteration.

[0103] Furthermore, at least a part of the procedures in FIG. 2 may be implemented be executed iteratively. In some embodiments, at the ending of the iteration, the second apparatus 120 may use the information of the first model, for example, to determine theglobal model, and the global model may be used to generate an output for the network environment.

[0104] In this way, the first model for privacy-preserving may be used to transfer the knowledge associated with the local data of the first apparatus 110 to the second apparatus 120 without transmitting the local data. Additionally, the second model of the first apparatus 110 may be updated. Moreover, the first apparatus 110 may be allowed to use the second model in a flexible way. Thus, the privacy and the flexibility of the model training procedure is enhanced. Furthermore, the efficiency of the model training procedure is improved.

[0105] Reference is made to FIG. 3, which illustrates a schematic diagram 300 of federated learning. The schematic diagram 300 involves UEs 310-1, 310-2, and 310-3 (which are collectively referred to as the UE 310), and an aggregator 330. The UE 310, may be an implementation of the first apparatus 110 in FIG. 1, and the aggregator 320 may be the second apparatus 120 in FIG. 1.

[0106] It is to be understood that the number of devices and their connections shown in FIG. 3 are only for the purpose of illustration without suggesting any limitation. The schematic diagram 300 may include any suitable number of devices configured to implementing example embodiments of the present disclosure. Although not shown, it would be appreciated that one or more additional devices may be located in the schematic diagram 300.

[0107] Moreover, the schematic diagram 300 involves UE owned local model 340-1, 340- 2, and 340-3 (which are collectively referred to as the UE owned local model 340), local privacy -preserving model 330-1, 330-2, and 330-3 (which are collectively referred to as the local privacy-preserving model 330), privacy-preserving model parameters 350-1, 350-2, and 350-3 (which are collectively referred to as the privacy-preserving model parameters 350).

[0108] As illustrated, the UE owned local model 340-1 may be implemented in the UE 310- 1. Additionally, the UE owned local model 340-1 may be associated with the local privacy - preserving model 330-1, which means knowledge may be transferred between the UE owned local model 340-1 and the local privacy -preserving model 330-1. Furthermore, the UE 310- 1 may transmit the privacy-preserving model parameters 350-1 to the aggregator 320.

[0109] Correspondingly, the UE owned local model 340-2 may be implemented in the UE 310-2. Additionally, the UE owned local model 340-2 may be associated with the localprivacy -preserving model 330-2, which means the knowledge may be transferred between the UE owned local model 340-2 and the local privacy-preserving model 330-2. Furthermore, the UE 310-2 may transmit the privacy-preserving model parameters 350-2 to the aggregator 320.

[0110] Moreover, the UE owned local model 340-3 may be implemented in the UE 310-3. Additionally, the UE owned local model 340-3 may be associated with the local privacypreserving model 330-3, which means the knowledge may be transferred between the UE owned local model 340-3 and the local privacy-preserving model 330-3. Furthermore, the UE 310-3 may transmit the privacy-preserving model parameters 350-3 to the aggregator 320.[OHl] Furthermore, a mapping function of the relationship between the UE owned local model 340 and the local privacy-preserving model 330 may be used for transferring the knowledge. In some embodiments, the mapping function may be an implementation of the design rule.

[0112] In the embodiments of FIG. 3, the UE owned local model 340 and the local privacypreserving model 330 are updated by the UE 3100 in each or several iteration(s) of the FL training. In some implementations, the UE own local model 340 may have different architecture from the local FL models.

[0113] In this arrangement, the UE 310 may be categorized into distinct groups according to their resource attributes, such as computational capabilities and unique local datasets. Subsequently, the aggregator 320 which may be the gNB or the UE or the NWDAF, may designate and send a tailored privacy -preserving model for every UE group. Upon receiving the diverse local privacy-preserving models trained by the UE 310 the aggregator 320 may aggregate the received local privacy -preserving model parameter 350. The global privacypreserving models may be then transmitted back to the UE 3104 for further iterations until the model is converged.

[0114] Furthermore, the aggregator 320 may employ generic models as the privacypreserving models tailored to each UE group capabilities. For machine vision tasks, a convolutional neural network may present a viable choice, while tasks involving temporal sequential data, like mobility scenarios, may benefit from a memory-based approach such as LSTM.

[0115] In each iteration of the FL training, every UE may undertake a knowledge transfer mechanism such as knowledge distillation or co-distillation procedures (e.g., data free knowledge distillation, noise-engineered knowledge distillation, etc.). The design rule that the UE 310 must follow is that the UE 310, may use its own local model with any knowledge distillation algorithm for preparing the privacy-preserving model weights. In other words, each UE may have the freedom to use its own local model and to select any specific algorithm for knowledge transfer among its own local and privacy-preserving models based on its required level of privacy. If the UE 310 requires high level of privacy, data free knowledge distillation approaches may be applied.

[0116] In this way, the UE 310 may be able to use the UE owned local models. The local privacy -preserving model 330 may be used to transfer knowledge between the received local privacy -preserving model parameter 350 and a memory -based approach. Thus, the flexibility and effectiveness of the learning train is improved.

[0117] To realize the FL with privacy -preserving mechanism in cellular networks scenario, the procedures for the FL may involve the privacy-preserving model. Reference is made to FIG. 4, which illustrates an example flowchart 400 of a model training procedure with privacy-preserving in accordance with some embodiments of the present disclosure. In the approach of FIG. 4, the UE may be an implementation of the first apparatus 110 , and the aggregator may be an implementation of the second apparatus 120 in FIG. 1. Additionally, the embodiment of FIG. 4 may involve one or more UE, and for the purposes of discussion, one of the UEs may be described herein.

[0118] As illustrated, at 410, the aggregator may send the design rule and assistance information to UE for preparing privacy-preserving models. In some embodiments, the aggregator may have means to indicate to UE that the sent model is privacy-preserving or actual model. The design rule from the aggregator to inform the UE may be used to transfer knowledge to the privacy-preserving model based on their own algorithm. Selection of the algorithm may be based on their level of required data privacy. Furthermore, aggregator should have means to send the architecture and parameters of the UE group-specific privacypreserving models to each group of participating UE that they may update the parameters.

[0119] At 420, the UE trains local models and prepares privacy preserving models based on the configured rules. Furthermore, at 430, the UE may send privacy-preserving models to the aggregator. Each participating UE may have means to update the privacy-preservingmodel parameters based on its own local model.

[0120] Moreover, each participating UE may have means to update the privacy -preserving model. For example, the UE may update the privacy -preserving model based on the UE owned local model. Furthermore, each participating UE may have means to report on the performance of the local privacy -preserving model and its own local model to the aggregator so that the aggregator may tweak the privacy -preserving model architectures.

[0121] In some embodiments, the model evaluation may be done via a variety of metrics, such as, an accuracy, a precision, a recall, an area under a receiver operating characteristic (ROC) curve, an F-score, a MSE, a RMSE, or a MAE of the model.

[0122] At 440, the aggregator may aggregate the privacy -preserving model.

[0123] At 450, the aggregator may send the aggregated model to UE based on privacypreserving models. At 460, the UE may convert the aggregated model to local model based on the design rule. Subsequently, move to the 420 of the flowchart 400 for the next iteration.

[0124] In the embodiments of FIG. 4, the UE may train its own local model with different architecture and parameters compared to the privacy-preserving model. The UE may be allowed to join the FL training procedure, using e.g., a lighter model compared to the global model. Using lighter models, allow all UES to be eligible for the FL training procedure even in case of sync. In the FL, UE with limited computational resources may also perform the calculation . The UE may train its own local model with its local data and based on private rule (not disclosed to the aggregator), and may update the privacy-preserving model parameters based on local model parameters.

[0125] In this way, privacy is guaranteed while the global model is still updated using updates from local trainers. Privacy preserving model may be useful for the multi-vendor scenarios in which the UEs from different vendors may not be willing to share sensitive information about their local models’ architecture. Thus, the reliability and flexibility of the model training procedure is improved.

[0126] FIG. 5 illustrates an example signaling flow 500 of a model training procedure with privacy-preserving in accordance with some embodiments of the present disclosure. As illustrated, the signaling flow 500 involves more than one participant 510 and an aggregator 520. At least one of the participants may be an implementation of the first apparatus 110 in FIG. 1, and the aggregator 520 may be an implementation of the second apparatus 120 inFIG. 1.

[0127] For example, the participant 510 may be implemented as a UE, and the aggregator 520 may be implemented as a gNB, a further UE, or a core network device implementing NWDAF. In some implementations, the aggregator 520 may be the core network device implementing NWDAF . In some embodiments, the participant 510 may be implemented as a DU and the aggregator 520 may be implemented as a CU connecting with the DU. Additionally, the aggregator 520 and the participants 510 may be implemented as network devices in different RANs. The participants 510 may be implemented as a network device and the aggregator 520 may be implemented as a core network device implementing a core network function. The aggregator 520 and the participants 510 may be implemented as core network devices implementing NWDAF.

[0128] It is noted the implementations of the participant 510 and the aggregator 520 mentioned herein are only for the purpose of illustration, without suggesting any limitation. Embodiments of the present disclosure are not limited here. Any other distributed entities in the network may be implementations of the participant 510 and the aggregator 520. At 5010, one of the participants (referred to as “the participant 510”) may transmit, to the aggregator 520, a subscription message for a service associated with a FL task. Specifically, the participant 510 may subscribe to the aggregator 520 for the Nnwdaf_Aggregation service, to join the FL task to improve the local model of the participant 510.

[0129] Furthermore, the participant 510 may transmit metadata to the aggregator 520. The metadata may include, for example, but not limited to, a report on the resource profile that the participant 510 may dedicate to the FL task, or local training data set characteristics of the participant 510. In some embodiments, the resource profile may indicate at least one of computational resources, memory resources, networking resources, power resources, or energy resources. Additionally, the local training data set characteristics may include data sample size, number of features, or amount of missing data samples.

[0130] At 5020, the aggregator 520 may divide the participants into a set of groups with similar characteristics based on the information received in the subscription message. For example, the aggregator 520 may create a score map along with the local dataset characteristics and the resource profiles. Subsequently, the aggregator 520 may group the participants into different clusters based on the assigned score. In these cases, the aggregator 520 may also select a part of the participants to the FL task.

[0131] Furthermore, at 5030, the aggregator 520 may determine the architecture of the privacy-preserving models that needs to be distributed to all participants belonging to that group. For example, one architecture of the privacy-preserving models may be associated with one participant, or one architecture of the privacy-preserving models may be associated with a group of the participants. The architecture (e.g., number of layers in a neural network(NN), or number of neurons, etc) of the privacy -preserving model may be designed and evaluated by performing empirical experimentation on simulated data, or a part of real data having access to within the application at hand or similar applications.

[0132] In some embodiments, the aggregator 520 may build several privacy-preserving models with different complexity levels. In these cases, each privacy-preserving model may be assigned to a group of participants. Once the optimal (or near optimal) privacy -preserving model may be determined. The optimal (or near optimal) privacy-preserving model may be fixed for different participants within a group, even for the new participants that may integrate in the groups in the future. In another approach, the privacy -preserving models may be provided by the operator to the network to be used in this framework.

[0133] In this manner, the participant 510 may utilize lighter model compared with the privacy -preserving model making in this way every participant eligible to join the FL procedure. This would also lead to energy efficiency since lower energy may be consumed to train or update the local model. Moreover, slower participant 510 may be enabled to join synchronization. The FL may use lighter local model to fulfill timing requirements.

[0134] Moreover, at 5040, the aggregator 520 transmits, to the participant 510, information for the privacy -preserving. Specifically, the information may include the privacy-preserving model, information indictive of the privacy-preserving model or an architecture of the privacy-preserving model, a design rule for transferring information between the privacypreserving model and the local model, or an indication indicating a set of parameters of the privacy -preserving model

[0135] For example, The aggregator 520 may response to the participant 510 with sending the privacy -preserving model. The aggregator 520 may specify the design rule for the participant 510. Moreover, the design rule may be derived from the indication of privacypreserving model. The aggregator 520 may indicate to participant 510 that the sent model architecture and parameters in the next step is related to the privacy-preserving model. When the participant 510 receives this indication, it may know that the participant 510 specificknowledge transfer algorithm may be applied on the local model of the participant 510 to convert the local model into the privacy-preserving model and vice versa. The aggregator 520 may also send indication about privacy-preserving model parameters (architecture, input / output size, etc.).

[0136] At 5050, the participant 510 updates the local model of the participant 510 by transferring the knowledge from the received privacy-preserving model using the design rule (also referred to as “private rule”).

[0137] Furthermore, at 5060, the participant 510 may train the privacy -preserving model by transferring the knowledge from the local model using their private rule.

[0138] Moreover, at 5070, the participant 510 updates the privacy -preserving model the by transferring the knowledge from the local model using the private rule.

[0139] In some embodiments, at 5080, the participant 510 uploads the privacy -preserving model to the aggregator 520. The participant 510 may transmit, to the aggregator 520, the updated privacy-preserving model and the performance report.

[0140] For example, the performance report may indicate at least one of performance of the privacy -preserving model or a difference between the updated privacy -preserving model and the updated local model.

[0141] Specifically, the privacy-preserving model and the performance of the local model may be reported to the aggregator 520.

[0142] At 5090, the aggregator 520 may aggregate the updated privacy-preserving model received from the participant 510. Specifically, the aggregator 520 may receive the updated parameters of the updated privacy -preserving models from all participants belonging to a group. The aggregator 520 may modify the assigned privacy-preserving model of each group of participants based on the difference between the local model and the privacy-preserving model. In some implementations, participant 510 may join and leave the FL framework without a lot of signaling.

[0143] For example, if the accuracy of the local model is lower than a threshold, a more complex privacy-preserving model may be assigned to that group of participants. Additionally, the aggregator 520 may determine weights for the aggregation of the updated privacy-preserving models. In these cases, the weight may be negatively correlated with thedifference between the updated privacy-preserving model and the updated local model.

[0144] In this way, no sensitive information about the local model architecture may be shared between participants and the aggregator 520 enabling the usage of FL paradigm also in multi-vendor scenario.

[0145] Moreover, as illustrated, the embodiment of FIG. 5 may include an iterative process. Specifically, the iterative process may include one or more iterations of the step 5040, 5050, 5060, 5070, 5080, and 5090. For the purpose of discussion, the iteration may be also referred to as a loop. The loop may be executed iteratively for multiple times until the predetermined number of iterations is reached or a pre-defined termination criterion is met.

[0146] In some embodiments, for transferring the knowledge from a model to a further model, participant 510 may select any knowledge distillation algorithm to train the privacypreserving model and the local model jointly. The aggregator 520 may determine that the participant 510 may apply any kind of knowledge distillation algorithm to transfer the knowledge to the privacy -preserving model.

[0147] As only the privacy -preserving model is shared among the participant 510 and the aggregator 520, to share the knowledge from the privacy -preserving model to the local model of the participant 510, the objective function of the local models may be modified. The local model may learn from the privacy-preserving model, as well as the available data.

[0148] Specifically, the local model may be trained with the local data (e.g., using standard stochastic gradient descent). Then the local model knowledge may be transferred to the privacy-preserving model through any knowledge distillation mechanisms. Subsequently, the participant 510 may send the parameters of the updated privacy -preserving model to the aggregator 520. Furthermore, the weights of the privacy-preserving models from the same group (with the same architecture) may be updated by averaging (or weighted averaging using the performance report) over the received parameters at the aggregator 520. The aggregated privacy -preserving model parameters may be sent back to each related group for further updates.

[0149] In this way, the weights of the local model of the participant 510 may not be shared with the aggregator 520 and the risk of white box attack may be decreased. Additionally, the participant 510 may have freedom to use the local model or the FL model for inference during the FL training iterations without leaving the FL. Moreover, the participant 510 may havefreedom to perform a knowledge distillation mechanisms based on level of required data privacy, which may be modified or improved even when the FL rounds are ongoing.

[0150] Additionally, with the privacy -preserving model, the participant 510 may be able to enhance the local model for the ML fallback scenario, if the FL model is not the optimized model because of dynamic situations and changes in dataset. If the network requests money for the designed models from the participant 510, some participant 510 may be reluctant to join the FL framework. With the privacy -preserving model, the participant 510 may use their own local model without additional payment.

[0151] Moreover, in the FL, the participant 510 with lower computational capacity may not be involved because the global model is designed so complex. By privacy-preserving model, even such participant 510 may join the FL while using their own local model. Only information about the participant 510 resources dedicated to the FL task may be shared and not about the overall participants capabilities.

[0152] Thus, the privacy of the model training procedure is ensured. Additionally, the flexibility and effectiveness of the model training is improved.

[0153] FIG. 6 shows a flowchart of an example method 600 implemented at a first apparatus in accordance with some example embodiments of the present disclosure. For the purpose of discussion, the method 600 will be described from the perspective of the first apparatus 110 in FIG. 2.

[0154] At block 610, the first apparatus 110 receives, from a second apparatus, first information of a first model for privacy-preserving of the first apparatus.

[0155] At block 620, the first apparatus 110 updates a second model based on the first information, wherein the second model is comprised in the first apparatus.

[0156] At block 630, the first apparatus 110 updates the first model based on a training result of the updated second model.

[0157] At block 640, the first apparatus 110 transmits, to the second apparatus, second information of the updated first model.

[0158] In some example embodiments, the first information may include at least one of information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indicationindicating a first set of parameters of the first model.

[0159] In some example embodiments, the first apparatus 110 may receive, from the second apparatus, third information of a third model for privacy-preserving. The third information may be determined based on the second information, and the third model may be associated with the first model.

[0160] In some example embodiments, the third information may include at least one of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model.

[0161] In some example embodiments, the first apparatus 110 may determine the training result of the updated second model based on local data of the first apparatus 110.

[0162] In some example embodiments, the second information may include at least one of information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

[0163] In some example embodiments, the first apparatus 110 may transmit, to the second apparatus, a message for privacy -preserving. The message may include at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus 110.

[0164] In some example embodiments, the first apparatus 110 may receive, from the second apparatus, an indication indicating at least one of an allowance of using the second model for updating the first model, or that the first information of the first model is to be received.

[0165] In some example embodiments, the first apparatus 110 may include a terminal device, and the second apparatus may include at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

[0166] FIG. 7 shows a flowchart of an example method 700 implemented at a second apparatus in accordance with some example embodiments of the present disclosure. For the purpose of discussion, the method 700 will be described from the perspective of the second apparatus 120 in FIG. 2.

[0167] At block 710, the second apparatus 120 transmits, to a first apparatus, first information of a first model for privacy-preserving of the first apparatus.

[0168] At block 720, the second apparatus 120 receives, from the first apparatus, second information of an updated first model.

[0169] The updated first model is associated with a training result of an updated second model. The updated second model is comprised in the first apparatus. The updated second model is associated with the first information.

[0170] In some example embodiments, the first information may include at least one of information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model.

[0171] In some example embodiments, the second information may include at least one of information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

[0172] In some example embodiments, the second apparatus 120 may determine third information of a third model for privacy-preserving based on the second information. The third model may be associated with the first model. The second apparatus 120 may transmit the third information to the first apparatus.

[0173] In some example embodiments, the third information may include at least one of information of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model.

[0174] In some example embodiments, the second apparatus 120 may receive, from the first apparatus, a message for privacy-preserving. The message may include at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0175] In some example embodiments, the second apparatus 120 may determine whether the first apparatus is allowed with the privacy-preserving based on the message. If the first apparatus is allowed, the second apparatus 120 may determine the first model based on at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0176] In some example embodiments, the second apparatus 120 may transmit, to the first apparatus, an indication indicating at least one of: an allowance of using a second model for updating the first model, or that the first information of the first model is to be received.

[0177] In some example embodiments, the first apparatus may include a terminal device, and the second apparatus 120 may include at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

[0178] In some example embodiments, a first apparatus capable of performing any of the method 600 (for example, the first apparatus 110 in FIG. 2) may comprise means for performing the respective operations of the method 600. The means may be implemented in any suitable form. For example, the means may be implemented in a circuitry or software module. The first apparatus may be implemented as or included in the first apparatus 110 in FIG. 2.

[0179] In some example embodiments, the first apparatus comprises means for receiving, from a second apparatus, first information of a first model for privacy -preserving of the first apparatus; means for updating a second model based on the first information, wherein the second model is comprised in the first apparatus; means for updating the first model based on a training result of the updated second model; and means for transmitting, to the second apparatus, second information of the updated first model.

[0180] In some example embodiments, the first information may include at least one of information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model.

[0181] In some example embodiments, the first apparatus may further include: means for receiving, from the second apparatus, third information of a third model for privacypreserving, wherein the third information is determined based on the second information, and the third model is associated with the first model.

[0182] In some example embodiments, the third information may include at least one of information of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model.

[0183] In some example embodiments, the first apparatus may further include: means fordetermining the training result of the updated second model based on local data of the first apparatus.

[0184] In some example embodiments, the second information may include at least one of: information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

[0185] In some example embodiments, the first apparatus may further include: means for transmitting, to the second apparatus, a message for privacy-preserving, the message comprising at least one of: resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0186] In some example embodiments, the first apparatus may further include: means for receiving, from the second apparatus, an indication indicating at least one of: means for an allowance of using the second model for updating the first model, or means for that the first information of the first model is to be received.

[0187] In some example embodiments, the first apparatus may include a terminal device, and the second apparatus may include at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

[0188] In some example embodiments, a second apparatus capable of performing any of the method 700 (for example, the second apparatus 120 in FIG. 2) may comprise means for performing the respective operations of the method 700. The means may be implemented in any suitable form. For example, the means may be implemented in a circuitry or software module. The second apparatus may be implemented as or included in the second apparatus 120 in FIG. 2.

[0189] In some example embodiments, the second apparatus may include means for transmitting, to a first apparatus, first information of a first model for privacy-preserving of the first apparatus; means for receiving, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updated second model is associated with the first information.

[0190] In some example embodiments, the first information may include at least one of: information indictive of the first model or an architecture of the first model, a design rule fortransferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model.

[0191] In some example embodiments, the second information may include at least one of information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

[0192] In some example embodiments, the second apparatus may further include: means for determining third information of a third model for privacy-preserving based on the second information. The third model may be associated with the first model. The second apparatus may include means for transmitting the third information to the first apparatus.

[0193] In some example embodiments, the third information may include at least one of information of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model.

[0194] In some example embodiments, the second apparatus may further include: means for receiving, from the first apparatus, a message for privacy-preserving. The message may include at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

[0195] In some example embodiments, the second apparatus may further include: means for determining whether the first apparatus is allowed with the privacy-preserving based on the message; and means for in accordance with a determination that the first apparatus is allowed, determining the first model based on at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus

[0196] In some example embodiments, the second apparatus may further include: means for transmitting, to the first apparatus, an indication indicating at least one of: means for an allowance of using a second model for updating the first model, or means for that the first information of the first model is to be received.

[0197] In some example embodiments, the first apparatus may include a terminal device, and the second apparatus may include at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

[0198] FIG. 8 is a simplified block diagram of a device 800 that is suitable for implementing example embodiments of the present disclosure. The device 800 may be provided to implement a communication device, for example, the first apparatus 110 or the second apparatus 120 as shown in FIG. 1. As shown, the device 800 includes one or more processors 810, one or more memories 820 coupled to the processor 810, and one or more communication modules 840 coupled to the processor 810.

[0199] The communication module 840 is for bidirectional communications. The communication module 840 has one or more communication interfaces to facilitate communication with one or more other modules or devices. The communication interfaces may represent any interface that is necessary for communication with other network elements. In some example embodiments, the communication module 840 may include at least one antenna.

[0200] The processor 810 may be of any type suitable to the local technical network and may include one or more of the following: general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs) and processors based on multicore processor architecture, as non-limiting examples. The device 800 may have multiple processors, such as an application specific integrated circuit chip that is slaved in time to a clock which synchronizes the main processor.

[0201] The memory 820 may include one or more non-volatile memories and one or more volatile memories. Examples of the non-volatile memories include, but are not limited to, a Read Only Memory (ROM) 824, an electrically programmable read only memory (EPROM), a flash memory, a hard disk, a compact disc (CD), a digital video disk (DVD), an optical disk, a laser disk, and other magnetic storage and / or optical storage. Examples of the volatile memories include, but are not limited to, a random-access memory (RAM) 822 and other volatile memories that will not last in the power-down duration.

[0202] A computer program 830 includes computer executable instructions that are executed by the associated processor 810. The instructions of the program 830 may include instructions for performing operations / acts of some example embodiments of the present disclosure. The program 830 may be stored in the memory, e.g., the ROM 824. The processor 810 may perform any suitable actions and processing by loading the program 830 into the RAM 822.

[0203] The example embodiments of the present disclosure may be implemented by meansof the program 830 so that the device 800 may perform any process of the disclosure as discussed with reference to FIG. 2 to FIG. 7. The example embodiments of the present disclosure may also be implemented by hardware or by a combination of software and hardware.

[0204] In some example embodiments, the program 830 may be tangibly contained in a computer readable medium which may be included in the device 800 (such as in the memory 820) or other storage devices that are accessible by the device 800. The device 800 may load the program 830 from the computer readable medium to the RAM 822 for execution. In some example embodiments, the computer readable medium may include any types of non- transitory storage medium, such as ROM, EPROM, a flash memory, a hard disk, CD, DVD, and the like. The term “non-transitory,” as used herein, is a limitation of the medium itself (i.e., tangible, not a signal) as opposed to a limitation on data storage persistency (e.g., RAM vs. ROM).

[0205] FIG. 9 shows an example of the computer readable medium 900 which may be in form of CD, DVD or other optical storage disk. The computer readable medium 900 has the program 830 stored thereon.

[0206] Generally, various embodiments of the present disclosure may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects may be implemented in hardware, and other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device. Although various aspects of embodiments of the present disclosure are illustrated and described as block diagrams, flowcharts, or using some other pictorial representations, it is to be understood that the block, apparatus, system, technique or method described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.

[0207] Some example embodiments of the present disclosure also provide at least one computer program product tangibly stored on a computer readable medium, such as a non- transitory computer readable medium. The computer program product includes computerexecutable instructions, such as those included in program modules, being executed in a device on a target physical or virtual processor, to carry out any of the methods as described above. Generally, program modules include routines, programs, libraries, objects, classes,components, data structures, or the like that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or split between program modules as desired in various embodiments. Machine-executable instructions for program modules may be executed within a local or distributed device. In a distributed device, program modules may be located in both local and remote storage media.

[0208] Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. The program code may be provided to a processor or controller of a general-purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, cause the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The program code may execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

[0209] In the context of the present disclosure, the computer program code or related data may be carried by any suitable carrier to enable the device, apparatus or processor to perform various processes and operations as described above. Examples of the carrier include a signal, computer readable medium, and the like.

[0210] The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of the computer readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random-access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

[0211] Further, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, although several specific implementation details are contained in the abovediscussions, these should not be construed as limitations on the scope of the present disclosure, but rather as descriptions of features that may be specific to particular embodiments. Unless explicitly stated, certain features that are described in the context of separate embodiments may also be implemented in combination in a single embodiment. Conversely, unless explicitly stated, various features that are described in the context of a single embodiment may also be implemented in a plurality of embodiments separately or in any suitable sub-combination.

[0212] Although the present disclosure has been described in languages specific to structural features and / or methodological acts, it is to be understood that the present disclosure defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

CLAIMS:

1. A first apparatus comprising: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the first apparatus to: receive, from a second apparatus, first information of a first model for privacypreserving of the first apparatus; update a second model based on the first information, wherein the second model is comprised in the first apparatus; update the first model based on a training result of the updated second model; and transmit, to the second apparatus, second information of the updated first model.

2. The first apparatus of claim 1, wherein the first information comprises at least one of: information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model.

3. The first apparatus of claim 1 or claim 2, wherein the first apparatus is caused to: receive, from the second apparatus, third information of a third model for privacypreserving, wherein the third information is determined based on the second information, and the third model is associated with the first model.

4. The first apparatus of claim 3, wherein the third information comprises at least one of: information of the third model, a design rule for transferring information between the third model and the updatedsecond model, or an indication indicating a third set of parameters of the third model.

5. The first apparatus of any of claims 1 to 4, wherein the first apparatus is caused to: determine the training result of the updated second model based on local data of the first apparatus.

6. The first apparatus of any of claims 1 to 5, wherein the second information comprises at least one of: information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

7. The first apparatus of any of claims 1 to 6, wherein the first apparatus is caused to: transmit, to the second apparatus, a message for privacy-preserving, the message comprising at least one of: resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

8. The first apparatus of any of claims 1 to 7, wherein the first apparatus is caused to: receive, from the second apparatus, an indication indicating at least one of: an allowance of using the second model for updating the first model, or that the first information of the first model is to be received.

9. The first apparatus of any of claims 1 to 8, wherein the first apparatus comprises a terminal device, and the second apparatus comprises at least one of a further terminal device, a network device in a core network, or a network device in a radio access network.

10. A second apparatus comprising: at least one processor; and39at least one memory storing instructions that, when executed by the at least one processor, cause the second apparatus to: transmit, to a first apparatus, first information of a first model for privacy-preserving of the first apparatus; receive, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updated second model is associated with the first information.

11. The second apparatus of claim 10, wherein the first information comprises at least one of the first model, information indictive of the first model or an architecture of the first model, a design rule for transferring information between the first model and the second model, or an indication indicating a first set of parameters of the first model.

12. The second apparatus of claims 10 or claim 11, wherein the second information comprises at least one of information of the updated first model, or a performance report indicating at least one of performance of the first model or a difference between the updated first model and the updated second model.

13. The second apparatus of any of claims 10 to 12, wherein the second apparatus is caused to: determine third information of a third model for privacy -preserving based on the second information, wherein the third model is associated with the first model; and transmit the third information to the first apparatus.4014. The second apparatus of claim 13, wherein the third information comprises at least one of: information of the third model, a design rule for transferring information between the third model and the updated second model, or an indication indicating a third set of parameters of the third model.

15. The second apparatus of any of claims 10 to 14, wherein the second apparatus is caused to: receive, from the first apparatus, a message for privacy-preserving, the message comprising at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus.

16. The second apparatus of claim 15, wherein the second apparatus is caused to: determine whether the first apparatus is allowed with the privacy-preserving based on the message; and in accordance with a determination that the first apparatus is allowed, determine the first model based on at least one of resource information of the first apparatus, information of local data of the first apparatus, or a level of privacy of the first apparatus17. The second apparatus of any of claim 10 to 16, wherein the second apparatus is caused to: transmit, to the first apparatus, an indication indicating at least one of: an allowance of using a second model for updating the first model, or that the first information of the first model is to be received.

18. The second apparatus of any of claims 10 to 17, wherein the first apparatus comprises a terminal device, and the second apparatus comprises at least one of a further terminal device, a network device in a core network, or a network device in a radio accessnetwork.

19. A method comprising: receiving, from a second apparatus, first information of a first model for privacypreserving of the first apparatus; updating a second model based on the first information, wherein the second model is comprised in the first apparatus; updating the first model based on a training result of the updated second model; and transmitting, to the second apparatus, second information of the updated first model.

20. A method comprising: transmitting, to a first apparatus, first information of a first model for privacypreserving of the first apparatus; receiving, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updated second model is associated with the first information.

21. A first apparatus comprising: means for receiving, from a second apparatus, first information of a first model for privacy -preserving of the first apparatus; means for updating a second model based on the first information, wherein the second model is comprised in the first apparatus; means for updating the first model based on a training result of the updated second model; and means for transmitting, to the second apparatus, second information of the updated first model.

22. A second apparatus comprising:means for transmitting, to a first apparatus, first information of a first model for privacy -preserving of the first apparatus; means for receiving, from the first apparatus, second information of an updated first model, wherein the updated first model is associated with a training result of an updated second model, the updated second model is comprised in the first apparatus, and the updated second model is associated with the first information.

23. A computer readable medium comprising instructions stored thereon for causing an apparatus at least to perform the method of claim 19 or the method of claim 20.43