Systems and methods for federated learning and client devices
The federated machine learning system, which utilizes Bayesian nonparametric weight factorization and kernel factorization, solves the model applicability problem caused by data distribution skewness, achieves efficient personalized training and enhanced data privacy protection, and is suitable for client devices in federated machine learning systems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SAMSUNG ELECTRONICS CO LTD
- Filing Date
- 2021-05-31
- Publication Date
- 2026-06-12
AI Technical Summary
Existing federated machine learning systems struggle to effectively personalize models when dealing with skewed data distributions across different clients, resulting in poor performance on specific clients and posing data privacy and security risks.
The Bayesian nonparametric weight factorization method is adopted to decompose the global model into a client-personalized part and a server-aggregated part. Only a subset of training parameters are transmitted. Kernel factorization is used to reduce data communication volume, and variational inference is used to optimize factor selection and updates. Data is kept on the local device to improve security.
It enables more efficient personalized model training, reduces data communication volume, enhances data privacy protection, avoids the risk of model being attacked and data reconstructed, and improves the applicability and security of the model on various clients.
Smart Images

Figure CN113762524B_ABST
Abstract
Description
[0001] This application claims priority to U.S. Provisional Application No. 63 / 033,747, filed June 2, 2020, and U.S. Non-Provisional Patent Application No. 17 / 148,557, filed January 13, 2021, the entire disclosure of which is incorporated herein by reference. Technical Field
[0002] The topics disclosed herein relate to federated machine learning. More specifically, the topics disclosed herein relate to systems and methods for federated machine learning. Background Technology
[0003] The growth of the Internet of Things (IoT), the proliferation of smartphones, and the digitization of records have enabled modern systems that generate massive amounts of data. This generated data can provide a wealth of information about individuals, leading to highly personalized intelligent applications, but it can also be sensitive and should be kept private. Examples of such private data include, but are not limited to, images of faces, typing history, medical records, and survey responses. Summary of the Invention
[0004] An example embodiment provides a client device in a federated machine learning system, the client device including at least one computing device, a communication interface, and a processor. The processor can be connected to the at least one computing device and the communication interface. The processor can select a parameter set for the client device from a global parameter set, train a model using a dataset from the client device and the parameter set selected by the client device, the dataset being formed from the output of the at least one computing device, update a weight factor dictionary and factor strength vectors after training the model, send the client-updated weight factor dictionary and client-updated factor strength vectors to a global server via the communication interface, receive a globally updated weight factor dictionary and globally updated factor strength vectors from the global server via the communication interface, and retrain the model using the client device's dataset, the parameter set selected by the client device, and the globally updated weight factor dictionary and globally updated factor strength vectors. In one embodiment, the client device can be part of a group of N client devices, where N is an integer. In another embodiment, the processor can select a parameter set from the global parameter set using three variational parameters that may include a seed value, and minimize the difference between supervised learning of the dataset and the regularization of the selected parameter set and the global parameter set. The processor can select a parameter set from a global parameter set by receiving a global parameter set already sent from a global server to a first subset of the N client devices, where the client devices are part of the first subset. The client devices can receive a globally updated weight factor dictionary and a globally updated factor strength vector sent from the global server to a second subset of the N client devices, where the client devices may be part of the second subset. In another embodiment, the processor can send a request for the current version of the global parameter set to the global server via a communication interface, update the model using the current version of the global parameter set, and evaluate the updated model using the current version of the global parameter set to form inference based on the dataset of the client devices.
[0005] An example embodiment provides a federated machine learning system that may include a global server and N client devices. The global server can receive updates to a weight factor dictionary and factor strength vectors from the N client devices and can generate globally updated weight factor dictionaries and globally updated factor strength vectors, where N is an integer. At least one client device may include at least one computing device, a communication interface, and a processor. The processor can be connected to the at least one computing device and the communication interface. The processor can select a parameter set from a global parameter set, train a model using the dataset from the client devices and the parameter set selected by the client devices, update the weight factor dictionary and factor strength vectors after training the model, send the client-updated weight factor dictionary and client-updated factor strength vectors via the communication interface, receive the globally updated weight factor dictionary and globally updated factor strength vectors from the global server via the communication interface, and retrain the model using the client devices' dataset, the parameter set selected by the client devices, and the globally updated weight factor dictionary and globally updated factor strength vectors. In one embodiment, the processor can select a parameter set from the global parameter set using three variational parameters that may include a seed value, and minimize the difference between supervised learning of the dataset and the regularization of the selected parameter set and the global parameter set. In another embodiment, the processor can select a parameter set from a global parameter set by receiving a global parameter set that has been sent from a global server to a first subset of the N client devices, where the client devices may be part of the first subset of client devices. In another embodiment, a client device can receive a globally updated weight factor dictionary and a globally updated factor strength vector that have been sent from a global server to a second subset of the N client devices, where the client devices may be part of the second subset of client devices. In one embodiment, the processor can send a request for the current version of the global parameter set to the global server via a communication interface, update the model using the current version of the global parameter set, and evaluate the updated model using the current version of the global parameter set to form inference based on the dataset of the client devices.
[0006] An example embodiment provides a method for federated machine learning, the method comprising: selecting a parameter set from a global parameter set at a client device, the global parameter set including a dictionary of weight factors and factor strength vectors; training a model at the client device using the client device's dataset and the parameter set selected by the client device; updating the weight factor dictionary and factor strength vectors after training the model; sending the client-updated weight factor dictionary and client-updated factor strength vectors from the client device to a global server; receiving a globally updated weight factor dictionary and globally updated factor strength vectors from the global server at the client device; and retraining the model at the client device using the client device's dataset, the parameter set selected by the client device, and the globally updated weight factor dictionary and globally updated factor strength vectors. In one embodiment, the client device may be part of a group of N client devices, where N is an integer. In another embodiment, the step of selecting a parameter set from the global parameter set may include: selecting the parameter set using three variational parameters including a seed value; and minimizing the difference between supervised learning of the dataset and regularization of the selected parameter set and the global parameter set. In another embodiment, the step of selecting a parameter set from a global parameter set may include: a client device receiving a global parameter set that has been sent from a global server to a first subset of the N client devices, the client device being part of the first subset of client devices. In another embodiment, the step of the client device receiving a globally updated weight factor dictionary and a globally updated factor strength vector from a global server may include: a client device receiving a globally updated weight factor dictionary and a globally updated factor strength vector that have been sent from a global server to a second subset of the N client devices, the client device being part of the second subset of client devices. In one embodiment, the method may further include: the client device requesting a current version of the global parameter set from a global server; receiving the current version of the global parameter set; updating the model using the current version of the global parameter set; and evaluating the model updated using the current version of the global parameter set to form inference based on the client device's dataset. Attached Figure Description
[0007] In the following sections, aspects of the subject matter disclosed herein will be described with reference to exemplary embodiments shown in the accompanying drawings, wherein:
[0008] Figure 1 A functional block diagram of an example embodiment of a federated learning system based on the subject matter disclosed herein is depicted;
[0009] Figure 2A and Figure 2B Functional block diagrams of example embodiments of a global server and a client based on the subject matter disclosed herein are depicted respectively;
[0010] Figure 3 This is a flowchart of an example embodiment of a method for federated machine learning at a client device based on the subject matter disclosed herein; and
[0011] Figure 4 An electronic device, including functions for federated machine learning, is depicted according to the subject matter disclosed herein. Detailed Implementation
[0012] In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the disclosure. However, those skilled in the art will understand that the aspects of the disclosure may be practiced without these specific details. In other instances, well-known methods, processes, components, and circuits have not been described in detail so as not to obscure the subject matter disclosed herein.
[0013] Throughout this specification, the reference to "an embodiment" indicates that a particular feature, structure, or characteristic described in connection with that embodiment may be included in at least one embodiment disclosed herein. Therefore, the appearance of the phrases "in one embodiment," "in an embodiment," or "according to an embodiment" (or other phrases with similar meanings) throughout different places in this specification does not necessarily indicate the same embodiment. Furthermore, in one or more embodiments, particular features, structures, or characteristics may be combined in any suitable manner. In this regard, as used herein, the word "exemplary" means "serving as an example, instance, or illustration." Any embodiment described herein as "exemplary" should not be construed as necessarily preferred or advantageous over other embodiments. Furthermore, in one or more embodiments, particular features, structures, or characteristics may be combined in any suitable manner. Additionally, depending on the context discussed herein, singular terms may include corresponding plural forms, and plural terms may include corresponding singular forms. Similarly, hyphenated terms (e.g., "two-dimensional", "predetermined", "specific pixel", etc.) may occasionally be used interchangeably with their non-hyphenated versions (e.g., "two-dimensional", "predetermined", "specific pixel", etc.), and uppercase terms (e.g., "Counter Clock", "Row Select", "PIXOUT", etc.) may be used interchangeably with their non-uppercase versions (e.g., "counter clock", "row select", "pixout", etc.). Such occasional interchangeability should not be considered inconsistent with each other.
[0014] Furthermore, depending on the context of this discussion, singular terms may include their corresponding plural forms, and plural terms may include their corresponding singular forms. It should also be noted that the various figures shown and discussed herein (including component diagrams) are for illustrative purposes only and are not drawn to scale. Similarly, various waveform and timing diagrams are shown for illustrative purposes only. For example, for clarity, the dimensions of some components may be exaggerated relative to others. Additionally, reference numerals are repeated in the figures where appropriate to indicate corresponding and / or similar components.
[0015] The terminology used herein is for the purpose of describing some exemplary embodiments only and is not intended to limit the claimed subject matter. As used herein, the singular form is intended to include the plural form as well, unless the context clearly indicates otherwise. It will also be understood that the term "comprising," when used herein, indicates the presence of stated features, integrals, steps, operations, elements, and / or components, but does not preclude the presence or addition of one or more other features, integrals, steps, operations, elements, components, and / or groups thereof. As used herein, unless expressly defined as such, the terms "first," "second," etc., are used as labels for nouns following them and do not indicate any type of order (e.g., spatial, temporal, logical, etc.). Furthermore, the same reference numerals may be used between two or more figures to denote parts, components, blocks, circuits, units, or modules having the same or similar functions. However, such use is merely for simplification and ease of discussion and does not imply that the construction or architectural details of such components or units are identical across all embodiments, or that such commonly referenced parts / modules are the only way to implement some of the exemplary embodiments disclosed herein.
[0016] It will be understood that when an element or layer is referred to as being "on" another element or layer, "connected to," or "bonded to" another element or layer, it may be directly on, directly connected to, or directly bonded to the other element or layer, or there may be intermediate elements or layers present. Conversely, when an element is referred to as being "directly on" another element or layer, "directly connected to," or "directly bonded to" another element or layer, there are no intermediate elements or layers present. The same reference numerals always denote the same element. As used herein, the term "and / or" includes any and all combinations of one or more of the associated listed items.
[0017] As used herein, unless explicitly defined as such, the terms "first," "second," etc., are used as labels for nouns that follow them and do not indicate any type of order (e.g., spatial, temporal, logical, etc.). Furthermore, the same reference numerals may be used between two or more figures to denote parts, components, blocks, circuits, units, or modules having the same or similar functions. However, such use is merely for simplification and ease of discussion and does not imply that the construction or architectural details of such components or units are identical across all embodiments, or that such commonly referenced parts / modules are the only way to implement some of the exemplary embodiments disclosed herein.
[0018] Unless otherwise defined, all terms used herein (including technical and scientific terms) shall have the same meaning as commonly understood by one of ordinary skill in the art to which this subject pertains. It will also be understood that, unless expressly defined herein, terms (such as those defined in a general dictionary) shall be interpreted as having a meaning consistent with their meaning in the context of the relevant field and shall not be interpreted in an idealized or overly formalized manner.
[0019] As used herein, the term "module" means any combination of software, firmware, and / or hardware configured to provide the functionality described herein in conjunction with modules. For example, software may be implemented as a software package, code, and / or instruction set or instructions, and the term "hardware" as used in any implementation described herein may include, for example, assemblies, hardwired circuitry, programmable circuitry, state machine circuitry, and / or firmware storing instructions executed by programmable circuitry, either individually or in any combination. Modules may be implemented collectively or individually as circuitry (e.g., but not limited to integrated circuits (ICs), system-on-a-chip (SoCs), assemblies, etc.) forming part of a larger system.
[0020] Federated learning (also known as consortium learning) has been proposed as a machine learning approach that maintains personalized data privacy by keeping user data local on each client device and sharing model updates only with a global server. Therefore, federated learning represents a viable strategy for training machine learning models on heterogeneous distributed networks in a privacy-preserving manner.
[0021] While the federated machine learning paradigm offers a way to maintain privacy for private data, many challenges remain for federated machine learning systems. For example, current federated machine learning systems consist of a single global model used by each client. However, because skewed data distributions can exist across different clients, single-model approaches may not work well for specific subgroups.
[0022] To illustrate this, consider N client devices, where the i-th client device has a data distribution that differs from the other client devices as a function of i. In a traditional federated machine learning setup, a single, learnable global model can be deployed across all N client devices. Traditional approaches assume a multilayer perceptron (MLP) architecture with layers l = 1, ..., L shared across all client devices and a weight set θ = {W}. l} l=1:L To satisfy the global objective, the weight set θ can be learned to minimize the loss over the mean of all clients. For example, a traditional federated machine learning system minimizes the following objective f(θ):
[0023]
[0024] Where i is the index of the client device, N is the number of clients, and F i (θ) is the local objective function, and p i ≥0 represents the weight of each device i.
[0025] However, given statistical heterogeneity, a one-size-fits-all approach can lead to a global model performing poorly on specific clients. Typically, performance can be translated into how closely the local distribution of a particular client approximates the distribution of the entire population. As a result, the model in this example of a traditional federated machine learning system can be considered unsuitable for clients with data characteristics that are less common among them.
[0026] The topics disclosed here demonstrate how to improve model consistency in federated learning by using Bayesian nonparametric weight factorization, which provides a personalized federated learning solution that can achieve higher local model performance across many clients.
[0027] Compared to traditional federated learning systems, the federated machine learning system disclosed herein includes at least three improved features. The first improvement is that the network on which federated learning occurs is divided into two parts. The first part provides server aggregation, and the second part is used for client personalization. The second improvement involves a reduced amount of data communication between the global server and client devices. That is, because kernel factorization is used in the client devices, and only a subset of the parameters used for training is transmitted, data communication between the global server and client devices is more efficient. The third improvement involves an additional security layer provided by kernel factorization, and only a subset of the parameters used for training is transmitted.
[0028] The federated machine learning system disclosed herein provides a federated learning system that efficiently uses data from a global model to train neural networks in N local models via factorization. Each client model can be personalized based on the client's local distribution, and all client models share components that are learned collectively.
[0029] Figure 1 A functional block diagram of an example embodiment of a federated learning system 100 based on the subject matter disclosed herein is depicted. The federated learning system 100 may include a global server 101 and N clients (i.e., local devices) 1021 to 1022. N Global server 101 may be located in a single location or in a distributed location in the cloud. As used herein, the term "global server" refers to any server device configured to communicate with two or more client devices (wired and / or wirelessly) via a wide area network (e.g., the Internet), and may be any server device configured to communicate directly with two or more client devices in a federated machine learning system. Clients 1021 to 102 N The system is communicatively connected to the global server 101 via communication link 103. Communication link 103 can be a wired communication link and / or a wireless communication link.
[0030] Figure 2A and Figure 2B Functional block diagrams of example embodiments of a global server 101 and a client 102 according to the subject matter disclosed herein are depicted respectively. The global server 101 may include a processing device 201 (such as a central processing unit (CPU)) communicatively connected to a memory 202 and a communication interface 203. The memory 202 may include non-volatile memory and / or volatile memory. The communication interface 203 may be configured to communicate with a network fabric (such as, but not limited to, the Internet). The communication interface 203 may be a wired and / or wireless communication interface. Other configurations of the global server 101 are possible. The global server 101 may be configured to provide federated machine learning capabilities as described herein. In one embodiment, the federated machine learning capabilities provided by the global server 101 may be provided by one or more modules, which may be configured to provide any combination of software, firmware, and / or hardware that provides the capabilities described herein.
[0031] Client 102 may include a processing device 251 (such as a CPU) communicatively connected to memory 252, communication interface 253, and one or more computing devices 254. One or more computing devices 254 may include the ability to sense or collect information relating to, but not limited to, motion, one or more images, biometrics and / or medical conditions of humans and / or non-human animals and / or plants, sound, voice, location, metadata, application usage (i.e., browsing history), and / or survey responses. In one embodiment, the output of at least one computing device 254 may form a dataset or data distribution. In one embodiment, at least one computing device 254 is a sensing device. Other configurations of the client device 102 are possible. The client 102 may be configured to provide federated machine learning capabilities as described herein. In one embodiment, the federated machine learning capabilities provided by the client 102 may be provided by one or more modules, which may be any combination of software, firmware, and / or hardware configured to provide the capabilities described herein.
[0032] Client 102 i It can have a distribution of usable data. The weight matrices of the L layers used for training The local model. Each weight set θ i Data distribution that can be made as specific as possible to each client i However, each client typically has limited data, which may be insufficient to train the entire model without overfitting. Therefore, the total number of parameters that must be learned across all clients is proportional to the number of clients. However, learning N individual models may not take advantage of the similarity between client data distributions or shared learning tasks. To use data more efficiently, the Federated Machine Learning System 100 provides a balance between a single global model and N local models. That is, each client model can be personalized to its local data distribution, and all models share components that are learned jointly. For this purpose, the weight matrix for client i... Factorize into:
[0033]
[0034]
[0035] in, and It is a dictionary of rank 1 weight factors that can be shared across clients. It is a diagonal personalized matrix for each client i, and diag() can be used to construct a diagonal matrix.
[0036] Factorization can be equivalently represented as:
[0037]
[0038] in, yes The kth column, yes In the k-th row, F is the number of factors (e.g., weighting factors), and This represents the outer product. Written in this way, the columns used as weighting factors... Hexing The corresponding explanations become clearer. (Dictionary) and Together they form a global weight factor dictionary, and This can be considered as the factor score of client i. Between clients The differences allow the model to be customized for each client's data distribution while sharing underlying factors. and This enables learning from data from all clients.
[0039] Each client factor score It can be formed as an element-wise product:
[0040]
[0041] in, Indicates the strength of each factor. It is a binary vector indicating the active factor, where F represents the Frobenius space and ⊙ represents the inner product. As described below, Typically sparse, therefore each client usually uses only a small subset of the available weight factors. As used herein, there is no... Superscript (e.g., λ) i () represents the entire set of all layers L performed across the factorization. Point estimation can be performed on W. a W b and factor strength r i To learn.
[0042] In the context of federated machine learning with statistical heterogeneity, there are several expected properties that customer factor scores should collectively possess. As mentioned earlier, It is usually sparse, resulting in λ. iIt is also sparse, which facilitates the integration of relevant knowledge while minimizing interference. That is, client A should be able to update global factors during training without disrupting client B's ability to perform client B's tasks. On the other hand, factors should be reused among clients. Although data can be distributed across clients in a non-independent and dissimilar manner, some similarity or overlap is often present. Sharing factors across all client data for distributed learning avoids the scenario of N independent models. Furthermore, in distributed settings considered for federated machine learning, the total number of nodes is rarely predefined. Therefore, the system should be fully scalable to accommodate new clients without re-initializing the entire model. This feature includes increasing server-side capacity (if necessary) and initializing new clients.
[0043] To promote personalized diagonal matrices The sparsity of the diagonal vectors can be regularized using a process similar to the Indian Buffet Process (IBP). Variational inference can force the posterior distribution of the diagonal vectors to be as close as possible to the prior diagonal vectors. Using Bayesian nonparametric methods allows data to specify client-side factor assignments, factor reuse, and server-side model expansion. The stick-breaking construction can be used in conjunction with IBP as a prior distribution for factor selection, as follows:
[0044]
[0045]
[0046]
[0047] Here, α can be a hyperparameter controlling the expected number of active factors and the rate of incorporation of new factors, and k indexes the factors. Furthermore, and They can follow a Beta distribution and a Bernoulli distribution, respectively.
[0048] For random variable b i and v i Learn the posterior distribution. Exact posterior inference can be difficult to tract; therefore, variational inference with a mean-field approximation can be used to determine the activity factor for each client device using a variational distribution learned via backpropagation using Bayesian methods. This variational distribution is {π}. i c i d i}:
[0049]
[0050]
[0051]
[0052] To have differentiable parameters, the Kumaraswamy distribution can be used as an alternative to the beta distribution and a soft relaxation of the Bernoulli distribution. The objective for each client is to maximize the variational lower bound:
[0053]
[0054] in, is the number of training examples at client i. The first term provides label supervision, and the second term regularizes the posterior distribution so as not to deviate from the IBP prior distribution.
[0055] The mean-field approximation can be used to allow the second term to be extended as follows:
[0056]
[0057] Before training begins, the global weight factor {W} a W b The factor strength r can be initialized by server 101. Once initialized, each training round begins at {W}. a W b ,r} is sent to a selected subset of all clients 102. Then, each selected (sampled) client uses its own private data distribution for its own E-period. To train the model, not only update the weight factor dictionary {W a W b} and factor strength r, and also update the variational parameters {π} of which factors the client uses to control the client. i c i d i Data distribution This may include information related to biometric data, medical data, image data, voice data, location data, application usage data, thermal data, atmospheric data, survey data, and / or audio data.
[0058] Once local training is complete, each client will {W a W b ,r} is sent back to the server, but will not be distributed with the data. The variational parameter {π} is retained together in the client. i c i d i}Send back to the server. After server 101 has received updates from all sampled clients, server 101 can use the averaging step to aggregate {W}. a W b Various new values of ,r} are given. In one embodiment, the averaging step can be a simple averaging step. This process is then repeated, with the server selecting a new subset of clients for sampling, sending the new updated set of global parameters to the new subset, and so on, until the desired number of communication rounds have occurred. This process is summarized by the pseudocode of Algorithm 1.
[0059] Algorithm 1
[0060]
[0061] When client 102 enters evaluation mode, the client can request the global parameter {W} of the current version from the server. a W b If the client has previously been queried for federated training, the local model includes aggregated global parameters and local variational parameters {π} derived by the client. i c i d i The generated binary vector. Otherwise, the client only uses the aggregated {W}. a W b Note that if the client has been previously sampled, a recently cached copy of the client's global parameters may be an option if the network connection is unavailable or too expensive. Typically, the client is able to request the latest parameters. In one embodiment, client 102 (e.g., processing device 251) may receive the current version of the global parameter set, use the current version of the global parameter set to update the model, and evaluate the updated model using the current version of the global parameter set to form inference based on the client's dataset.
[0062] Data security is one of the central aspects of federated machine learning. A simpler, more standard approach to training models could be exploited if all data were first aggregated on a central server. The real possibility that sensitive client data could be intercepted during transmission or that the data repository of server 101 could be compromised by an attacker is a major concern and promotes keeping data on local device 102 for federated machine learning. On the other hand, simply keeping data on the client side may not be sufficient for security purposes. Just as data can be stolen during transmission or from a central database in a non-federated setting, federated training updates are equally vulnerable. For example, in one example federated machine learning approach, updates include the entire set of parameters of the model. This could effectively mean that immediately generating data may be a trade-off against relinquishing white-box access to the model, which could expose the model to a wide range of malicious activities, including exposing the very data that federated machine learning is intended to protect.
[0063] For the federated machine learning system disclosed herein, the client sends the entire weight factor dictionary {W} to server 101. a W b} and factor strength r, but not transmit {π i c i d i Therefore, information regarding which specific factors the client uses is kept local. In other words, client data... and factor selection Neither of them leaves the local device. Therefore, even if the message is intercepted, the attacker cannot fully reconstruct the model, thus hindering their ability to execute an attack to recover the data.
[0064] Figure 3This is a flowchart of an example embodiment of a federated machine learning method 300 for client devices according to the subject matter disclosed herein. The method begins at 301. Global parameters (i.e., a global weight factor dictionary and factor strengths) may be initialized by a global server 101 and sent to a selected subset of all clients 102 before training begins. At 302, the client devices select their own parameter set from the global parameter set. In one embodiment, the client uses variational parameters to form the client's parameter selection. At 303, the client devices train a model using their own dataset and the parameter set selected by the client devices. At 304, after training, the client devices (e.g., via communication interface 253) send the client-updated weight factor dictionary and the client-updated factor strength vector to the global server 101, but do not send the client's own dataset or the variational parameters used by the client to form the client's parameter selection. The global server 101 may use an averaging step to aggregate the client-updated dictionary components and factor strength vector. The global server 101 may select a new subset of clients for sampling and send the new updated global parameter set to the new subset of clients. In an example embodiment of method 300, the client is selected as part of a new subset of clients. At 305, the client device (e.g., via communication interface 253) receives a globally updated weight factor dictionary and a globally updated factor strength vector from global server 101. At 306, the client device retrains using the client's dataset, the parameter set selected by the client device, the globally updated weight factor dictionary, and the globally updated factor strength vector. The method may continue until the desired number of training epochs have occurred. The method ends at 307.
[0065] Figure 4An electronic device 400, comprising functionality for federated machine learning, is depicted according to the subject matter disclosed herein. In one embodiment, the electronic device 400 may be a global server operating to provide federated machine learning as disclosed herein. In another embodiment, the electronic device 400 may be a client device operating to provide federated machine learning as disclosed herein. The electronic device 400 (whether a global server or a client device) may also be implemented as, but is not limited to, a computing device, a personal digital assistant (PDA), a laptop computer, a mobile computer, a network tablet, a wireless phone, a cellular phone, a smartphone, a digital music player, or a wired or wireless electronic device. The electronic device 400 may include a controller 410, input / output devices 420 (such as, but not limited to, a keypad, keyboard, display, touchscreen display, camera and / or image sensor), memory 430, interface 440, graphics processing unit (GPU) 450, and imaging processor 460 connected to each other via a bus 470. The controller 410 may include, for example, at least one microprocessor, at least one digital signal processor, at least one microcontroller, etc. The memory 430 may be configured to store command codes or user data to be used by the controller 410.
[0066] Interface 440 may be configured to include a wireless interface configured to transmit data to or receive data from a wireless communication network using radio frequency (RF) signals. Wireless interface 440 may include, for example, an antenna. Electronic device 400 can also be used in communication interface protocols of communication systems (such as, but not limited to, Code Division Multiple Access (CDMA), Global System for Mobile Communications (GSM), North American Digital Communications (NADC), Extended Time Division Multiple Access (E-TDMA), Wideband CDMA (WCDMA), CDMA2000, Wi-Fi, Municipal Wi-Fi (Muni Wi-Fi), Bluetooth, Digital Enhanced Wireless Telecommunications (DECT), Wireless Universal Serial Bus (Wireless USB), Fast Low Latency Access with Seamless Handover (Orthogonal Frequency Division Multiplexing) (Fast OFDM), IEEE 802.20, General Packet Radio Service (GPRS), iBurst, Wireless Broadband (WiBro), WiMAX, WiMAX Advanced, Universal Mobile Communications Service - Time Division Duplex (UMTS-TDD), High-Speed Packet Access (HSPA), Evolved Data Optimized (EVDO), Long Term Evolution Advanced (LTE-Advanced), Multichannel Multipoint Distribution Service (MMDS), 5G, etc.).
[0067] Embodiments of the subject matter and operation described in this specification may be implemented as digital electronic circuits, or as computer software, firmware, or hardware (including the structures disclosed in this specification and their structural equivalents), or a combination of one or more of these. Embodiments of the subject matter described in this specification may be implemented as one or more computer programs (i.e., one or more modules of computer program instructions) encoded on a computer storage medium to execute or control the operation of a data processing device. Optionally or additionally, the program instructions may be encoded on artificially generated propagation signals (e.g., machine-generated electrical, optical, or electromagnetic signals) generated to encode information for transmission into a suitable receiver device for execution by the data processing device. The computer storage medium may be or be included in a computer-readable storage device, a computer-readable storage substrate, a random or serial access memory array or apparatus, or a combination thereof. Furthermore, when the computer storage medium is not a propagation signal, it may be a source or destination of computer program instructions encoded in an artificially generated propagation signal. The computer storage medium may also be or be included in one or more separate physical components or media (e.g., multiple CDs, disks, or other storage devices). Furthermore, the operations described in this specification can be implemented as operations performed by a data processing device on data stored on one or more computer-readable storage devices or received from other sources.
[0068] While this specification may contain numerous specific implementation details, these details should not be construed as limiting the scope of any claimed subject matter, but rather as descriptions of specific features of particular embodiments. Specific features described in the context of individual embodiments in this specification may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments. Furthermore, although features may be described above as functioning in a particular combination, and even initially claimed in this way, in some cases one or more features from a claimed combination may be removed from that combination, and the claimed combination may involve sub-combinations or variations thereof.
[0069] Similarly, although operations are described in a specific order in the accompanying drawings, this should not be construed as requiring the operations to be performed in the specific order shown or sequentially, or to perform all shown operations to obtain the desired result. In certain situations, multitasking and parallel processing may be advantageous. Furthermore, the separation of various system components in the above embodiments should not be construed as requiring such separation in all embodiments; it should be understood that the described program components and systems can generally be integrated together in a single software product or encapsulated in multiple software products.
[0070] Therefore, specific embodiments of the subject matter have been described herein. Other embodiments are within the scope of the appended claims. In some cases, the actions set forth in the claims may be performed in a different order and still achieve the desired result. Furthermore, the processes described in the drawings do not necessarily require the specific order or sequence shown to obtain the desired result. In certain embodiments, multitasking and parallel processing may be advantageous.
[0071] As those skilled in the art will recognize, the innovative concepts described herein can be modified and altered in a wide range of applications. Therefore, the scope of the claimed subject matter should not be limited to any specific exemplary teachings discussed above, but is defined by the appended claims.
Claims
1. A client device in a federated machine learning system, the client device comprising: At least one computing device; Communication interface; and Processor, connected to the at least one computing device and communication interface, processor: A parameter set for the client device is selected from the global parameter set, wherein the global parameter set includes a weight factor dictionary and a factor strength vector. The model is trained using a dataset from the client device and a parameter set selected by the client device in a factorization manner. The dataset is formed from the output of the at least one computing device. The process of training the model includes updating the weight factor dictionary and factor strength vector by minimizing the difference between supervised learning of the dataset and the regularization of the selected parameter set and the global parameter set. After training the model, the client-updated weight factor dictionary and the client-updated factor strength vector are obtained. The client-updated weight factor dictionary and client-updated factor strength vector are sent to the global server via the communication interface. The system receives the globally updated weight factor dictionary and the globally updated factor strength vector from the global server via the communication interface. The model is retrained using the dataset from the client device, the parameter set selected by the client device, and the globally updated weight factor dictionary and the globally updated factor strength vector.
2. The client device of claim 1, wherein, The client device is part of a group of N client devices, where N is a positive integer.
3. The client device according to claim 2, wherein, The processor selects a parameter set from the global parameter set by using three variational parameters, including a seed value.
4. The client device according to claim 3, wherein, The processor selects a parameter group from the global parameter group by receiving a global parameter group that has been sent from the global server to a first subset of the N client devices, wherein the client devices are part of the first subset of client devices.
5. The client device according to claim 4, wherein, The client device receives the globally updated weight factor dictionary and the globally updated factor intensity vector by receiving a globally updated weight factor dictionary and a globally updated factor intensity vector sent by the global server to a second subset of the N client devices, wherein the client device is part of the second subset of client devices.
6. The client device according to claim 4, wherein, The processor sends a request for the current version of the global parameter group to the global server via the communication interface. The processor uses the current version of the global parameter set to update the model, and The processor evaluates the model updated using the current version of the global parameter set to form inferences based on the dataset of the client device.
7. The client device according to any one of claims 1 to 6, wherein, The dataset includes information related to at least one of biometric data, medical data, image data, voice data, location data, application usage data, thermal data, atmospheric data, audio data, and survey data.
8. A federated machine learning system, comprising: The global server receives updates to the weight factor dictionary and factor strength vectors from N client devices, and generates globally updated weight factor dictionaries and globally updated factor strength vectors, where N is a positive integer. and The N client devices, at least one of which includes: At least one computing device, communication interface, and Processor, connected to the at least one computing device and communication interface, processor: Select a parameter set from the global parameter set, which includes a dictionary of weight factors and a vector of factor strengths. The model is trained using the dataset from the client device and a parameter set selected by the client device in a factorization manner, wherein the process of training the model includes updating the weight factor dictionary and factor strength vector by minimizing the difference between supervised learning of the dataset and the regularization of the selected parameter set and the global parameter set. After training the model, the client-updated weight factor dictionary and the client-updated factor strength vector are obtained. The updated weight factor dictionary and updated factor strength vector are sent to the client via the communication interface. The system receives the globally updated weight factor dictionary and the globally updated factor strength vector from the global server via the communication interface. The model is retrained using the dataset from the client device, the parameter set selected by the client device, and the globally updated weight factor dictionary and the globally updated factor strength vector.
9. The federated machine learning system of claim 8, wherein, The processor selects a parameter set from the global parameter set by using three variational parameters, including a seed value.
10. The federated machine learning system of claim 9, wherein, The processor selects a parameter group from the global parameter group by receiving a global parameter group that has been sent from the global server to a first subset of the N client devices, wherein the client devices are part of the first subset of client devices.
11. The federated machine learning system of claim 10, wherein, The client device receives the globally updated weight factor dictionary and the globally updated factor intensity vector by receiving a globally updated weight factor dictionary and a globally updated factor intensity vector sent by a global server to a second subset of the N client devices, wherein the client device is part of the second subset of client devices.
12. The federated machine learning system of claim 10, wherein, The processor sends a request for the current version of the global parameter group to the global server via the communication interface. The processor uses the current version of the global parameter set to update the model, and The processor evaluates the model updated using the current version of the global parameter set to form inferences based on the dataset of the client device.
13. The federated machine learning system according to any one of claims 8 to 12, wherein, The dataset includes information related to at least one of biometric data, medical data, image data, voice data, location data, application usage data, thermal data, atmospheric data, audio data, and survey data.
14. A method for federated machine learning, the method comprising: The client device selects a parameter group from a global parameter group, which includes a dictionary of weight factors and a factor strength vector. The client device uses the client device's dataset and a parameter set selected by the client device to train a model in a factorization manner, wherein the steps of training the model include: updating the weight factor dictionary and factor strength vector by minimizing the difference between supervised learning of the dataset and regularization of the selected parameter set and the global parameter set; After training the model, obtain the client-updated weight factor dictionary and the client-updated factor strength vector; The updated weight factor dictionary and the updated factor strength vector are sent from the client device to the global server. The client device receives a globally updated weight factor dictionary and a globally updated factor strength vector from the global server; and The client device retrains the model using the client device's dataset, the parameter set selected by the client device, and the globally updated weight factor dictionary and the globally updated factor strength vector.
15. The method according to claim 14, wherein, The client device is part of a group of N client devices, where N is a positive integer.
16. The method according to claim 15, wherein, The steps for selecting a parameter group from the global parameter group also include: using three variational parameters, including a seed value, to select the parameter group.
17. The method according to claim 16, wherein, The step of selecting a parameter group from the global parameter group further includes: the client device receiving a global parameter group that has been sent from the global server to a first subset of the N client devices, the client devices being part of the first subset of client devices.
18. The method according to claim 17, wherein, The step of the client device receiving the globally updated weight factor dictionary and the globally updated factor strength vector from the global server further includes: the client device receiving the globally updated weight factor dictionary and the globally updated factor strength vector sent by the global server to a second subset of the N client devices, wherein the client device is part of the second subset of client devices.
19. The method of claim 17, further comprising: The client device requests the current version of the global parameter group from the global server; Receives the global parameter group for the current version; Update the model using the current version of the global parameter set; and The model updated using the current version of the global parameter set is evaluated to form inferences based on the dataset of the client device.
20. The method according to any one of claims 14 to 19, wherein, The dataset includes information related to at least one of biometric data, medical data, image data, voice data, location data, application usage data, thermal data, atmospheric data, audio data, and survey data.