A privacy-preserving distributed optimization method and system based on optimal control

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By reconstructing the least squares problem into a discrete-time optimal control problem and introducing a regularization matrix, the computational collapse problem caused by the singularity of the Hessian matrix is solved, achieving fast convergence and stability of federated learning and improving the overall performance of federated learning.

CN121503734BActive Publication Date: 2026-06-12SHANDONG UNIV

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SHANDONG UNIV
Filing Date: 2026-01-12
Publication Date: 2026-06-12

Application Information

Patent Timeline

12 Jan 2026

Application

12 Jun 2026

Publication

CN121503734B

IPC: G06N20/00; G06F21/62; G06F17/16

CPC: G06N20/00; G06F21/6245; G06F17/16

AI Tagging

Application Domain

Digital data protection Machine learning

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing distributed second-order optimization methods suffer computational collapse in federated learning scenarios due to the singularity of the Hessian matrix, failing to guarantee training stability and efficiency.

⚗Method used

The least squares problem is reconstructed into a discrete-time optimal control problem. By introducing a control regularization matrix, the direct inversion of the Hessian matrix is avoided, and the model parameters are updated using the optimal control input.

🎯Benefits of technology

It maintains robustness under singular or ill-conditioned matrix conditions, achieves fast convergence and computational stability, and improves the overall performance of federated learning.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN121503734B_ABST

Patent Text Reader

Abstract

The application discloses a privacy protection type distributed optimization method and system based on optimal control, relates to the technical field of privacy protection, and comprises the following steps: dividing model parameters of each client into local private components and global shared components; constructing an optimal control model; at the beginning of each round of communication, a client receives the global shared components, combines the local private components to obtain a current state variable; performing local updating, calculating a gradient and a Hessian matrix, correcting the Hessian matrix according to a regularization matrix, solving updated model parameters by using the optimal control model; uploading the updated global shared components to a server, so that the server performs an aggregation operation to generate a global model until a termination condition is reached. The original least square problem is reconstructed into a discrete-time optimal control problem, the model parameters are updated by solving optimal control inputs, the direct inversion of the Hessian matrix is avoided by introducing a control regularization matrix, and the robustness under the condition that a matrix is singular or ill-conditioned is ensured.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of privacy protection technology, and in particular to a privacy-preserving distributed optimization method and system based on optimal control. Background Technology

[0002] The statements in this section are merely background information related to the present invention and do not necessarily constitute prior art.

[0003] Federated learning, an emerging paradigm that enables distributed collaborative model training under privacy constraints, is commonly used in data-sensitive fields such as smart healthcare and intelligent transportation. In these applications, multiple clients need to collaboratively optimize the global model by interacting with a central server to update information without sharing their local raw data.

[0004] Currently, optimization algorithms based on first-order gradients (such as the FedAvg algorithm and its variants) have the advantage of simple communication and computation. However, first-order algorithms only utilize gradient information and have a slow convergence speed (usually sublinear convergence). When facing high-dimensional, large-scale optimization problems, a large number of communication rounds are required to achieve the expected accuracy, resulting in a serious waste of communication and computational resources.

[0005] Second-order optimization methods (such as Newton's method) can theoretically achieve faster linear or superlinear convergence than first-order methods by utilizing the curvature information of the objective function. This makes second-order methods more promising than traditional first-order methods in scenarios that pursue high efficiency and high accuracy. However, existing distributed second-order optimization schemes have significant limitations in federated learning scenarios. Traditional second-order algorithms usually rely directly on the explicit inversion of the Hessian matrix to calculate the update step size. In the literature and existing technologies, the Hessian matrix often exhibits singularity when faced with uneven data distribution, ill-conditioned local loss functions, or high-noise data. This singularity makes the matrix invertible, causing existing algorithms to fail or diverge directly, and failing to guarantee training stability. Existing technologies have not effectively solved how to retain the speedup advantage of second-order methods while avoiding the computational collapse problem caused by the singularity of the Hessian matrix, making it difficult to improve overall performance. Summary of the Invention

[0006] To address the aforementioned issues, this invention proposes a privacy-preserving distributed optimization method and system based on optimal control. The original least squares problem is reconstructed into a discrete-time optimal control problem. The model parameters are updated by solving for the optimal control input. By introducing a control regularization matrix, the direct inversion of the Hessian matrix is avoided, ensuring robustness under singular or ill-conditioned matrix conditions.

[0007] To achieve the above objectives, the present invention adopts the following technical solution:

[0008] In a first aspect, the present invention provides a privacy-preserving distributed optimization method based on optimal control, applied to a client, comprising:

[0009] The model parameters for each client are divided into local private components and globally shared components;

[0010] Using model parameters as state variables, update step size as control input, and minimizing local least squares loss function as objective, construct an optimal control model;

[0011] At the start of each round of communication between the client and the server, the client receives the globally shared component broadcast by the server, combines it with the local private component, and assembles the current state variable; performs local update, calculates the gradient and Hessian matrix based on the current state variable, corrects the Hessian matrix according to the regularization matrix, and then uses the optimal control model to solve for the updated model parameters.

[0012] The globally shared components in the updated model parameters are uploaded to the server so that the server can perform aggregation operations to generate a global model until the termination condition is met, at which point the optimal model is obtained.

[0013] As an alternative implementation method, the first The client in the first Cost function of the optimal control model for round-robin communication for:

[0014] ;

[0015] in, It is the least squares loss function; For the first A client, in the t-th round of communication, the control input to be solved in the k-th local update; It is a preset positive definite diagonal matrix; For the first A client, in the t-th round of communication, updates the model parameters locally for the K-th time; K is the maximum number of iterations for local updates; For the first A client updates the model parameters locally during the t-th round of communication.

[0016] As an alternative implementation, the process of modifying the Hessian matrix based on the regularization matrix involves adding a positive definite diagonal matrix to the original Hessian matrix.

[0017] As an alternative implementation, the local update process includes: recursively calculating approximate control inputs to update model parameters.

[0018] ;

[0019] ;

[0020] ;

[0021] in, For the first A client, in the t-th round of communication, updates the model parameters locally for the k-th time; For the first The client, in the first Round communication, the first The model parameters are updated locally next time; For the first For client t, during the t-th round of communication and the k-th local update, the t-th... Approximate control input obtained from recursive calculations; For the first For client t, during the t-th round of communication and the k-th local update, the t-th... Approximate control input obtained from recursive calculations; For the first The initial value of each client is recursively calculated in the t-th round of communication; For the first For each client, the initial model parameters for the t-th round of communication; It is a preset positive definite diagonal matrix; It is the least squares loss function.

[0022] As an alternative implementation method, complete After the local update, the client separates the updated globally shared component. The model is then uploaded to the server, which uses an average aggregation strategy to generate the next round of models. n represents the number of clients.

[0023] As an alternative implementation, the termination condition is reaching a preset number of communication rounds or the global loss function converging to a preset threshold.

[0024] Secondly, the present invention provides a privacy-preserving distributed optimization system based on optimal control, applied to a client, comprising:

[0025] The decoupling module is configured to divide the model parameters of each client into local private components and globally shared components;

[0026] The modeling module is configured to construct an optimal control model with model parameters as state variables, update step size as control input, and minimizing the local least squares loss function as the objective.

[0027] The update module is configured to, at the beginning of each round of communication between the client and the server, have the client receive the globally shared component broadcast by the server, combine it with the local private component, assemble the current state variable, perform a local update, calculate the gradient and Hessian matrix based on the current state variable, correct the Hessian matrix according to the regularization matrix, and then use the optimal control model to solve for the updated model parameters.

[0028] The upload module is configured to upload the globally shared components of the updated model parameters to the server, so that the server can perform an aggregation operation to generate a global model until the termination condition is met, and then obtain the optimal model.

[0029] Thirdly, the present invention provides an electronic device including a memory and a processor, and computer instructions stored in the memory and running on the processor, wherein the computer instructions, when executed by the processor, perform the method described in the first aspect.

[0030] Fourthly, the present invention provides a computer-readable storage medium for storing computer instructions, which, when executed by a processor, perform the method described in the first aspect.

[0031] Fifthly, the present invention provides a computer program product, including a computer program that, when executed by a processor, implements the method described in the first aspect.

[0032] Compared with the prior art, the beneficial effects of the present invention are as follows:

[0033] To overcome the reliance on explicit inversion of the Hessian matrix in traditional second-order algorithms and address their instability under ill-conditioned problems, this invention proposes a federated learning framework based on optimal control. The original least squares problem is reconstructed into a discrete-time optimal control problem, transforming the model parameter update process into finding the optimal control input sequence. Simultaneously, by introducing a control regularization matrix, direct inversion of the Hessian matrix is avoided, ensuring robustness under matrix singularity or ill-conditioned conditions. This addresses the shortcomings of existing federated learning algorithms in handling least squares problems, such as slow convergence, unstable training, and computational failures caused by ill-conditioned local loss functions or singular Hessian matrices.

[0034] This invention proposes a federated learning framework based on optimal control. It reconstructs the least squares problem in federated learning using optimal control theory, transforming the local optimization tasks of each client into optimal control problems. The model parameters are updated by solving for the optimal control input, providing a novel perspective for solving federated optimization problems.

[0035] This invention introduces a control regularization matrix technique, which accelerates convergence by utilizing second-order curvature information and avoids the requirement for explicit inversion of the Hessian matrix in the traditional Newton method from a mathematical perspective. This effectively solves the problem of algorithm failure caused by the singularity of the Hessian matrix and significantly improves the numerical stability of the algorithm in ill-conditioned and high-noise environments.

[0036] Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description

[0037] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0038] Figure 1 Here is a flowchart of the privacy-preserving distributed optimization method based on optimal control provided in Embodiment 1 of the present invention;

[0039] Figure 2 This is a schematic diagram of the privacy-preserving distributed optimization method based on optimal control provided in Embodiment 1 of the present invention.

[0040] Figure 3 This is an experimental result graph showing the change curve of the global loss function value provided in Embodiment 1 of the present invention;

[0041] Figure 4 This is a graph showing the experimental results of the L1 residual curve provided in Embodiment 1 of the present invention;

[0042] Figure 5 The figure shows the experimental results of the L2 residual curve provided in Embodiment 1 of the present invention. Detailed Implementation

[0043] The present invention will be further described below with reference to the accompanying drawings and embodiments.

[0044] It should be noted that the following detailed descriptions are exemplary and intended to provide further illustration of the invention. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains.

[0045] It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of exemplary embodiments according to the invention. As used herein, unless the context clearly indicates otherwise, the singular form is intended to include the plural form as well. Furthermore, it should be understood that the terms “comprising” and “including”, and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product, or apparatus that includes a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0046] Where there is no conflict, the embodiments and features in the embodiments of the present invention can be combined with each other.

[0047] Example 1

[0048] This embodiment provides a privacy-preserving distributed optimization method based on optimal control, applicable to a distributed system containing a server and multiple clients;

[0049] like Figure 1 As shown, it includes:

[0050] The model parameters for each client are divided into local private components and globally shared components;

[0051] Using model parameters as state variables, update step size as control input, and minimizing local least squares loss function as objective, construct an optimal control model;

[0052] At the start of each round of communication between the client and the server, the client receives the globally shared component broadcast by the server, combines it with the local private component, and assembles the current state variable; performs local update, calculates the gradient and Hessian matrix based on the current state variable, corrects the Hessian matrix according to the regularization matrix, and then uses the optimal control model to solve for the updated model parameters.

[0053] The globally shared components in the updated model parameters are uploaded to the server so that the server can perform aggregation operations to generate a global model until the termination condition is met, at which point the optimal model is obtained.

[0054] The following is combined with Figure 2 The above methods will be explained in detail.

[0055] Step 1: Initialization and variable decoupling.

[0056] Server initializes global model parameters And set the total number of communication rounds. Local iteration count and positive definite control regularization matrix .

[0057] To balance global consensus and personalized adaptation, the model parameters are decoupled and divided into local private components and globally shared components for each client. The local private components are updated only on the client to adapt to data characteristics and protect privacy, while the globally shared components are used for cross-client collaborative optimization.

[0058] Specifically: for the first For each client, the model parameters are updated locally during the t-th round of communication. (State variables) are:

[0059] (1);

[0060] in, For the first For each client, the local private components (such as personalized prediction heads) of the k-th local update during the t-th round of communication are retained only during the local iteration; For the first For each client, the globally shared component (such as the feature extractor) updated locally in the t-th round of communication needs to be uploaded to the server for aggregation.

[0061] Step 2: Construct an optimal control model for local updates, reconstructing the client's local loss optimization process into an optimal control problem for a discrete-time linear dynamic system.

[0062] Specifically:

[0063] This embodiment abandons the traditional gradient descent perspective, modeling local updates as a dynamic control process. Using model parameters as state variables, the update step size as control input, and minimizing the local least squares loss function as the objective, an optimal control model is constructed. The state equation is defined as follows: , For the first A client has a control input to be solved in the t-th round of communication and the k-th local update.

[0064] To find the optimal This refactors the optimization problem into a discrete-time optimal control problem. Within this framework, the cost function of the optimal control model... Defined as:

[0065] (2);

[0066] in, It is the least squares loss function; To control the energy regularization term, Given a pre-defined positive definite diagonal matrix, the introduction of a control energy regularization term not only constrains the update step size, but more importantly, it enables... The positive definiteness is used to correct the eigenvalue distribution of the Hessian matrix; For the first A client updates the model parameters locally in the t-th round of communication, where K is the maximum number of iterations for local updates.

[0067] To address the aforementioned dynamic constraints, a sequence of costate variables (Lagrange multipliers) is introduced. Construct the augmented objective function:

[0068] (3);

[0069] Applying the variational method, let For control input The partial derivatives are zero, thus deriving the optimality condition.

[0070] Utilizing the backward recursive property of costate equations, Essentially, it represents the cumulative effect of gradients over all future moments, i.e. ; For the first The client communicates in round t, and in round t... The model parameters are updated locally.

[0071] Substituting this result back into the control equation, we can obtain the theoretically optimal control law:

[0072] (4).

[0073] However, this theoretical formula is computationally infeasible because of the current control variables. Dependent on future states The future state, in turn, is determined by the present state. This decision created a non-causal circular dependency.

[0074] To break this dependency cycle, a first-order Taylor expansion is used:

[0075] (5);

[0076] in, For the first The client communicates in round t, and in round t... The model parameters are updated locally.

[0077] The gradient terms of the future are approximated. Through algebraic derivation, the original reliance on the future is transformed into the utilization of historical information, ultimately resulting in a fully forward, explicit iterative process.

[0078] Specifically, the update direction is recursively calculated using the gradient at the current time step, the Hessian matrix, and the update direction at the previous time step.

[0079] set up For the first The client, the first The approximate control input obtained from the recursive calculation has the following update formula:

[0080] (6);

[0081] (7);

[0082] (8);

[0083] in, For the first A client, in the t-th round of communication, updates the model parameters locally for the k-th time; For the first The client, in the first Round communication, the first The model parameters are updated locally next time; For the first For client t, during the t-th round of communication and the k-th local update, the t-th... Approximate control input obtained from recursive calculations; For the first For client t, during the t-th round of communication and the k-th local update, the t-th... Approximate control input obtained from recursive calculations; For the first The initial value of each client is recursively calculated in the t-th round of communication; For the first The initial model parameters for each client in the t-th round of communication.

[0084] In this step, because It is positive definite, even if It is a singular matrix (non-invertible), a matrix It remains strictly positive definite and invertible, fundamentally ensuring the stability of numerical calculations.

[0085] This derivation process transforms the theoretical optimal solution, which depends on the sum of future gradients, into a practical numerical algorithm that incorporates second-order information and utilizes historical data.

[0086] Step 3: Federated learning iterative process.

[0087] At the start of each round of communication between the client and server, the client receives the globally shared components broadcast by the server and assembles them with its local private components to obtain the current state variables. Then, a local update is performed, calculating the gradient and Hessian matrix based on the current state variables, and solving for the model parameters using an iterative method based on optimal control. During this calculation process, the Hessian matrix is corrected using a regularization matrix to ensure matrix invertibility. Finally, the client uploads the updated globally shared components to the server, and the server performs an aggregation operation to generate a new global model until the model converges.

[0088] like Figure 2 As shown, each round of communication includes the following sub-steps:

[0089] 1) Broadcast and State Assembly: The server will broadcast the current globally shared components. The broadcast is sent to all clients. Each client then combines this with the local private components retained from the previous training round. Assemble into the current state variable .

[0090] 2) Local optimization based on optimal control: parallel execution on the client side This is the second local update. In this update, the gradient under the current state variable is first calculated. Hessian matrix Subsequently, in order to avoid the possible singularity of the Hessian matrix and reduce the computational complexity, the control input is calculated recursively using equations (6)-(8).

[0091] 3) Global aggregation: Completed After the local update, the client separates the updated globally shared component. And upload it to the server. The server uses an average aggregation strategy to generate the next round of models: n is the number of clients; For the first The client communicates in round t, and in round t... The globally shared component updated locally.

[0092] Step 4: Determine the termination condition.

[0093] Repeat step 3 until the preset number of communication rounds is reached. Alternatively, the global loss function converges to a preset threshold, and the final optimized model is output.

[0094] Through the above implementation method, when processing high-noise data (such as cryo-electron microscopy image alignment tasks), second-order information can be effectively utilized to achieve fast convergence, while the computational stability of the entire process is guaranteed by the control regularization mechanism.

[0095] The aforementioned method employs a data-decentralized mechanism in federated learning, allowing each client to process the raw data locally and exchange only optimization-related parameters, thus forming a collaborative computing approach with privacy protection features. During computation, each client's least-squares optimization problem is reconstructed as an optimal control problem for a nonlinear dynamic system. The local loss function is minimized by finding the optimal control input, and a regularization matrix is introduced into the control cost to avoid explicit inversion of the Hessian matrix, resulting in a second-order optimization method that is stable and efficient in multi-node environments.

[0096] To verify the effectiveness and superiority of the Federated Optimal Control (FedOC) algorithm proposed in this embodiment in practical applications, numerical experiments were conducted based on a cryo-electron microscopy (Cryo-EM) image alignment task. The experiments used a bundle adjustment framework as the basic model, constructing the alignment task as a nonlinear least-squares optimization problem, and tested it using a real Vibrio dataset (containing 121 camera views and 80 labeled 3D points). The Fed-Grad algorithm using standard gradient descent and the FedNewton algorithm using standard Newton's method were selected as benchmarks for comparison. In terms of specific experimental parameter settings, the regularization matrix of the method in this embodiment was set as follows: The learning rate of the comparison method Fed-Grad is fixed at 1. Both perform 5 local iterations in each round of federated optimization.

[0097] Experimental results show that, Figure 3 The global cost curve shown in this embodiment demonstrates that, compared to the gradient-descent-based Fed-Grad method, the objective function value of the FedOC method exhibits a sharp downward trend, rapidly converging to a lower level. This intuitively verifies that the algorithm has a significant advantage in convergence speed. Figure 4 L1 residual curve and Figure 5 As shown in the L2 residual curve, the method in this embodiment also exhibits a steeper descent slope and a better final convergence value in terms of various indicators reflecting alignment accuracy, indicating that it can achieve high-precision alignment results faster.

[0098] Specifically, the numerical results show that the method in this embodiment can reduce the L1 residual to 1.1228, which is far superior to the Fed-Grad method's 4.9678; and reduce the L2 residual to 1.8185, which is significantly superior to the Fed-Grad method's 8.2453; the global cost is also reduced to The comparison method is still in its early stages. The high-order bits. Meanwhile, in the experiment, the FedNewton algorithm failed to produce effective optimization results due to the singularity of the Hessian matrix during the calculation process. In contrast, the method in this embodiment effectively avoids the computational failure problem caused by the singularity or ill-conditioned nature of the Hessian matrix by introducing a regularization control term, proving that this embodiment has stronger robustness in handling such complex optimization problems.

[0099] Thus, this embodiment achieves a balance between superlinear convergence and global stability, theoretically proving that it has superlinear convergence speed under standard assumptions. Furthermore, in high-noise practical tasks such as cryo-electron microscopy image alignment, it significantly reduces alignment error and maintains robust convergence performance compared to existing first-order and second-order baseline algorithms.

[0100] Example 2

[0101] This embodiment provides a privacy-preserving distributed optimization system based on optimal control, applied to a client, including:

[0102] The decoupling module is configured to divide the model parameters of each client into local private components and globally shared components;

[0103] The modeling module is configured to construct an optimal control model with model parameters as state variables, update step size as control input, and minimizing the local least squares loss function as the objective.

[0104] The update module is configured to, at the beginning of each round of communication between the client and the server, have the client receive the globally shared component broadcast by the server, combine it with the local private component, assemble the current state variable, perform a local update, calculate the gradient and Hessian matrix based on the current state variable, correct the Hessian matrix according to the regularization matrix, and then use the optimal control model to solve for the updated model parameters.

[0105] The upload module is configured to upload the globally shared components of the updated model parameters to the server, so that the server can perform an aggregation operation to generate a global model until the termination condition is met, and then obtain the optimal model.

[0106] It should be noted that the above modules correspond to the steps described in Embodiment 1, and the examples and application scenarios implemented by the above modules and the corresponding steps are the same, but are not limited to the content disclosed in Embodiment 1. It should also be noted that the above modules, as part of the system, can be executed in a computer system such as a set of computer-executable instructions.

[0107] In further embodiments, the following is also provided:

[0108] An electronic device includes a memory and a processor, as well as computer instructions stored in the memory and running on the processor, wherein the computer instructions, when executed by the processor, perform the method described in Embodiment 1. For brevity, further details are omitted here.

[0109] It should be understood that in this embodiment, the processor can be a central processing unit (CPU), or it can be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or any conventional processor, etc.

[0110] Memory may include read-only memory and random access memory, and provides instructions and data to the processor. A portion of memory may also include non-volatile random access memory. For example, memory may also store information about the device type.

[0111] A computer-readable storage medium for storing computer instructions, which, when executed by a processor, perform the method described in Embodiment 1.

[0112] The method in Example 1 can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor. The software modules can reside in readily available storage media in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, or registers. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method. To avoid repetition, a detailed description is not provided here.

[0113] A computer program product includes a computer program that, when executed by a processor, implements the method described in Embodiment 1.

[0114] The present invention also provides at least one computer program product tangibly stored on a non-transitory computer-readable storage medium. The computer program product includes computer-executable instructions, such as instructions included in program modules, which execute in a device on a target real or virtual processor to perform the processes / methods described above. Typically, program modules include routines, programs, libraries, objects, classes, components, data structures, etc., that perform specific tasks or implement specific abstract data types. In various embodiments, the functionality of program modules can be combined or divided among program modules as needed. The machine-executable instructions for the program modules can execute within a local or distributed device. In a distributed device, the program modules can reside in both local and remote storage media.

[0115] The computer program code used to implement the methods of the present invention may be written in one or more programming languages. This computer program code may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, such that when executed by the computer or other programmable data processing device, the program code causes the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The program code may be executed entirely on a computer, partially on a computer, as a stand-alone software package, partially on a computer and partially on a remote computer, or entirely on a remote computer or server.

[0116] In the context of this invention, computer program code or related data may be carried by any suitable carrier to enable a device, apparatus, or processor to perform the various processes and operations described above. Examples of carriers include signals, computer-readable media, and the like. Examples of signals may include electrical, optical, radio, sound, or other forms of propagation signals, such as carrier waves, infrared signals, etc.

[0117] Those skilled in the art will recognize that the units and algorithm steps described in connection with the various examples of this embodiment can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this invention.

[0118] It should be noted that all data acquisition is conducted in accordance with laws and regulations and with user consent, and the data is used legally.

[0119] While the specific embodiments of the present invention have been described above in conjunction with the accompanying drawings, this is not intended to limit the scope of protection of the present invention. Those skilled in the art should understand that various modifications or variations that can be made by those skilled in the art without creative effort based on the technical solutions of the present invention are still within the scope of protection of the present invention.

Claims

1. A privacy-preserving distributed optimization method based on optimal control, characterized in that, Applied to distributed systems that include servers and multiple clients, including: The server initializes global model parameters, setting the total number of communication rounds and the number of local iterations; For the cryo-electron microscopy image alignment task, the model parameters of the cryo-electron microscopy image alignment model are decoupled, and the model parameters of each client are divided into local private components and globally shared components. At the start of each round of communication between the client and the server, the server broadcasts the current global shared component to all clients; The client receives the globally shared component broadcast by the server, combines it with the local private component retained from the previous training round, and assembles it to obtain the current state variable; All clients execute in parallel. The local update process includes: calculating the gradient and Hessian matrix based on the current state variables, correcting the Hessian matrix according to the regularization matrix, and then solving for the updated model parameters using the constructed optimal control model. The optimal control model is constructed by using the model parameters as state variables, the update step size as control input, and constructing the cryo-electron microscopy image alignment task as a nonlinear least squares optimization problem, with the goal of minimizing the local least squares loss function. Finish After the local update, the client separates the updated global shared component and uploads it to the server; The server aggregates the globally shared components from all clients to generate the next round of global models until the termination condition is met, at which point the optimal cryo-electron microscopy image alignment model is obtained. Among them, the The client in the first Cost function of the optimal control model for round-robin communication for: ； in, It is the least squares loss function; For the first A client, in the t-th round of communication, the control input to be solved in the k-th local update; This is a pre-defined positive definite diagonal matrix; For the first A client, in the t-th round of communication, updates the model parameters locally for the K-th time; K is the maximum number of iterations for local updates; For the first A client, in the t-th round of communication, updates the model parameters locally for the k-th time; The process of modifying the Hessian matrix based on the regularization matrix involves adding a positive definite diagonal matrix to the original Hessian matrix.

2. The privacy-preserving distributed optimization method based on optimal control as described in claim 1, characterized in that, The local update process includes: recursively calculating approximate control inputs to update model parameters. ；；； in, For the first A client, in the t-th round of communication, updates the model parameters locally for the k-th time; For the first The client, in the first Round communication, the first The model parameters are updated locally next time; For the first For client t, during the t-th round of communication and the k-th local update, the t-th... Approximate control input obtained from recursive calculations; For the first For client t, during the t-th round of communication and the k-th local update, the t-th... Approximate control input obtained from recursive calculations; For the first The initial value of each client is recursively calculated in the t-th round of communication; For the first For each client, the initial model parameters for the t-th round of communication; This is a pre-defined positive definite diagonal matrix; It is the least squares loss function.

3. The privacy-preserving distributed optimization method based on optimal control as described in claim 1, characterized in that, Finish After the local update, the client separates the updated globally shared component. The model is then uploaded to the server, which uses an average aggregation strategy to generate the next round of models. n represents the number of clients.

4. The privacy-preserving distributed optimization method based on optimal control as described in claim 1, characterized in that, The termination condition is reaching a preset number of communication rounds or the global loss function converging to a preset threshold.

5. A privacy-preserving distributed optimization system based on optimal control, characterized in that, The method for implementing a privacy-preserving distributed optimization based on optimal control as described in any one of claims 1-4, applied to a client, includes: The decoupling module is configured to divide the model parameters of each client into local private components and globally shared components, wherein the model parameters are the model parameters of the cryo-electron microscopy image alignment model; The modeling module is configured to construct an optimal control model with model parameters as state variables, update step size as control input, and minimizing the local least squares loss function as objective. The optimal control model is constructed by using model parameters as state variables, update step size as control input, and constructing the cryo-electron microscopy image alignment task as a nonlinear least squares optimization problem, with minimizing the local least squares loss function as objective. The update module is configured to, at the beginning of each round of communication between the client and the server, have the client receive the globally shared component broadcast by the server, combine it with the local private component, assemble the current state variable, perform a local update, calculate the gradient and Hessian matrix based on the current state variable, correct the Hessian matrix according to the regularization matrix, and then use the optimal control model to solve for the updated model parameters. The upload module is configured to upload the globally shared components in the updated model parameters to the server, so that the server can perform an aggregation operation to generate a global model until the termination condition is met, and then obtain the optimal cryo-electron microscopy image alignment model.

6. An electronic device, characterized in that, It includes a memory and a processor, as well as computer instructions stored in the memory and running on the processor, which, when executed by the processor, perform the method according to any one of claims 1-4.

7. A computer-readable storage medium, characterized in that, Used to store computer instructions, which, when executed by a processor, perform the method described in any one of claims 1-4.

8. A computer program product, characterized in that, Includes a computer program, which, when executed by a processor, implements the method described in any one of claims 1-4.