Variance reduction technology-based distributed projection method considering communication delay

A distributed and projection algorithm technology, applied in the field of intelligent communication, can solve the problems of large gradient calculation, low efficiency of calculation and communication of multi-agent system, and heavy burden of intelligent agent calculation.

Pending Publication Date: 2020-12-11
SOUTHWEST UNIVERSITY
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

However, when the existing distributed optimization algorithms face large-scale convex optimization problems with relatively complex local constraints, the amount of gradient calculation is large, and the c...
View more

Method used

Step 3, propose the distributed projection algorithm (3) based on variance reduction technology to solve the convex optimization problem model (2) of band restriction, promptly adopt local stochastic ...
View more

Abstract

The invention discloses a variance reduction technology-based distributed projection method considering communication time delay. The method comprises the following steps of 1, proposing an original optimization problem model (1) for a multi-intelligent system with local set constraints and local equation constraints at the same time; 2, equivalently converting the original optimization problem model (1) obtained in the step 1 into a convex optimization problem model (2) convenient for distribution processing; step 3, proposing a variance reduction technology-based distributed projection algorithm (3), solving a constrained convex optimization problem model (2), i.e., estimating a local full gradient in an unbiased manner by adopting a local random average gradient so as to reduce a heavycalculation burden caused by calculating full gradients of all local target functions in each iteration; and 4, performing convergence analysis. According to the invention, the calculation cost of allagents in the network can be greatly reduced, so that the communication and calculation pressure of the whole multi-agent system is reduced, and the practicability is relatively high.

Application Domain

Geometric CADDesign optimisation/simulation +2

Technology Topic

Optimization problemEngineering +5

Image

  • Variance reduction technology-based distributed projection method considering communication delay
  • Variance reduction technology-based distributed projection method considering communication delay
  • Variance reduction technology-based distributed projection method considering communication delay

Examples

  • Experimental program(2)

Example Embodiment

[0100] Secondly, the specific embodiment of the invention is as follows:
[0101] A distributed projection method based on variance reduction technology considering communication delay includes the following steps:
[0102] 1, putting forward a primal optimization problem model (1) for a multi-intelligent system with local set constraints and local equality constraints;
[0103] 2, equivalently converting the original optimization problem model (1) obtained in step 1 into a convex optimization problem model (2) which is convenient for distribution and processing;
[0104] 3. Propose a distributed projection algorithm based on variance reduction technology. (3) Solve the constrained convex optimization problem model (2), that is, use the local random average gradient to estimate the local full gradient without bias, so as to reduce the heavy computational burden caused by calculating the full gradient of all local objective functions in each iteration.
[0105] 4, analyzing the convergence of the distributed projection algorithm (3) based on variance reduction technology proposed in step 3;
[0106] The concrete construction process and form of the original optimization problem model (1) in step 1 are as follows:
[0107] Firstly, define an agent cluster V = {1, …, m}, a communication network edge set. And adjacent moment array. Undirected communication network based on And the simple network g has no self-circulation;
[0108] When the agent (i,j)∈E, a ij =a ji 0, otherwise a ij =a ji =0;
[0109] The degree of agent I is expressed as
[0110] For diagonal matrix d = diag {d 1 ,d 2 ,...,d m }, the Laplace matrix of the undirected network G is defined as
[0111] If the undirected network G is connected, then Laplacian matrix Is symmetrical and semi-positive;
[0112] Secondly, the specific form of the original optimization problem model (1) is as follows
[0113]
[0114] In the above formula, the objective function Indicates the sample to be processed in a practical problem, and the Table decision vector, q i Indicates the total number of local problems assigned to agent I;
[0115] At the same time, the local objective function is further decomposed into among h∈{1,...,q i } is a sub-function of the h-th local objective function;
[0116] Based on the above formula, define Is a closed convex set, and the intersection x is non-empty, define a column full rank matrix. and The optimal solution of constrained convex optimization problem (1) is defined as
[0117] The specific form of the convex optimization problem model (2) in step 2 is as follows:
[0118]
[0119] Where x i Agent I pairs the decision vectors. Estimated value of;
[0120] Matrix b is defined as a diagonal matrix with full rank, and the diagonal elements are {B 1,...,B m }, that is
[0121] Stacking vector
[0122] make Cartesian product;
[0123] make
[0124] q i The maximum and minimum values of are expressed as Q respectively. max And q min (where q min ≥1, that is, each agent processes at least one sample);
[0125] According to the above statement, λ can be obtained. min (B T B)q min >0;
[0126] Based on the above convex optimization problem model (2), the following assumptions and definitions are made:
[0127] Hypothesis 1: Each local sub-objective function f i h All of them are strongly convex and have Lipschitz continuous gradients. That is, for all i∈V, H ∈ {1, ..., Q. i }, and Have the following formula:
[0128]
[0129]
[0130] Where 0 < μ≤ L f;
[0131] Then, under the condition that hypothesis 1 is established, the global optimal solution of constrained convex optimization problem (2) is unique and expressed as
[0132] 2: The undirected network G is connected;
[0133] Hypothesis 3: For and exist
[0134]
[0135] Among them b 0 Is a positive integer.
[0136] 1: Define global vector to collect local variable X. i,k ,y i,k ,w i,k ,g i,k and As follows:
[0137]
[0138]
[0139]
[0140]
[0141]
[0142] And global vector X. k And W. k Partial version:
[0143]
[0144]
[0145] Then, at the k-th iteration, the communication delay I,j∈V, determined by agent I and agent J at the same time, therefore, the global delay vector x k [i] and w k [i] is only held by agent I.
[0146] The specific iterative process of the distributed projection algorithm (3) based on variance reduction technology in step 3 is as follows:
[0147] Initialization: for all agents i∈V, initialize X i,0 ,
[0148] Setting: k = 0
[0149] For agent I = 1, ..., m, execute
[0150] 1: from the set {1,...,q i } to choose any sample.
[0151] 2. Calculate the local random average gradient as follows
[0152]
[0153] 3: Settings And store.
[0154] 4: Update the variable x i,k+1 as follows
[0155]
[0156] 5: Update the variable Y. i,k+1 as follows
[0157] y i,k+1 =y i,k +B i x i,k+1 -b i
[0158] 6: Update the variable w i,k+1 as follows
[0159] w i,k+1 =w i,k +βx i,k+1
[0160] End of cycle
[0161] Set k = k+1, and repeat the above cycle until the stop condition is met;
[0162] Wherein, the Is a sub-function of the local objective function. h∈{1,...,q i } at the iteration value of the kth iteration, Represents an n-dimensional real column vector.
[0163] Said The iteration rules of are as follows:
[0164]
[0165] At iteration k, for agent I, a local random average gradient is defined:
[0166]
[0167] among The following iteration can be used for calculation:
[0168]
[0169] Ling f k Representing the σ algebra generated by the local random average gradient in iteration k, the following equation can be obtained:
[0170]
[0171] The convergence analysis process in step 4 is as follows:
[0172]First of all, in the practical application process, this embodiment adopts the following seven lemmas in the convergence analysis: Lemma 1: For any non-empty closed convex set X, the following two inequalities hold.
[0173]
[0174]
[0175] In which p X [] is a projection operator;
[0176] Lemma 2: If it exists and The global optimal solution of constrained convex optimization problem (2) is obtained under the condition that the first assumption is established. Only exists, and there are:
[0177]
[0178] Where the constant step alpha > 0 and the parameter beta > 0;
[0179] Lemma 3: Under the assumption 1-2, consider the sequence generated by distributed projection algorithm (3) based on variance reduction technology. And {g k } k≥0 , for have
[0180]
[0181] In which the auxiliary sequence {p k } k≥0 Is defined as:
[0182]
[0183] Sequence {p k } k≥0 It is assumed to be non-negative under the condition that one is established;
[0184] Lemma 4: Considering the distributed projection algorithm (3) and sequence (13) based on variance reduction technology under the condition of hypothesis 1, for have
[0185]
[0186] Lemma 5: Consider the global vector V under the condition that hypothesis 3 holds. k =[(v 1,k ) T ,...,(v m,k ) T ] T And its delayed version v. k [i], there are:
[0187]
[0188] among For a given sequence {v t } t≥0 , we give
[0189]
[0190] Here, l and d are two non-negative scalars; Then, the The superposition of k from 0 to n can be obtained.
[0191]
[0192] Lemma 6: Considering the distributed projection algorithm (3) based on variance reduction technology under the conditions of assumptions 1-3, the following inequalities hold.
[0193]
[0194] among And w = I-α l, φ and η are positive constants;
[0195] The concrete proof process of the above conclusion is as follows:
[0196] According to definition 1, we give the shorthand form of distributed projection algorithm (3) based on variance reduction technology as follows:
[0197]
[0198] y i,k+1 =y i,k +Bix i,k+1 -b i (9b)
[0199] w i,k+1 =w i,k +βx i,k+1 (9c)
[0200] among And v i,k Defined as follows:
[0201]
[0202] According to (9a), we have:
[0203]
[0204]
[0205] Among them, the inequality uses the following formula:
[0206] (I) take note of x k+1 =P X [v k ], then according to Lemma 1, the following formula holds:
[0207]
[0208] among and
[0209] (ii) Similar to [12], we have
[0210]
[0211] Next, continue the analysis.
[0212]
[0213] Where η sum Is a positive constant, the first inequality applies Young's inequality, and the second inequality uses that the function f is strongly convex and has Lipschitz continuous gradient. Substituting the result of (27) into (24) can get:
[0214]
[0215] Next to 2α(x k+1 -x * ) T B T B(x k+1 -x k ) for processing:
[0216]
[0217] Substitute the result of formula (29) into formula (28), and take the expectation to get:
[0218]
[0219]
[0220] According to formula (8), we know that Therefore, we are going to Carry out treatment
[0221]
[0222] In which p kIn the definition (13), the first equation in (31) uses the standard variance decomposition E [|| A-E [A | F. k ]|| 2 |F k ]=E[||a|| 2 |F k ]-||E[a|F k ]|| 2 The inequality uses the strong convexity sum of f. The continuity of Lipshits. Next, substituting the conclusion of (31) into (30) can get:
[0223]
[0224] Next, we will introduce an important relation, Where v is a semi-positive definite matrix. According to this relation, we can get the following three formulas:
[0225]
[0226] Finally, substitute the result of formula (33) into formula (32).
[0227] Lemma 7: Under the condition that hypothesis 3 holds, the following two inequalities hold.
[0228]
[0229]
[0230] In which ξ 1 ,ξ 2 Are two arbitrarily positive constants; It is worth noting that when the undirected network is determined, It will be determined accordingly.
[0231] Conclusion The specific proving process is as follows:
[0232] Let's first prove (19a) in Lemma 7.
[0233]
[0234] The second inequality uses Lemma 5, the last inequality uses Young's inequality, and ξ 1 Is a positive constant; The proof process of (19b) is similar to that of (19a), so it will not be repeated here;
[0235] Secondly, for the convenience of analysis, the following definitions are made:
[0236] 2: For 0 < α < 1/λ max (l), define a semi-positive definite matrix p as:
[0237]
[0238] Where W = I-α L is a positive definite matrix, so there is:
[0239]
[0240] In which vector And u * =[(x * ) T ,(y * ) T ,(w * ) T ] T;
[0241] Then the following conclusions can be obtained by combining assumptions 1-3 and definitions 1-2:
[0242] Under the assumption of 1-3, consider distributed projection algorithm (3) based on variance reduction technology and U in definition 2. k And the definition of U*, if the parameters η, φ and ξ satisfy:
[0243]
[0244] 0<φ<2μ (21b)
[0245]
[0246] Then, the constant step alpha and the algorithm parameter beta satisfy:
[0247]
[0248]
[0249] Then the sequence {U k } k≥0 Is bounded and convergent, then the sequence {x k } k≥0 Is the only one that converges to X. * Yes.
[0250] The specific certification process is as follows:
[0251] For α > 0 and β > 0, substituting the result of Lemma 7 into Lemma 6 can get:
[0252]
[0253]
[0254] among Is defined in Theorem 6. Next, according to Lemma 4, we put c(E[p k+1 |F k ]-p k ) added to (35)
[0255] Get both ends:
[0256]
[0257] According to Lemma 3, we know the sequence P. k ≥0; Therefore, if η > 2l f [L f q max +q min (L f -μ)]/λ min (B T B)q min And 4αq max L f /η≤c, then (36) can be rewritten as:
[0258]
[0259] According to definition 2, if 0 < α < 1/λ max (L), and 0 < β < 1, we have
[0260]
[0261] To deal with the first item at the right end of the inequality sign in (38), we set ξ below. 1 =ξ 2 = ξ, 0 < φ < 2μ, and 0 < ξ Based on this definition, we can rewrite formula (38) as:
[0262]
[0263] Sum (39) about k from 0 to n to get:
[0264]
[0265] Under conditions (21) and (22), we define a semi-positive definite matrix.
[0266]
[0267] Therefore, inequality (40) can be rewritten as:
[0268]
[0269] When n approaches infinity, we have
[0270]
[0271] The formula shows that the right side of formula (39) is summable. Therefore, the sequence On inner product P Is quasi Fejér monotone; We can get the sequence directly. Is bounded and convergent; Therefore, the sequence {U k } k≥0 Is bounded and convergent; Finally we can get the sequence {x k } k≥0 Convergence to x *; Under the assumption that 1 holds, we know the global optimal solution X. * Is the only one that exists.

Example

[0272] Application example 1
[0273] To prove the effectiveness of the proposed algorithm, we consider using M = 10 multi-agent network to solve the following least squares optimization problems:
[0274]
[0275] among and The abscissa indicates the calculation amount of all samples at one time. We set n = 10, p. i = 1, and the overall sample is q = 1000; The total samples are randomly and evenly distributed among agents in the network; Therefore, every agent i∈V needs to deal with Q. i = q/m samples; local parameter and Randomly selected between [-1,1] and [-n,n] respectively; Equality constraints are defined as B when j = I i The jth element of is 1, otherwise it is 0; b i Constant is 1; The local set constraint of agent I is defined as X. i =[-1 n ,1 n ], where 1 n Represents a column vector whose n dimensions are all 1s.
[0276] The application network of the above embodiment and the application results are as follows Figure 1-4 As shown in, specifically:
[0277] Figure 1 Show the communication network diagram used in the experiment, where the connectivity rate of the network is 0.5;
[0278] Figure 2 A comparison diagram of the algorithm performance between the algorithm of the present invention and the algorithm of the prior art, wherein the algorithm of the prior art adopts the document "Q.Liu,S.Yang,and Y.Hong, The calculation method disclosed in "Constrained consensus algorithms with fixed step size for distributed convergence optimization over multi-agent networks," IEEE Transactions on Automatic Control, Vol. 62, No.8, pp. 4259–4265, 2017. ",by Figure 2 It can be clearly seen that the algorithm proposed by the invention has the best performance, that is, the fastest convergence rate;
[0279] Figure 3 This invention is the instantaneous behavior of agents 2, 4, 6, 8 and 10 without communication delay.
[0280] Figure 4 This invention is the instantaneous behavior of agents No.2, 4, 6, 8 and 10 under the condition of communication delay (and the maximum communication delay of each iteration is 10).
[0281] comprehensive Figure 3 and Figure 4 It can be seen that communication delay has great influence on the instantaneous behavior of agents.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products