TY - JOUR
T1 - Push-Pull Gradient Methods for Distributed Optimization in Networks
AU - Pu, Shi
AU - Shi, Wei
AU - Xu, Jinming
AU - Nedic, Angelia
N1 - Funding Information:
Manuscript received August 20, 2019; accepted January 31, 2020. Date of publication February 10, 2020; date of current version December 24, 2020. This work was supported in part by the NSF under Grant CCF-1717391, in part by the ONR under Grant N000141612245, and in part by the SRIBD Research Startup Fund under Grant J00120190011. This paper was presented in part at the Proceedings of the 57th IEEE Conference on Decision and Control, Miami Beach, FL, USA, December 2018 [1]. Recommended by Associate Editor S. Grammatico. (Shi Pu and Wei Shi contributed equally to this work.) (Corresponding authors: Shi Pu; Jinming Xu.) Shi Pu is with the School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen 518172, China (e-mail: pushi@cuhk.edu.cn).
Publisher Copyright:
© 2020 IEEE.
PY - 2021/1
Y1 - 2021/1
N2 - In this article, we focus on solving a distributed convex optimization problem in a network, where each agent has its own convex cost function and the goal is to minimize the sum of the agents'cost functions while obeying the network connectivity structure. In order to minimize the sum of the cost functions, we consider new distributed gradient-based methods where each node maintains two estimates, namely an estimate of the optimal decision variable and an estimate of the gradient for the average of the agents' objective functions. From the viewpoint of an agent, the information about the gradients is pushed to the neighbors, whereas the information about the decision variable ispulled from the neighbors, hence giving the name 'push-pull gradient methods.' The methods utilize two different graphs for the information exchange among agents and, as such, unify the algorithms with different types of distributed architecture, including decentralized (peer to peer), centralized (master-slave), and semicentralized (leader-follower) architectures. We show that the proposed algorithms and their many variants converge linearly for strongly convex and smooth objective functions over a network (possibly with unidirectional data links) in both synchronous and asynchronous random-gossip settings. In particular, under the random-gossip setting, 'push-pull' is the first class of algorithms for distributed optimization over directed graphs. Moreover, we numerically evaluate our proposed algorithms in both scenarios, and show that they outperform other existing linearly convergent schemes, especially for ill-conditioned problems and networks that are not well balanced.
AB - In this article, we focus on solving a distributed convex optimization problem in a network, where each agent has its own convex cost function and the goal is to minimize the sum of the agents'cost functions while obeying the network connectivity structure. In order to minimize the sum of the cost functions, we consider new distributed gradient-based methods where each node maintains two estimates, namely an estimate of the optimal decision variable and an estimate of the gradient for the average of the agents' objective functions. From the viewpoint of an agent, the information about the gradients is pushed to the neighbors, whereas the information about the decision variable ispulled from the neighbors, hence giving the name 'push-pull gradient methods.' The methods utilize two different graphs for the information exchange among agents and, as such, unify the algorithms with different types of distributed architecture, including decentralized (peer to peer), centralized (master-slave), and semicentralized (leader-follower) architectures. We show that the proposed algorithms and their many variants converge linearly for strongly convex and smooth objective functions over a network (possibly with unidirectional data links) in both synchronous and asynchronous random-gossip settings. In particular, under the random-gossip setting, 'push-pull' is the first class of algorithms for distributed optimization over directed graphs. Moreover, we numerically evaluate our proposed algorithms in both scenarios, and show that they outperform other existing linearly convergent schemes, especially for ill-conditioned problems and networks that are not well balanced.
KW - Convex optimization
KW - directed graph
KW - distributed optimization
KW - linear convergence
KW - network structure
KW - random-gossip algorithm
KW - spanning tree
UR - http://www.scopus.com/inward/record.url?scp=85098326167&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098326167&partnerID=8YFLogxK
U2 - 10.1109/TAC.2020.2972824
DO - 10.1109/TAC.2020.2972824
M3 - Article
AN - SCOPUS:85098326167
SN - 0018-9286
VL - 66
SP - 1
EP - 16
JO - IRE Transactions on Automatic Control
JF - IRE Transactions on Automatic Control
IS - 1
M1 - 8988200
ER -