Curvature-aided incremental aggregated gradient method

Hoi To Wai; Wei Shi; Angelia Nedich; Anna Scaglione

doi:10.1109/ALLERTON.2017.8262782

Curvature-aided incremental aggregated gradient method

Hoi To Wai, Wei Shi, Angelia Nedich, Anna Scaglione

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

4 Scopus citations

Abstract

We propose a new algorithm for finite sum optimization which we call the curvature-aided incremental aggregated gradient (CIAG) method. Motivated by the problem of training a classifier for a d-dimensional problem, where the number of training data is m and m ≫ d ≫ 1, the CIAG method seeks to accelerate incremental aggregated gradient (IAG) methods using aids from the curvature (or Hessian) information, while avoiding the evaluation of matrix inverses required by the incremental Newton (IN) method. Specifically, our idea is to exploit the incrementally aggregated Hessian matrix to trace the full gradient vector at every incremental step, therefore achieving an improved linear convergence rate over the state-of-the-art IAG methods. For strongly convex problems, the fast linear convergence rate requires the objective function to be close to quadratic, or the initial point to be close to optimal solution. Importantly, we show that running one iteration of the CIAG method yields the same improvement to the optimality gap as running one iteration of the full gradient method, while the complexity is O(d²) for CIAG and O(md) for the full gradient. Overall, the CIAG method strikes a balance between the high computation complexity incremental Newtontype methods and the slow IAG method. Our numerical results support the theoretical findings and show that the CIAG method often converges with much fewer iterations than IAG, and requires much shorter running time than IN when the problem dimension is high.

Original language	English (US)
Title of host publication	55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	526-532
Number of pages	7
ISBN (Electronic)	9781538632666
DOIs	https://doi.org/10.1109/ALLERTON.2017.8262782
State	Published - Jul 1 2017
Event	55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017 - Monticello, United States Duration: Oct 3 2017 → Oct 6 2017

Publication series

Name	55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017
Volume	2018-January

Other

Other	55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017
Country/Territory	United States
City	Monticello
Period	10/3/17 → 10/6/17

Keywords

Newton method
empirical risk minimization
incremental gradient method
linear convergence

ASJC Scopus subject areas

Computer Networks and Communications
Hardware and Architecture
Signal Processing
Energy Engineering and Power Technology
Control and Optimization

Access to Document

10.1109/ALLERTON.2017.8262782

Cite this

Wai, H. T., Shi, W., Nedich, A., & Scaglione, A. (2017). Curvature-aided incremental aggregated gradient method. In 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017 (pp. 526-532). (55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017; Vol. 2018-January). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ALLERTON.2017.8262782

Curvature-aided incremental aggregated gradient method. / Wai, Hoi To; Shi, Wei; Nedich, Angelia et al.
55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 526-532 (55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017; Vol. 2018-January).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wai, HT, Shi, W, Nedich, A & Scaglione, A 2017, Curvature-aided incremental aggregated gradient method. in 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017. 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017, vol. 2018-January, Institute of Electrical and Electronics Engineers Inc., pp. 526-532, 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017, Monticello, United States, 10/3/17. https://doi.org/10.1109/ALLERTON.2017.8262782

Wai HT, Shi W, Nedich A, Scaglione A. Curvature-aided incremental aggregated gradient method. In 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 526-532. (55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017). doi: 10.1109/ALLERTON.2017.8262782

@inproceedings{031070f0da5b49288d72addd0a0ef000,

title = "Curvature-aided incremental aggregated gradient method",

abstract = "We propose a new algorithm for finite sum optimization which we call the curvature-aided incremental aggregated gradient (CIAG) method. Motivated by the problem of training a classifier for a d-dimensional problem, where the number of training data is m and m ≫ d ≫ 1, the CIAG method seeks to accelerate incremental aggregated gradient (IAG) methods using aids from the curvature (or Hessian) information, while avoiding the evaluation of matrix inverses required by the incremental Newton (IN) method. Specifically, our idea is to exploit the incrementally aggregated Hessian matrix to trace the full gradient vector at every incremental step, therefore achieving an improved linear convergence rate over the state-of-the-art IAG methods. For strongly convex problems, the fast linear convergence rate requires the objective function to be close to quadratic, or the initial point to be close to optimal solution. Importantly, we show that running one iteration of the CIAG method yields the same improvement to the optimality gap as running one iteration of the full gradient method, while the complexity is O(d2) for CIAG and O(md) for the full gradient. Overall, the CIAG method strikes a balance between the high computation complexity incremental Newtontype methods and the slow IAG method. Our numerical results support the theoretical findings and show that the CIAG method often converges with much fewer iterations than IAG, and requires much shorter running time than IN when the problem dimension is high.",

keywords = "Newton method, empirical risk minimization, incremental gradient method, linear convergence",

author = "Wai, {Hoi To} and Wei Shi and Angelia Nedich and Anna Scaglione",

note = "Publisher Copyright: {\textcopyright} 2017 IEEE.; 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017 ; Conference date: 03-10-2017 Through 06-10-2017",

year = "2017",

month = jul,

day = "1",

doi = "10.1109/ALLERTON.2017.8262782",

language = "English (US)",

series = "55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "526--532",

booktitle = "55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017",

}

TY - GEN

T1 - Curvature-aided incremental aggregated gradient method

AU - Wai, Hoi To

AU - Shi, Wei

AU - Nedich, Angelia

AU - Scaglione, Anna

PY - 2017/7/1

Y1 - 2017/7/1

N2 - We propose a new algorithm for finite sum optimization which we call the curvature-aided incremental aggregated gradient (CIAG) method. Motivated by the problem of training a classifier for a d-dimensional problem, where the number of training data is m and m ≫ d ≫ 1, the CIAG method seeks to accelerate incremental aggregated gradient (IAG) methods using aids from the curvature (or Hessian) information, while avoiding the evaluation of matrix inverses required by the incremental Newton (IN) method. Specifically, our idea is to exploit the incrementally aggregated Hessian matrix to trace the full gradient vector at every incremental step, therefore achieving an improved linear convergence rate over the state-of-the-art IAG methods. For strongly convex problems, the fast linear convergence rate requires the objective function to be close to quadratic, or the initial point to be close to optimal solution. Importantly, we show that running one iteration of the CIAG method yields the same improvement to the optimality gap as running one iteration of the full gradient method, while the complexity is O(d2) for CIAG and O(md) for the full gradient. Overall, the CIAG method strikes a balance between the high computation complexity incremental Newtontype methods and the slow IAG method. Our numerical results support the theoretical findings and show that the CIAG method often converges with much fewer iterations than IAG, and requires much shorter running time than IN when the problem dimension is high.

AB - We propose a new algorithm for finite sum optimization which we call the curvature-aided incremental aggregated gradient (CIAG) method. Motivated by the problem of training a classifier for a d-dimensional problem, where the number of training data is m and m ≫ d ≫ 1, the CIAG method seeks to accelerate incremental aggregated gradient (IAG) methods using aids from the curvature (or Hessian) information, while avoiding the evaluation of matrix inverses required by the incremental Newton (IN) method. Specifically, our idea is to exploit the incrementally aggregated Hessian matrix to trace the full gradient vector at every incremental step, therefore achieving an improved linear convergence rate over the state-of-the-art IAG methods. For strongly convex problems, the fast linear convergence rate requires the objective function to be close to quadratic, or the initial point to be close to optimal solution. Importantly, we show that running one iteration of the CIAG method yields the same improvement to the optimality gap as running one iteration of the full gradient method, while the complexity is O(d2) for CIAG and O(md) for the full gradient. Overall, the CIAG method strikes a balance between the high computation complexity incremental Newtontype methods and the slow IAG method. Our numerical results support the theoretical findings and show that the CIAG method often converges with much fewer iterations than IAG, and requires much shorter running time than IN when the problem dimension is high.

KW - Newton method

KW - empirical risk minimization

KW - incremental gradient method

KW - linear convergence

UR - http://www.scopus.com/inward/record.url?scp=85047912758&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047912758&partnerID=8YFLogxK

U2 - 10.1109/ALLERTON.2017.8262782

DO - 10.1109/ALLERTON.2017.8262782

M3 - Conference contribution

AN - SCOPUS:85047912758

T3 - 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017

SP - 526

EP - 532

BT - 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 55th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2017

Y2 - 3 October 2017 through 6 October 2017

ER -

Curvature-aided incremental aggregated gradient method

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this