On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis

Farzad Yousefian; Angelia Nedić; Uday V. Shanbhag

doi:10.1137/17M1152474

On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis

Farzad Yousefian, Angelia Nedić, Uday V. Shanbhag

Electrical Engineering

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

Motivated by applications arising from large-scale optimization and machine learning, we consider stochastic quasi-Newton (SQN) methods for solving unconstrained convex optimization problems. Much of the convergence analysis of SQN methods, in both full and limited-memory regimes, requires the objective function to be strongly convex. However, this assumption is fairly restrictive and does not hold in many applications. To the best of our knowledge, no rate state- ments currently exist for SQN methods in the absence of such an assumption. Furthermore, among the existing first-order methods for addressing stochastic optimization problems with merely convex objectives, techniques equipped with provable convergence rates employ averaging. However, this averaging technique has a detrimental impact on inducing sparsity. Motivated by these gaps, we consider optimization problems with non-strongly convex objectives with Lipschitz but possibly un- bounded gradients. The main contributions of the paper are as follows: (i) To address large-scale stochastic optimization problems, we develop an iteratively regularized stochastic limited-memory BFGS (IRS-LBFGS) algorithm, where the step size, regularization parameter, and the Hessian in- verse approximation are updated iteratively. We establish convergence of the iterates (with no averaging) to an optimal solution of the original problem both in an almost-sure sense and in a mean sense. The convergence rate is derived in terms of the objective function value and is shown to be O(1/k^(1/3-∈)), where ∈ is an arbitrary small positive scalar. (ii) In deterministic regimes, we show that the algorithm displays a rate O(1/k^1-∈). We present numerical experiments performed on a large-scale text classification problem and compare IRS-LBFGS with standard SQN methods as well as first-order methods such as SAGA and IAG.

Original language	English (US)
Pages (from-to)	1144-1172
Number of pages	29
Journal	SIAM Journal on Optimization
Volume	30
Issue number	2
DOIs	https://doi.org/10.1137/17M1152474
State	Published - 2020

Keywords

Large scale optimization
Quasi-Newton
Regularization
Stochastic optimization

ASJC Scopus subject areas

Software
Theoretical Computer Science
Applied Mathematics

Access to Document

10.1137/17M1152474

Cite this

@article{d8a606395738414b896b4e01692068a8,

title = "On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis",

abstract = "Motivated by applications arising from large-scale optimization and machine learning, we consider stochastic quasi-Newton (SQN) methods for solving unconstrained convex optimization problems. Much of the convergence analysis of SQN methods, in both full and limited-memory regimes, requires the objective function to be strongly convex. However, this assumption is fairly restrictive and does not hold in many applications. To the best of our knowledge, no rate state- ments currently exist for SQN methods in the absence of such an assumption. Furthermore, among the existing first-order methods for addressing stochastic optimization problems with merely convex objectives, techniques equipped with provable convergence rates employ averaging. However, this averaging technique has a detrimental impact on inducing sparsity. Motivated by these gaps, we consider optimization problems with non-strongly convex objectives with Lipschitz but possibly un- bounded gradients. The main contributions of the paper are as follows: (i) To address large-scale stochastic optimization problems, we develop an iteratively regularized stochastic limited-memory BFGS (IRS-LBFGS) algorithm, where the step size, regularization parameter, and the Hessian in- verse approximation are updated iteratively. We establish convergence of the iterates (with no averaging) to an optimal solution of the original problem both in an almost-sure sense and in a mean sense. The convergence rate is derived in terms of the objective function value and is shown to be O(1/k(1/3-∈)), where ∈ is an arbitrary small positive scalar. (ii) In deterministic regimes, we show that the algorithm displays a rate O(1/k1-∈). We present numerical experiments performed on a large-scale text classification problem and compare IRS-LBFGS with standard SQN methods as well as first-order methods such as SAGA and IAG.",

keywords = "Large scale optimization, Quasi-Newton, Regularization, Stochastic optimization",

author = "Farzad Yousefian and Angelia Nedi{\'c} and Shanbhag, {Uday V.}",

note = "Publisher Copyright: {\textcopyright} 2020 Society for Industrial and Applied Mathematics.",

year = "2020",

doi = "10.1137/17M1152474",

language = "English (US)",

volume = "30",

pages = "1144--1172",

journal = "SIAM Journal on Optimization",

issn = "1052-6234",

publisher = "Society for Industrial and Applied Mathematics Publications",

number = "2",

}

TY - JOUR

T1 - On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization

T2 - Asymptotic convergence and rate analysis

AU - Yousefian, Farzad

AU - Nedić, Angelia

AU - Shanbhag, Uday V.

PY - 2020

Y1 - 2020

N2 - Motivated by applications arising from large-scale optimization and machine learning, we consider stochastic quasi-Newton (SQN) methods for solving unconstrained convex optimization problems. Much of the convergence analysis of SQN methods, in both full and limited-memory regimes, requires the objective function to be strongly convex. However, this assumption is fairly restrictive and does not hold in many applications. To the best of our knowledge, no rate state- ments currently exist for SQN methods in the absence of such an assumption. Furthermore, among the existing first-order methods for addressing stochastic optimization problems with merely convex objectives, techniques equipped with provable convergence rates employ averaging. However, this averaging technique has a detrimental impact on inducing sparsity. Motivated by these gaps, we consider optimization problems with non-strongly convex objectives with Lipschitz but possibly un- bounded gradients. The main contributions of the paper are as follows: (i) To address large-scale stochastic optimization problems, we develop an iteratively regularized stochastic limited-memory BFGS (IRS-LBFGS) algorithm, where the step size, regularization parameter, and the Hessian in- verse approximation are updated iteratively. We establish convergence of the iterates (with no averaging) to an optimal solution of the original problem both in an almost-sure sense and in a mean sense. The convergence rate is derived in terms of the objective function value and is shown to be O(1/k(1/3-∈)), where ∈ is an arbitrary small positive scalar. (ii) In deterministic regimes, we show that the algorithm displays a rate O(1/k1-∈). We present numerical experiments performed on a large-scale text classification problem and compare IRS-LBFGS with standard SQN methods as well as first-order methods such as SAGA and IAG.

AB - Motivated by applications arising from large-scale optimization and machine learning, we consider stochastic quasi-Newton (SQN) methods for solving unconstrained convex optimization problems. Much of the convergence analysis of SQN methods, in both full and limited-memory regimes, requires the objective function to be strongly convex. However, this assumption is fairly restrictive and does not hold in many applications. To the best of our knowledge, no rate state- ments currently exist for SQN methods in the absence of such an assumption. Furthermore, among the existing first-order methods for addressing stochastic optimization problems with merely convex objectives, techniques equipped with provable convergence rates employ averaging. However, this averaging technique has a detrimental impact on inducing sparsity. Motivated by these gaps, we consider optimization problems with non-strongly convex objectives with Lipschitz but possibly un- bounded gradients. The main contributions of the paper are as follows: (i) To address large-scale stochastic optimization problems, we develop an iteratively regularized stochastic limited-memory BFGS (IRS-LBFGS) algorithm, where the step size, regularization parameter, and the Hessian in- verse approximation are updated iteratively. We establish convergence of the iterates (with no averaging) to an optimal solution of the original problem both in an almost-sure sense and in a mean sense. The convergence rate is derived in terms of the objective function value and is shown to be O(1/k(1/3-∈)), where ∈ is an arbitrary small positive scalar. (ii) In deterministic regimes, we show that the algorithm displays a rate O(1/k1-∈). We present numerical experiments performed on a large-scale text classification problem and compare IRS-LBFGS with standard SQN methods as well as first-order methods such as SAGA and IAG.

KW - Large scale optimization

KW - Quasi-Newton

KW - Regularization

KW - Stochastic optimization

UR - http://www.scopus.com/inward/record.url?scp=85085251665&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85085251665&partnerID=8YFLogxK

U2 - 10.1137/17M1152474

DO - 10.1137/17M1152474

M3 - Article

AN - SCOPUS:85085251665

SN - 1052-6234

VL - 30

SP - 1144

EP - 1172

JO - SIAM Journal on Optimization

JF - SIAM Journal on Optimization

IS - 2

ER -

On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this