A new class of incremental gradient methods for least squares problems

Dimitri P. Bertsekas

doi:10.1137/S1052623495287022

A new class of incremental gradient methods for least squares problems

Dimitri P. Bertsekas

Research output: Contribution to journal › Article › peer-review

224 Scopus citations

Abstract

The least mean squares (LMS) method for linear least squares problems differs from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of the parameter vector under optimization. This mode of operation often leads to faster convergence when far from the eventual limit and to slower (sublinear) convergence when close to the optimal solution. We embed both LMS and steepest descent, as well as other intermediate methods, within a one-parameter class of algorithms, and we propose a hybrid class of methods that combine the faster early convergence rate of LMS with the faster ultimate linear convergence rate of steepest descent. These methods are well suited for neural network training problems with large data sets. Furthermore, these methods allow the effective use of scaling based, for example, on diagonal or other approximations of the Hessian matrix.

Original language	English (US)
Pages (from-to)	913-926
Number of pages	14
Journal	SIAM Journal on Optimization
Volume	7
Issue number	4
DOIs	https://doi.org/10.1137/S1052623495287022
State	Published - Nov 1997
Externally published	Yes

Keywords

Gradient methods
Least squares
Neural networks
Nonlinear programming

ASJC Scopus subject areas

Software
Theoretical Computer Science

Access to Document

10.1137/S1052623495287022

Cite this

@article{3346074842f44d368f6af1f316d87a72,

title = "A new class of incremental gradient methods for least squares problems",

abstract = "The least mean squares (LMS) method for linear least squares problems differs from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of the parameter vector under optimization. This mode of operation often leads to faster convergence when far from the eventual limit and to slower (sublinear) convergence when close to the optimal solution. We embed both LMS and steepest descent, as well as other intermediate methods, within a one-parameter class of algorithms, and we propose a hybrid class of methods that combine the faster early convergence rate of LMS with the faster ultimate linear convergence rate of steepest descent. These methods are well suited for neural network training problems with large data sets. Furthermore, these methods allow the effective use of scaling based, for example, on diagonal or other approximations of the Hessian matrix.",

keywords = "Gradient methods, Least squares, Neural networks, Nonlinear programming",

author = "Bertsekas, {Dimitri P.}",

year = "1997",

month = nov,

doi = "10.1137/S1052623495287022",

language = "English (US)",

volume = "7",

pages = "913--926",

journal = "SIAM Journal on Optimization",

issn = "1052-6234",

publisher = "Society for Industrial and Applied Mathematics Publications",

number = "4",

}

TY - JOUR

T1 - A new class of incremental gradient methods for least squares problems

AU - Bertsekas, Dimitri P.

PY - 1997/11

Y1 - 1997/11

N2 - The least mean squares (LMS) method for linear least squares problems differs from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of the parameter vector under optimization. This mode of operation often leads to faster convergence when far from the eventual limit and to slower (sublinear) convergence when close to the optimal solution. We embed both LMS and steepest descent, as well as other intermediate methods, within a one-parameter class of algorithms, and we propose a hybrid class of methods that combine the faster early convergence rate of LMS with the faster ultimate linear convergence rate of steepest descent. These methods are well suited for neural network training problems with large data sets. Furthermore, these methods allow the effective use of scaling based, for example, on diagonal or other approximations of the Hessian matrix.

AB - The least mean squares (LMS) method for linear least squares problems differs from the steepest descent method in that it processes data blocks one-by-one, with intermediate adjustment of the parameter vector under optimization. This mode of operation often leads to faster convergence when far from the eventual limit and to slower (sublinear) convergence when close to the optimal solution. We embed both LMS and steepest descent, as well as other intermediate methods, within a one-parameter class of algorithms, and we propose a hybrid class of methods that combine the faster early convergence rate of LMS with the faster ultimate linear convergence rate of steepest descent. These methods are well suited for neural network training problems with large data sets. Furthermore, these methods allow the effective use of scaling based, for example, on diagonal or other approximations of the Hessian matrix.

KW - Gradient methods

KW - Least squares

KW - Neural networks

KW - Nonlinear programming

UR - http://www.scopus.com/inward/record.url?scp=0031285678&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0031285678&partnerID=8YFLogxK

U2 - 10.1137/S1052623495287022

DO - 10.1137/S1052623495287022

M3 - Article

AN - SCOPUS:0031285678

SN - 1052-6234

VL - 7

SP - 913

EP - 926

JO - SIAM Journal on Optimization

JF - SIAM Journal on Optimization

IS - 4

ER -

A new class of incremental gradient methods for least squares problems

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this