A boundedness result for the direct heuristic dynamic programming

Feng Liu; Jian Sun; Jennie Si; Wentao Guo; Shengwei Mei

doi:10.1016/j.neunet.2012.02.005

A boundedness result for the direct heuristic dynamic programming

Feng Liu, Jian Sun, Jennie Si, Wentao Guo, Shengwei Mei

Research output: Contribution to journal › Article › peer-review

105 Scopus citations

Abstract

Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programming (direct HDP) is one of the ADP algorithms inspired by the adaptive critic designs. It has been shown applicable to industrial scale, realistic and complex control problems. In this paper, we provide a uniformly ultimately boundedness (UUB) result for the direct HDP learning controller under mild and intuitive conditions. By using a Lyapunov approach we show that the estimation errors of the learning parameters or the weights in the action and critic networks remain UUB. This result provides a useful controller convergence guarantee for the first time for the direct HDP design.

Original language	English (US)
Pages (from-to)	229-235
Number of pages	7
Journal	Neural Networks
Volume	32
DOIs	https://doi.org/10.1016/j.neunet.2012.02.005
State	Published - Aug 2012

Keywords

Approximate dynamic programming (ADP)
Direct heuristic dynamic programming (direct HDP)
Lyapunov stability
Uniformly ultimately boundedness (UUB)

ASJC Scopus subject areas

Cognitive Neuroscience
Artificial Intelligence

Access to Document

10.1016/j.neunet.2012.02.005

Cite this

@article{2006208a51c2424ba54e36826c3e4859,

title = "A boundedness result for the direct heuristic dynamic programming",

abstract = "Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programming (direct HDP) is one of the ADP algorithms inspired by the adaptive critic designs. It has been shown applicable to industrial scale, realistic and complex control problems. In this paper, we provide a uniformly ultimately boundedness (UUB) result for the direct HDP learning controller under mild and intuitive conditions. By using a Lyapunov approach we show that the estimation errors of the learning parameters or the weights in the action and critic networks remain UUB. This result provides a useful controller convergence guarantee for the first time for the direct HDP design.",

keywords = "Approximate dynamic programming (ADP), Direct heuristic dynamic programming (direct HDP), Lyapunov stability, Uniformly ultimately boundedness (UUB)",

author = "Feng Liu and Jian Sun and Jennie Si and Wentao Guo and Shengwei Mei",

note = "Funding Information: The research was supported by the National Natural Science Foundation of China under Cooperative Research Funds (Number: 50828701 ). The third author is also supported by the US NSF under grant ECCS-0702057 .",

year = "2012",

month = aug,

doi = "10.1016/j.neunet.2012.02.005",

language = "English (US)",

volume = "32",

pages = "229--235",

journal = "Neural Networks",

issn = "0893-6080",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - A boundedness result for the direct heuristic dynamic programming

AU - Liu, Feng

AU - Sun, Jian

AU - Si, Jennie

AU - Guo, Wentao

AU - Mei, Shengwei

N1 - Funding Information: The research was supported by the National Natural Science Foundation of China under Cooperative Research Funds (Number: 50828701 ). The third author is also supported by the US NSF under grant ECCS-0702057 .

PY - 2012/8

Y1 - 2012/8

N2 - Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programming (direct HDP) is one of the ADP algorithms inspired by the adaptive critic designs. It has been shown applicable to industrial scale, realistic and complex control problems. In this paper, we provide a uniformly ultimately boundedness (UUB) result for the direct HDP learning controller under mild and intuitive conditions. By using a Lyapunov approach we show that the estimation errors of the learning parameters or the weights in the action and critic networks remain UUB. This result provides a useful controller convergence guarantee for the first time for the direct HDP design.

AB - Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programming (direct HDP) is one of the ADP algorithms inspired by the adaptive critic designs. It has been shown applicable to industrial scale, realistic and complex control problems. In this paper, we provide a uniformly ultimately boundedness (UUB) result for the direct HDP learning controller under mild and intuitive conditions. By using a Lyapunov approach we show that the estimation errors of the learning parameters or the weights in the action and critic networks remain UUB. This result provides a useful controller convergence guarantee for the first time for the direct HDP design.

KW - Approximate dynamic programming (ADP)

KW - Direct heuristic dynamic programming (direct HDP)

KW - Lyapunov stability

KW - Uniformly ultimately boundedness (UUB)

UR - http://www.scopus.com/inward/record.url?scp=84862811991&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84862811991&partnerID=8YFLogxK

U2 - 10.1016/j.neunet.2012.02.005

DO - 10.1016/j.neunet.2012.02.005

M3 - Article

C2 - 22397949

AN - SCOPUS:84862811991

SN - 0893-6080

VL - 32

SP - 229

EP - 235

JO - Neural Networks

JF - Neural Networks

ER -

A boundedness result for the direct heuristic dynamic programming

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this