Continuous-Time Reinforcement Learning Control: A Review of Theoretical Results, Insights on Performance, and Needs for New Designs

Brent A. Wallace; Jennie Si

doi:10.1109/TNNLS.2023.3245980

Continuous-Time Reinforcement Learning Control: A Review of Theoretical Results, Insights on Performance, and Needs for New Designs

Brent A. Wallace, Jennie Si

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

This exposition discusses continuous-time reinforcement learning (CT-RL) for the control of affine nonlinear systems. We review four seminal methods that are the centerpieces of the most recent results on CT-RL control. We survey the theoretical results of the four methods, highlighting their fundamental importance and successes by including discussions on problem formulation, key assumptions, algorithm procedures, and theoretical guarantees. Subsequently, we evaluate the performance of the control designs to provide analyses and insights on the feasibility of these design methods for applications from a control designer’s point of view. Through systematic evaluations, we point out when theory diverges from practical controller synthesis. We, furthermore, introduce a new quantitative analytical framework to diagnose the observed discrepancies. Based on the analyses and the insights gained through quantitative evaluations, we point out potential future research directions to unleash the potential of CT-RL control algorithms in addressing the identified challenges.

Original language	English (US)
Pages (from-to)	1-21
Number of pages	21
Journal	IEEE Transactions on Neural Networks and Learning Systems
DOIs	https://doi.org/10.1109/TNNLS.2023.3245980
State	Accepted/In press - 2023

Keywords

Adaptive/approximate dynamic programming (ADP)
Convergence
Heuristic algorithms
Mathematical models
Optimal control
Power system stability
Recurrent neural networks
Tuning
continuous-time (CT)
optimal control
policy iteration (PI)
reinforcement learning (RL)
value iteration (VI)

ASJC Scopus subject areas

Software
Computer Science Applications
Computer Networks and Communications
Artificial Intelligence

Access to Document

10.1109/TNNLS.2023.3245980

Cite this

@article{6e1bf54600a84bd8932d5b9854053547,

title = "Continuous-Time Reinforcement Learning Control: A Review of Theoretical Results, Insights on Performance, and Needs for New Designs",

abstract = "This exposition discusses continuous-time reinforcement learning (CT-RL) for the control of affine nonlinear systems. We review four seminal methods that are the centerpieces of the most recent results on CT-RL control. We survey the theoretical results of the four methods, highlighting their fundamental importance and successes by including discussions on problem formulation, key assumptions, algorithm procedures, and theoretical guarantees. Subsequently, we evaluate the performance of the control designs to provide analyses and insights on the feasibility of these design methods for applications from a control designer{\textquoteright}s point of view. Through systematic evaluations, we point out when theory diverges from practical controller synthesis. We, furthermore, introduce a new quantitative analytical framework to diagnose the observed discrepancies. Based on the analyses and the insights gained through quantitative evaluations, we point out potential future research directions to unleash the potential of CT-RL control algorithms in addressing the identified challenges.",

keywords = "Adaptive/approximate dynamic programming (ADP), Convergence, Heuristic algorithms, Mathematical models, Optimal control, Power system stability, Recurrent neural networks, Tuning, continuous-time (CT), optimal control, policy iteration (PI), reinforcement learning (RL), value iteration (VI)",

author = "Wallace, {Brent A.} and Jennie Si",

note = "Publisher Copyright: IEEE",

year = "2023",

doi = "10.1109/TNNLS.2023.3245980",

language = "English (US)",

pages = "1--21",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

}

TY - JOUR

T1 - Continuous-Time Reinforcement Learning Control

T2 - A Review of Theoretical Results, Insights on Performance, and Needs for New Designs

AU - Wallace, Brent A.

AU - Si, Jennie

N1 - Publisher Copyright: IEEE

PY - 2023

Y1 - 2023

N2 - This exposition discusses continuous-time reinforcement learning (CT-RL) for the control of affine nonlinear systems. We review four seminal methods that are the centerpieces of the most recent results on CT-RL control. We survey the theoretical results of the four methods, highlighting their fundamental importance and successes by including discussions on problem formulation, key assumptions, algorithm procedures, and theoretical guarantees. Subsequently, we evaluate the performance of the control designs to provide analyses and insights on the feasibility of these design methods for applications from a control designer’s point of view. Through systematic evaluations, we point out when theory diverges from practical controller synthesis. We, furthermore, introduce a new quantitative analytical framework to diagnose the observed discrepancies. Based on the analyses and the insights gained through quantitative evaluations, we point out potential future research directions to unleash the potential of CT-RL control algorithms in addressing the identified challenges.

AB - This exposition discusses continuous-time reinforcement learning (CT-RL) for the control of affine nonlinear systems. We review four seminal methods that are the centerpieces of the most recent results on CT-RL control. We survey the theoretical results of the four methods, highlighting their fundamental importance and successes by including discussions on problem formulation, key assumptions, algorithm procedures, and theoretical guarantees. Subsequently, we evaluate the performance of the control designs to provide analyses and insights on the feasibility of these design methods for applications from a control designer’s point of view. Through systematic evaluations, we point out when theory diverges from practical controller synthesis. We, furthermore, introduce a new quantitative analytical framework to diagnose the observed discrepancies. Based on the analyses and the insights gained through quantitative evaluations, we point out potential future research directions to unleash the potential of CT-RL control algorithms in addressing the identified challenges.

KW - Adaptive/approximate dynamic programming (ADP)

KW - Convergence

KW - Heuristic algorithms

KW - Mathematical models

KW - Optimal control

KW - Power system stability

KW - Recurrent neural networks

KW - Tuning

KW - continuous-time (CT)

KW - optimal control

KW - policy iteration (PI)

KW - reinforcement learning (RL)

KW - value iteration (VI)

UR - http://www.scopus.com/inward/record.url?scp=85149412333&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85149412333&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2023.3245980

DO - 10.1109/TNNLS.2023.3245980

M3 - Article

AN - SCOPUS:85149412333

SN - 2162-237X

SP - 1

EP - 21

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

ER -

Continuous-Time Reinforcement Learning Control: A Review of Theoretical Results, Insights on Performance, and Needs for New Designs

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this