Direct Learning by Reinforcement IEEE Transactions on Neural Networks 2

Jennie Si

doi:10.1016/B978-012170960-0/50090-6

Direct Learning by Reinforcement IEEE Transactions on Neural Networks 2

Jennie Si

Electrical Engineering

Research output: Chapter in Book/Report/Conference proceeding › Chapter

1 Scopus citations

Abstract

This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design. © 2005

Original language	English (US)
Title of host publication	The Electrical Engineering Handbook
Publisher	Elsevier Inc.
Pages	1151-1159
Number of pages	9
ISBN (Print)	9780121709600
DOIs	https://doi.org/10.1016/B978-012170960-0/50090-6
State	Published - 2005

ASJC Scopus subject areas

General Computer Science

Access to Document

10.1016/B978-012170960-0/50090-6

Cite this

@inbook{441f2556637b476b896e5bcafed87915,

title = "Direct Learning by Reinforcement IEEE Transactions on Neural Networks 2",

abstract = "This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design. {\textcopyright} 2005",

author = "Jennie Si",

year = "2005",

doi = "10.1016/B978-012170960-0/50090-6",

language = "English (US)",

isbn = "9780121709600",

pages = "1151--1159",

booktitle = "The Electrical Engineering Handbook",

publisher = "Elsevier Inc.",

}

TY - CHAP

T1 - Direct Learning by Reinforcement IEEE Transactions on Neural Networks 2

AU - Si, Jennie

PY - 2005

Y1 - 2005

N2 - This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design. © 2005

AB - This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design. © 2005

UR - http://www.scopus.com/inward/record.url?scp=84882483350&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84882483350&partnerID=8YFLogxK

U2 - 10.1016/B978-012170960-0/50090-6

DO - 10.1016/B978-012170960-0/50090-6

M3 - Chapter

AN - SCOPUS:84882483350

SN - 9780121709600

SP - 1151

EP - 1159

BT - The Electrical Engineering Handbook

PB - Elsevier Inc.

ER -

Direct Learning by Reinforcement IEEE Transactions on Neural Networks 2

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this