Direct Learning by Reinforcement

Jennie Si

doi:10.1016/B978-012170960-0/50090-6

Direct Learning by Reinforcement

Research output: Chapter in Book/Report/Conference proceeding › Chapter

Abstract

This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design.

Original language	English (US)
Title of host publication	The Electrical Engineering Handbook
Publisher	Elsevier
Pages	1151-1159
Number of pages	9
ISBN (Electronic)	9780121709600
DOIs	https://doi.org/10.1016/B978-012170960-0/50090-6
State	Published - Jan 1 2004
Externally published	Yes

ASJC Scopus subject areas

General Engineering

Access to Document

10.1016/B978-012170960-0/50090-6

Cite this

@inbook{8ff529835c9b4913b8ef425193763d33,

title = "Direct Learning by Reinforcement",

abstract = "This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design.",

author = "Jennie Si",

year = "2004",

month = jan,

day = "1",

doi = "10.1016/B978-012170960-0/50090-6",

language = "English (US)",

pages = "1151--1159",

booktitle = "The Electrical Engineering Handbook",

publisher = "Elsevier",

}

TY - CHAP

T1 - Direct Learning by Reinforcement

AU - Si, Jennie

PY - 2004/1/1

Y1 - 2004/1/1

N2 - This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design.

AB - This chapter is an introduction to a learning control scheme that can be implemented in real time. It is viewed as a model independent approach to the adaptive critic designs. The chapter demonstrates the implementation details and learning results using two illustrative examples. Systematic treatment for developing a generic online learning control system based on the fundamental principle of reinforcement learning or, more specifically, neural dynamic programming is focused in this chapter. This online learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and tries to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement are memorized through a network learning process where in the future, similar states are positively associated with a control action leading to a positive reinforcement. This discussion also introduces a successful candidate of online learning control design.

UR - http://www.scopus.com/inward/record.url?scp=85148774700&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85148774700&partnerID=8YFLogxK

U2 - 10.1016/B978-012170960-0/50090-6

DO - 10.1016/B978-012170960-0/50090-6

M3 - Chapter

AN - SCOPUS:85148774700

SP - 1151

EP - 1159

BT - The Electrical Engineering Handbook

PB - Elsevier

ER -

Direct Learning by Reinforcement

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this