On-line learning control by association and reinforcement

Jennie Si; Yu Tsung Wang

On-line learning control by association and reinforcement

Jennie Si, Yu Tsung Wang

Electrical Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

8 Scopus citations

Abstract

This paper focuses on a systematic treatment for developing a generic on-line learning control system based on the fundamental principle of reinforcement learning or more specifically neuro-dynamic programming. This real time learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and try to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. Two successful candidates of on-line learning control designs will be introduced. Real time learning algorithms will be derived for individual components in the learning system. Some analytical insight will be provided to give some guidelines on the entire on-line learning control system. The performance of the on-line learning controller is measured by its learning speed, success rate of learning, and the degree to meet the learning control objective. The overall learning control system performance will be tested on a single cart-pole balancing problem and a more complex problem of balancing a triple-link inverted pendulum.

Original language	English (US)
Title of host publication	Proceedings of the International Joint Conference on Neural Networks
Place of Publication	Piscataway, NJ, United States
Publisher	IEEE
Pages	221-226
Number of pages	6
Volume	3
State	Published - 2000
Event	International Joint Conference on Neural Networks (IJCNN'2000) - Como, Italy Duration: Jul 24 2000 → Jul 27 2000

Other

Other	International Joint Conference on Neural Networks (IJCNN'2000)
City	Como, Italy
Period	7/24/00 → 7/27/00

ASJC Scopus subject areas

Software

Cite this

@inproceedings{73994ec5ff80409f99b76ab166c5722d,

title = "On-line learning control by association and reinforcement",

abstract = "This paper focuses on a systematic treatment for developing a generic on-line learning control system based on the fundamental principle of reinforcement learning or more specifically neuro-dynamic programming. This real time learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and try to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. Two successful candidates of on-line learning control designs will be introduced. Real time learning algorithms will be derived for individual components in the learning system. Some analytical insight will be provided to give some guidelines on the entire on-line learning control system. The performance of the on-line learning controller is measured by its learning speed, success rate of learning, and the degree to meet the learning control objective. The overall learning control system performance will be tested on a single cart-pole balancing problem and a more complex problem of balancing a triple-link inverted pendulum.",

author = "Jennie Si and Wang, {Yu Tsung}",

year = "2000",

language = "English (US)",

volume = "3",

pages = "221--226",

booktitle = "Proceedings of the International Joint Conference on Neural Networks",

publisher = "IEEE",

note = "International Joint Conference on Neural Networks (IJCNN'2000) ; Conference date: 24-07-2000 Through 27-07-2000",

}

TY - GEN

T1 - On-line learning control by association and reinforcement

AU - Si, Jennie

AU - Wang, Yu Tsung

PY - 2000

Y1 - 2000

N2 - This paper focuses on a systematic treatment for developing a generic on-line learning control system based on the fundamental principle of reinforcement learning or more specifically neuro-dynamic programming. This real time learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and try to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. Two successful candidates of on-line learning control designs will be introduced. Real time learning algorithms will be derived for individual components in the learning system. Some analytical insight will be provided to give some guidelines on the entire on-line learning control system. The performance of the on-line learning controller is measured by its learning speed, success rate of learning, and the degree to meet the learning control objective. The overall learning control system performance will be tested on a single cart-pole balancing problem and a more complex problem of balancing a triple-link inverted pendulum.

AB - This paper focuses on a systematic treatment for developing a generic on-line learning control system based on the fundamental principle of reinforcement learning or more specifically neuro-dynamic programming. This real time learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and try to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. Two successful candidates of on-line learning control designs will be introduced. Real time learning algorithms will be derived for individual components in the learning system. Some analytical insight will be provided to give some guidelines on the entire on-line learning control system. The performance of the on-line learning controller is measured by its learning speed, success rate of learning, and the degree to meet the learning control objective. The overall learning control system performance will be tested on a single cart-pole balancing problem and a more complex problem of balancing a triple-link inverted pendulum.

UR - http://www.scopus.com/inward/record.url?scp=0033698504&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033698504&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0033698504

VL - 3

SP - 221

EP - 226

BT - Proceedings of the International Joint Conference on Neural Networks

PB - IEEE

CY - Piscataway, NJ, United States

T2 - International Joint Conference on Neural Networks (IJCNN'2000)

Y2 - 24 July 2000 through 27 July 2000

ER -

On-line learning control by association and reinforcement

Abstract

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this