Abstract
This paper focuses on a systematic treatment for developing a generic on-line learning control system based on the fundamental principle of reinforcement learning or more specifically neuro-dynamic programming. This real time learning system improves its performance over time in two aspects. First, it learns from its own mistakes through the reinforcement signal from the external environment and try to reinforce its action to improve future performance. Second, system states associated with the positive reinforcement is memorized through a network learning process where in the future, similar states will be more positively associated with a control action leading to a positive reinforcement. Two successful candidates of on-line learning control designs will be introduced. Real time learning algorithms will be derived for individual components in the learning system. Some analytical insight will be provided to give some guidelines on the entire on-line learning control system. The performance of the on-line learning controller is measured by its learning speed, success rate of learning, and the degree to meet the learning control objective. The overall learning control system performance will be tested on a single cart-pole balancing problem and a more complex problem of balancing a triple-link inverted pendulum.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the International Joint Conference on Neural Networks |
Place of Publication | Piscataway, NJ, United States |
Publisher | IEEE |
Pages | 221-226 |
Number of pages | 6 |
Volume | 3 |
State | Published - 2000 |
Event | International Joint Conference on Neural Networks (IJCNN'2000) - Como, Italy Duration: Jul 24 2000 → Jul 27 2000 |
Other
Other | International Joint Conference on Neural Networks (IJCNN'2000) |
---|---|
City | Como, Italy |
Period | 7/24/00 → 7/27/00 |
ASJC Scopus subject areas
- Software