Neuro-dynamic programming based on self-organized patterns

Jennie Si, Yu Tsung Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

This paper introduces a real-time learning control mechanism, as a robust and efficient scheme of neuro-dynamic programming. The objective of the learning controller is to optimize a certain performance measure by learning to create appropriate control actions through interacting with the environment. The controller is set out to learn to perform better over time starting with no prior knowledge about the system. The system under consideration does not render a complete system model describing its behaviors either. Instead, real-time sampled measurements are available to the designer. The feedback signal from the environment about the system is less descriptive in the sense that only indicative signals, such as binary-valued signals, are available at the end of a task signifying either a success or a failure. The current study proposes pattern-based neuro-dynamic programming techniques. The state measurements are first analyzed by similarity and organized by proximity. Control actions are then generated in relevance to the state patterns. A critic network serves the purpose of `monitoring' the performance of the controller to achieve a given optimality. In the present paper, we provide detailed implementation, and performance evaluations of this learning controller in a cart-pole balancing problem.

Original languageEnglish (US)
Title of host publicationIEEE International Symposium on Intelligent Control - Proceedings
Place of PublicationPiscataway, NJ, United States
PublisherIEEE
Pages120-125
Number of pages6
StatePublished - 1999
EventProceedings of the 1999 IEEE International Symposium on Intelligent Control - Intelligent Systems and Semiotics - Cambridge, MA, USA
Duration: Sep 15 1999Sep 17 1999

Other

OtherProceedings of the 1999 IEEE International Symposium on Intelligent Control - Intelligent Systems and Semiotics
CityCambridge, MA, USA
Period9/15/999/17/99

ASJC Scopus subject areas

  • Hardware and Architecture
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Neuro-dynamic programming based on self-organized patterns'. Together they form a unique fingerprint.

Cite this