Online Reinforcement Learning Control for the Personalization of a Robotic Knee Prosthesis

Yue Wen, Jennie Si, Andrea Brandt, Xiang Gao, He Helen Huang

Research output: Contribution to journalArticlepeer-review

97 Scopus citations

Abstract

Robotic prostheses deliver greater function than passive prostheses, but we face the challenge of tuning a large number of control parameters in order to personalize the device for individual amputee users. This problem is not easily solved by traditional control designs or the latest robotic technology. Reinforcement learning (RL) is naturally appealing. The recent, unprecedented success of AlphaZero demonstrated RL as a feasible, large-scale problem solver. However, the prosthesis-tuning problem is associated with several unaddressed issues such as that it does not have a known and stable model, the continuous states and controls of the problem may result in a curse of dimensionality, and the human-prosthesis system is constantly subject to measurement noise, environmental change and human-body-caused variations. In this paper, we demonstrated the feasibility of direct heuristic dynamic programming, an approximate dynamic programming (ADP) approach, to automatically tune the 12 robotic knee prosthesis parameters to meet individual human users' needs. We tested the ADP-tuner on two subjects (one able-bodied subject and one amputee subject) walking at a fixed speed on a treadmill. The ADP-tuner learned to reach target gait kinematics in an average of 300 gait cycles or 10 min of walking. We observed improved ADP tuning performance when we transferred a previously learned ADP controller to a new learning session with the same subject. To the best of our knowledge, our approach to personalize robotic prostheses is the first implementation of online ADP learning control to a clinical problem involving human subjects.

Original languageEnglish (US)
Article number8613842
Pages (from-to)2346-2356
Number of pages11
JournalIEEE Transactions on Cybernetics
Volume50
Issue number6
DOIs
StatePublished - Jun 2020

Keywords

  • Approximate dynamic programming (ADP)
  • direct heuristic dynamic programming (dHDP)
  • reinforcement learning (RL)
  • robotic knee prosthesis

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Online Reinforcement Learning Control for the Personalization of a Robotic Knee Prosthesis'. Together they form a unique fingerprint.

Cite this