TY - JOUR
T1 - Reinforcement learning based recloser control for distribution cables with degraded insulation level
AU - Cui, Qiushi
AU - Muhammad Yousaf Hashmy, Syed
AU - Weng, Yang
AU - Dyer, Michael
N1 - Funding Information:
Manuscript received January 3, 2020; revised March 20, 2020 and May 10, 2020; accepted June 9, 2020. Date of publication June 15, 2020; date of current version March 24, 2021. This work was supported in part by the National Science Foundation of the United States under Grant 1810537, in part by the Salt River Project on “An Investigation of the Effects of Reclosing on Distribution Systems with Under-ground Cables”, and in part by the APAR-E Project on “Sensor Enabled Modeling of Future Distribution Systems with Distributed Energy Resources”. Paper no. TPWRD-00012-2020. (Corresponding author: Yang Weng.) Qiushi Cui, Syed Muhammad Yousaf Hashmy, and Yang Weng are with the Department of Electrical and Computer Engineering, Arizona State University, Tempe, AZ 85281 USA (e-mail: qiushi.cui@asu.edu; shashmy@asu.edu; yang.weng@asu.edu).
Publisher Copyright:
© 1986-2012 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - Utilities continuously observe cable failures on aged cables that have an unknown degraded basic insulation level (BIL). One of the root causes is the transient overvoltage (TOV) associated with circuit breaker reclosing. To solve this problem, researchers propose a series of controlled switching methods, most of which belong to deterministic control. However, in power systems, especially in distribution networks, the switching transient is buffeted by stochasticity. Since it is hard to model transient overvoltage due to its complexity, we propose a model-free stochastic control method for reclosers under the existence of uncertainty and noise. Concretely, to capture high-dimensional dynamics patterns, we formulate the recloser control problem by incorporating the temporal sequence reward mechanism into a deep Q-network (DQN). Meanwhile, we embed our physical understanding of the problem into the action probability allocation and develop an infeasible-action-space-elimination algorithm. Through PSCAD simulation, we first reveal the impact of load types on cables' TOVs. Then, to reduce the training burden for the proposed reinforcement learning (RL) control method in different applications, we establish a post-learning knowledge transfer method. After the validation with our industrial partner, we exhibit several learning curves to show the enhanced performance. The learning efficiency is proved to be outstanding due to the proposed time sequence reward mechanism and infeasible action elimination method. Moreover, the results on knowledge transfer demonstrate the capability of method generalization. Finally, a comparison with conventional methods is conducted. It illustrates the proposed method is most effective in mitigating the TOV phenomenon among three methods.
AB - Utilities continuously observe cable failures on aged cables that have an unknown degraded basic insulation level (BIL). One of the root causes is the transient overvoltage (TOV) associated with circuit breaker reclosing. To solve this problem, researchers propose a series of controlled switching methods, most of which belong to deterministic control. However, in power systems, especially in distribution networks, the switching transient is buffeted by stochasticity. Since it is hard to model transient overvoltage due to its complexity, we propose a model-free stochastic control method for reclosers under the existence of uncertainty and noise. Concretely, to capture high-dimensional dynamics patterns, we formulate the recloser control problem by incorporating the temporal sequence reward mechanism into a deep Q-network (DQN). Meanwhile, we embed our physical understanding of the problem into the action probability allocation and develop an infeasible-action-space-elimination algorithm. Through PSCAD simulation, we first reveal the impact of load types on cables' TOVs. Then, to reduce the training burden for the proposed reinforcement learning (RL) control method in different applications, we establish a post-learning knowledge transfer method. After the validation with our industrial partner, we exhibit several learning curves to show the enhanced performance. The learning efficiency is proved to be outstanding due to the proposed time sequence reward mechanism and infeasible action elimination method. Moreover, the results on knowledge transfer demonstrate the capability of method generalization. Finally, a comparison with conventional methods is conducted. It illustrates the proposed method is most effective in mitigating the TOV phenomenon among three methods.
KW - Cable failure
KW - Controlled switching
KW - Post-learning knowledge
KW - Reinforcement learning
KW - Transient overvoltage
UR - http://www.scopus.com/inward/record.url?scp=85103394070&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103394070&partnerID=8YFLogxK
U2 - 10.1109/TPWRD.2020.3002503
DO - 10.1109/TPWRD.2020.3002503
M3 - Article
AN - SCOPUS:85103394070
SN - 0885-8977
VL - 36
SP - 1118
EP - 1127
JO - IEEE Transactions on Power Delivery
JF - IEEE Transactions on Power Delivery
IS - 2
M1 - 9117181
ER -