PnP-DRL: A Plug-and-Play Deep Reinforcement Learning Approach for Experience-Driven Networking

Zhiyuan Xu; Kun Wu; Weiyi Zhang; Jian Tang; Yanzhi Wang; Guoliang Xue

doi:10.1109/JSAC.2021.3087270

PnP-DRL: A Plug-and-Play Deep Reinforcement Learning Approach for Experience-Driven Networking

Zhiyuan Xu, Kun Wu, Weiyi Zhang, Jian Tang, Yanzhi Wang, Guoliang Xue

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances.

Original language	English (US)
Article number	9454317
Pages (from-to)	2476-2486
Number of pages	11
Journal	IEEE Journal on Selected Areas in Communications
Volume	39
Issue number	8
DOIs	https://doi.org/10.1109/JSAC.2021.3087270
State	Published - Aug 2021

Keywords

Experience-driven networking
batch reinforcement learning
deep reinforcement learning

ASJC Scopus subject areas

Computer Networks and Communications
Electrical and Electronic Engineering

Access to Document

10.1109/JSAC.2021.3087270

Cite this

@article{be78fdcbc558443eb8957a87930614ee,

title = "PnP-DRL: A Plug-and-Play Deep Reinforcement Learning Approach for Experience-Driven Networking",

abstract = "While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances.",

keywords = "Experience-driven networking, batch reinforcement learning, deep reinforcement learning",

author = "Zhiyuan Xu and Kun Wu and Weiyi Zhang and Jian Tang and Yanzhi Wang and Guoliang Xue",

note = "Publisher Copyright: {\textcopyright} 1983-2012 IEEE.",

year = "2021",

month = aug,

doi = "10.1109/JSAC.2021.3087270",

language = "English (US)",

volume = "39",

pages = "2476--2486",

journal = "IEEE Journal on Selected Areas in Communications",

issn = "0733-8716",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "8",

}

TY - JOUR

T1 - PnP-DRL

T2 - A Plug-and-Play Deep Reinforcement Learning Approach for Experience-Driven Networking

AU - Xu, Zhiyuan

AU - Wu, Kun

AU - Zhang, Weiyi

AU - Tang, Jian

AU - Wang, Yanzhi

AU - Xue, Guoliang

PY - 2021/8

Y1 - 2021/8

N2 - While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances.

AB - While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances.

KW - Experience-driven networking

KW - batch reinforcement learning

KW - deep reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85110621965&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85110621965&partnerID=8YFLogxK

U2 - 10.1109/JSAC.2021.3087270

DO - 10.1109/JSAC.2021.3087270

M3 - Article

AN - SCOPUS:85110621965

SN - 0733-8716

VL - 39

SP - 2476

EP - 2486

JO - IEEE Journal on Selected Areas in Communications

JF - IEEE Journal on Selected Areas in Communications

IS - 8

M1 - 9454317

ER -

PnP-DRL: A Plug-and-Play Deep Reinforcement Learning Approach for Experience-Driven Networking

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this