TY - GEN
T1 - IBIS
T2 - 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2016
AU - Xu, Yiqi
AU - Zhao, Ming
PY - 2016/5/31
Y1 - 2016/5/31
N2 - Big-data systems are increasingly shared by diverse, data-intensive applications from different domains. However, existing systems lack the support for I/O management, and the performance of bigdata applications degrades in unpredictable ways when they contend for I/Os. To address this challenge, this paper proposes IBIS, an Interposed Big-data I/O Scheduler, to provide I/O performance differentiation for competing applications in a shared big-data system. IBIS transparently intercepts, isolates, and schedules an application's different phases of I/Os via an I/O interposition layer on every datanode of the big-data system. It provides a new proportionalshare I/O scheduler, SFQ(D2), to allow applications to share the I/O service of each datanode with good fairness and resource utilization. It enables the distributed I/O schedulers to coordinate with one another and to achieve proportional sharing of the big-data system's total I/O service in a scalable manner. Finally, it supports the shared use of big-data resources by diverse frameworks and manages the I/Os from different types of big-data workloads (e.g., batch jobs vs. queries) across these frameworks. The prototype of IBIS is implemented in Hadoop/YARN, a widely used big-data system. Experiments based on a variety of representative applications (WordCount, TeraSort, Facebook, TPC-H) show that IBIS achieves good total-service proportional sharing with low overhead in both application performance and resource usages. IBIS is also shown to support various performance policies: it can deliver stronger performance isolation than native Hadoop/YARN (99% better for WordCount and 15% better for TPC-H queries) with good resource utilization; and it can also achieve perfect proportional slowdown with better application performance (30% better than native Hadoop).
AB - Big-data systems are increasingly shared by diverse, data-intensive applications from different domains. However, existing systems lack the support for I/O management, and the performance of bigdata applications degrades in unpredictable ways when they contend for I/Os. To address this challenge, this paper proposes IBIS, an Interposed Big-data I/O Scheduler, to provide I/O performance differentiation for competing applications in a shared big-data system. IBIS transparently intercepts, isolates, and schedules an application's different phases of I/Os via an I/O interposition layer on every datanode of the big-data system. It provides a new proportionalshare I/O scheduler, SFQ(D2), to allow applications to share the I/O service of each datanode with good fairness and resource utilization. It enables the distributed I/O schedulers to coordinate with one another and to achieve proportional sharing of the big-data system's total I/O service in a scalable manner. Finally, it supports the shared use of big-data resources by diverse frameworks and manages the I/Os from different types of big-data workloads (e.g., batch jobs vs. queries) across these frameworks. The prototype of IBIS is implemented in Hadoop/YARN, a widely used big-data system. Experiments based on a variety of representative applications (WordCount, TeraSort, Facebook, TPC-H) show that IBIS achieves good total-service proportional sharing with low overhead in both application performance and resource usages. IBIS is also shown to support various performance policies: it can deliver stronger performance isolation than native Hadoop/YARN (99% better for WordCount and 15% better for TPC-H queries) with good resource utilization; and it can also achieve perfect proportional slowdown with better application performance (30% better than native Hadoop).
UR - http://www.scopus.com/inward/record.url?scp=84978520087&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978520087&partnerID=8YFLogxK
U2 - 10.1145/2907294.2907319
DO - 10.1145/2907294.2907319
M3 - Conference contribution
AN - SCOPUS:84978520087
T3 - HPDC 2016 - Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing
SP - 111
EP - 122
BT - HPDC 2016 - Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing
PB - Association for Computing Machinery, Inc
Y2 - 31 May 2016 through 4 June 2016
ER -