TY - GEN
T1 - Decentralized scheduling with data locality for data-parallel computation on peer-to-peer networks
AU - Wang, Weina
AU - Barnard, Matthew
AU - Ying, Lei
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/4/4
Y1 - 2016/4/4
N2 - Despite distributed in computation and data storage, current data-parallel computing systems are centralized in task scheduling, which results in hierarchies that create single point of failure, limit scalability, and increase administration costs. In this paper, we propose a fully decentralized scheduling algorithm for data-parallel computing systems on peer-to-peer (P2P) networks. Our scheduling algorithm eliminates the centralized scheduler by letting each node in the network make scheduling decisions. To achieve good performance, data locality, which stresses the efficiency of colocating tasks with their input data, and load-balancing, should be considered jointly, and in a decentralized fashion. By exploring a backpressure-based approach, the proposed task scheduling algorithm strikes the right balance between data locality and load-balancing with each node only knowing the status information of part of the nodes in the network, and proves to maximize the throughput.
AB - Despite distributed in computation and data storage, current data-parallel computing systems are centralized in task scheduling, which results in hierarchies that create single point of failure, limit scalability, and increase administration costs. In this paper, we propose a fully decentralized scheduling algorithm for data-parallel computing systems on peer-to-peer (P2P) networks. Our scheduling algorithm eliminates the centralized scheduler by letting each node in the network make scheduling decisions. To achieve good performance, data locality, which stresses the efficiency of colocating tasks with their input data, and load-balancing, should be considered jointly, and in a decentralized fashion. By exploring a backpressure-based approach, the proposed task scheduling algorithm strikes the right balance between data locality and load-balancing with each node only knowing the status information of part of the nodes in the network, and proves to maximize the throughput.
UR - http://www.scopus.com/inward/record.url?scp=84969894797&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969894797&partnerID=8YFLogxK
U2 - 10.1109/ALLERTON.2015.7447024
DO - 10.1109/ALLERTON.2015.7447024
M3 - Conference contribution
AN - SCOPUS:84969894797
T3 - 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015
SP - 337
EP - 344
BT - 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015
Y2 - 29 September 2015 through 2 October 2015
ER -