Two-level throughput and latency IO control for parallel file systems

Yiqi Xu, Ming Zhao

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

Existing parallel file systems are unable to provide both throughput and response time guarantees for concurrent parallel applications. This limitation prevents different, competing applications from getting their desired performance as high-performance computing (HPC) systems continue to scale up and be used in a shared environment. This paper presents a new two-level scheduler for parallel storage systems, a new solution to address this challenge based on a distributed performance virtualization layer for parallel file systems (vPFS). It provides both bandwidth proportional sharing and response time guarantees by addressing them at different levels of the scheduler in a cooperative manner. The utility and performance of this scheduler are studied on PVFS2, a widely used parallel file system. An experimental evaluation using a typical HPC benchmark (IOR) shows that when the storage is not overloaded, requests complete within 95th percentile response time bound during 90% of the time. The scheduler can further favor more latency-sensitive application under overloaded case.

Original languageEnglish (US)
StatePublished - 2013
Externally publishedYes
Event8th International Workshop on Feedback Computing - San Jose, United States
Duration: Jun 25 2013 → …

Conference

Conference8th International Workshop on Feedback Computing
Country/TerritoryUnited States
CitySan Jose
Period6/25/13 → …

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Software
  • Artificial Intelligence
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Two-level throughput and latency IO control for parallel file systems'. Together they form a unique fingerprint.

Cite this