DA placement: A dual-aware data placement in a deduplicated and erasure-coded storage system

Mingzhu Deng, Ming Zhao, Fang Liu, Zhiguang Chen, Nong Xiao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Simultaneously incorporating deduplication as well as erasure coding is preferred for modern storage systems for the enhanced storage efficiency and economical data reliability. However, simple incorporation suffers from the “read imbalance problem”, in which parallel data accesses are curbed by throttled storage nodes. This problem is due to the uneven data placement in the system, which is unaware of the employment of both deduplication and erasure coding, each of whom alters the order of data if unattended. This paper proposes a systematic design and implementation of a Dual-Aware(DA) placement in a combined storage system to achieve both deduplication-awareness and erasure-coding-awareness at the same time. DA not only records the node number of each unique data to allow for quick references with ease, but also dynamically tracks used nodes for each writes request. In this way, deduplication awareness is formed to skip inconvenient placement locations. Besides, DA serializes the placement of parity blocks with a stripe and across stripes. Such realization of erasure coding awareness ensures the separation of data and parity, as well as maintains data sequentiality at bordering stripes. Additionally, DA manages to extend with further load-balancing through an innovative use of the deduplication level, which intuitively predicts future accesses of a piece of data. In short, DA manages to boost system performance with little memory or computation cost. Extensive experiments using both real-world traces and synthesized workloads, prove DA achieves a better read performance. For example, DA respectively leads an average latency margin of 30.86% and 29.63%, over the baseline rolling placement(BA) and random placement(RA) under CAFTL traces over a default cluster of 12 nodes with RS(8,4).

Original languageEnglish (US)
Title of host publicationAlgorithms and Architectures for Parallel Processing - 18th International Conference, ICA3PP 2018, Proceedings
EditorsJaideep Vaidya, Jin Li
PublisherSpringer Verlag
Pages358-377
Number of pages20
ISBN (Print)9783030050504
DOIs
StatePublished - 2018
Event18th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2018 - Guangzhou, China
Duration: Nov 15 2018Nov 17 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11334 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2018
Country/TerritoryChina
CityGuangzhou
Period11/15/1811/17/18

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'DA placement: A dual-aware data placement in a deduplicated and erasure-coded storage system'. Together they form a unique fingerprint.

Cite this