GFCache: A greedy failure cache considering failure recency and failure frequency for an erasure-coded storage system

Mingzhu Deng, Fang Liu, Ming Zhao, Zhiguang Chen, Nong Xiao

Research output: Contribution to journalArticle

Abstract

In the big data era, data unavailability, either temporary or permanent, becomes a normal occurrence on a daily basis. Unlike the permanent data failure, which is fixed through a background job, temporarily unavailable data is recovered on-the-fly to serve the ongoing read request. However, those newly revived data is discarded after serving the request, due to the assumption that data experiencing temporary failures could come back alive later. Such disposal of failure data prevents the sharing of failure information among clients, and leads to many unnecessary data recovery processes, (e.g. caused by either recurring unavailability of a data or multiple data failures in one stripe), thereby straining system performance. To this end, this paper proposes GFCache to cache corrupted data for the dual purposes of failure information sharing and eliminating unnecessary data recovery processes. GFCache employs a greedy caching approach of opportunism to promote not only the failed data, but also sequential failure-likely data in the same stripe. Additionally, GFCache includes a FARC (Failure ARC) catch replacement algorithm, which features a balanced consideration of failure recency, frequency to accommodate data corruption with good hit ratio. The stored data in GFCache is able to support fast read of the normal data access. Furthermore, since GFCache is a generic failure cache, it can be used anywhere erasure coding is deployed with any specific coding schemes and parameters. Evaluations show that GFCache achieves good hit ratio with our sophisticated caching algorithm and manages to significantly boost system performance by reducing unnecessary data recoveries with vulnerable data in the cache.

Original languageEnglish (US)
Pages (from-to)153-167
Number of pages15
JournalComputers, Materials and Continua
Volume58
Issue number1
DOIs
StatePublished - Jan 1 2019

Fingerprint

Storage System
Cache
Recovery
Caching
Hits
System Performance
Coding
Information Sharing

Keywords

  • Erasure coding
  • Failure cache
  • Failure frequency
  • Failure recency
  • Greedy recovery

ASJC Scopus subject areas

  • Biomaterials
  • Modeling and Simulation
  • Mechanics of Materials
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

GFCache : A greedy failure cache considering failure recency and failure frequency for an erasure-coded storage system. / Deng, Mingzhu; Liu, Fang; Zhao, Ming; Chen, Zhiguang; Xiao, Nong.

In: Computers, Materials and Continua, Vol. 58, No. 1, 01.01.2019, p. 153-167.

Research output: Contribution to journalArticle

@article{efb26fb2a00e4befa2c120d63b452142,
title = "GFCache: A greedy failure cache considering failure recency and failure frequency for an erasure-coded storage system",
abstract = "In the big data era, data unavailability, either temporary or permanent, becomes a normal occurrence on a daily basis. Unlike the permanent data failure, which is fixed through a background job, temporarily unavailable data is recovered on-the-fly to serve the ongoing read request. However, those newly revived data is discarded after serving the request, due to the assumption that data experiencing temporary failures could come back alive later. Such disposal of failure data prevents the sharing of failure information among clients, and leads to many unnecessary data recovery processes, (e.g. caused by either recurring unavailability of a data or multiple data failures in one stripe), thereby straining system performance. To this end, this paper proposes GFCache to cache corrupted data for the dual purposes of failure information sharing and eliminating unnecessary data recovery processes. GFCache employs a greedy caching approach of opportunism to promote not only the failed data, but also sequential failure-likely data in the same stripe. Additionally, GFCache includes a FARC (Failure ARC) catch replacement algorithm, which features a balanced consideration of failure recency, frequency to accommodate data corruption with good hit ratio. The stored data in GFCache is able to support fast read of the normal data access. Furthermore, since GFCache is a generic failure cache, it can be used anywhere erasure coding is deployed with any specific coding schemes and parameters. Evaluations show that GFCache achieves good hit ratio with our sophisticated caching algorithm and manages to significantly boost system performance by reducing unnecessary data recoveries with vulnerable data in the cache.",
keywords = "Erasure coding, Failure cache, Failure frequency, Failure recency, Greedy recovery",
author = "Mingzhu Deng and Fang Liu and Ming Zhao and Zhiguang Chen and Nong Xiao",
year = "2019",
month = "1",
day = "1",
doi = "10.32604/cmc.2019.03585",
language = "English (US)",
volume = "58",
pages = "153--167",
journal = "Computers, Materials and Continua",
issn = "1546-2218",
publisher = "Tech Science Press",
number = "1",

}

TY - JOUR

T1 - GFCache

T2 - A greedy failure cache considering failure recency and failure frequency for an erasure-coded storage system

AU - Deng, Mingzhu

AU - Liu, Fang

AU - Zhao, Ming

AU - Chen, Zhiguang

AU - Xiao, Nong

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In the big data era, data unavailability, either temporary or permanent, becomes a normal occurrence on a daily basis. Unlike the permanent data failure, which is fixed through a background job, temporarily unavailable data is recovered on-the-fly to serve the ongoing read request. However, those newly revived data is discarded after serving the request, due to the assumption that data experiencing temporary failures could come back alive later. Such disposal of failure data prevents the sharing of failure information among clients, and leads to many unnecessary data recovery processes, (e.g. caused by either recurring unavailability of a data or multiple data failures in one stripe), thereby straining system performance. To this end, this paper proposes GFCache to cache corrupted data for the dual purposes of failure information sharing and eliminating unnecessary data recovery processes. GFCache employs a greedy caching approach of opportunism to promote not only the failed data, but also sequential failure-likely data in the same stripe. Additionally, GFCache includes a FARC (Failure ARC) catch replacement algorithm, which features a balanced consideration of failure recency, frequency to accommodate data corruption with good hit ratio. The stored data in GFCache is able to support fast read of the normal data access. Furthermore, since GFCache is a generic failure cache, it can be used anywhere erasure coding is deployed with any specific coding schemes and parameters. Evaluations show that GFCache achieves good hit ratio with our sophisticated caching algorithm and manages to significantly boost system performance by reducing unnecessary data recoveries with vulnerable data in the cache.

AB - In the big data era, data unavailability, either temporary or permanent, becomes a normal occurrence on a daily basis. Unlike the permanent data failure, which is fixed through a background job, temporarily unavailable data is recovered on-the-fly to serve the ongoing read request. However, those newly revived data is discarded after serving the request, due to the assumption that data experiencing temporary failures could come back alive later. Such disposal of failure data prevents the sharing of failure information among clients, and leads to many unnecessary data recovery processes, (e.g. caused by either recurring unavailability of a data or multiple data failures in one stripe), thereby straining system performance. To this end, this paper proposes GFCache to cache corrupted data for the dual purposes of failure information sharing and eliminating unnecessary data recovery processes. GFCache employs a greedy caching approach of opportunism to promote not only the failed data, but also sequential failure-likely data in the same stripe. Additionally, GFCache includes a FARC (Failure ARC) catch replacement algorithm, which features a balanced consideration of failure recency, frequency to accommodate data corruption with good hit ratio. The stored data in GFCache is able to support fast read of the normal data access. Furthermore, since GFCache is a generic failure cache, it can be used anywhere erasure coding is deployed with any specific coding schemes and parameters. Evaluations show that GFCache achieves good hit ratio with our sophisticated caching algorithm and manages to significantly boost system performance by reducing unnecessary data recoveries with vulnerable data in the cache.

KW - Erasure coding

KW - Failure cache

KW - Failure frequency

KW - Failure recency

KW - Greedy recovery

UR - http://www.scopus.com/inward/record.url?scp=85064840711&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064840711&partnerID=8YFLogxK

U2 - 10.32604/cmc.2019.03585

DO - 10.32604/cmc.2019.03585

M3 - Article

VL - 58

SP - 153

EP - 167

JO - Computers, Materials and Continua

JF - Computers, Materials and Continua

SN - 1546-2218

IS - 1

ER -