Server recovery using naturally replicated state: a case study

Murthy Devarakonda, Bill Kish, Ajay Mohindra

Research output: Contribution to conferencePaper

1 Scopus citations

Abstract

This paper describes design and preliminary measurements of a file server recovery scheme that uses naturally replicated state among clients. This scheme, implemented in the Calypso file system, is truly transparent to the user and avoids the overhead of explicit replication. A three-phase protocol reconstructs the server state either on a backup node (if disks are multi-ported) or on the rebooted server node. Measurements show that the recovery time is about 21 seconds for a busy 10-node cluster. However, the time to rebuild the distributed state is only about 1.5 seconds, and most of the recovery time is spent in replaying the write-ahead log of the underlying file system. Fortunately, the log redo time is bounded by the log size.

Original languageEnglish (US)
Pages213-220
Number of pages8
StatePublished - Jan 1 1995
EventProceedings of the 15th International Conference on Distributed Computing Systems - Vancouver, Can
Duration: May 30 1995Jun 2 1995

Other

OtherProceedings of the 15th International Conference on Distributed Computing Systems
CityVancouver, Can
Period5/30/956/2/95

    Fingerprint

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Devarakonda, M., Kish, B., & Mohindra, A. (1995). Server recovery using naturally replicated state: a case study. 213-220. Paper presented at Proceedings of the 15th International Conference on Distributed Computing Systems, Vancouver, Can, .