Abstract
In this paper we consider the rollback propagation and the performance of a fault-tolerant multiprocessor with a rollback recovery mechanism (FTMR2M) [1], which was designed to be tolerant of hardware failure with minimum time overhead. Rollback propagation between cooperating processes is usually required to ensure correct recovery from failure. To minimize the waste of processor time and storage overhead required for handling sophisticated rollback propagations, the FTMR2M always keeps one recoverable state. Approaches for evaluating the recovery overhead and analyzing the performance of FTMR2M are presented. Two methods for detecting rollback propagations and multi-step rollbacks between cooperating processes are also proposed.
Original language | English (US) |
---|---|
Pages (from-to) | 171-180 |
Number of pages | 10 |
Journal | Proceedings - International Symposium on Computer Architecture |
State | Published - Apr 26 1982 |
Externally published | Yes |
Event | 9th Annual Symposium on Computer Architecture, ISCA 1982 - Austin, United States Duration: Apr 26 1982 → Apr 29 1982 |
ASJC Scopus subject areas
- Hardware and Architecture