Fairly redistributing failed server load in a distributed system

Venkatesh Sangam, Christopher B. Mayer, Kasim Candan

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

We recently proposed a novel method for large-object replication and load balancing. Our method is particularly well-suited to data grids, data warehousing providers, and hosting of dynamic web sites. The method attempts to distribute object request load fairly to servers according to server capacity so that the likelihood of them overloading, and hence failing, is reduced. Unfortunately, server failures cannot be eliminated entirely. When a server fails, the load carried by that server must be absorbed by the rest of the system. Unless this load is distributed fairly across the remaining servers, they may also overload, creating a cascade of failures and reduced quality of service. In this paper, we propose an efficient method for fairly redistributing the load of a failed server or set of failed servers within our replication system. We also report on experimental results that verify the validity of our approach.

Original languageEnglish (US)
Pages (from-to)871-884
Number of pages14
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2889
StatePublished - 2003

Fingerprint

Computer Communication Networks
Distributed Systems
Servers
Server
Reproducibility of Results
Replication
Data Warehousing
Data Grid
Data warehouses
Overload
Load Balancing
Resource allocation
Cascade
Quality of Service
Websites
Likelihood
Quality of service
Verify
Experimental Results

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

@article{ecc2ae75f3d449ffb4ee57c53d2aedc0,
title = "Fairly redistributing failed server load in a distributed system",
abstract = "We recently proposed a novel method for large-object replication and load balancing. Our method is particularly well-suited to data grids, data warehousing providers, and hosting of dynamic web sites. The method attempts to distribute object request load fairly to servers according to server capacity so that the likelihood of them overloading, and hence failing, is reduced. Unfortunately, server failures cannot be eliminated entirely. When a server fails, the load carried by that server must be absorbed by the rest of the system. Unless this load is distributed fairly across the remaining servers, they may also overload, creating a cascade of failures and reduced quality of service. In this paper, we propose an efficient method for fairly redistributing the load of a failed server or set of failed servers within our replication system. We also report on experimental results that verify the validity of our approach.",
author = "Venkatesh Sangam and Mayer, {Christopher B.} and Kasim Candan",
year = "2003",
language = "English (US)",
volume = "2889",
pages = "871--884",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Fairly redistributing failed server load in a distributed system

AU - Sangam, Venkatesh

AU - Mayer, Christopher B.

AU - Candan, Kasim

PY - 2003

Y1 - 2003

N2 - We recently proposed a novel method for large-object replication and load balancing. Our method is particularly well-suited to data grids, data warehousing providers, and hosting of dynamic web sites. The method attempts to distribute object request load fairly to servers according to server capacity so that the likelihood of them overloading, and hence failing, is reduced. Unfortunately, server failures cannot be eliminated entirely. When a server fails, the load carried by that server must be absorbed by the rest of the system. Unless this load is distributed fairly across the remaining servers, they may also overload, creating a cascade of failures and reduced quality of service. In this paper, we propose an efficient method for fairly redistributing the load of a failed server or set of failed servers within our replication system. We also report on experimental results that verify the validity of our approach.

AB - We recently proposed a novel method for large-object replication and load balancing. Our method is particularly well-suited to data grids, data warehousing providers, and hosting of dynamic web sites. The method attempts to distribute object request load fairly to servers according to server capacity so that the likelihood of them overloading, and hence failing, is reduced. Unfortunately, server failures cannot be eliminated entirely. When a server fails, the load carried by that server must be absorbed by the rest of the system. Unless this load is distributed fairly across the remaining servers, they may also overload, creating a cascade of failures and reduced quality of service. In this paper, we propose an efficient method for fairly redistributing the load of a failed server or set of failed servers within our replication system. We also report on experimental results that verify the validity of our approach.

UR - http://www.scopus.com/inward/record.url?scp=4344571650&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4344571650&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:4344571650

VL - 2889

SP - 871

EP - 884

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -