The various advantages of distributed systems can be realized only when their resources are “optimally” (in some sense) controlled and utilized. For example, distributed systems must be reconfigured dynamically to cope with component failures and workload changes. Owing to the inherent difficulty in formulating and solving resource control problems, the resource control strategies currently proposed/used for distributed systems are largely ad hoc. It is our purpose in this paper to 1) quantitatively formulate the problem of controlling resources in a distributed system so as to optimize a reward function, and 2) derive optimal control strategies using Markov decision theory. The control variables treated here are quite general: for example, they could be control decisions related to system configuration, repair, diagnostics, files, or data. Two algorithms for resource control in distributed systems are derived for time-invariant and periodic environments, respectively. A detailed example to demonstrate the power and usefulness of our approach is provided.
ASJC Scopus subject areas