Modeling virtualized applications using machine learning techniques

Sajib Kundu, Raju Rangaswami, Ajay Gulati, Ming Zhao, Kaushik Dutta

Research output: Contribution to journalArticlepeer-review

59 Scopus citations


With the growing adoption of virtualized datacenters and cloud hosting services, the allocation and sizing of resources such as CPU, memory, and I/O bandwidth for virtual machines (VMs) is becoming increasingly important. Accurate performance modeling of an application would help users in better VM sizing, thus reducing costs. It can also benefit cloud service providers who can offer a new charging model based on the VMs' performance instead of their configured sizes. In this paper, we present techniques to model the performance of a VM-hosted application as a function of the resources allocated to the VM and the resource contention it experiences. To address this multi-dimensional modeling problem, we propose and refine the use of two machine learning techniques: artificial neural network (ANN) and support vector machine (SVM). We evaluate these modeling techniques using five virtualized applications from the RUBiS and Filebench suite of benchmarks and demonstrate that their median and 90th percentile prediction errors are within 4.36% and 29.17% respectively. These results are substantially better than regression based approaches as well as direct applications of machine learning techniques without our refinements. We also present a simple and effective approach to VM sizing and empirically demonstrate that it can deliver optimal results for 65% of the sizing problems that we studied and produces close-to-optimal sizes for the remaining 35%.

Original languageEnglish (US)
Pages (from-to)3-14
Number of pages12
JournalACM SIGPLAN Notices
Issue number7
StatePublished - Sep 1 2012
Externally publishedYes


  • Cloud data centers
  • Machine learning
  • Performance modeling
  • VM sizing
  • Virtualization

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'Modeling virtualized applications using machine learning techniques'. Together they form a unique fingerprint.

Cite this