Using synthetic data to evaluate multiple regression and principal component analyses for statistical modeling of daily building energy consumption

T Agami Reddy, D. E. Claridge

Research output: Contribution to journalArticle

41 Citations (Scopus)

Abstract

Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequences representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated.

Original languageEnglish (US)
Pages (from-to)35-44
Number of pages10
JournalEnergy and Buildings
Volume21
Issue number1
DOIs
StatePublished - 1994
Externally publishedYes

Fingerprint

Principal component analysis
Energy utilization
Regression analysis
Solar radiation
Atmospheric humidity
Temperature

ASJC Scopus subject areas

  • Renewable Energy, Sustainability and the Environment
  • Civil and Structural Engineering

Cite this

@article{21fbc1aaf4674bc78f5de1b4aa278252,
title = "Using synthetic data to evaluate multiple regression and principal component analyses for statistical modeling of daily building energy consumption",
abstract = "Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequences representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated.",
author = "Reddy, {T Agami} and Claridge, {D. E.}",
year = "1994",
doi = "10.1016/0378-7788(94)90014-0",
language = "English (US)",
volume = "21",
pages = "35--44",
journal = "Energy and Buildings",
issn = "0378-7788",
publisher = "Elsevier BV",
number = "1",

}

TY - JOUR

T1 - Using synthetic data to evaluate multiple regression and principal component analyses for statistical modeling of daily building energy consumption

AU - Reddy, T Agami

AU - Claridge, D. E.

PY - 1994

Y1 - 1994

N2 - Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequences representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated.

AB - Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequences representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated.

UR - http://www.scopus.com/inward/record.url?scp=0028593310&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028593310&partnerID=8YFLogxK

U2 - 10.1016/0378-7788(94)90014-0

DO - 10.1016/0378-7788(94)90014-0

M3 - Article

AN - SCOPUS:0028593310

VL - 21

SP - 35

EP - 44

JO - Energy and Buildings

JF - Energy and Buildings

SN - 0378-7788

IS - 1

ER -