Heteroscedastic BART via Multiplicative Regression Trees

M. T. Pratola, H. A. Chipman, E. I. George, R. E. McCulloch

Research output: Contribution to journalArticle

Abstract

Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.

Original languageEnglish (US)
JournalJournal of Computational and Graphical Statistics
DOIs
StateAccepted/In press - Jan 1 2019

Fingerprint

Regression Tree
Multiplicative
Diagnostic Plot
Variance Function
Error Model
Response Surface
Nonlinear Response
Nonparametric Regression
Predictors
Enhancement
Regression
Regression tree
Statistics
Formulation
Requirements
Model

Keywords

  • Applied statistical inference
  • Big data
  • Nonparametric regression
  • Uncertainty quantification

ASJC Scopus subject areas

  • Statistics and Probability
  • Discrete Mathematics and Combinatorics
  • Statistics, Probability and Uncertainty

Cite this

Heteroscedastic BART via Multiplicative Regression Trees. / Pratola, M. T.; Chipman, H. A.; George, E. I.; McCulloch, R. E.

In: Journal of Computational and Graphical Statistics, 01.01.2019.

Research output: Contribution to journalArticle

@article{17b54bb3044a49c0b469b31f77f52c6f,
title = "Heteroscedastic BART via Multiplicative Regression Trees",
abstract = "Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.",
keywords = "Applied statistical inference, Big data, Nonparametric regression, Uncertainty quantification",
author = "Pratola, {M. T.} and Chipman, {H. A.} and George, {E. I.} and McCulloch, {R. E.}",
year = "2019",
month = "1",
day = "1",
doi = "10.1080/10618600.2019.1677243",
language = "English (US)",
journal = "Journal of Computational and Graphical Statistics",
issn = "1061-8600",
publisher = "American Statistical Association",

}

TY - JOUR

T1 - Heteroscedastic BART via Multiplicative Regression Trees

AU - Pratola, M. T.

AU - Chipman, H. A.

AU - George, E. I.

AU - McCulloch, R. E.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.

AB - Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.

KW - Applied statistical inference

KW - Big data

KW - Nonparametric regression

KW - Uncertainty quantification

UR - http://www.scopus.com/inward/record.url?scp=85075380404&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075380404&partnerID=8YFLogxK

U2 - 10.1080/10618600.2019.1677243

DO - 10.1080/10618600.2019.1677243

M3 - Article

AN - SCOPUS:85075380404

JO - Journal of Computational and Graphical Statistics

JF - Journal of Computational and Graphical Statistics

SN - 1061-8600

ER -