Stochastic Tree Ensembles for Regularized Nonlinear Regression

Jingyu He; P. Richard Hahn

doi:10.1080/01621459.2021.1942012

Stochastic Tree Ensembles for Regularized Nonlinear Regression

Jingyu He, P. Richard Hahn

Mathematical and Statistical Sciences, School of (SoMSS)

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

This article develops a novel stochastic tree ensemble method for nonlinear regression, referred to as accelerated Bayesian additive regression trees, or XBART. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning algorithms, XBART attains state-of-the-art performance at prediction and function estimation. Simulation studies demonstrate that XBART provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost, and neural networks (using Keras) on a variety of test functions. Additionally, it is demonstrated that using XBART to initialize the standard BART MCMC algorithm considerably improves credible interval coverage and reduces total run-time. Finally, two basic theoretical results are established: the single tree version of the model is asymptotically consistent and the Markov chain produced by the ensemble version of the algorithm has a unique stationary distribution.

Original language	English (US)
Pages (from-to)	551-570
Number of pages	20
Journal	Journal of the American Statistical Association
Volume	118
Issue number	541
DOIs	https://doi.org/10.1080/01621459.2021.1942012
State	Published - 2023

Keywords

Bayesian
Machine learning
Markov chain Monte Carlo
Regression trees
Supervised learning

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1080/01621459.2021.1942012

Cite this

@article{9050cf0899ca4e80b06aed0f252fe1b4,

title = "Stochastic Tree Ensembles for Regularized Nonlinear Regression",

abstract = "This article develops a novel stochastic tree ensemble method for nonlinear regression, referred to as accelerated Bayesian additive regression trees, or XBART. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning algorithms, XBART attains state-of-the-art performance at prediction and function estimation. Simulation studies demonstrate that XBART provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost, and neural networks (using Keras) on a variety of test functions. Additionally, it is demonstrated that using XBART to initialize the standard BART MCMC algorithm considerably improves credible interval coverage and reduces total run-time. Finally, two basic theoretical results are established: the single tree version of the model is asymptotically consistent and the Markov chain produced by the ensemble version of the algorithm has a unique stationary distribution.",

keywords = "Bayesian, Machine learning, Markov chain Monte Carlo, Regression trees, Supervised learning",

author = "Jingyu He and Hahn, {P. Richard}",

note = "Publisher Copyright: {\textcopyright} 2021 American Statistical Association.",

year = "2023",

doi = "10.1080/01621459.2021.1942012",

language = "English (US)",

volume = "118",

pages = "551--570",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor and Francis Ltd.",

number = "541",

}

TY - JOUR

T1 - Stochastic Tree Ensembles for Regularized Nonlinear Regression

AU - He, Jingyu

AU - Hahn, P. Richard

PY - 2023

Y1 - 2023

N2 - This article develops a novel stochastic tree ensemble method for nonlinear regression, referred to as accelerated Bayesian additive regression trees, or XBART. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning algorithms, XBART attains state-of-the-art performance at prediction and function estimation. Simulation studies demonstrate that XBART provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost, and neural networks (using Keras) on a variety of test functions. Additionally, it is demonstrated that using XBART to initialize the standard BART MCMC algorithm considerably improves credible interval coverage and reduces total run-time. Finally, two basic theoretical results are established: the single tree version of the model is asymptotically consistent and the Markov chain produced by the ensemble version of the algorithm has a unique stationary distribution.

AB - This article develops a novel stochastic tree ensemble method for nonlinear regression, referred to as accelerated Bayesian additive regression trees, or XBART. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning algorithms, XBART attains state-of-the-art performance at prediction and function estimation. Simulation studies demonstrate that XBART provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost, and neural networks (using Keras) on a variety of test functions. Additionally, it is demonstrated that using XBART to initialize the standard BART MCMC algorithm considerably improves credible interval coverage and reduces total run-time. Finally, two basic theoretical results are established: the single tree version of the model is asymptotically consistent and the Markov chain produced by the ensemble version of the algorithm has a unique stationary distribution.

KW - Bayesian

KW - Machine learning

KW - Markov chain Monte Carlo

KW - Regression trees

KW - Supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85112688498&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85112688498&partnerID=8YFLogxK

U2 - 10.1080/01621459.2021.1942012

DO - 10.1080/01621459.2021.1942012

M3 - Article

AN - SCOPUS:85112688498

SN - 0162-1459

VL - 118

SP - 551

EP - 570

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

IS - 541

ER -

Stochastic Tree Ensembles for Regularized Nonlinear Regression

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint