Nonparametric machine learning and efficient computation with bayesian additive regression trees: The bart r package

Rodney Sparapani, Charles Spanbauer, Robert McCulloch

Research output: Contribution to journalArticlepeer-review

46 Scopus citations

Abstract

In this article, we introduce the BART R package which is an acronym for Bayesian additive regression trees. BART is a Bayesian nonparametric, machine learning, ensemble predictive modeling method for continuous, binary, categorical and time-to-event out-comes. Furthermore, BART is a tree-based, black-box method which fits the outcome to an arbitrary random function, f, of the covariates. The BART technique is relatively computationally efficient as compared to its competitors, but large sample sizes can be demanding. Therefore, the BART package includes efficient state-of-the-art implemen-tations for continuous, binary, categorical and time-to-event outcomes that can take advantage of modern off-the-shelf hardware and software multi-threading technology. The BART package is written in C++ for both programmer and execution efficiency. The BART package takes advantage of multi-threading via forking as provided by the parallel package and OpenMP when available and supported by the platform. The ensemble of binary trees produced by a BART fit can be stored and re-used later via the R predict function. In addition to being an R package, the installed BART routines can be called directly from C++. The BART package provides the tools for your BART toolbox.

Original languageEnglish (US)
Pages (from-to)1-66
Number of pages66
JournalJournal of Statistical Software
Volume97
Issue number1
DOIs
StatePublished - 2021

Keywords

  • Binary trees
  • Black-box
  • Categorical
  • Competing risks
  • Continuous
  • Ensemble predictive model
  • Forking
  • Multi-threading
  • Multinomial
  • OpenMP
  • Recurrent events
  • Survival analysis

ASJC Scopus subject areas

  • Software
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Nonparametric machine learning and efficient computation with bayesian additive regression trees: The bart r package'. Together they form a unique fingerprint.

Cite this