Boost-R: Gradient boosted trees for recurrence data

Xiao Liu, Rong Pan

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Recurrence data arise from multi-disciplinary domains spanning reliability, cyber security, healthcare, online retailing, etc. This paper investigates an additive-tree-based approach, known as Boost-R (Boosting for Recurrence Data), for recurrent event data with both static and dynamic features. Boost-R constructs an ensemble of gradient boosted additive trees to estimate the cumulative intensity function of the recurrent event process, where a new tree is added to the ensemble by minimizing the regularized L 2 distance between the observed and predicted cumulative intensity. Unlike conventional regression trees, a time-dependent function is constructed by Boost-R on each tree leaf. The sum of these functions, from multiple trees, yields the ensemble estimator of the cumulative intensity. The divide-and-conquer nature of tree-based methods is appealing when hidden sub-populations exist within a heterogeneous population. The non-parametric nature of regression trees helps to avoid parametric assumptions on the complex interactions between event processes and features. Critical insights and advantages of Boost-R are investigated through comprehensive numerical examples. Datasets and computer code of Boost-R are made available on GitHub. To our best knowledge, Boost-R is the first gradient boosted additive-tree-based approach for modeling large-scale recurrent event data with both static and dynamic feature information.

Original languageEnglish (US)
Pages (from-to)545-565
Number of pages21
JournalJournal of Quality Technology
Volume53
Issue number5
DOIs
StatePublished - 2021
Externally publishedYes

Keywords

  • additive trees
  • feature selection
  • gradient boosting
  • recurrent event data
  • reliability

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Strategy and Management
  • Management Science and Operations Research
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'Boost-R: Gradient boosted trees for recurrence data'. Together they form a unique fingerprint.

Cite this