TY - GEN
T1 - Zero-inflated boosted ensembles for rare event counts
AU - Borisov, Alexander
AU - Runger, George
AU - Tuv, Eugene
AU - Lurponglukana-Strand, Nuttha
N1 - Funding Information:
This material is based upon work supported by the National Science Foundation under Grant No. 0743160.
PY - 2009
Y1 - 2009
N2 - Two linked ensembles are used for a supervised learning problem with rare-event counts. With many target instances of zero, more traditional loss functions (such as squared error and class error) are often not relevant and a statistical model leads to a likelihood with two related parameters from a zero-inflated Poisson (ZIP) distribution. In a new approach, a linked pair of gradient boosted tree ensembles are developed to handle the multiple parameters in a manner that can be generalized to other problems. The result is a unique learner that extends machine learning methods to data with nontraditional structures. We empirically compare to two real data sets and two artificial data sets versus a single-tree approach (ZIP-tree) and a statistical generalized linear model.
AB - Two linked ensembles are used for a supervised learning problem with rare-event counts. With many target instances of zero, more traditional loss functions (such as squared error and class error) are often not relevant and a statistical model leads to a likelihood with two related parameters from a zero-inflated Poisson (ZIP) distribution. In a new approach, a linked pair of gradient boosted tree ensembles are developed to handle the multiple parameters in a manner that can be generalized to other problems. The result is a unique learner that extends machine learning methods to data with nontraditional structures. We empirically compare to two real data sets and two artificial data sets versus a single-tree approach (ZIP-tree) and a statistical generalized linear model.
UR - http://www.scopus.com/inward/record.url?scp=70349855115&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349855115&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-03915-7_20
DO - 10.1007/978-3-642-03915-7_20
M3 - Conference contribution
AN - SCOPUS:70349855115
SN - 3642039146
SN - 9783642039140
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 225
EP - 236
BT - Advances in Intelligent Data Analysis VIII - 8th International Symposium on Intelligent Data Analysis, IDA 2009, Proceedings
T2 - 8th International Symposium on Intelligent Data Analysis, IDA 2009
Y2 - 31 August 2009 through 2 September 2009
ER -