Abstract

Public health surveillance is a special case of the general problem that monitors counts (or rates) of events for changes. Modern data complements event counts with many additional measurements (such as geographic, demographic, and others) that comprise high-dimensional covariates. This leads to an important challenge to detect a change that only occurs within a region, initially unspecified, defined by these covariates. Current methods used to handle covariate information are limited to low-dimensional data. The approach presented in this article transforms the problem to supervised learning, so that an appropriate learner and signal criteria can then be defined. A feature selection algorithm is used to identify covariates that contribute to a model (either individually or through interactions) and this is used to generate a signal based on formal statistical inference. A measure of statistical significance is also included to control false alarms. Graphical plots are used to isolate change locations in covariate space. Results on a variety of simulated examples are provided.

Original languageEnglish (US)
Pages (from-to)770-789
Number of pages20
JournalIIE Transactions (Institute of Industrial Engineers)
Volume46
Issue number8
DOIs
StatePublished - Aug 3 2014

Keywords

  • Epidemiology
  • data mining
  • decision trees
  • ensembles
  • feature selection

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering

Fingerprint Dive into the research topics of 'Public health surveillance with ensemble-based supervised learning'. Together they form a unique fingerprint.

  • Cite this