Abstract
A common goal of most public health surveillance programs is to detect disease outbreaks before they become a threat to the public. In this work, we propose a novel and computationally feasible approach to this problem. By tackling public health surveillance with a supervised learner that can handle high-dimensional, mixed-type data, and even missing values; we developed a method that can accurately detect changes in disease incidence rates, even in high-dimensions. We use probability estimates from random forests to develop an alternative signal criterion that can detect when there is a concentration of disease incidences within a particular geographic region and/or subpopulation that is unlikely to have occurred by chance. A series of simulated experiments suggest this method is able to accurately detect the presence of disease clusters, on average, 88% of time. Simulated results also suggest a feasible combination of the method's parameters that can significantly reduce the computational complexity of the method to an average system time of 1.9 minutes (s = 0.48 minutes) for a data set containing 1,000 incidences running on an Intel Core i5 processor.
Original language | English (US) |
---|---|
Title of host publication | IIE Annual Conference and Expo 2013 |
Publisher | Institute of Industrial Engineers |
Pages | 2551-2560 |
Number of pages | 10 |
State | Published - 2013 |
Event | IIE Annual Conference and Expo 2013 - San Juan, Puerto Rico Duration: May 18 2013 → May 22 2013 |
Other
Other | IIE Annual Conference and Expo 2013 |
---|---|
Country/Territory | Puerto Rico |
City | San Juan |
Period | 5/18/13 → 5/22/13 |
ASJC Scopus subject areas
- Industrial and Manufacturing Engineering