Integrating Bioinformatics and Clustering Analysis for Disease Surveillance

Project: Research project

Project Details


There has been a tremendous focus in bioinformatics on translation of data from the bench into information and knowledge for clinical decision-making. This includes analysis of human genetics for personalized medicine and treatment. However, there has been much less attention on translational bioinformatics for public health practice such as surveillance of emerging/re-emerging viruses. This involves data acquisition, integration, and analyses of viral genetics to infer origin, spread, and evolution such as the emergence of new strains. The relevant scientific fields for this practice include certain aspects of molecular epidemiology and phylogeography. Recent attention has focused on viruses of zoonotic origin, which are defined as pathogens that are transmittable between animals and humans. In addition to seasonal influenza and West Nile virus, this classification of pathogens includes novel viruses such as Middle Eastern Respiratory Syndrome and influenza A H7N9. Despite the successes highlighted in the literature, there has been little utilization of bioinformatics resources and tools among state public health, agriculture, and wildlife agencies for zoonotic surveillance. Previously this type of resource has been restricted primarily to those in academia.

While bioinformatics has been sparsely used for surveillance of zoonotic viruses, other applications such as Geospatial Information Systems (GIS) have been employed by state health agencies to analyze spatial patterns of infection. This includes software to produce disease maps using an array of data types such as clinical, geographical, or human mobility data for tasks such as, geocoding, clustering, or outbreak detection. In addition, advances in geospatial statistics have enabled health agencies to perform more powerful space-time analyses to infer spatiotemporal patterns. However, these GIS consider only traditional epidemiological data such as location and timing of reported cases and not the genetics of the virus that causes the disease. This prevents health agencies from understanding how changes in the genome of the virus and the associated environment in which it disseminates impacts disease risk.

The long-term goal of this proposal is to enhance the identification of geospatial hotspots of zoonotic viruses by applying bioinformatics principles to access, integrate, and analyze viral genetics and spatiotemporal reportable disease data. This project will include approaches from bioinformatics, genetics, spatial statistics, GIS, and epidemiology. To do this, I will first measure the utilization of bioinformatics resources and tools as well as the current approaches and limitations identified by state agencies of public health, agriculture, and wildlife for detecting and predicting hotspots (clusters) of zoonotic viruses (Aim 1). I will then use this feedback to develop a spatial decision support system for detecting and predicting zoonotic hotspots that applies bioinformatics principles to access, integrate, and analyze viral genetics, environmental, and spatiotemporal reportable disease data (Aim 2). In Aim 3, I will then evaluate my system for cluster detection and prediction against a system that does not consider viral genetics and relies on traditional spatiotemporal data, and perform validation of the predictive capability. Additional evaluation of the users satisfaction and system usability will be evaluated.
Effective start/end date9/20/1512/20/18


  • HHS: National Institutes of Health (NIH): $116,003.00

Fingerprint Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.