2 Scopus citations


Rapidly rising healthcare costs represent one of the major issues plaguing the healthcare system. Data from the Arizona Health Care Cost Containment System, Arizona's Medicaid program provide a unique opportunity to exploit state-of-the-art machine learning and data mining algorithms to analyze data and provide actionable findings that can aid cost containment. Our work addresses specific challenges in this real-life healthcare application with respect to data imbalance in the process of building predictive risk models for forecasting high-cost patients. We survey the literature and propose novel data mining approaches customized for this compelling application with specific focus on non-random sampling. Our empirical study indicates that the proposed approach is highly effective and can benefit further research on cost containment in the healthcare industry.

Original languageEnglish (US)
Title of host publicationBiomedical Engineering Systems and Technologies - International Joint Conference, BIOSTEC 2008, Revised Selected Papers
Number of pages14
StatePublished - Dec 1 2008
Event1st International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2008 - Funchal, Madeira, Portugal
Duration: Jan 28 2008Jan 31 2008

Publication series

NameCommunications in Computer and Information Science
Volume25 CCIS
ISSN (Print)1865-0929


Other1st International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2008
CityFunchal, Madeira


  • Medicaid
  • Predictive risk modeling
  • data mining
  • future high-cost patients
  • health care expenditures
  • imbalanced data classification
  • non-random sampling
  • risk adjustment
  • skewed data

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)


Dive into the research topics of 'Understanding the effects of sampling on healthcare risk modeling for the prediction of future high-cost patients'. Together they form a unique fingerprint.

Cite this