Applying machine learning to understand water security and water access inequality in underserved colonia communities

Zhining Gu, Wenwen Li, Michael Hanemann, Yushiou Tsai, Amber Wutich, Paul Westerhoff, Laura Landes, Anais D. Roque, Madeleine Zheng, Carmen A. Velasco, Sarah Porter

Research output: Contribution to journalArticlepeer-review


This paper explores the application of machine learning to enhance our understanding of water accessibility issues in underserved communities called colonias located along the northern part of the United States–Mexico border. We analyzed >2000 such communities using data from the Rural Community Assistance Partnership (RCAP) and applied hierarchical clustering and the adaptive affinity propagation algorithm to automatically group colonias into clusters with different water access conditions. The Gower distance was introduced to make the algorithm capable of processing complex datasets containing both categorical and numerical attributes. To better understand and explain the clustering results derived from the machine learning process, we further applied a decision tree analysis algorithm to associate the input data with the derived clusters, to identify and rank the importance of factors that characterize different water access conditions in each cluster. Our results complement experts' priority rankings of water infrastructure needs, providing a more in-depth view of the water insecurity challenges that the colonias suffer from. As an automated and reproducible workflow combining a series of tools, the proposed machine learning pipeline represents an operationalized solution for conducting data-driven analysis to understand water access inequality. This pipeline can be adapted to analyze different datasets and decision scenarios.

Original languageEnglish (US)
Article number101969
JournalComputers, Environment and Urban Systems
StatePublished - Jun 2023
Externally publishedYes


  • Adaptive affinity propagation
  • Hierarchical clustering
  • Machine learning
  • Underserved communities
  • Water security

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Ecological Modeling
  • Environmental Science(all)
  • Urban Studies


Dive into the research topics of 'Applying machine learning to understand water security and water access inequality in underserved colonia communities'. Together they form a unique fingerprint.

Cite this