Machine learning to predict rapid progression of carotid atherosclerosis in patients with impaired glucose tolerance

Xia Hu, Peter D. Reaven, Aramesh Saremi, Ninghao Liu, Mohammad Ali Abbasi, Huan Liu, Raymond Q. Migrino, ACT NOW Study Investigators the ACT NOW Study Investigators

Research output: Contribution to journalArticlepeer-review

16 Scopus citations


Objectives: Prediabetes is a major epidemic and is associated with adverse cardio-cerebrovascular outcomes. Early identification of patients who will develop rapid progression of atherosclerosis could be beneficial for improved risk stratification. In this paper, we investigate important factors impacting the prediction, using several machine learning methods, of rapid progression of carotid intima-media thickness in impaired glucose tolerance (IGT) participants. Methods: In the Actos Now for Prevention of Diabetes (ACT NOW) study, 382 participants with IGT underwent carotid intima-media thickness (CIMT) ultrasound evaluation at baseline and at 15–18 months, and were divided into rapid progressors (RP, n = 39, 58 ± 17.5 μM change) and non-rapid progressors (NRP, n = 343, 5.8 ± 20 μM change, p < 0.001 versus RP). To deal with complex multi-modal data consisting of demographic, clinical, and laboratory variables, we propose a general data-driven framework to investigate the ACT NOW dataset. In particular, we first employed a Fisher Score-based feature selection method to identify the most effective variables and then proposed a probabilistic Bayes-based learning method for the prediction. Comparison of the methods and factors was conducted using area under the receiver operating characteristic curve (AUC) analyses and Brier score. Results: The experimental results show that the proposed learning methods performed well in identifying or predicting RP. Among the methods, the performance of Naïve Bayes was the best (AUC 0.797, Brier score 0.085) compared to multilayer perceptron (0.729, 0.086) and random forest (0.642, 0.10). The results also show that feature selection has a significant positive impact on the data prediction performance. Conclusions: By dealing with multi-modal data, the proposed learning methods show effectiveness in predicting prediabetics at risk for rapid atherosclerosis progression. The proposed framework demonstrated utility in outcome prediction in a typical multidimensional clinical dataset with a relatively small number of subjects, extending the potential utility of machine learning approaches beyond extremely large-scale datasets.

Original languageEnglish (US)
Article number14
JournalEurasip Journal on Bioinformatics and Systems Biology
Issue number1
StatePublished - Dec 1 2016


  • Atherosclerosis
  • Diabetes
  • Machine learning
  • Model
  • Prognosis

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Computer Science Applications
  • Computational Mathematics


Dive into the research topics of 'Machine learning to predict rapid progression of carotid atherosclerosis in patients with impaired glucose tolerance'. Together they form a unique fingerprint.

Cite this