Combining NMR and LC/MS using backward variable elimination

Metabolomics analysis of colorectal cancer, polyps, and healthy controls

Lingli Deng, Haiwei Gu, Jiangjiang Zhu, G. A. Nagana Gowda, Danijel Djukovic, E. Gabriela Chiorean, Daniel Raftery

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Both nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) play important roles in metabolomics. The complementary features of NMR and MS make their combination very attractive; however, currently the vast majority of metabolomics studies use either NMR or MS separately, and variable selection that combines NMR and MS for biomarker identification and statistical modeling is still not well developed. In this study focused on methodology, we developed a backward variable elimination partial least-squares discriminant analysis algorithm embedded with Monte Carlo cross validation (MCCV-BVE-PLSDA), to combine NMR and targeted liquid chromatography (LC)/MS data. Using the metabolomics analysis of serum for the detection of colorectal cancer (CRC) and polyps as an example, we demonstrate that variable selection is vitally important in combining NMR and MS data. The combined approach was better than using NMR or LC/MS data alone in providing significantly improved predictive accuracy in all the pairwise comparisons among CRC, polyps, and healthy controls. Using this approach, we selected a subset of metabolites responsible for the improved separation for each pairwise comparison, and we achieved a comprehensive profile of altered metabolite levels, including those in glycolysis, the TCA cycle, amino acid metabolism, and other pathways that were related to CRC and polyps. MCCV-BVE-PLSDA is straightforward, easy to implement, and highly useful for studying the contribution of each individual variable to multivariate statistical models. On the basis of these results, we recommend using an appropriate variable selection step, such as MCCV-BVE-PLSDA, when analyzing data from multiple analytical platforms to obtain improved statistical performance and a more accurate biological interpretation, especially for biomarker discovery. Importantly, the approach described here is relatively universal and can be easily expanded for combination with other analytical technologies.

Original languageEnglish (US)
Pages (from-to)7975-7983
Number of pages9
JournalAnalytical Chemistry
Volume88
Issue number16
DOIs
StatePublished - Aug 16 2016
Externally publishedYes

Fingerprint

Liquid chromatography
Mass spectrometry
Nuclear magnetic resonance
Biomarkers
Metabolites
Discriminant analysis
Metabolomics
Metabolism
Nuclear magnetic resonance spectroscopy
Amino Acids

ASJC Scopus subject areas

  • Analytical Chemistry

Cite this

Combining NMR and LC/MS using backward variable elimination : Metabolomics analysis of colorectal cancer, polyps, and healthy controls. / Deng, Lingli; Gu, Haiwei; Zhu, Jiangjiang; Nagana Gowda, G. A.; Djukovic, Danijel; Chiorean, E. Gabriela; Raftery, Daniel.

In: Analytical Chemistry, Vol. 88, No. 16, 16.08.2016, p. 7975-7983.

Research output: Contribution to journalArticle

Deng, Lingli ; Gu, Haiwei ; Zhu, Jiangjiang ; Nagana Gowda, G. A. ; Djukovic, Danijel ; Chiorean, E. Gabriela ; Raftery, Daniel. / Combining NMR and LC/MS using backward variable elimination : Metabolomics analysis of colorectal cancer, polyps, and healthy controls. In: Analytical Chemistry. 2016 ; Vol. 88, No. 16. pp. 7975-7983.
@article{3bcedf4373054bea8d1123515f1b8f9f,
title = "Combining NMR and LC/MS using backward variable elimination: Metabolomics analysis of colorectal cancer, polyps, and healthy controls",
abstract = "Both nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) play important roles in metabolomics. The complementary features of NMR and MS make their combination very attractive; however, currently the vast majority of metabolomics studies use either NMR or MS separately, and variable selection that combines NMR and MS for biomarker identification and statistical modeling is still not well developed. In this study focused on methodology, we developed a backward variable elimination partial least-squares discriminant analysis algorithm embedded with Monte Carlo cross validation (MCCV-BVE-PLSDA), to combine NMR and targeted liquid chromatography (LC)/MS data. Using the metabolomics analysis of serum for the detection of colorectal cancer (CRC) and polyps as an example, we demonstrate that variable selection is vitally important in combining NMR and MS data. The combined approach was better than using NMR or LC/MS data alone in providing significantly improved predictive accuracy in all the pairwise comparisons among CRC, polyps, and healthy controls. Using this approach, we selected a subset of metabolites responsible for the improved separation for each pairwise comparison, and we achieved a comprehensive profile of altered metabolite levels, including those in glycolysis, the TCA cycle, amino acid metabolism, and other pathways that were related to CRC and polyps. MCCV-BVE-PLSDA is straightforward, easy to implement, and highly useful for studying the contribution of each individual variable to multivariate statistical models. On the basis of these results, we recommend using an appropriate variable selection step, such as MCCV-BVE-PLSDA, when analyzing data from multiple analytical platforms to obtain improved statistical performance and a more accurate biological interpretation, especially for biomarker discovery. Importantly, the approach described here is relatively universal and can be easily expanded for combination with other analytical technologies.",
author = "Lingli Deng and Haiwei Gu and Jiangjiang Zhu and {Nagana Gowda}, {G. A.} and Danijel Djukovic and Chiorean, {E. Gabriela} and Daniel Raftery",
year = "2016",
month = "8",
day = "16",
doi = "10.1021/acs.analchem.6b00885",
language = "English (US)",
volume = "88",
pages = "7975--7983",
journal = "Analytical Chemistry",
issn = "0003-2700",
publisher = "American Chemical Society",
number = "16",

}

TY - JOUR

T1 - Combining NMR and LC/MS using backward variable elimination

T2 - Metabolomics analysis of colorectal cancer, polyps, and healthy controls

AU - Deng, Lingli

AU - Gu, Haiwei

AU - Zhu, Jiangjiang

AU - Nagana Gowda, G. A.

AU - Djukovic, Danijel

AU - Chiorean, E. Gabriela

AU - Raftery, Daniel

PY - 2016/8/16

Y1 - 2016/8/16

N2 - Both nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) play important roles in metabolomics. The complementary features of NMR and MS make their combination very attractive; however, currently the vast majority of metabolomics studies use either NMR or MS separately, and variable selection that combines NMR and MS for biomarker identification and statistical modeling is still not well developed. In this study focused on methodology, we developed a backward variable elimination partial least-squares discriminant analysis algorithm embedded with Monte Carlo cross validation (MCCV-BVE-PLSDA), to combine NMR and targeted liquid chromatography (LC)/MS data. Using the metabolomics analysis of serum for the detection of colorectal cancer (CRC) and polyps as an example, we demonstrate that variable selection is vitally important in combining NMR and MS data. The combined approach was better than using NMR or LC/MS data alone in providing significantly improved predictive accuracy in all the pairwise comparisons among CRC, polyps, and healthy controls. Using this approach, we selected a subset of metabolites responsible for the improved separation for each pairwise comparison, and we achieved a comprehensive profile of altered metabolite levels, including those in glycolysis, the TCA cycle, amino acid metabolism, and other pathways that were related to CRC and polyps. MCCV-BVE-PLSDA is straightforward, easy to implement, and highly useful for studying the contribution of each individual variable to multivariate statistical models. On the basis of these results, we recommend using an appropriate variable selection step, such as MCCV-BVE-PLSDA, when analyzing data from multiple analytical platforms to obtain improved statistical performance and a more accurate biological interpretation, especially for biomarker discovery. Importantly, the approach described here is relatively universal and can be easily expanded for combination with other analytical technologies.

AB - Both nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) play important roles in metabolomics. The complementary features of NMR and MS make their combination very attractive; however, currently the vast majority of metabolomics studies use either NMR or MS separately, and variable selection that combines NMR and MS for biomarker identification and statistical modeling is still not well developed. In this study focused on methodology, we developed a backward variable elimination partial least-squares discriminant analysis algorithm embedded with Monte Carlo cross validation (MCCV-BVE-PLSDA), to combine NMR and targeted liquid chromatography (LC)/MS data. Using the metabolomics analysis of serum for the detection of colorectal cancer (CRC) and polyps as an example, we demonstrate that variable selection is vitally important in combining NMR and MS data. The combined approach was better than using NMR or LC/MS data alone in providing significantly improved predictive accuracy in all the pairwise comparisons among CRC, polyps, and healthy controls. Using this approach, we selected a subset of metabolites responsible for the improved separation for each pairwise comparison, and we achieved a comprehensive profile of altered metabolite levels, including those in glycolysis, the TCA cycle, amino acid metabolism, and other pathways that were related to CRC and polyps. MCCV-BVE-PLSDA is straightforward, easy to implement, and highly useful for studying the contribution of each individual variable to multivariate statistical models. On the basis of these results, we recommend using an appropriate variable selection step, such as MCCV-BVE-PLSDA, when analyzing data from multiple analytical platforms to obtain improved statistical performance and a more accurate biological interpretation, especially for biomarker discovery. Importantly, the approach described here is relatively universal and can be easily expanded for combination with other analytical technologies.

UR - http://www.scopus.com/inward/record.url?scp=84983242613&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84983242613&partnerID=8YFLogxK

U2 - 10.1021/acs.analchem.6b00885

DO - 10.1021/acs.analchem.6b00885

M3 - Article

VL - 88

SP - 7975

EP - 7983

JO - Analytical Chemistry

JF - Analytical Chemistry

SN - 0003-2700

IS - 16

ER -