Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: Application to the detection of breast cancer

Haiwei Gu, Zhengzheng Pan, Bowei Xi, Vincent Asiago, Brian Musselman, Daniel Raftery

Research output: Contribution to journalArticle

97 Citations (Scopus)

Abstract

Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, 1H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology.

Original languageEnglish (US)
Pages (from-to)57-63
Number of pages7
JournalAnalytica Chimica Acta
Volume686
Issue number1-2
DOIs
StatePublished - Feb 7 2011
Externally publishedYes

Fingerprint

Metabolomics
Least-Squares Analysis
Mass spectrometry
nuclear magnetic resonance
cancer
Mass Spectrometry
Magnetic Resonance Spectroscopy
mass spectrometry
Nuclear magnetic resonance
Breast Neoplasms
Metabolome
Principal Component Analysis
Principal component analysis
Nuclear magnetic resonance spectroscopy
principal component analysis
spectroscopy
detection
analysis
Cluster Analysis
Toxicity

Keywords

  • Breast cancer
  • Direct analysis in real time
  • Human serum
  • Mass spectrometry
  • Metabolomics
  • Nuclear magnetic resonance
  • Orthogonal signal correction
  • Partial least squares

ASJC Scopus subject areas

  • Analytical Chemistry
  • Environmental Chemistry
  • Biochemistry
  • Spectroscopy

Cite this

Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics : Application to the detection of breast cancer. / Gu, Haiwei; Pan, Zhengzheng; Xi, Bowei; Asiago, Vincent; Musselman, Brian; Raftery, Daniel.

In: Analytica Chimica Acta, Vol. 686, No. 1-2, 07.02.2011, p. 57-63.

Research output: Contribution to journalArticle

@article{47312e70b4074ad48e343c309b7da033,
title = "Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: Application to the detection of breast cancer",
abstract = "Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, 1H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology.",
keywords = "Breast cancer, Direct analysis in real time, Human serum, Mass spectrometry, Metabolomics, Nuclear magnetic resonance, Orthogonal signal correction, Partial least squares",
author = "Haiwei Gu and Zhengzheng Pan and Bowei Xi and Vincent Asiago and Brian Musselman and Daniel Raftery",
year = "2011",
month = "2",
day = "7",
doi = "10.1016/j.aca.2010.11.040",
language = "English (US)",
volume = "686",
pages = "57--63",
journal = "Analytica Chimica Acta",
issn = "0003-2670",
publisher = "Elsevier",
number = "1-2",

}

TY - JOUR

T1 - Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics

T2 - Application to the detection of breast cancer

AU - Gu, Haiwei

AU - Pan, Zhengzheng

AU - Xi, Bowei

AU - Asiago, Vincent

AU - Musselman, Brian

AU - Raftery, Daniel

PY - 2011/2/7

Y1 - 2011/2/7

N2 - Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, 1H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology.

AB - Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, 1H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology.

KW - Breast cancer

KW - Direct analysis in real time

KW - Human serum

KW - Mass spectrometry

KW - Metabolomics

KW - Nuclear magnetic resonance

KW - Orthogonal signal correction

KW - Partial least squares

UR - http://www.scopus.com/inward/record.url?scp=78651355465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78651355465&partnerID=8YFLogxK

U2 - 10.1016/j.aca.2010.11.040

DO - 10.1016/j.aca.2010.11.040

M3 - Article

C2 - 21237308

AN - SCOPUS:78651355465

VL - 686

SP - 57

EP - 63

JO - Analytica Chimica Acta

JF - Analytica Chimica Acta

SN - 0003-2670

IS - 1-2

ER -