Identification of optimal classification functions for biological sample and state discrimination from metabolic profiling data

Kyongbum Lee, Daehee Hwang, Tadaaki Yokoyama, George Stephanopoulos, Gregory N. Stephanopoulos, Martin L. Yarmush

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Motivations: Classification of biological samples for diagnostic purposes is a difficult task because of the many decisions involved on the number, type and functional manipulations of the input variables. This study presents a generally applicable strategy for systematic formulation of optimal diagnostic indexes. To this end, we develop a novel set of computational tools by integrating regression optimization, stepwise variable selection and cross-validation algorithms. Results: The proposed discrimination methodology was applied to plasma and tissue (liver) metabolic profiling data describing the time progression of liver dysfunction in a rat model of acute hepatic failure generated by D-galactosamine (GaIN) injection. From the plasma data, our methodology identified seven (out of a total of 23) metabolites, and the corresponding transform functions, as the best inputs to the optimal diagnostic index. This index showed better time resolution and increased noise robustness compared with an existing metabolic index, Fischer's BCAA/AAA molar ratio, as well as indexes generated using other commonly used discriminant analysis tools. Comparison of plasma and liver indexes found two consensus metabolites, lactate and glucose, which implicate glycolysis and/or gluconeogenesis in mediating the metabolic effects of GaIN.

Original languageEnglish (US)
Pages (from-to)959-969
Number of pages11
JournalBioinformatics
Volume20
Issue number6
DOIs
StatePublished - Apr 12 2004
Externally publishedYes

Fingerprint

Galactosamine
Profiling
Liver
Discrimination
Metabolites
Plasmas
Diagnostics
Gluconeogenesis
Plasma
Acute Liver Failure
Discriminant Analysis
Discriminant analysis
Glycolysis
Glucose
Noise
Rats
Liver Diseases
Lactic Acid
Tissue
Noise Robustness

ASJC Scopus subject areas

  • Statistics and Probability
  • Medicine(all)
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Identification of optimal classification functions for biological sample and state discrimination from metabolic profiling data. / Lee, Kyongbum; Hwang, Daehee; Yokoyama, Tadaaki; Stephanopoulos, George; Stephanopoulos, Gregory N.; Yarmush, Martin L.

In: Bioinformatics, Vol. 20, No. 6, 12.04.2004, p. 959-969.

Research output: Contribution to journalArticle

Lee, Kyongbum ; Hwang, Daehee ; Yokoyama, Tadaaki ; Stephanopoulos, George ; Stephanopoulos, Gregory N. ; Yarmush, Martin L. / Identification of optimal classification functions for biological sample and state discrimination from metabolic profiling data. In: Bioinformatics. 2004 ; Vol. 20, No. 6. pp. 959-969.
@article{af45f61898f64e3c8186e69e81473148,
title = "Identification of optimal classification functions for biological sample and state discrimination from metabolic profiling data",
abstract = "Motivations: Classification of biological samples for diagnostic purposes is a difficult task because of the many decisions involved on the number, type and functional manipulations of the input variables. This study presents a generally applicable strategy for systematic formulation of optimal diagnostic indexes. To this end, we develop a novel set of computational tools by integrating regression optimization, stepwise variable selection and cross-validation algorithms. Results: The proposed discrimination methodology was applied to plasma and tissue (liver) metabolic profiling data describing the time progression of liver dysfunction in a rat model of acute hepatic failure generated by D-galactosamine (GaIN) injection. From the plasma data, our methodology identified seven (out of a total of 23) metabolites, and the corresponding transform functions, as the best inputs to the optimal diagnostic index. This index showed better time resolution and increased noise robustness compared with an existing metabolic index, Fischer's BCAA/AAA molar ratio, as well as indexes generated using other commonly used discriminant analysis tools. Comparison of plasma and liver indexes found two consensus metabolites, lactate and glucose, which implicate glycolysis and/or gluconeogenesis in mediating the metabolic effects of GaIN.",
author = "Kyongbum Lee and Daehee Hwang and Tadaaki Yokoyama and George Stephanopoulos and Stephanopoulos, {Gregory N.} and Yarmush, {Martin L.}",
year = "2004",
month = "4",
day = "12",
doi = "10.1093/bioinformatics/bth015",
language = "English (US)",
volume = "20",
pages = "959--969",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - Identification of optimal classification functions for biological sample and state discrimination from metabolic profiling data

AU - Lee, Kyongbum

AU - Hwang, Daehee

AU - Yokoyama, Tadaaki

AU - Stephanopoulos, George

AU - Stephanopoulos, Gregory N.

AU - Yarmush, Martin L.

PY - 2004/4/12

Y1 - 2004/4/12

N2 - Motivations: Classification of biological samples for diagnostic purposes is a difficult task because of the many decisions involved on the number, type and functional manipulations of the input variables. This study presents a generally applicable strategy for systematic formulation of optimal diagnostic indexes. To this end, we develop a novel set of computational tools by integrating regression optimization, stepwise variable selection and cross-validation algorithms. Results: The proposed discrimination methodology was applied to plasma and tissue (liver) metabolic profiling data describing the time progression of liver dysfunction in a rat model of acute hepatic failure generated by D-galactosamine (GaIN) injection. From the plasma data, our methodology identified seven (out of a total of 23) metabolites, and the corresponding transform functions, as the best inputs to the optimal diagnostic index. This index showed better time resolution and increased noise robustness compared with an existing metabolic index, Fischer's BCAA/AAA molar ratio, as well as indexes generated using other commonly used discriminant analysis tools. Comparison of plasma and liver indexes found two consensus metabolites, lactate and glucose, which implicate glycolysis and/or gluconeogenesis in mediating the metabolic effects of GaIN.

AB - Motivations: Classification of biological samples for diagnostic purposes is a difficult task because of the many decisions involved on the number, type and functional manipulations of the input variables. This study presents a generally applicable strategy for systematic formulation of optimal diagnostic indexes. To this end, we develop a novel set of computational tools by integrating regression optimization, stepwise variable selection and cross-validation algorithms. Results: The proposed discrimination methodology was applied to plasma and tissue (liver) metabolic profiling data describing the time progression of liver dysfunction in a rat model of acute hepatic failure generated by D-galactosamine (GaIN) injection. From the plasma data, our methodology identified seven (out of a total of 23) metabolites, and the corresponding transform functions, as the best inputs to the optimal diagnostic index. This index showed better time resolution and increased noise robustness compared with an existing metabolic index, Fischer's BCAA/AAA molar ratio, as well as indexes generated using other commonly used discriminant analysis tools. Comparison of plasma and liver indexes found two consensus metabolites, lactate and glucose, which implicate glycolysis and/or gluconeogenesis in mediating the metabolic effects of GaIN.

UR - http://www.scopus.com/inward/record.url?scp=2342418957&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2342418957&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bth015

DO - 10.1093/bioinformatics/bth015

M3 - Article

C2 - 14751977

AN - SCOPUS:2342418957

VL - 20

SP - 959

EP - 969

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 6

ER -