caCORE: A common infrastructure for cancer informatics

Peter A. Covitz; Frank Hartel; Carl Schaefer; Sherri De Coronado; Gilberto Fragoso; Himanso Sahni; Scott Gustafson; Kenneth H. Buetow

doi:10.1093/bioinformatics/btg335

caCORE: A common infrastructure for cancer informatics

Peter A. Covitz, Frank Hartel, Carl Schaefer, Sherri De Coronado, Gilberto Fragoso, Himanso Sahni, Scott Gustafson, Kenneth H. Buetow

Research output: Contribution to journal › Article › peer-review

131 Scopus citations

Abstract

Motivation: Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. Results: We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources.

Original language	English (US)
Pages (from-to)	2404-2412
Number of pages	9
Journal	Bioinformatics
Volume	19
Issue number	18
DOIs	https://doi.org/10.1093/bioinformatics/btg335
State	Published - Dec 12 2003
Externally published	Yes

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/btg335

Cite this

@article{fd92229df52d46068fed00ba3085d0fa,

title = "caCORE: A common infrastructure for cancer informatics",

abstract = "Motivation: Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. Results: We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources.",

author = "Covitz, {Peter A.} and Frank Hartel and Carl Schaefer and {De Coronado}, Sherri and Gilberto Fragoso and Himanso Sahni and Scott Gustafson and Buetow, {Kenneth H.}",

note = "Funding Information: Unigene (NCBI); LocusLink (NCBI); Homologene (NCBI) Cancer Genome Anatomy Project (NCI); NCI60 project (NCI, Stanford); Director{\textquoteright}s Challenge Initiative (NCI) Genetic Annotation Initiative (NCI) Genetic Annotation Initiative (Washington University, Incyte, Agencourt) Cancer Genome Anatomy Project (NCI); dbEST (NCBI); IMAGE Consortium Cancer Molecular Analysis Project (NCI) Unigene (NCBI) Golden Path via DAS (UCSC) BioCarta Pathways (BioCarta) NCI Metathesaurus (NCI); NCI Thesaurus (NCI); CMAP Ontology (NCI); Gene Ontology (Gene Ontology Consortium) Cancer Data Standards Repository (NCI) Cancer Therapy Evaluation Program (NCI); Developmental Therapeutics Program (NCI); Division of Cancer Prevention (NCI) Cancer Therapy Evaluation Program (NCI); Special Programs of Research Excellence (NCI); Division of Cancer Prevention (NCI) Funding Information: We thank S. Settnek and M. Connelly for their contributions to caBIO; J.-J. Maurer, L. Chatterjee, R. Chilukuri and P. Aggarwal for their work on the caDSR; M. Haber, L. Wright, J. Oberthaler and F. Rosenberg for contributions to EVS; D. Zimmerman for technical documentation; J. Silva, D. Warzel, B. Meadows and J. Abrams for their fundamental role in launching the CDE project. This work was supported by the National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services.",

year = "2003",

month = dec,

day = "12",

doi = "10.1093/bioinformatics/btg335",

language = "English (US)",

volume = "19",

pages = "2404--2412",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "18",

}

TY - JOUR

T1 - caCORE

T2 - A common infrastructure for cancer informatics

AU - Covitz, Peter A.

AU - Hartel, Frank

AU - Schaefer, Carl

AU - De Coronado, Sherri

AU - Fragoso, Gilberto

AU - Sahni, Himanso

AU - Gustafson, Scott

AU - Buetow, Kenneth H.

N1 - Funding Information: Unigene (NCBI); LocusLink (NCBI); Homologene (NCBI) Cancer Genome Anatomy Project (NCI); NCI60 project (NCI, Stanford); Director’s Challenge Initiative (NCI) Genetic Annotation Initiative (NCI) Genetic Annotation Initiative (Washington University, Incyte, Agencourt) Cancer Genome Anatomy Project (NCI); dbEST (NCBI); IMAGE Consortium Cancer Molecular Analysis Project (NCI) Unigene (NCBI) Golden Path via DAS (UCSC) BioCarta Pathways (BioCarta) NCI Metathesaurus (NCI); NCI Thesaurus (NCI); CMAP Ontology (NCI); Gene Ontology (Gene Ontology Consortium) Cancer Data Standards Repository (NCI) Cancer Therapy Evaluation Program (NCI); Developmental Therapeutics Program (NCI); Division of Cancer Prevention (NCI) Cancer Therapy Evaluation Program (NCI); Special Programs of Research Excellence (NCI); Division of Cancer Prevention (NCI) Funding Information: We thank S. Settnek and M. Connelly for their contributions to caBIO; J.-J. Maurer, L. Chatterjee, R. Chilukuri and P. Aggarwal for their work on the caDSR; M. Haber, L. Wright, J. Oberthaler and F. Rosenberg for contributions to EVS; D. Zimmerman for technical documentation; J. Silva, D. Warzel, B. Meadows and J. Abrams for their fundamental role in launching the CDE project. This work was supported by the National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services.

PY - 2003/12/12

Y1 - 2003/12/12

N2 - Motivation: Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. Results: We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources.

AB - Motivation: Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. Results: We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources.

UR - http://www.scopus.com/inward/record.url?scp=0346252358&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0346252358&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btg335

DO - 10.1093/bioinformatics/btg335

M3 - Article

C2 - 14668224

AN - SCOPUS:0346252358

SN - 1367-4803

VL - 19

SP - 2404

EP - 2412

JO - Bioinformatics

JF - Bioinformatics

IS - 18

ER -

caCORE: A common infrastructure for cancer informatics

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this