Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases

Lisa M. Lix; Marina S. Yogendran; William D. Leslie; Souradet Y. Shaw; Richard Baumgartner; Christopher Bowman; Colleen Metge; Abba Gumel; Janet Hux; Robert C. James

doi:10.1016/j.jclinepi.2008.02.002

Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases

Lisa M. Lix, Marina S. Yogendran, William D. Leslie, Souradet Y. Shaw, Richard Baumgartner, Christopher Bowman, Colleen Metge, Abba Gumel, Janet Hux, Robert C. James

Research output: Contribution to journal › Article › peer-review

62 Scopus citations

Abstract

Objectives: The aim was to construct and validate algorithms for osteoporosis case ascertainment from administrative databases and to estimate the population prevalence of osteoporosis for these algorithms. Study Design and Setting: Artificial neural networks, classification trees, and logistic regression were applied to hospital, physician, and pharmacy data from Manitoba, Canada. Discriminative performance and calibration (i.e., error) were compared for algorithms defined from different sets of diagnosis, prescription drug, comorbidity, and demographic variables. Algorithms were validated against a regional bone mineral density testing program. Results: Discriminative performance and calibration were poorer and sensitivity was generally lower for algorithms based on diagnosis codes alone than for algorithms based on an expanded set of data features that included osteoporosis prescriptions and age. Validation measures were similar for neural networks and classification trees, but prevalence estimates were lower for the former model. Conclusion: Multiple features of administrative data generally resulted in improved sensitivity of osteoporosis case-detection algorithm without loss of specificity. However, prevalence estimates using an expanded set of features were still slightly lower than estimates from a population-based study with primary data collection. The classification methods developed in this study can be extended to other chronic diseases for which there may be multiple markers in administrative data.

Original language	English (US)
Pages (from-to)	1250-1260
Number of pages	11
Journal	Journal of Clinical Epidemiology
Volume	61
Issue number	12
DOIs	https://doi.org/10.1016/j.jclinepi.2008.02.002
State	Published - Dec 2008
Externally published	Yes

Keywords

Classification trees
Logistic regression
Neural networks
Osteoporosis
Prevalence
Sensitivity
Specificity

ASJC Scopus subject areas

Epidemiology

Access to Document

10.1016/j.jclinepi.2008.02.002

Cite this

@article{7056b7498e184e63b8c6adcf5d3a5717,

title = "Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases",

abstract = "Objectives: The aim was to construct and validate algorithms for osteoporosis case ascertainment from administrative databases and to estimate the population prevalence of osteoporosis for these algorithms. Study Design and Setting: Artificial neural networks, classification trees, and logistic regression were applied to hospital, physician, and pharmacy data from Manitoba, Canada. Discriminative performance and calibration (i.e., error) were compared for algorithms defined from different sets of diagnosis, prescription drug, comorbidity, and demographic variables. Algorithms were validated against a regional bone mineral density testing program. Results: Discriminative performance and calibration were poorer and sensitivity was generally lower for algorithms based on diagnosis codes alone than for algorithms based on an expanded set of data features that included osteoporosis prescriptions and age. Validation measures were similar for neural networks and classification trees, but prevalence estimates were lower for the former model. Conclusion: Multiple features of administrative data generally resulted in improved sensitivity of osteoporosis case-detection algorithm without loss of specificity. However, prevalence estimates using an expanded set of features were still slightly lower than estimates from a population-based study with primary data collection. The classification methods developed in this study can be extended to other chronic diseases for which there may be multiple markers in administrative data.",

keywords = "Classification trees, Logistic regression, Neural networks, Osteoporosis, Prevalence, Sensitivity, Specificity",

author = "Lix, {Lisa M.} and Yogendran, {Marina S.} and Leslie, {William D.} and Shaw, {Souradet Y.} and Richard Baumgartner and Christopher Bowman and Colleen Metge and Abba Gumel and Janet Hux and James, {Robert C.}",

note = "Funding Information: This research was supported by a grant from the Canadian Institutes of Health Research to the first and sixth authors, and by a Canadian Institutes of Health Research New Investigator Award to the first author. The authors are indebted to Manitoba Health for the provision of data. The results and conclusions are those of the authors, and no official endorsement by Manitoba Health is intended or should be inferred.",

year = "2008",

month = dec,

doi = "10.1016/j.jclinepi.2008.02.002",

language = "English (US)",

volume = "61",

pages = "1250--1260",

journal = "Journal of Clinical Epidemiology",

issn = "0895-4356",

publisher = "Elsevier USA",

number = "12",

}

TY - JOUR

T1 - Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases

AU - Lix, Lisa M.

AU - Yogendran, Marina S.

AU - Leslie, William D.

AU - Shaw, Souradet Y.

AU - Baumgartner, Richard

AU - Bowman, Christopher

AU - Metge, Colleen

AU - Gumel, Abba

AU - Hux, Janet

AU - James, Robert C.

N1 - Funding Information: This research was supported by a grant from the Canadian Institutes of Health Research to the first and sixth authors, and by a Canadian Institutes of Health Research New Investigator Award to the first author. The authors are indebted to Manitoba Health for the provision of data. The results and conclusions are those of the authors, and no official endorsement by Manitoba Health is intended or should be inferred.

PY - 2008/12

Y1 - 2008/12

N2 - Objectives: The aim was to construct and validate algorithms for osteoporosis case ascertainment from administrative databases and to estimate the population prevalence of osteoporosis for these algorithms. Study Design and Setting: Artificial neural networks, classification trees, and logistic regression were applied to hospital, physician, and pharmacy data from Manitoba, Canada. Discriminative performance and calibration (i.e., error) were compared for algorithms defined from different sets of diagnosis, prescription drug, comorbidity, and demographic variables. Algorithms were validated against a regional bone mineral density testing program. Results: Discriminative performance and calibration were poorer and sensitivity was generally lower for algorithms based on diagnosis codes alone than for algorithms based on an expanded set of data features that included osteoporosis prescriptions and age. Validation measures were similar for neural networks and classification trees, but prevalence estimates were lower for the former model. Conclusion: Multiple features of administrative data generally resulted in improved sensitivity of osteoporosis case-detection algorithm without loss of specificity. However, prevalence estimates using an expanded set of features were still slightly lower than estimates from a population-based study with primary data collection. The classification methods developed in this study can be extended to other chronic diseases for which there may be multiple markers in administrative data.

AB - Objectives: The aim was to construct and validate algorithms for osteoporosis case ascertainment from administrative databases and to estimate the population prevalence of osteoporosis for these algorithms. Study Design and Setting: Artificial neural networks, classification trees, and logistic regression were applied to hospital, physician, and pharmacy data from Manitoba, Canada. Discriminative performance and calibration (i.e., error) were compared for algorithms defined from different sets of diagnosis, prescription drug, comorbidity, and demographic variables. Algorithms were validated against a regional bone mineral density testing program. Results: Discriminative performance and calibration were poorer and sensitivity was generally lower for algorithms based on diagnosis codes alone than for algorithms based on an expanded set of data features that included osteoporosis prescriptions and age. Validation measures were similar for neural networks and classification trees, but prevalence estimates were lower for the former model. Conclusion: Multiple features of administrative data generally resulted in improved sensitivity of osteoporosis case-detection algorithm without loss of specificity. However, prevalence estimates using an expanded set of features were still slightly lower than estimates from a population-based study with primary data collection. The classification methods developed in this study can be extended to other chronic diseases for which there may be multiple markers in administrative data.

KW - Classification trees

KW - Logistic regression

KW - Neural networks

KW - Osteoporosis

KW - Prevalence

KW - Sensitivity

KW - Specificity

UR - http://www.scopus.com/inward/record.url?scp=55249113158&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=55249113158&partnerID=8YFLogxK

U2 - 10.1016/j.jclinepi.2008.02.002

DO - 10.1016/j.jclinepi.2008.02.002

M3 - Article

C2 - 18619800

AN - SCOPUS:55249113158

SN - 0895-4356

VL - 61

SP - 1250

EP - 1260

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

IS - 12

ER -

Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this