TY - JOUR
T1 - Using multiple data features improved the validity of osteoporosis case ascertainment from administrative databases
AU - Lix, Lisa M.
AU - Yogendran, Marina S.
AU - Leslie, William D.
AU - Shaw, Souradet Y.
AU - Baumgartner, Richard
AU - Bowman, Christopher
AU - Metge, Colleen
AU - Gumel, Abba
AU - Hux, Janet
AU - James, Robert C.
PY - 2008/12/1
Y1 - 2008/12/1
N2 - Objectives: The aim was to construct and validate algorithms for osteoporosis case ascertainment from administrative databases and to estimate the population prevalence of osteoporosis for these algorithms. Study Design and Setting: Artificial neural networks, classification trees, and logistic regression were applied to hospital, physician, and pharmacy data from Manitoba, Canada. Discriminative performance and calibration (i.e., error) were compared for algorithms defined from different sets of diagnosis, prescription drug, comorbidity, and demographic variables. Algorithms were validated against a regional bone mineral density testing program. Results: Discriminative performance and calibration were poorer and sensitivity was generally lower for algorithms based on diagnosis codes alone than for algorithms based on an expanded set of data features that included osteoporosis prescriptions and age. Validation measures were similar for neural networks and classification trees, but prevalence estimates were lower for the former model. Conclusion: Multiple features of administrative data generally resulted in improved sensitivity of osteoporosis case-detection algorithm without loss of specificity. However, prevalence estimates using an expanded set of features were still slightly lower than estimates from a population-based study with primary data collection. The classification methods developed in this study can be extended to other chronic diseases for which there may be multiple markers in administrative data.
AB - Objectives: The aim was to construct and validate algorithms for osteoporosis case ascertainment from administrative databases and to estimate the population prevalence of osteoporosis for these algorithms. Study Design and Setting: Artificial neural networks, classification trees, and logistic regression were applied to hospital, physician, and pharmacy data from Manitoba, Canada. Discriminative performance and calibration (i.e., error) were compared for algorithms defined from different sets of diagnosis, prescription drug, comorbidity, and demographic variables. Algorithms were validated against a regional bone mineral density testing program. Results: Discriminative performance and calibration were poorer and sensitivity was generally lower for algorithms based on diagnosis codes alone than for algorithms based on an expanded set of data features that included osteoporosis prescriptions and age. Validation measures were similar for neural networks and classification trees, but prevalence estimates were lower for the former model. Conclusion: Multiple features of administrative data generally resulted in improved sensitivity of osteoporosis case-detection algorithm without loss of specificity. However, prevalence estimates using an expanded set of features were still slightly lower than estimates from a population-based study with primary data collection. The classification methods developed in this study can be extended to other chronic diseases for which there may be multiple markers in administrative data.
KW - Classification trees
KW - Logistic regression
KW - Neural networks
KW - Osteoporosis
KW - Prevalence
KW - Sensitivity
KW - Specificity
UR - http://www.scopus.com/inward/record.url?scp=55249113158&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=55249113158&partnerID=8YFLogxK
U2 - 10.1016/j.jclinepi.2008.02.002
DO - 10.1016/j.jclinepi.2008.02.002
M3 - Article
C2 - 18619800
AN - SCOPUS:55249113158
VL - 61
SP - 1250
EP - 1260
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
SN - 0895-4356
IS - 12
ER -