The cost-effectiveness of targeted delinquency prevention programs for children depends on the accuracy of the screening process. Screening accuracy is often poor, resulting in wasted resources and missed opportunities to avert negative outcomes. This study examined whether screening approaches based on logistic regression or machine learning algorithms could improve accuracy relative to traditional sum-score approaches when identifying boys in the 5th grade (N = 1012) who would be repeatedly arrested for violent and serious crimes from ages 13 to 30. Screening algorithms were developed that incorporated facets of teacher-reported externalizing problems and other known risk factors (e.g., peer rejection). The predictive performance of these algorithms was evaluated and compared in holdout (i.e., test) data using the area under the receiver operating curve (AUROC) and Brier score. Both the logistic and machine learning methods yielded AUROC superior to traditional sum-score screening approaches when a broad set of risk factors for future delinquency was considered. However, this improvement was modest and was not present when using item-level information from a composite scale assessing externalizing problems. Contrary to expectations, machine learning algorithms performed no better than simple logistic models. There was a large apparent advantage of machine learning that disappeared after appropriate cross-validation, underscoring the importance of careful evaluation of these methods. Results suggest that screening using logistic regression could improve the cost-effectiveness of targeted delinquency prevention programs in some cases, but screening using machine learning would confer no marginal benefit under currently realistic conditions.
- Machine learning
ASJC Scopus subject areas
- Public Health, Environmental and Occupational Health