Are reported accuracies in the clinical speech machine learning literature overoptimistic?

Visar Berisha; Chelsea Krantsevich; Gabriela Stegmann; Shira Hahn; Julie Liss

doi:10.21437/Interspeech.2022-691

Are reported accuracies in the clinical speech machine learning literature overoptimistic?

Visar Berisha, Chelsea Krantsevich, Gabriela Stegmann, Shira Hahn, Julie Liss

Health Solutions, College of (CHS)

Research output: Contribution to journal › Conference article › peer-review

10 Scopus citations

Abstract

Building clinical speech analytics models that will reliably translate in-clinic requires a realistic characterization of their performance. So, how well do we estimate the accuracy of published models in the literature? We evaluate the relationship between sample size and reported accuracy across 77 journal publications that use speech to classify between healthy controls and patients with dementia. The studies are combined across three meta-analyses that use the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocol. The results show that reported accuracy declines as a function of increasing sample size, with small sample size studies yielding an overoptimistic estimate of the accuracy. For correctly trained models, this is unexpected as the ability of a machine learning model to predict group membership ought to remain the same or improve with additional training data. We posit that the overoptimism is the result of a combination of publication bias and overfitting and suggest mitigation strategies.

Original language	English (US)
Pages (from-to)	2453-2457
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	2022-September
DOIs	https://doi.org/10.21437/Interspeech.2022-691
State	Published - 2022
Event	23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of Duration: Sep 18 2022 → Sep 22 2022

Keywords

clinical speech analytics
dementia
MCI
natural language processing
robust machine learning

ASJC Scopus subject areas

Language and Linguistics
Human-Computer Interaction
Signal Processing
Software
Modeling and Simulation

Access to Document

10.21437/Interspeech.2022-691

Cite this

Are reported accuracies in the clinical speech machine learning literature overoptimistic? / Berisha, Visar; Krantsevich, Chelsea; Stegmann, Gabriela et al.
In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2022-September, 2022, p. 2453-2457.

Research output: Contribution to journal › Conference article › peer-review

@article{e5a92389d8374995b377c2dddcadf24d,

title = "Are reported accuracies in the clinical speech machine learning literature overoptimistic?",

abstract = "Building clinical speech analytics models that will reliably translate in-clinic requires a realistic characterization of their performance. So, how well do we estimate the accuracy of published models in the literature? We evaluate the relationship between sample size and reported accuracy across 77 journal publications that use speech to classify between healthy controls and patients with dementia. The studies are combined across three meta-analyses that use the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocol. The results show that reported accuracy declines as a function of increasing sample size, with small sample size studies yielding an overoptimistic estimate of the accuracy. For correctly trained models, this is unexpected as the ability of a machine learning model to predict group membership ought to remain the same or improve with additional training data. We posit that the overoptimism is the result of a combination of publication bias and overfitting and suggest mitigation strategies.",

keywords = "clinical speech analytics, dementia, MCI, natural language processing, robust machine learning",

author = "Visar Berisha and Chelsea Krantsevich and Gabriela Stegmann and Shira Hahn and Julie Liss",

note = "Funding Information: This work is funded in part by NIH - NIDCD R01 DC006859. Publisher Copyright: Copyright {\textcopyright} 2022 ISCA.; 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 ; Conference date: 18-09-2022 Through 22-09-2022",

year = "2022",

doi = "10.21437/Interspeech.2022-691",

language = "English (US)",

volume = "2022-September",

pages = "2453--2457",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",