Abstract

State-of-the-art automatic speech recognition (ASR) engines perform well on healthy speech; however recent studies show that their performance on dysarthric speech is highly variable. This is because of the acoustic variability associated with the different dysarthria subtypes. This paper aims to develop a better understanding of how perceptual disturbances in dysarthric speech relate to ASR performance. Accurate ratings of a representative set of 32 dysarthric speakers along different perceptual dimensions are obtained and the performance of a representative ASR algorithm on the same set of speakers is analyzed. This work explores the relationship between these ratings and ASR performance and reveals that ASR performance can be predicted from perceptual disturbances in dysarthric speech with articulatory precision contributing the most to the prediction followed by prosody.

Original languageEnglish (US)
Pages (from-to)EL416-EL422
JournalJournal of the Acoustical Society of America
Volume140
Issue number5
DOIs
StatePublished - Nov 1 2016

Fingerprint

speech recognition
disturbances
ratings
engines
Automatic Speech Recognition
acoustics
predictions
Rating

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Cite this

The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance. / Tu, Ming; Wisler, Alan; Berisha, Visar; Liss, Julie.

In: Journal of the Acoustical Society of America, Vol. 140, No. 5, 01.11.2016, p. EL416-EL422.

Research output: Contribution to journalArticle

@article{a86a19ed59f74dd4afd13b0db76f9735,
title = "The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance",
abstract = "State-of-the-art automatic speech recognition (ASR) engines perform well on healthy speech; however recent studies show that their performance on dysarthric speech is highly variable. This is because of the acoustic variability associated with the different dysarthria subtypes. This paper aims to develop a better understanding of how perceptual disturbances in dysarthric speech relate to ASR performance. Accurate ratings of a representative set of 32 dysarthric speakers along different perceptual dimensions are obtained and the performance of a representative ASR algorithm on the same set of speakers is analyzed. This work explores the relationship between these ratings and ASR performance and reveals that ASR performance can be predicted from perceptual disturbances in dysarthric speech with articulatory precision contributing the most to the prediction followed by prosody.",
author = "Ming Tu and Alan Wisler and Visar Berisha and Julie Liss",
year = "2016",
month = "11",
day = "1",
doi = "10.1121/1.4967208",
language = "English (US)",
volume = "140",
pages = "EL416--EL422",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "5",

}

TY - JOUR

T1 - The relationship between perceptual disturbances in dysarthric speech and automatic speech recognition performance

AU - Tu, Ming

AU - Wisler, Alan

AU - Berisha, Visar

AU - Liss, Julie

PY - 2016/11/1

Y1 - 2016/11/1

N2 - State-of-the-art automatic speech recognition (ASR) engines perform well on healthy speech; however recent studies show that their performance on dysarthric speech is highly variable. This is because of the acoustic variability associated with the different dysarthria subtypes. This paper aims to develop a better understanding of how perceptual disturbances in dysarthric speech relate to ASR performance. Accurate ratings of a representative set of 32 dysarthric speakers along different perceptual dimensions are obtained and the performance of a representative ASR algorithm on the same set of speakers is analyzed. This work explores the relationship between these ratings and ASR performance and reveals that ASR performance can be predicted from perceptual disturbances in dysarthric speech with articulatory precision contributing the most to the prediction followed by prosody.

AB - State-of-the-art automatic speech recognition (ASR) engines perform well on healthy speech; however recent studies show that their performance on dysarthric speech is highly variable. This is because of the acoustic variability associated with the different dysarthria subtypes. This paper aims to develop a better understanding of how perceptual disturbances in dysarthric speech relate to ASR performance. Accurate ratings of a representative set of 32 dysarthric speakers along different perceptual dimensions are obtained and the performance of a representative ASR algorithm on the same set of speakers is analyzed. This work explores the relationship between these ratings and ASR performance and reveals that ASR performance can be predicted from perceptual disturbances in dysarthric speech with articulatory precision contributing the most to the prediction followed by prosody.

UR - http://www.scopus.com/inward/record.url?scp=84996563969&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84996563969&partnerID=8YFLogxK

U2 - 10.1121/1.4967208

DO - 10.1121/1.4967208

M3 - Article

C2 - 27908075

AN - SCOPUS:84996563969

VL - 140

SP - EL416-EL422

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 5

ER -