TY - GEN
T1 - Objective Measures of Plosive Nasalization in Hypernasal Speech
AU - Saxon, Michael
AU - Liss, Julie
AU - Berisha, Visar
N1 - Funding Information:
∗This work was funded in part by NIH RO1 grant R01DC006859 (MPIs: Liss, Berisha).
Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Hypernasal speech is a common symptom across several neurological disorders; however it has a variable acoustic signature, making it difficult to quantify acoustically or perceptually. In this paper, we propose the nasal cognate distinctiveness features as an objective proxy for hypernasal speech. Our method is motivated by the observation that incomplete velopharyngeal closure changes the acoustics of the resultant speech such that alveolar stops /t/ and /d/ map to the alveolar nasal /n/ and bilabial stops /b/ and /p/ map to bilabial nasal /m/. We propose a new family of features based on likelihood ratios between the plosives and their respective nasal cognates. These features are based on an acoustic model that is trained only on healthy speech, and evaluated on a set of 75 speakers diagnosed with different dysarthria subtypes and exhibiting varying levels of hypernasality. Our results show that the family of features compares favorably with the clinical perception of speech-language pathologists subjectively evaluating hypernasality.
AB - Hypernasal speech is a common symptom across several neurological disorders; however it has a variable acoustic signature, making it difficult to quantify acoustically or perceptually. In this paper, we propose the nasal cognate distinctiveness features as an objective proxy for hypernasal speech. Our method is motivated by the observation that incomplete velopharyngeal closure changes the acoustics of the resultant speech such that alveolar stops /t/ and /d/ map to the alveolar nasal /n/ and bilabial stops /b/ and /p/ map to bilabial nasal /m/. We propose a new family of features based on likelihood ratios between the plosives and their respective nasal cognates. These features are based on an acoustic model that is trained only on healthy speech, and evaluated on a set of 75 speakers diagnosed with different dysarthria subtypes and exhibiting varying levels of hypernasality. Our results show that the family of features compares favorably with the clinical perception of speech-language pathologists subjectively evaluating hypernasality.
KW - automatic speech recognition
KW - dysarthria
KW - hypernasality
KW - speech
KW - velopha-ryngeal dysfunction
UR - http://www.scopus.com/inward/record.url?scp=85068956796&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068956796&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8682339
DO - 10.1109/ICASSP.2019.8682339
M3 - Conference contribution
AN - SCOPUS:85068956796
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 6520
EP - 6524
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Y2 - 12 May 2019 through 17 May 2019
ER -