Modeling pathological speech perception from data with similarity labels

Visar Berisha; Julie Liss; Steven Sandoval; Rene Utianski; Andreas Spanias

doi:10.1109/ICASSP.2014.6853730

Modeling pathological speech perception from data with similarity labels

Visar Berisha, Julie Liss, Steven Sandoval, Rene Utianski, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

21 Scopus citations

Abstract

The current state of the art in judging pathological speech intelligibility is subjective assessment performed by trained speech pathologists (SLP). These tests, however, are inconsistent, costly and, oftentimes suffer from poor intra- and inter-judge reliability. As such, consistent, reliable, and perceptually-relevant objective evaluations of pathological speech are critical. Here, we propose a data-driven approach to this problem. We propose new cost functions for examining data from a series of experiments, whereby we ask certified SLPs to rate pathological speech along the perceptual dimensions that contribute to decreased intelligibility. We consider qualitative feedback from SLPs in the form of comparisons similar to statements 'Is Speaker A's rhythm more similar to Speaker B or Speaker C?' Data of this form is common in behavioral research, but is different from the traditional data structures expected in supervised (data matrix + class labels) or unsupervised (data matrix) machine learning. The proposed method identifies relevant acoustic features that correlate with the ordinal data collected during the experiment. Using these features, we show that we are able to develop objective measures of the speech signal degradation that correlate well with SLP responses.

Original language	English (US)
Title of host publication	2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	915-919
Number of pages	5
ISBN (Print)	9781479928927
DOIs	https://doi.org/10.1109/ICASSP.2014.6853730
State	Published - 2014
Event	2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy Duration: May 4 2014 → May 9 2014

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)	1520-6149

Other

Other	2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Country/Territory	Italy
City	Florence
Period	5/4/14 → 5/9/14

ASJC Scopus subject areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP.2014.6853730

Cite this

Berisha, V., Liss, J., Sandoval, S., Utianski, R., & Spanias, A. (2014). Modeling pathological speech perception from data with similarity labels. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 (pp. 915-919). Article 6853730 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2014.6853730

Modeling pathological speech perception from data with similarity labels. / Berisha, Visar ; Liss, Julie; Sandoval, Steven et al.
2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. p. 915-919 6853730 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Berisha, V , Liss, J, Sandoval, S, Utianski, R & Spanias, A 2014, Modeling pathological speech perception from data with similarity labels. in 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014., 6853730, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 915-919, 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Florence, Italy, 5/4/14. https://doi.org/10.1109/ICASSP.2014.6853730

Berisha V , Liss J, Sandoval S, Utianski R, Spanias A. Modeling pathological speech perception from data with similarity labels. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc. 2014. p. 915-919. 6853730. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2014.6853730

Berisha, Visar ; Liss, Julie ; Sandoval, Steven et al. / Modeling pathological speech perception from data with similarity labels. 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 915-919 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{f813c53c37d142bcade1cdff7364b7b6,

title = "Modeling pathological speech perception from data with similarity labels",

abstract = "The current state of the art in judging pathological speech intelligibility is subjective assessment performed by trained speech pathologists (SLP). These tests, however, are inconsistent, costly and, oftentimes suffer from poor intra- and inter-judge reliability. As such, consistent, reliable, and perceptually-relevant objective evaluations of pathological speech are critical. Here, we propose a data-driven approach to this problem. We propose new cost functions for examining data from a series of experiments, whereby we ask certified SLPs to rate pathological speech along the perceptual dimensions that contribute to decreased intelligibility. We consider qualitative feedback from SLPs in the form of comparisons similar to statements 'Is Speaker A's rhythm more similar to Speaker B or Speaker C?' Data of this form is common in behavioral research, but is different from the traditional data structures expected in supervised (data matrix + class labels) or unsupervised (data matrix) machine learning. The proposed method identifies relevant acoustic features that correlate with the ordinal data collected during the experiment. Using these features, we show that we are able to develop objective measures of the speech signal degradation that correlate well with SLP responses.",

author = "Visar Berisha and Julie Liss and Steven Sandoval and Rene Utianski and Andreas Spanias",

year = "2014",

doi = "10.1109/ICASSP.2014.6853730",

language = "English (US)",

isbn = "9781479928927",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "915--919",

booktitle = "2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014",

note = "2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 ; Conference date: 04-05-2014 Through 09-05-2014",

}

TY - GEN

T1 - Modeling pathological speech perception from data with similarity labels

AU - Berisha, Visar

AU - Liss, Julie

AU - Sandoval, Steven

AU - Utianski, Rene

AU - Spanias, Andreas

PY - 2014

Y1 - 2014

N2 - The current state of the art in judging pathological speech intelligibility is subjective assessment performed by trained speech pathologists (SLP). These tests, however, are inconsistent, costly and, oftentimes suffer from poor intra- and inter-judge reliability. As such, consistent, reliable, and perceptually-relevant objective evaluations of pathological speech are critical. Here, we propose a data-driven approach to this problem. We propose new cost functions for examining data from a series of experiments, whereby we ask certified SLPs to rate pathological speech along the perceptual dimensions that contribute to decreased intelligibility. We consider qualitative feedback from SLPs in the form of comparisons similar to statements 'Is Speaker A's rhythm more similar to Speaker B or Speaker C?' Data of this form is common in behavioral research, but is different from the traditional data structures expected in supervised (data matrix + class labels) or unsupervised (data matrix) machine learning. The proposed method identifies relevant acoustic features that correlate with the ordinal data collected during the experiment. Using these features, we show that we are able to develop objective measures of the speech signal degradation that correlate well with SLP responses.

AB - The current state of the art in judging pathological speech intelligibility is subjective assessment performed by trained speech pathologists (SLP). These tests, however, are inconsistent, costly and, oftentimes suffer from poor intra- and inter-judge reliability. As such, consistent, reliable, and perceptually-relevant objective evaluations of pathological speech are critical. Here, we propose a data-driven approach to this problem. We propose new cost functions for examining data from a series of experiments, whereby we ask certified SLPs to rate pathological speech along the perceptual dimensions that contribute to decreased intelligibility. We consider qualitative feedback from SLPs in the form of comparisons similar to statements 'Is Speaker A's rhythm more similar to Speaker B or Speaker C?' Data of this form is common in behavioral research, but is different from the traditional data structures expected in supervised (data matrix + class labels) or unsupervised (data matrix) machine learning. The proposed method identifies relevant acoustic features that correlate with the ordinal data collected during the experiment. Using these features, we show that we are able to develop objective measures of the speech signal degradation that correlate well with SLP responses.

UR - http://www.scopus.com/inward/record.url?scp=84905215533&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905215533&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2014.6853730

DO - 10.1109/ICASSP.2014.6853730

M3 - Conference contribution

AN - SCOPUS:84905215533

SN - 9781479928927

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 915

EP - 919

BT - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014

Y2 - 4 May 2014 through 9 May 2014

ER -

Modeling pathological speech perception from data with similarity labels

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this