Models for objective evaluation of dysarthric speech from data annotated by multiple listeners

Ming Tu, Yishan Jiao, Visar Berisha, Julie Liss

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

In subjective evaluation of dysarthric speech, the inter-rater agreement between clinicians can be low. Disagreement among clinicians results from differences in their perceptual assessment abilities, familiarization with a client, clinical experiences, etc. Recently, there has been interest in developing signal processing and machine learning models for objective evaluation of subjective speech quality. In this paper, we propose a new method to address this problem by collecting subjective ratings from multiple evaluators and modeling the reliability of each annotator within a machine learning framework. In contrast to previous work, our model explicitly models the dependence of the speaker on an evaluators reliability. We evaluate the model on a series of experiments on a dysarthric speech database and show that our method outperforms other similar approaches.

Original languageEnglish (US)
Title of host publicationConference Record of the 50th Asilomar Conference on Signals, Systems and Computers, ACSSC 2016
EditorsMichael B. Matthews
PublisherIEEE Computer Society
Pages827-830
Number of pages4
ISBN (Electronic)9781538639542
DOIs
StatePublished - Mar 1 2017
Event50th Asilomar Conference on Signals, Systems and Computers, ACSSC 2016 - Pacific Grove, United States
Duration: Nov 6 2016Nov 9 2016

Publication series

NameConference Record - Asilomar Conference on Signals, Systems and Computers
ISSN (Print)1058-6393

Other

Other50th Asilomar Conference on Signals, Systems and Computers, ACSSC 2016
Country/TerritoryUnited States
CityPacific Grove
Period11/6/1611/9/16

ASJC Scopus subject areas

  • Signal Processing
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Models for objective evaluation of dysarthric speech from data annotated by multiple listeners'. Together they form a unique fingerprint.

Cite this