The impact of forced-alignment errors on automatic pronunciation evaluation

Vikram C. Mathad, Tristan J. Mahr, Nancy Scherer, Kathy Chapman, Katherine C. Hustad, Julie Liss, Visar Berisha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Automatic evaluation of phone-level pronunciation scores typically involves two stages: (1) automatic phonetic segmentation via text-constrained phoneme alignment and (2) quantification of acoustic deviation for each phoneme-level relative to a database of correctly-pronounced speech. It's clear that the second stage depends on the first. That is, if there is misalignment, the acoustic deviation will also be impacted. In this paper, we analyzed the impact of alignment error on a measure of goodness of pronunciation. We computed (1) automatic pronunciation scores using force-aligned samples, (2) the forced-alignment error rate, and (3) acoustic deviation using manually-aligned samples. We used a bivariate linear regression model to characterize the contributions of forced alignment errors and acoustic deviation on the automatic pronunciation scores. This was done across two different children speech databases, namely children with cleft lip/palate and typically developing children between the ages of 3-6 years. The analysis shows that, for speech from typically-developing children, most of the variation in the automatic pronunciation scores is explained by acoustic deviation, with the errors in forced alignment playing a relatively minor role. The forced alignment errors have a small but significant downstream impact on pronunciation assessment for children with cleft lip/palate.

Original languageEnglish (US)
Title of host publication22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PublisherInternational Speech Communication Association
Pages176-180
Number of pages5
ISBN (Electronic)9781713836902
DOIs
StatePublished - 2021
Event22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, Czech Republic
Duration: Aug 30 2021Sep 3 2021

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume1
ISSN (Print)2308-457X
ISSN (Electronic)1990-9772

Conference

Conference22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Country/TerritoryCzech Republic
CityBrno
Period8/30/219/3/21

Keywords

  • Alignment error
  • Force alignments
  • Goodness of pronunciation
  • Manual alignments

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'The impact of forced-alignment errors on automatic pronunciation evaluation'. Together they form a unique fingerprint.

Cite this