Acoustic cues to lexical segmentation

A study of resynthesized speech

Stephanie M. Spitzer, Julie Liss, Sven L. Mattys

Research output: Contribution to journalArticle

43 Citations (Scopus)

Abstract

It has been posited that the role of prosody in lexical segmentation is elevated when the speech signal is degraded or unreliable. Using predictions from Cutler and Norris' [J. Exp. Psychol. Hum. Percept. Perform. 14, 113-121 (1988)] metrical segmentation strategy hypothesis as a framework, this investigation examined how individual suprasegmental and segmental cues to syllabic stress contribute differentially to the recognition of strong and weak syllables for the purpose of lexical segmentation. Syllabic contrastivity was reduced in resynthesized phrases by systematically (i) flattening the fundamental frequency (F0) contours, (ii) equalizing vowel durations, (iii) weakening strong vowels, (iv) combining the two suprasegmental cues, i.e., F0 and duration, and (v) combining the manipulation of all cues. Results indicated that, despite similar decrements in overall intelligibility, F0 flattening and the weakening of strong vowels had a greater impact on lexical segmentation than did equalizing vowel duration. Both combined-cue conditions resulted in greater decrements in intelligibility, but with no additional negative impact on lexical segmentation. The results support the notion of F0 variation and vowel quality as primary conduits for stress-based segmentation and suggest that the effectiveness of stress-based segmentation with degraded speech must be investigated relative to the suprasegmental and segmental impoverishments occasioned by each particular degradation.

Original languageEnglish (US)
Pages (from-to)3678-3687
Number of pages10
JournalJournal of the Acoustical Society of America
Volume122
Issue number6
DOIs
StatePublished - 2007

Fingerprint

vowels
cues
acoustics
intelligibility
flattening
syllables
manipulators
Acoustic Cues
Lexical Segmentation
degradation
Suprasegmentals
Segmentation
predictions
Intelligibility
Vowel Duration

ASJC Scopus subject areas

  • Acoustics and Ultrasonics

Cite this

Acoustic cues to lexical segmentation : A study of resynthesized speech. / Spitzer, Stephanie M.; Liss, Julie; Mattys, Sven L.

In: Journal of the Acoustical Society of America, Vol. 122, No. 6, 2007, p. 3678-3687.

Research output: Contribution to journalArticle

Spitzer, Stephanie M. ; Liss, Julie ; Mattys, Sven L. / Acoustic cues to lexical segmentation : A study of resynthesized speech. In: Journal of the Acoustical Society of America. 2007 ; Vol. 122, No. 6. pp. 3678-3687.
@article{df8bf73beaaa4fb0b723637be8e69ccb,
title = "Acoustic cues to lexical segmentation: A study of resynthesized speech",
abstract = "It has been posited that the role of prosody in lexical segmentation is elevated when the speech signal is degraded or unreliable. Using predictions from Cutler and Norris' [J. Exp. Psychol. Hum. Percept. Perform. 14, 113-121 (1988)] metrical segmentation strategy hypothesis as a framework, this investigation examined how individual suprasegmental and segmental cues to syllabic stress contribute differentially to the recognition of strong and weak syllables for the purpose of lexical segmentation. Syllabic contrastivity was reduced in resynthesized phrases by systematically (i) flattening the fundamental frequency (F0) contours, (ii) equalizing vowel durations, (iii) weakening strong vowels, (iv) combining the two suprasegmental cues, i.e., F0 and duration, and (v) combining the manipulation of all cues. Results indicated that, despite similar decrements in overall intelligibility, F0 flattening and the weakening of strong vowels had a greater impact on lexical segmentation than did equalizing vowel duration. Both combined-cue conditions resulted in greater decrements in intelligibility, but with no additional negative impact on lexical segmentation. The results support the notion of F0 variation and vowel quality as primary conduits for stress-based segmentation and suggest that the effectiveness of stress-based segmentation with degraded speech must be investigated relative to the suprasegmental and segmental impoverishments occasioned by each particular degradation.",
author = "Spitzer, {Stephanie M.} and Julie Liss and Mattys, {Sven L.}",
year = "2007",
doi = "10.1121/1.2801545",
language = "English (US)",
volume = "122",
pages = "3678--3687",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "6",

}

TY - JOUR

T1 - Acoustic cues to lexical segmentation

T2 - A study of resynthesized speech

AU - Spitzer, Stephanie M.

AU - Liss, Julie

AU - Mattys, Sven L.

PY - 2007

Y1 - 2007

N2 - It has been posited that the role of prosody in lexical segmentation is elevated when the speech signal is degraded or unreliable. Using predictions from Cutler and Norris' [J. Exp. Psychol. Hum. Percept. Perform. 14, 113-121 (1988)] metrical segmentation strategy hypothesis as a framework, this investigation examined how individual suprasegmental and segmental cues to syllabic stress contribute differentially to the recognition of strong and weak syllables for the purpose of lexical segmentation. Syllabic contrastivity was reduced in resynthesized phrases by systematically (i) flattening the fundamental frequency (F0) contours, (ii) equalizing vowel durations, (iii) weakening strong vowels, (iv) combining the two suprasegmental cues, i.e., F0 and duration, and (v) combining the manipulation of all cues. Results indicated that, despite similar decrements in overall intelligibility, F0 flattening and the weakening of strong vowels had a greater impact on lexical segmentation than did equalizing vowel duration. Both combined-cue conditions resulted in greater decrements in intelligibility, but with no additional negative impact on lexical segmentation. The results support the notion of F0 variation and vowel quality as primary conduits for stress-based segmentation and suggest that the effectiveness of stress-based segmentation with degraded speech must be investigated relative to the suprasegmental and segmental impoverishments occasioned by each particular degradation.

AB - It has been posited that the role of prosody in lexical segmentation is elevated when the speech signal is degraded or unreliable. Using predictions from Cutler and Norris' [J. Exp. Psychol. Hum. Percept. Perform. 14, 113-121 (1988)] metrical segmentation strategy hypothesis as a framework, this investigation examined how individual suprasegmental and segmental cues to syllabic stress contribute differentially to the recognition of strong and weak syllables for the purpose of lexical segmentation. Syllabic contrastivity was reduced in resynthesized phrases by systematically (i) flattening the fundamental frequency (F0) contours, (ii) equalizing vowel durations, (iii) weakening strong vowels, (iv) combining the two suprasegmental cues, i.e., F0 and duration, and (v) combining the manipulation of all cues. Results indicated that, despite similar decrements in overall intelligibility, F0 flattening and the weakening of strong vowels had a greater impact on lexical segmentation than did equalizing vowel duration. Both combined-cue conditions resulted in greater decrements in intelligibility, but with no additional negative impact on lexical segmentation. The results support the notion of F0 variation and vowel quality as primary conduits for stress-based segmentation and suggest that the effectiveness of stress-based segmentation with degraded speech must be investigated relative to the suprasegmental and segmental impoverishments occasioned by each particular degradation.

UR - http://www.scopus.com/inward/record.url?scp=38849155298&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38849155298&partnerID=8YFLogxK

U2 - 10.1121/1.2801545

DO - 10.1121/1.2801545

M3 - Article

VL - 122

SP - 3678

EP - 3687

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 6

ER -