Human Frequency Following Responses to Vocoded Speech

Saradha Ananthakrishnan; Xin Luo; Ananthanarayan Krishnan

doi:10.1097/AUD.0000000000000432

Human Frequency Following Responses to Vocoded Speech

Saradha Ananthakrishnan, Xin Luo, Ananthanarayan Krishnan

Health Solutions, College of (CHS)

Research output: Contribution to journal › Article › peer-review

13 Scopus citations

Abstract

OBJECTIVES: Vocoders offer an effective platform to simulate the effects of cochlear implant speech processing strategies in normal-hearing listeners. Several behavioral studies have examined the effects of varying spectral and temporal cues on vocoded speech perception; however, little is known about the neural indices of vocoded speech perception. Here, the scalp-recorded frequency following response (FFR) was used to study the effects of varying spectral and temporal cues on brainstem neural representation of specific acoustic cues, the temporal envelope periodicity related to fundamental frequency (F0) and temporal fine structure (TFS) related to formant and formant-related frequencies, as reflected in the phase-locked neural activity in response to vocoded speech.

DESIGN: In experiment 1, FFRs were measured in 12 normal-hearing, adult listeners in response to a steady state English back vowel /u/ presented in an unaltered, unprocessed condition and six sine-vocoder conditions with varying numbers of channels (1, 2, 4, 8, 16, and 32), while the temporal envelope cutoff frequency was fixed at 500 Hz. In experiment 2, FFRs were obtained from 14 normal-hearing, adult listeners in response to the same English vowel /u/, presented in an unprocessed condition and four vocoded conditions where both the temporal envelope cutoff frequency (50 versus 500 Hz) and carrier type (sine wave versus noise band) were varied separately with the number of channels fixed at 8. Fast Fourier Transform was applied to the time waveforms of FFR to analyze the strength of brainstem neural representation of temporal envelope periodicity (F0) and TFS-related peaks (formant structure).

RESULTS: Brainstem neural representation of both temporal envelope and TFS cues improved when the number of channels increased from 1 to 4, followed by a plateau with 8 and 16 channels, and a reduction in phase-locking strength with 32 channels. For the sine vocoders, peaks in the FFRTFS spectra corresponded with the low-frequency sine-wave carriers and side band frequencies in the stimulus spectra. When the temporal envelope cutoff frequency increased from 50 to 500 Hz, an improvement was observed in brainstem F0 representation with no change in brainstem representation of spectral peaks proximal to the first formant frequency (F1). There was no significant effect of carrier type (sine- versus noise-vocoder) on brainstem neural representation of F0 cues when the temporal envelope cutoff frequency was 500 Hz.

CONCLUSIONS: While the improvement in neural representation of temporal envelope and TFS cues with up to 4 vocoder channels is consistent with the behavioral literature, the reduced neural phase-locking strength noted with even more channels may be because of the narrow bandwidth of each channel as the number of channels increases. Stronger neural representation of temporal envelope cues with higher temporal envelope cutoff frequencies is likely a reflection of brainstem neural phase-locking to F0-related periodicity fluctuations preserved in the 500-Hz temporal envelopes, which are unavailable in the 50-Hz temporal envelopes. No effect of temporal envelope cutoff frequency was seen for neural representation of TFS cues, suggesting that spectral side band frequencies created by the 500-Hz temporal envelopes did not improve neural representation of F1 cues over the 50-Hz temporal envelopes. Finally, brainstem F0 representation was not significantly affected by carrier type with a temporal envelope cutoff frequency of 500 Hz, which is inconsistent with previous results of behavioral studies examining pitch perception of vocoded stimuli.

Original language	English (US)
Pages (from-to)	e256-e267
Journal	Ear and hearing
Volume	38
Issue number	5
DOIs	https://doi.org/10.1097/AUD.0000000000000432
State	Published - Sep 1 2017

ASJC Scopus subject areas

Otorhinolaryngology
Speech and Hearing

Access to Document

10.1097/AUD.0000000000000432

Cite this

@article{c00d48b553ac4c0a8e9e8f2959fa63ee,

title = "Human Frequency Following Responses to Vocoded Speech",

abstract = "OBJECTIVES: Vocoders offer an effective platform to simulate the effects of cochlear implant speech processing strategies in normal-hearing listeners. Several behavioral studies have examined the effects of varying spectral and temporal cues on vocoded speech perception; however, little is known about the neural indices of vocoded speech perception. Here, the scalp-recorded frequency following response (FFR) was used to study the effects of varying spectral and temporal cues on brainstem neural representation of specific acoustic cues, the temporal envelope periodicity related to fundamental frequency (F0) and temporal fine structure (TFS) related to formant and formant-related frequencies, as reflected in the phase-locked neural activity in response to vocoded speech.DESIGN: In experiment 1, FFRs were measured in 12 normal-hearing, adult listeners in response to a steady state English back vowel /u/ presented in an unaltered, unprocessed condition and six sine-vocoder conditions with varying numbers of channels (1, 2, 4, 8, 16, and 32), while the temporal envelope cutoff frequency was fixed at 500 Hz. In experiment 2, FFRs were obtained from 14 normal-hearing, adult listeners in response to the same English vowel /u/, presented in an unprocessed condition and four vocoded conditions where both the temporal envelope cutoff frequency (50 versus 500 Hz) and carrier type (sine wave versus noise band) were varied separately with the number of channels fixed at 8. Fast Fourier Transform was applied to the time waveforms of FFR to analyze the strength of brainstem neural representation of temporal envelope periodicity (F0) and TFS-related peaks (formant structure).RESULTS: Brainstem neural representation of both temporal envelope and TFS cues improved when the number of channels increased from 1 to 4, followed by a plateau with 8 and 16 channels, and a reduction in phase-locking strength with 32 channels. For the sine vocoders, peaks in the FFRTFS spectra corresponded with the low-frequency sine-wave carriers and side band frequencies in the stimulus spectra. When the temporal envelope cutoff frequency increased from 50 to 500 Hz, an improvement was observed in brainstem F0 representation with no change in brainstem representation of spectral peaks proximal to the first formant frequency (F1). There was no significant effect of carrier type (sine- versus noise-vocoder) on brainstem neural representation of F0 cues when the temporal envelope cutoff frequency was 500 Hz.CONCLUSIONS: While the improvement in neural representation of temporal envelope and TFS cues with up to 4 vocoder channels is consistent with the behavioral literature, the reduced neural phase-locking strength noted with even more channels may be because of the narrow bandwidth of each channel as the number of channels increases. Stronger neural representation of temporal envelope cues with higher temporal envelope cutoff frequencies is likely a reflection of brainstem neural phase-locking to F0-related periodicity fluctuations preserved in the 500-Hz temporal envelopes, which are unavailable in the 50-Hz temporal envelopes. No effect of temporal envelope cutoff frequency was seen for neural representation of TFS cues, suggesting that spectral side band frequencies created by the 500-Hz temporal envelopes did not improve neural representation of F1 cues over the 50-Hz temporal envelopes. Finally, brainstem F0 representation was not significantly affected by carrier type with a temporal envelope cutoff frequency of 500 Hz, which is inconsistent with previous results of behavioral studies examining pitch perception of vocoded stimuli.",

author = "Saradha Ananthakrishnan and Xin Luo and Ananthanarayan Krishnan",

year = "2017",

month = sep,

day = "1",

doi = "10.1097/AUD.0000000000000432",

language = "English (US)",

volume = "38",

pages = "e256--e267",

journal = "Ear and hearing",

issn = "0196-0202",

publisher = "Lippincott Williams and Wilkins",

number = "5",

}

TY - JOUR

T1 - Human Frequency Following Responses to Vocoded Speech

AU - Ananthakrishnan, Saradha

AU - Luo, Xin

AU - Krishnan, Ananthanarayan

PY - 2017/9/1

Y1 - 2017/9/1

N2 - OBJECTIVES: Vocoders offer an effective platform to simulate the effects of cochlear implant speech processing strategies in normal-hearing listeners. Several behavioral studies have examined the effects of varying spectral and temporal cues on vocoded speech perception; however, little is known about the neural indices of vocoded speech perception. Here, the scalp-recorded frequency following response (FFR) was used to study the effects of varying spectral and temporal cues on brainstem neural representation of specific acoustic cues, the temporal envelope periodicity related to fundamental frequency (F0) and temporal fine structure (TFS) related to formant and formant-related frequencies, as reflected in the phase-locked neural activity in response to vocoded speech.DESIGN: In experiment 1, FFRs were measured in 12 normal-hearing, adult listeners in response to a steady state English back vowel /u/ presented in an unaltered, unprocessed condition and six sine-vocoder conditions with varying numbers of channels (1, 2, 4, 8, 16, and 32), while the temporal envelope cutoff frequency was fixed at 500 Hz. In experiment 2, FFRs were obtained from 14 normal-hearing, adult listeners in response to the same English vowel /u/, presented in an unprocessed condition and four vocoded conditions where both the temporal envelope cutoff frequency (50 versus 500 Hz) and carrier type (sine wave versus noise band) were varied separately with the number of channels fixed at 8. Fast Fourier Transform was applied to the time waveforms of FFR to analyze the strength of brainstem neural representation of temporal envelope periodicity (F0) and TFS-related peaks (formant structure).RESULTS: Brainstem neural representation of both temporal envelope and TFS cues improved when the number of channels increased from 1 to 4, followed by a plateau with 8 and 16 channels, and a reduction in phase-locking strength with 32 channels. For the sine vocoders, peaks in the FFRTFS spectra corresponded with the low-frequency sine-wave carriers and side band frequencies in the stimulus spectra. When the temporal envelope cutoff frequency increased from 50 to 500 Hz, an improvement was observed in brainstem F0 representation with no change in brainstem representation of spectral peaks proximal to the first formant frequency (F1). There was no significant effect of carrier type (sine- versus noise-vocoder) on brainstem neural representation of F0 cues when the temporal envelope cutoff frequency was 500 Hz.CONCLUSIONS: While the improvement in neural representation of temporal envelope and TFS cues with up to 4 vocoder channels is consistent with the behavioral literature, the reduced neural phase-locking strength noted with even more channels may be because of the narrow bandwidth of each channel as the number of channels increases. Stronger neural representation of temporal envelope cues with higher temporal envelope cutoff frequencies is likely a reflection of brainstem neural phase-locking to F0-related periodicity fluctuations preserved in the 500-Hz temporal envelopes, which are unavailable in the 50-Hz temporal envelopes. No effect of temporal envelope cutoff frequency was seen for neural representation of TFS cues, suggesting that spectral side band frequencies created by the 500-Hz temporal envelopes did not improve neural representation of F1 cues over the 50-Hz temporal envelopes. Finally, brainstem F0 representation was not significantly affected by carrier type with a temporal envelope cutoff frequency of 500 Hz, which is inconsistent with previous results of behavioral studies examining pitch perception of vocoded stimuli.

AB - OBJECTIVES: Vocoders offer an effective platform to simulate the effects of cochlear implant speech processing strategies in normal-hearing listeners. Several behavioral studies have examined the effects of varying spectral and temporal cues on vocoded speech perception; however, little is known about the neural indices of vocoded speech perception. Here, the scalp-recorded frequency following response (FFR) was used to study the effects of varying spectral and temporal cues on brainstem neural representation of specific acoustic cues, the temporal envelope periodicity related to fundamental frequency (F0) and temporal fine structure (TFS) related to formant and formant-related frequencies, as reflected in the phase-locked neural activity in response to vocoded speech.DESIGN: In experiment 1, FFRs were measured in 12 normal-hearing, adult listeners in response to a steady state English back vowel /u/ presented in an unaltered, unprocessed condition and six sine-vocoder conditions with varying numbers of channels (1, 2, 4, 8, 16, and 32), while the temporal envelope cutoff frequency was fixed at 500 Hz. In experiment 2, FFRs were obtained from 14 normal-hearing, adult listeners in response to the same English vowel /u/, presented in an unprocessed condition and four vocoded conditions where both the temporal envelope cutoff frequency (50 versus 500 Hz) and carrier type (sine wave versus noise band) were varied separately with the number of channels fixed at 8. Fast Fourier Transform was applied to the time waveforms of FFR to analyze the strength of brainstem neural representation of temporal envelope periodicity (F0) and TFS-related peaks (formant structure).RESULTS: Brainstem neural representation of both temporal envelope and TFS cues improved when the number of channels increased from 1 to 4, followed by a plateau with 8 and 16 channels, and a reduction in phase-locking strength with 32 channels. For the sine vocoders, peaks in the FFRTFS spectra corresponded with the low-frequency sine-wave carriers and side band frequencies in the stimulus spectra. When the temporal envelope cutoff frequency increased from 50 to 500 Hz, an improvement was observed in brainstem F0 representation with no change in brainstem representation of spectral peaks proximal to the first formant frequency (F1). There was no significant effect of carrier type (sine- versus noise-vocoder) on brainstem neural representation of F0 cues when the temporal envelope cutoff frequency was 500 Hz.CONCLUSIONS: While the improvement in neural representation of temporal envelope and TFS cues with up to 4 vocoder channels is consistent with the behavioral literature, the reduced neural phase-locking strength noted with even more channels may be because of the narrow bandwidth of each channel as the number of channels increases. Stronger neural representation of temporal envelope cues with higher temporal envelope cutoff frequencies is likely a reflection of brainstem neural phase-locking to F0-related periodicity fluctuations preserved in the 500-Hz temporal envelopes, which are unavailable in the 50-Hz temporal envelopes. No effect of temporal envelope cutoff frequency was seen for neural representation of TFS cues, suggesting that spectral side band frequencies created by the 500-Hz temporal envelopes did not improve neural representation of F1 cues over the 50-Hz temporal envelopes. Finally, brainstem F0 representation was not significantly affected by carrier type with a temporal envelope cutoff frequency of 500 Hz, which is inconsistent with previous results of behavioral studies examining pitch perception of vocoded stimuli.

UR - http://www.scopus.com/inward/record.url?scp=85016606650&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016606650&partnerID=8YFLogxK

U2 - 10.1097/AUD.0000000000000432

DO - 10.1097/AUD.0000000000000432

M3 - Article

C2 - 28362674

AN - SCOPUS:85016606650

SN - 0196-0202

VL - 38

SP - e256-e267

JO - Ear and hearing

JF - Ear and hearing

IS - 5

ER -

Human Frequency Following Responses to Vocoded Speech

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this