Speaker normalization for Chinese vowel recognition in cochlear implants

Xin Luo, Qian Jie Fu

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.

Original languageEnglish (US)
Pages (from-to)1358-1361
Number of pages4
JournalIEEE Transactions on Biomedical Engineering
Volume52
Issue number7
DOIs
StatePublished - Jul 2005
Externally publishedYes

Fingerprint

Cochlear implants
Speech recognition
Speech processing
Filter banks
Audition

Keywords

  • Cochlear implants
  • Speaker normalization
  • Vowel recognition

ASJC Scopus subject areas

  • Biomedical Engineering

Cite this

Speaker normalization for Chinese vowel recognition in cochlear implants. / Luo, Xin; Fu, Qian Jie.

In: IEEE Transactions on Biomedical Engineering, Vol. 52, No. 7, 07.2005, p. 1358-1361.

Research output: Contribution to journalArticle

@article{4a4f3a5f0bb44a6da6fb664db6358de4,
title = "Speaker normalization for Chinese vowel recognition in cochlear implants",
abstract = "Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this {"}optimal{"} output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.",
keywords = "Cochlear implants, Speaker normalization, Vowel recognition",
author = "Xin Luo and Fu, {Qian Jie}",
year = "2005",
month = "7",
doi = "10.1109/TBME.2005.847530",
language = "English (US)",
volume = "52",
pages = "1358--1361",
journal = "IEEE Transactions on Biomedical Engineering",
issn = "0018-9294",
publisher = "IEEE Computer Society",
number = "7",

}

TY - JOUR

T1 - Speaker normalization for Chinese vowel recognition in cochlear implants

AU - Luo, Xin

AU - Fu, Qian Jie

PY - 2005/7

Y1 - 2005/7

N2 - Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.

AB - Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.

KW - Cochlear implants

KW - Speaker normalization

KW - Vowel recognition

UR - http://www.scopus.com/inward/record.url?scp=21844459236&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=21844459236&partnerID=8YFLogxK

U2 - 10.1109/TBME.2005.847530

DO - 10.1109/TBME.2005.847530

M3 - Article

C2 - 16042003

AN - SCOPUS:21844459236

VL - 52

SP - 1358

EP - 1361

JO - IEEE Transactions on Biomedical Engineering

JF - IEEE Transactions on Biomedical Engineering

SN - 0018-9294

IS - 7

ER -