Improved approach to robust speech recognition using minimum error classification

Min Tau Lin, Andreas Spanias, Philipos Loizou

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

An effective way of applying minimum error classification (MEC) to improve robustness in speech recognition is presented in this paper. In contrast to the traditional maximum likelihood (ML) training procedure that attempts to maximize the a priori probability of generating the training data set, MEC training attempts to minimize a function of the recognition error on the given training data set. In the MEC training procedure, the N-best algorithm is used to maximize the separation between the correct and competing models over confusable training tokens. The main focus of this paper is to investigate the effectiveness of MEC training when combined with four existing speech recognition algorithms under noisy and telephone mismatched environments. These algorithms are the weighted projection measure (WPM), the minimax approach (MA), the cepstral mean subtraction (CMS) method and the stochastic matching algorithms (SMAs). Experiments were performed using the Texas Instruments isolated digits database and the E-set words from the OGI Spelled and Spoken Telephone Corpus. The average word error rate reduction due to MEC training was 22.5% for isolated digit recognition and 8% for E-set word recognition.

Original languageEnglish (US)
Pages (from-to)27-36
Number of pages10
JournalSpeech Communication
Volume30
Issue number1
DOIs
StatePublished - Jan 2000

Fingerprint

Robust Speech Recognition
Speech recognition
Telephone
Speech Recognition
Digit
telephone
Maximise
Training
Databases
Stochastic Algorithms
Recognition Algorithm
Subtraction
Maximum likelihood
Matching Algorithm
Minimax
Maximum Likelihood
projection
Error Rate
Projection
Robustness

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Experimental and Cognitive Psychology
  • Linguistics and Language

Cite this

Improved approach to robust speech recognition using minimum error classification. / Lin, Min Tau; Spanias, Andreas; Loizou, Philipos.

In: Speech Communication, Vol. 30, No. 1, 01.2000, p. 27-36.

Research output: Contribution to journalArticle

@article{d81c9d1c1ce345998b9ce8f80e9e2fcf,
title = "Improved approach to robust speech recognition using minimum error classification",
abstract = "An effective way of applying minimum error classification (MEC) to improve robustness in speech recognition is presented in this paper. In contrast to the traditional maximum likelihood (ML) training procedure that attempts to maximize the a priori probability of generating the training data set, MEC training attempts to minimize a function of the recognition error on the given training data set. In the MEC training procedure, the N-best algorithm is used to maximize the separation between the correct and competing models over confusable training tokens. The main focus of this paper is to investigate the effectiveness of MEC training when combined with four existing speech recognition algorithms under noisy and telephone mismatched environments. These algorithms are the weighted projection measure (WPM), the minimax approach (MA), the cepstral mean subtraction (CMS) method and the stochastic matching algorithms (SMAs). Experiments were performed using the Texas Instruments isolated digits database and the E-set words from the OGI Spelled and Spoken Telephone Corpus. The average word error rate reduction due to MEC training was 22.5{\%} for isolated digit recognition and 8{\%} for E-set word recognition.",
author = "Lin, {Min Tau} and Andreas Spanias and Philipos Loizou",
year = "2000",
month = "1",
doi = "10.1016/S0167-6393(99)00027-8",
language = "English (US)",
volume = "30",
pages = "27--36",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",
number = "1",

}

TY - JOUR

T1 - Improved approach to robust speech recognition using minimum error classification

AU - Lin, Min Tau

AU - Spanias, Andreas

AU - Loizou, Philipos

PY - 2000/1

Y1 - 2000/1

N2 - An effective way of applying minimum error classification (MEC) to improve robustness in speech recognition is presented in this paper. In contrast to the traditional maximum likelihood (ML) training procedure that attempts to maximize the a priori probability of generating the training data set, MEC training attempts to minimize a function of the recognition error on the given training data set. In the MEC training procedure, the N-best algorithm is used to maximize the separation between the correct and competing models over confusable training tokens. The main focus of this paper is to investigate the effectiveness of MEC training when combined with four existing speech recognition algorithms under noisy and telephone mismatched environments. These algorithms are the weighted projection measure (WPM), the minimax approach (MA), the cepstral mean subtraction (CMS) method and the stochastic matching algorithms (SMAs). Experiments were performed using the Texas Instruments isolated digits database and the E-set words from the OGI Spelled and Spoken Telephone Corpus. The average word error rate reduction due to MEC training was 22.5% for isolated digit recognition and 8% for E-set word recognition.

AB - An effective way of applying minimum error classification (MEC) to improve robustness in speech recognition is presented in this paper. In contrast to the traditional maximum likelihood (ML) training procedure that attempts to maximize the a priori probability of generating the training data set, MEC training attempts to minimize a function of the recognition error on the given training data set. In the MEC training procedure, the N-best algorithm is used to maximize the separation between the correct and competing models over confusable training tokens. The main focus of this paper is to investigate the effectiveness of MEC training when combined with four existing speech recognition algorithms under noisy and telephone mismatched environments. These algorithms are the weighted projection measure (WPM), the minimax approach (MA), the cepstral mean subtraction (CMS) method and the stochastic matching algorithms (SMAs). Experiments were performed using the Texas Instruments isolated digits database and the E-set words from the OGI Spelled and Spoken Telephone Corpus. The average word error rate reduction due to MEC training was 22.5% for isolated digit recognition and 8% for E-set word recognition.

UR - http://www.scopus.com/inward/record.url?scp=0033889906&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033889906&partnerID=8YFLogxK

U2 - 10.1016/S0167-6393(99)00027-8

DO - 10.1016/S0167-6393(99)00027-8

M3 - Article

AN - SCOPUS:0033889906

VL - 30

SP - 27

EP - 36

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

IS - 1

ER -