An effective way of applying minimum error classification (MEC) to improve robustness in speech recognition is presented in this paper. In contrast to the traditional maximum likelihood (ML) training procedure that attempts to maximize the a priori probability of generating the training data set, MEC training attempts to minimize a function of the recognition error on the given training data set. In the MEC training procedure, the N-best algorithm is used to maximize the separation between the correct and competing models over confusable training tokens. The main focus of this paper is to investigate the effectiveness of MEC training when combined with four existing speech recognition algorithms under noisy and telephone mismatched environments. These algorithms are the weighted projection measure (WPM), the minimax approach (MA), the cepstral mean subtraction (CMS) method and the stochastic matching algorithms (SMAs). Experiments were performed using the Texas Instruments isolated digits database and the E-set words from the OGI Spelled and Spoken Telephone Corpus. The average word error rate reduction due to MEC training was 22.5% for isolated digit recognition and 8% for E-set word recognition.
ASJC Scopus subject areas
- Modeling and Simulation
- Language and Linguistics
- Linguistics and Language
- Computer Vision and Pattern Recognition
- Computer Science Applications