A procedure for estimating the parameters of a sinusoidal model from speech corrupted by additive noise is described. An approximate harmonic representation is used wherein voiced speech is represented by a set of sine waves at multiples of the fundamental frequency and several additional components at frequencies near each harmonic. Amplitudes and phases of the sinusoidal components are estimated using a state-based technique that employs hidden Markov models (HMMs) to classify speech and noise spectra. Voicing and fundamental frequency are determined using an analysis-by- synthesis approach. Simulation results are presented, comparing the performance of the proposed algorithm to that of the standard HMM-based minimum mean square error (MMSE) estimator. The proposed method was found to reduce the structured residual noise associated with HMM-based algorithms.
ASJC Scopus subject areas
- Arts and Humanities (miscellaneous)
- Acoustics and Ultrasonics