Speech enhancement using state-based estimation and sinusoidal modeling

Michael E. Deisher, Andreas Spanias

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

A procedure for estimating the parameters of a sinusoidal model from speech corrupted by additive noise is described. An approximate harmonic representation is used wherein voiced speech is represented by a set of sine waves at multiples of the fundamental frequency and several additional components at frequencies near each harmonic. Amplitudes and phases of the sinusoidal components are estimated using a state-based technique that employs hidden Markov models (HMMs) to classify speech and noise spectra. Voicing and fundamental frequency are determined using an analysis-by- synthesis approach. Simulation results are presented, comparing the performance of the proposed algorithm to that of the standard HMM-based minimum mean square error (MMSE) estimator. The proposed method was found to reduce the structured residual noise associated with HMM-based algorithms.

Original languageEnglish (US)
Pages (from-to)1141-1149
Number of pages9
JournalJournal of the Acoustical Society of America
Volume102
Issue number2 pt 1
DOIs
StatePublished - Aug 1997

Fingerprint

augmentation
harmonics
sine waves
noise spectra
estimators
estimating
Enhancement
Hidden Markov Model
Modeling
synthesis
Harmonics
Fundamental Frequency
simulation
Simulation
Voicing
Waves

ASJC Scopus subject areas

  • Acoustics and Ultrasonics

Cite this

Speech enhancement using state-based estimation and sinusoidal modeling. / Deisher, Michael E.; Spanias, Andreas.

In: Journal of the Acoustical Society of America, Vol. 102, No. 2 pt 1, 08.1997, p. 1141-1149.

Research output: Contribution to journalArticle

@article{7e0308eed86647b8bad0e92df68a5ac9,
title = "Speech enhancement using state-based estimation and sinusoidal modeling",
abstract = "A procedure for estimating the parameters of a sinusoidal model from speech corrupted by additive noise is described. An approximate harmonic representation is used wherein voiced speech is represented by a set of sine waves at multiples of the fundamental frequency and several additional components at frequencies near each harmonic. Amplitudes and phases of the sinusoidal components are estimated using a state-based technique that employs hidden Markov models (HMMs) to classify speech and noise spectra. Voicing and fundamental frequency are determined using an analysis-by- synthesis approach. Simulation results are presented, comparing the performance of the proposed algorithm to that of the standard HMM-based minimum mean square error (MMSE) estimator. The proposed method was found to reduce the structured residual noise associated with HMM-based algorithms.",
author = "Deisher, {Michael E.} and Andreas Spanias",
year = "1997",
month = "8",
doi = "10.1121/1.419866",
language = "English (US)",
volume = "102",
pages = "1141--1149",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "2 pt 1",

}

TY - JOUR

T1 - Speech enhancement using state-based estimation and sinusoidal modeling

AU - Deisher, Michael E.

AU - Spanias, Andreas

PY - 1997/8

Y1 - 1997/8

N2 - A procedure for estimating the parameters of a sinusoidal model from speech corrupted by additive noise is described. An approximate harmonic representation is used wherein voiced speech is represented by a set of sine waves at multiples of the fundamental frequency and several additional components at frequencies near each harmonic. Amplitudes and phases of the sinusoidal components are estimated using a state-based technique that employs hidden Markov models (HMMs) to classify speech and noise spectra. Voicing and fundamental frequency are determined using an analysis-by- synthesis approach. Simulation results are presented, comparing the performance of the proposed algorithm to that of the standard HMM-based minimum mean square error (MMSE) estimator. The proposed method was found to reduce the structured residual noise associated with HMM-based algorithms.

AB - A procedure for estimating the parameters of a sinusoidal model from speech corrupted by additive noise is described. An approximate harmonic representation is used wherein voiced speech is represented by a set of sine waves at multiples of the fundamental frequency and several additional components at frequencies near each harmonic. Amplitudes and phases of the sinusoidal components are estimated using a state-based technique that employs hidden Markov models (HMMs) to classify speech and noise spectra. Voicing and fundamental frequency are determined using an analysis-by- synthesis approach. Simulation results are presented, comparing the performance of the proposed algorithm to that of the standard HMM-based minimum mean square error (MMSE) estimator. The proposed method was found to reduce the structured residual noise associated with HMM-based algorithms.

UR - http://www.scopus.com/inward/record.url?scp=0031214234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0031214234&partnerID=8YFLogxK

U2 - 10.1121/1.419866

DO - 10.1121/1.419866

M3 - Article

AN - SCOPUS:0031214234

VL - 102

SP - 1141

EP - 1149

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 2 pt 1

ER -