HMM-based speech enhancement using harmonic modeling

Michael E. Deisher, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Editors Anon
PublisherIEEE
Pages1175-1178
Number of pages4
Volume2
StatePublished - 1997
Externally publishedYes
EventProceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5) - Munich, Ger
Duration: Apr 21 1997Apr 24 1997

Other

OtherProceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5)
CityMunich, Ger
Period4/21/974/24/97

Fingerprint

Speech enhancement
Hidden Markov models
harmonics
augmentation
Speech intelligibility
voice communication
Speech communication
intelligibility
Noise abatement
acoustics
Speech recognition
Mean square error
speech recognition
coders
noise reduction
estimators
Communication systems
Acoustics
central processing units
telecommunication

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Deisher, M. E., & Spanias, A. (1997). HMM-based speech enhancement using harmonic modeling. In Anon (Ed.), ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 2, pp. 1175-1178). IEEE.

HMM-based speech enhancement using harmonic modeling. / Deisher, Michael E.; Spanias, Andreas.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. ed. / Anon. Vol. 2 IEEE, 1997. p. 1175-1178.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Deisher, ME & Spanias, A 1997, HMM-based speech enhancement using harmonic modeling. in Anon (ed.), ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 2, IEEE, pp. 1175-1178, Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5), Munich, Ger, 4/21/97.
Deisher ME, Spanias A. HMM-based speech enhancement using harmonic modeling. In Anon, editor, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 2. IEEE. 1997. p. 1175-1178
Deisher, Michael E. ; Spanias, Andreas. / HMM-based speech enhancement using harmonic modeling. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. editor / Anon. Vol. 2 IEEE, 1997. pp. 1175-1178
@inproceedings{2387485df8464359a8774b73ff983e41,
title = "HMM-based speech enhancement using harmonic modeling",
abstract = "This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.",
author = "Deisher, {Michael E.} and Andreas Spanias",
year = "1997",
language = "English (US)",
volume = "2",
pages = "1175--1178",
editor = "Anon",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "IEEE",

}

TY - GEN

T1 - HMM-based speech enhancement using harmonic modeling

AU - Deisher, Michael E.

AU - Spanias, Andreas

PY - 1997

Y1 - 1997

N2 - This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.

AB - This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.

UR - http://www.scopus.com/inward/record.url?scp=0030643942&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030643942&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0030643942

VL - 2

SP - 1175

EP - 1178

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

A2 - Anon, null

PB - IEEE

ER -