HMM-based speech enhancement using harmonic modeling

Michael E. Deisher; Andreas Spanias

HMM-based speech enhancement using harmonic modeling

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.

Original language	English (US)
Title of host publication	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Editors	Anon
Publisher	IEEE
Pages	1175-1178
Number of pages	4
Volume	2
State	Published - 1997
Externally published	Yes
Event	Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5) - Munich, Ger Duration: Apr 21 1997 → Apr 24 1997

Other

Other	Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5)
City	Munich, Ger
Period	4/21/97 → 4/24/97

ASJC Scopus subject areas

Signal Processing
Electrical and Electronic Engineering
Acoustics and Ultrasonics

Cite this

@inproceedings{2387485df8464359a8774b73ff983e41,

title = "HMM-based speech enhancement using harmonic modeling",

abstract = "This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.",

author = "Deisher, {Michael E.} and Andreas Spanias",

year = "1997",

language = "English (US)",

volume = "2",

pages = "1175--1178",

editor = "Anon",

booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "IEEE",

note = "Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5) ; Conference date: 21-04-1997 Through 24-04-1997",

}

TY - GEN

T1 - HMM-based speech enhancement using harmonic modeling

AU - Deisher, Michael E.

AU - Spanias, Andreas

PY - 1997

Y1 - 1997

N2 - This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.

AB - This paper describes a technique for reduction of non-stationary noise in electronic voice communication systems. Removal of noise is needed in many such systems, particularly those deployed in harsh mobile or otherwise dynamic acoustic environments. The proposed method employs state-based statistical models of both speech and noise, and is thus capable of tracking variations in noise during sustained speech. This work extends the hidden Markov model (HMM) based minimum mean square error (MMSE) estimator to incorporate a ternary voicing state, and applies it to a harmonic representation of voiced speech. Noise reduction during voiced sounds is thereby improved. Performance is evaluated using speech and noise from standard databases. The extended algorithm is demonstrated to improve speech quality as measured by informal preference tests and objective measures, to preserve speech intelligibility as measured by informal Diagnostic Rhyme Tests, and to improve the performance of a low bit-rate speech coder and a speech recognition system when used as a pre-processor.

UR - http://www.scopus.com/inward/record.url?scp=0030643942&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030643942&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0030643942

VL - 2

SP - 1175

EP - 1178

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

A2 - Anon, null

PB - IEEE

T2 - Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5)

Y2 - 21 April 1997 through 24 April 1997

ER -

HMM-based speech enhancement using harmonic modeling

Abstract

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this