Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm

Sassan Ahmadi, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.

Original languageEnglish (US)
Title of host publicationConference Record of the Asilomar Conference on Signals, Systems and Computers
EditorsM.P. Farques, R.D. Hippenstiel
PublisherIEEE Comp Soc
Pages1075-1079
Number of pages5
Volume2
StatePublished - 1998
Externally publishedYes
EventProceedings of the 1997 31st Asilomar Conference on Signals, Systems & Computers. Part 1 (of 2) - Pacific Grove, CA, USA
Duration: Nov 2 1997Nov 5 1997

Other

OtherProceedings of the 1997 31st Asilomar Conference on Signals, Systems & Computers. Part 1 (of 2)
CityPacific Grove, CA, USA
Period11/2/9711/5/97

Fingerprint

Speech coding
Phase matching
Vector quantization
Interpolation
Sampling

ASJC Scopus subject areas

  • Hardware and Architecture
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Ahmadi, S., & Spanias, A. (1998). Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm. In M. P. Farques, & R. D. Hippenstiel (Eds.), Conference Record of the Asilomar Conference on Signals, Systems and Computers (Vol. 2, pp. 1075-1079). IEEE Comp Soc.

Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm. / Ahmadi, Sassan; Spanias, Andreas.

Conference Record of the Asilomar Conference on Signals, Systems and Computers. ed. / M.P. Farques; R.D. Hippenstiel. Vol. 2 IEEE Comp Soc, 1998. p. 1075-1079.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ahmadi, S & Spanias, A 1998, Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm. in MP Farques & RD Hippenstiel (eds), Conference Record of the Asilomar Conference on Signals, Systems and Computers. vol. 2, IEEE Comp Soc, pp. 1075-1079, Proceedings of the 1997 31st Asilomar Conference on Signals, Systems & Computers. Part 1 (of 2), Pacific Grove, CA, USA, 11/2/97.
Ahmadi S, Spanias A. Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm. In Farques MP, Hippenstiel RD, editors, Conference Record of the Asilomar Conference on Signals, Systems and Computers. Vol. 2. IEEE Comp Soc. 1998. p. 1075-1079
Ahmadi, Sassan ; Spanias, Andreas. / Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm. Conference Record of the Asilomar Conference on Signals, Systems and Computers. editor / M.P. Farques ; R.D. Hippenstiel. Vol. 2 IEEE Comp Soc, 1998. pp. 1075-1079
@inproceedings{cc33d5746bf747a6b8744cb5bb074567,
title = "Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm",
abstract = "This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.",
author = "Sassan Ahmadi and Andreas Spanias",
year = "1998",
language = "English (US)",
volume = "2",
pages = "1075--1079",
editor = "M.P. Farques and R.D. Hippenstiel",
booktitle = "Conference Record of the Asilomar Conference on Signals, Systems and Computers",
publisher = "IEEE Comp Soc",

}

TY - GEN

T1 - Sinusoidal speech coding at 2.4 kbps using an improved phase matching algorithm

AU - Ahmadi, Sassan

AU - Spanias, Andreas

PY - 1998

Y1 - 1998

N2 - This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.

AB - This paper addresses the design, development, evaluation, and implementation of efficient low bit rate speech coding algorithms based on the sinusoidal model. A series of algorithms have been developed for pitch frequency determination and voicing detection, simultaneous modeling of the sinusoidal amplitudes and phases, and mid-frame interpolation. An improved sinusoidal phase matching algorithm is presented, where short-time sinusoidal phases are approximated using an elaborate combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. A voicing-dependent perceptual split vector quantization scheme is used to encode the sinusoidal amplitudes. The perceptual properties of the human auditory system are effectively exploited in the developed algorithms. The algorithms have been successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder has been evaluated in terms of subjective tests such as the mean opinion score and the diagnostic rhyme test, as well as some perceptually-motivated objective distortion measures. Performance analysis on a large speech database indicates that the use of the proposed algorithms resulted in considerable improvement in temporal and spectral signal matching, as well as improved subjective quality of the reproduced speech.

UR - http://www.scopus.com/inward/record.url?scp=0031648999&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0031648999&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0031648999

VL - 2

SP - 1075

EP - 1079

BT - Conference Record of the Asilomar Conference on Signals, Systems and Computers

A2 - Farques, M.P.

A2 - Hippenstiel, R.D.

PB - IEEE Comp Soc

ER -