Low bit-rate speech coding based on an improved sinusoidal model

Sassan Ahmadi; Andreas Spanias

doi:10.1016/S0167-6393(00)00057-1

Low bit-rate speech coding based on an improved sinusoidal model

Sassan Ahmadi, Andreas Spanias

Electrical Engineering

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

This paper addresses the design, implementation and evaluation of efficient low bit-rate speech coding algorithms based on an improved sinusoidal model. A series of algorithms were developed for speech classification and pitch frequency determination, modeling of sinusoidal amplitudes and phases, and frame interpolation. An improved paradigm for sinusoidal phase coding is presented, where short-time sinusoidal phases are modeled using a combination of linear prediction, spectral sampling, linear phase alignment and all-pass phase error correction components. A class-dependent split vector quantization scheme is used to encode the sinusoidal amplitudes. The masking properties of the human auditory system are effectively exploited in the algorithms. The algorithms were successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder was evaluated in terms of informal subjective tests such as the mean opinion score (MOS) and the diagnostic rhyme test (DRT), as well as some perceptually motivated objective distortion measures. Performance analysis on a large speech database indicates considerable improvement in short-time signal matching both in the time and the spectral domains. In addition, subjective quality of the reproduced speech is considerably improved.

Original language	English (US)
Pages (from-to)	369-390
Number of pages	22
Journal	Speech Communication
Volume	34
Issue number	4
DOIs	https://doi.org/10.1016/S0167-6393(00)00057-1
State	Published - Jul 2001

Keywords

Frame interpolation
Linear prediction
Phase modeling
Sinusoidal model
Speech classification
Speech coding

ASJC Scopus subject areas

Software
Modeling and Simulation
Communication
Language and Linguistics
Linguistics and Language
Computer Vision and Pattern Recognition
Computer Science Applications

Access to Document

10.1016/S0167-6393(00)00057-1

Cite this

@article{f0762cc25c3e47a2a0808fe603a32673,

title = "Low bit-rate speech coding based on an improved sinusoidal model",

abstract = "This paper addresses the design, implementation and evaluation of efficient low bit-rate speech coding algorithms based on an improved sinusoidal model. A series of algorithms were developed for speech classification and pitch frequency determination, modeling of sinusoidal amplitudes and phases, and frame interpolation. An improved paradigm for sinusoidal phase coding is presented, where short-time sinusoidal phases are modeled using a combination of linear prediction, spectral sampling, linear phase alignment and all-pass phase error correction components. A class-dependent split vector quantization scheme is used to encode the sinusoidal amplitudes. The masking properties of the human auditory system are effectively exploited in the algorithms. The algorithms were successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder was evaluated in terms of informal subjective tests such as the mean opinion score (MOS) and the diagnostic rhyme test (DRT), as well as some perceptually motivated objective distortion measures. Performance analysis on a large speech database indicates considerable improvement in short-time signal matching both in the time and the spectral domains. In addition, subjective quality of the reproduced speech is considerably improved.",

keywords = "Frame interpolation, Linear prediction, Phase modeling, Sinusoidal model, Speech classification, Speech coding",

author = "Sassan Ahmadi and Andreas Spanias",

year = "2001",

month = jul,

doi = "10.1016/S0167-6393(00)00057-1",

language = "English (US)",

volume = "34",

pages = "369--390",

journal = "Speech Communication",

issn = "0167-6393",

publisher = "Elsevier",

number = "4",

}

TY - JOUR

T1 - Low bit-rate speech coding based on an improved sinusoidal model

AU - Ahmadi, Sassan

AU - Spanias, Andreas

PY - 2001/7

Y1 - 2001/7

N2 - This paper addresses the design, implementation and evaluation of efficient low bit-rate speech coding algorithms based on an improved sinusoidal model. A series of algorithms were developed for speech classification and pitch frequency determination, modeling of sinusoidal amplitudes and phases, and frame interpolation. An improved paradigm for sinusoidal phase coding is presented, where short-time sinusoidal phases are modeled using a combination of linear prediction, spectral sampling, linear phase alignment and all-pass phase error correction components. A class-dependent split vector quantization scheme is used to encode the sinusoidal amplitudes. The masking properties of the human auditory system are effectively exploited in the algorithms. The algorithms were successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder was evaluated in terms of informal subjective tests such as the mean opinion score (MOS) and the diagnostic rhyme test (DRT), as well as some perceptually motivated objective distortion measures. Performance analysis on a large speech database indicates considerable improvement in short-time signal matching both in the time and the spectral domains. In addition, subjective quality of the reproduced speech is considerably improved.

AB - This paper addresses the design, implementation and evaluation of efficient low bit-rate speech coding algorithms based on an improved sinusoidal model. A series of algorithms were developed for speech classification and pitch frequency determination, modeling of sinusoidal amplitudes and phases, and frame interpolation. An improved paradigm for sinusoidal phase coding is presented, where short-time sinusoidal phases are modeled using a combination of linear prediction, spectral sampling, linear phase alignment and all-pass phase error correction components. A class-dependent split vector quantization scheme is used to encode the sinusoidal amplitudes. The masking properties of the human auditory system are effectively exploited in the algorithms. The algorithms were successfully integrated into a 2.4 kbps sinusoidal coder. The performance of the 2.4 kbps coder was evaluated in terms of informal subjective tests such as the mean opinion score (MOS) and the diagnostic rhyme test (DRT), as well as some perceptually motivated objective distortion measures. Performance analysis on a large speech database indicates considerable improvement in short-time signal matching both in the time and the spectral domains. In addition, subjective quality of the reproduced speech is considerably improved.

KW - Frame interpolation

KW - Linear prediction

KW - Phase modeling

KW - Sinusoidal model

KW - Speech classification

KW - Speech coding

UR - http://www.scopus.com/inward/record.url?scp=0035400321&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035400321&partnerID=8YFLogxK

U2 - 10.1016/S0167-6393(00)00057-1

DO - 10.1016/S0167-6393(00)00057-1

M3 - Article

AN - SCOPUS:0035400321

SN - 0167-6393

VL - 34

SP - 369

EP - 390

JO - Speech Communication

JF - Speech Communication

IS - 4

ER -

Low bit-rate speech coding based on an improved sinusoidal model

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this