A hybrid model for speech synthesis

Andreas Spanias

A hybrid model for speech synthesis

Andreas Spanias

Electrical Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

5 Scopus citations

Abstract

A hybrid model for speech analysis/synthesis is proposed. It relies on a time-varying autoregressive moving-average (ARMA) model and the short-time Fourier transform (STFT). The model is hybrid in that the periodic (narrowband) component in speech is represented in the frequency domain by a harmonic-based STFT, while the random component in speech is represented by a random noise sequence, appropriately shaped by the ARMA model. The time-varying ARMA model has a dual function (namely, it creates a spectral envelope that fits accurately the harmonic STFT components) and provides for the spectral shaping of random noise. This hybrid model essentially incorporates the benefits of waveform coders by employing the STFT and the benefits of traditional vocoders by using an appropriately shaped noise sequence; thus, it is expected to yield robust speech synthesis at low data rates.

Original language	English (US)
Title of host publication	Proceedings - IEEE International Symposium on Circuits and Systems
Publisher	Publ by IEEE
Pages	1521-1524
Number of pages	4
Volume	2
State	Published - 1990
Event	1990 IEEE International Symposium on Circuits and Systems Part 3 (of 4) - New Orleans, LA, USA Duration: May 1 1990 → May 3 1990

Other

Other	1990 IEEE International Symposium on Circuits and Systems Part 3 (of 4)
City	New Orleans, LA, USA
Period	5/1/90 → 5/3/90

ASJC Scopus subject areas

Electrical and Electronic Engineering
Electronic, Optical and Magnetic Materials

Cite this

@inproceedings{55eaf348bd684795997434949afcc52d,

title = "A hybrid model for speech synthesis",

abstract = "A hybrid model for speech analysis/synthesis is proposed. It relies on a time-varying autoregressive moving-average (ARMA) model and the short-time Fourier transform (STFT). The model is hybrid in that the periodic (narrowband) component in speech is represented in the frequency domain by a harmonic-based STFT, while the random component in speech is represented by a random noise sequence, appropriately shaped by the ARMA model. The time-varying ARMA model has a dual function (namely, it creates a spectral envelope that fits accurately the harmonic STFT components) and provides for the spectral shaping of random noise. This hybrid model essentially incorporates the benefits of waveform coders by employing the STFT and the benefits of traditional vocoders by using an appropriately shaped noise sequence; thus, it is expected to yield robust speech synthesis at low data rates.",

author = "Andreas Spanias",

year = "1990",

language = "English (US)",

volume = "2",

pages = "1521--1524",

booktitle = "Proceedings - IEEE International Symposium on Circuits and Systems",

publisher = "Publ by IEEE",

note = "1990 IEEE International Symposium on Circuits and Systems Part 3 (of 4) ; Conference date: 01-05-1990 Through 03-05-1990",

}

TY - GEN

T1 - A hybrid model for speech synthesis

AU - Spanias, Andreas

PY - 1990

Y1 - 1990

N2 - A hybrid model for speech analysis/synthesis is proposed. It relies on a time-varying autoregressive moving-average (ARMA) model and the short-time Fourier transform (STFT). The model is hybrid in that the periodic (narrowband) component in speech is represented in the frequency domain by a harmonic-based STFT, while the random component in speech is represented by a random noise sequence, appropriately shaped by the ARMA model. The time-varying ARMA model has a dual function (namely, it creates a spectral envelope that fits accurately the harmonic STFT components) and provides for the spectral shaping of random noise. This hybrid model essentially incorporates the benefits of waveform coders by employing the STFT and the benefits of traditional vocoders by using an appropriately shaped noise sequence; thus, it is expected to yield robust speech synthesis at low data rates.

AB - A hybrid model for speech analysis/synthesis is proposed. It relies on a time-varying autoregressive moving-average (ARMA) model and the short-time Fourier transform (STFT). The model is hybrid in that the periodic (narrowband) component in speech is represented in the frequency domain by a harmonic-based STFT, while the random component in speech is represented by a random noise sequence, appropriately shaped by the ARMA model. The time-varying ARMA model has a dual function (namely, it creates a spectral envelope that fits accurately the harmonic STFT components) and provides for the spectral shaping of random noise. This hybrid model essentially incorporates the benefits of waveform coders by employing the STFT and the benefits of traditional vocoders by using an appropriately shaped noise sequence; thus, it is expected to yield robust speech synthesis at low data rates.

UR - http://www.scopus.com/inward/record.url?scp=0025592604&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0025592604&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0025592604

VL - 2

SP - 1521

EP - 1524

BT - Proceedings - IEEE International Symposium on Circuits and Systems

PB - Publ by IEEE

T2 - 1990 IEEE International Symposium on Circuits and Systems Part 3 (of 4)

Y2 - 1 May 1990 through 3 May 1990

ER -

A hybrid model for speech synthesis

Abstract

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this