Perceptual segmentation and component selection for sinusoidal representations of audio

Ted Painter, Andreas Spanias

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

This paper presents two fundamental enhancements in a hybrid audio signal model consisting of sinusoidal, transient, and noise (STN) components. The first enhancement involves a novel application of a perceptual metric for optimal time segmentation for the analysis of transients. In particular, Moore and Glasberg's model of partial loudness is modified for use with general signals and then integrated into a novel time segmentation scheme. The second, and perhaps more significant STN enhancement is concerned with a new methodology for ranking and selection of the most perceptually relevant sinusoids. A systematic procedure is developed for the selection of a compact set of sinusoids and comparative results are given to demonstrate the merit of this method.

Original languageEnglish (US)
Pages (from-to)149-161
Number of pages13
JournalIEEE Transactions on Speech and Audio Processing
Volume13
Issue number2
DOIs
StatePublished - Mar 2005

Fingerprint

sine waves
augmentation
audio signals
loudness
ranking
methodology

Keywords

  • Audio coding
  • Psychoacoustics
  • Segmentation
  • Sinusoidal models

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Perceptual segmentation and component selection for sinusoidal representations of audio. / Painter, Ted; Spanias, Andreas.

In: IEEE Transactions on Speech and Audio Processing, Vol. 13, No. 2, 03.2005, p. 149-161.

Research output: Contribution to journalArticle

@article{45ff41038bd249f4a48142eab92f460b,
title = "Perceptual segmentation and component selection for sinusoidal representations of audio",
abstract = "This paper presents two fundamental enhancements in a hybrid audio signal model consisting of sinusoidal, transient, and noise (STN) components. The first enhancement involves a novel application of a perceptual metric for optimal time segmentation for the analysis of transients. In particular, Moore and Glasberg's model of partial loudness is modified for use with general signals and then integrated into a novel time segmentation scheme. The second, and perhaps more significant STN enhancement is concerned with a new methodology for ranking and selection of the most perceptually relevant sinusoids. A systematic procedure is developed for the selection of a compact set of sinusoids and comparative results are given to demonstrate the merit of this method.",
keywords = "Audio coding, Psychoacoustics, Segmentation, Sinusoidal models",
author = "Ted Painter and Andreas Spanias",
year = "2005",
month = "3",
doi = "10.1109/TSA.2004.841050",
language = "English (US)",
volume = "13",
pages = "149--161",
journal = "IEEE Transactions on Speech and Audio Processing",
issn = "1558-7916",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

TY - JOUR

T1 - Perceptual segmentation and component selection for sinusoidal representations of audio

AU - Painter, Ted

AU - Spanias, Andreas

PY - 2005/3

Y1 - 2005/3

N2 - This paper presents two fundamental enhancements in a hybrid audio signal model consisting of sinusoidal, transient, and noise (STN) components. The first enhancement involves a novel application of a perceptual metric for optimal time segmentation for the analysis of transients. In particular, Moore and Glasberg's model of partial loudness is modified for use with general signals and then integrated into a novel time segmentation scheme. The second, and perhaps more significant STN enhancement is concerned with a new methodology for ranking and selection of the most perceptually relevant sinusoids. A systematic procedure is developed for the selection of a compact set of sinusoids and comparative results are given to demonstrate the merit of this method.

AB - This paper presents two fundamental enhancements in a hybrid audio signal model consisting of sinusoidal, transient, and noise (STN) components. The first enhancement involves a novel application of a perceptual metric for optimal time segmentation for the analysis of transients. In particular, Moore and Glasberg's model of partial loudness is modified for use with general signals and then integrated into a novel time segmentation scheme. The second, and perhaps more significant STN enhancement is concerned with a new methodology for ranking and selection of the most perceptually relevant sinusoids. A systematic procedure is developed for the selection of a compact set of sinusoids and comparative results are given to demonstrate the merit of this method.

KW - Audio coding

KW - Psychoacoustics

KW - Segmentation

KW - Sinusoidal models

UR - http://www.scopus.com/inward/record.url?scp=14644436350&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=14644436350&partnerID=8YFLogxK

U2 - 10.1109/TSA.2004.841050

DO - 10.1109/TSA.2004.841050

M3 - Article

VL - 13

SP - 149

EP - 161

JO - IEEE Transactions on Speech and Audio Processing

JF - IEEE Transactions on Speech and Audio Processing

SN - 1558-7916

IS - 2

ER -