A scalable feature learning and tag prediction framework for natural environment sounds

P. Sattigeri, J. J. Thiagarajan, M. Shah, K. N. Ramamurthy, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Building feature extraction approaches that can effectively characterize natural environment sounds is challenging due to the dynamic nature. In this paper, we develop a framework for feature extraction and obtaining semantic inferences from such data. In particular, we propose a new pooling strategy for deep architectures, that can preserve the temporal dynamics in the resulting representation. By constructing an ensemble of semantic embeddings, we employ an l1-reconstruction based prediction algorithm for estimating the relevant tags. We evaluate our approach on challenging environmental sound recognition datasets, and show that the proposed features outperform traditional spectral features.

Original languageEnglish (US)
Title of host publicationConference Record of the 48th Asilomar Conference on Signals, Systems and Computers
EditorsMichael B. Matthews
PublisherIEEE Computer Society
Pages1779-1783
Number of pages5
ISBN (Electronic)9781479982974
DOIs
StatePublished - Apr 24 2015
Event48th Asilomar Conference on Signals, Systems and Computers, ACSSC 2015 - Pacific Grove, United States
Duration: Nov 2 2014Nov 5 2014

Publication series

NameConference Record - Asilomar Conference on Signals, Systems and Computers
Volume2015-April
ISSN (Print)1058-6393

Other

Other48th Asilomar Conference on Signals, Systems and Computers, ACSSC 2015
CountryUnited States
CityPacific Grove
Period11/2/1411/5/14

ASJC Scopus subject areas

  • Signal Processing
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'A scalable feature learning and tag prediction framework for natural environment sounds'. Together they form a unique fingerprint.

  • Cite this

    Sattigeri, P., Thiagarajan, J. J., Shah, M., Ramamurthy, K. N., & Spanias, A. (2015). A scalable feature learning and tag prediction framework for natural environment sounds. In M. B. Matthews (Ed.), Conference Record of the 48th Asilomar Conference on Signals, Systems and Computers (pp. 1779-1783). [7094773] (Conference Record - Asilomar Conference on Signals, Systems and Computers; Vol. 2015-April). IEEE Computer Society. https://doi.org/10.1109/ACSSC.2014.7094773