Robust multi-feature segmentation and indexing for natural sound environments

Gordon Wichern, Harvey Thornburg, Brandon Mechtley, Alex Fink, Kai Tu, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

Creating an audio database from continuous long-term recordings, allows for sounds to not only be linked by the time and place in which they were recorded, but also to sounds with similar acoustic characteristics. Of paramount importance in this application is the accurate segmentation of sound events, enabling realistic navigation of these recordings. We first propose a novel feature set of specific relevance to environmental sounds, and then develop a Bayesian framework for sound segmentation, which fuses dynamics across multiple features. This probabilistic model possesses the ability to account for non-instantaneous sound onsets and absent or delayed responses among individual features, providing flexibility in defining exactly what constitutes a sound event. Example recordings demonstrate the diversity of our feature set, and the utility of our probabilistic segmentation model in extracting sound events from both indoor and outdoor environments.

Original languageEnglish (US)
Title of host publicationCBMI'2007 - 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings
Pages69-76
Number of pages8
DOIs
StatePublished - 2007
EventCBMI'2007 - 2007 International Workshop on Content-Based Multimedia Indexing - Bordeaux, France
Duration: Jun 25 2007Jun 27 2007

Publication series

NameCBMI'2007 - 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings

Other

OtherCBMI'2007 - 2007 International Workshop on Content-Based Multimedia Indexing
Country/TerritoryFrance
CityBordeaux
Period6/25/076/27/07

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Robust multi-feature segmentation and indexing for natural sound environments'. Together they form a unique fingerprint.

Cite this