An ontological framework for retrieving environmental sounds using semantics and acoustic content

Gordon Wichern, Brandon Mechtley, Alex Fink, Harvey Thornburg, Andreas Spanias

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic characteristics. Of paramount importance in navigating these databases are the problems of retrieving similar sounds using text- or sound-based queries, and automatically annotating unlabeled sounds. We propose an integrated system, which can be used for text-based retrieval of unlabeled audio, content-based query-by-example, and automatic annotation of unlabeled sound files. To this end, we introduce an ontological framework where sounds are connected to each other based on the similarity between acoustic features specifically adapted to environmental sounds, while semantic tags and sounds are connected through link weights that are optimized based on user-provided tags. Furthermore, tags are linked to each other through a measure of semantic similarity, which allows for efficient incorporation of out-of-vocabulary tags, that is, tags that do not yet exist in the database. Results on two freely available databases of environmental sounds contributed and labeled by nonexpert users demonstrate effective recall, precision, and average precision scores for both the text-based retrieval and annotation tasks.

Original languageEnglish (US)
Article number192363
JournalEurasip Journal on Audio, Speech, and Music Processing
Volume2010
DOIs
StatePublished - 2010

Fingerprint

semantics
Acoustics
Semantics
Acoustic waves
acoustics
annotations
files
Sound recording
retrieval
Labels
organizing
recording

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

An ontological framework for retrieving environmental sounds using semantics and acoustic content. / Wichern, Gordon; Mechtley, Brandon; Fink, Alex; Thornburg, Harvey; Spanias, Andreas.

In: Eurasip Journal on Audio, Speech, and Music Processing, Vol. 2010, 192363, 2010.

Research output: Contribution to journalArticle

@article{4d24f9c475904f2890072fe9fead51c8,
title = "An ontological framework for retrieving environmental sounds using semantics and acoustic content",
abstract = "Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic characteristics. Of paramount importance in navigating these databases are the problems of retrieving similar sounds using text- or sound-based queries, and automatically annotating unlabeled sounds. We propose an integrated system, which can be used for text-based retrieval of unlabeled audio, content-based query-by-example, and automatic annotation of unlabeled sound files. To this end, we introduce an ontological framework where sounds are connected to each other based on the similarity between acoustic features specifically adapted to environmental sounds, while semantic tags and sounds are connected through link weights that are optimized based on user-provided tags. Furthermore, tags are linked to each other through a measure of semantic similarity, which allows for efficient incorporation of out-of-vocabulary tags, that is, tags that do not yet exist in the database. Results on two freely available databases of environmental sounds contributed and labeled by nonexpert users demonstrate effective recall, precision, and average precision scores for both the text-based retrieval and annotation tasks.",
author = "Gordon Wichern and Brandon Mechtley and Alex Fink and Harvey Thornburg and Andreas Spanias",
year = "2010",
doi = "10.1155/2010/192363",
language = "English (US)",
volume = "2010",
journal = "Eurasip Journal on Audio, Speech, and Music Processing",
issn = "1687-4714",
publisher = "Springer Publishing Company",

}

TY - JOUR

T1 - An ontological framework for retrieving environmental sounds using semantics and acoustic content

AU - Wichern, Gordon

AU - Mechtley, Brandon

AU - Fink, Alex

AU - Thornburg, Harvey

AU - Spanias, Andreas

PY - 2010

Y1 - 2010

N2 - Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic characteristics. Of paramount importance in navigating these databases are the problems of retrieving similar sounds using text- or sound-based queries, and automatically annotating unlabeled sounds. We propose an integrated system, which can be used for text-based retrieval of unlabeled audio, content-based query-by-example, and automatic annotation of unlabeled sound files. To this end, we introduce an ontological framework where sounds are connected to each other based on the similarity between acoustic features specifically adapted to environmental sounds, while semantic tags and sounds are connected through link weights that are optimized based on user-provided tags. Furthermore, tags are linked to each other through a measure of semantic similarity, which allows for efficient incorporation of out-of-vocabulary tags, that is, tags that do not yet exist in the database. Results on two freely available databases of environmental sounds contributed and labeled by nonexpert users demonstrate effective recall, precision, and average precision scores for both the text-based retrieval and annotation tasks.

AB - Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic characteristics. Of paramount importance in navigating these databases are the problems of retrieving similar sounds using text- or sound-based queries, and automatically annotating unlabeled sounds. We propose an integrated system, which can be used for text-based retrieval of unlabeled audio, content-based query-by-example, and automatic annotation of unlabeled sound files. To this end, we introduce an ontological framework where sounds are connected to each other based on the similarity between acoustic features specifically adapted to environmental sounds, while semantic tags and sounds are connected through link weights that are optimized based on user-provided tags. Furthermore, tags are linked to each other through a measure of semantic similarity, which allows for efficient incorporation of out-of-vocabulary tags, that is, tags that do not yet exist in the database. Results on two freely available databases of environmental sounds contributed and labeled by nonexpert users demonstrate effective recall, precision, and average precision scores for both the text-based retrieval and annotation tasks.

UR - http://www.scopus.com/inward/record.url?scp=79251537747&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79251537747&partnerID=8YFLogxK

U2 - 10.1155/2010/192363

DO - 10.1155/2010/192363

M3 - Article

AN - SCOPUS:79251537747

VL - 2010

JO - Eurasip Journal on Audio, Speech, and Music Processing

JF - Eurasip Journal on Audio, Speech, and Music Processing

SN - 1687-4714

M1 - 192363

ER -