An evaluation of attention models for use in SLAM

Samuel Dodge, Lina Karam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.

Original languageEnglish (US)
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
Volume9025
DOIs
StatePublished - 2014
EventIntelligent Robots and Computer Vision XXXI: Algorithms and Techniques - San Francisco, CA, United States
Duration: Feb 4 2014Feb 6 2014

Other

OtherIntelligent Robots and Computer Vision XXXI: Algorithms and Techniques
CountryUnited States
CitySan Francisco, CA
Period2/4/142/6/14

Fingerprint

Simultaneous Localization and Mapping
Saliency
Detector
evaluation
Image Transformation
Evaluation
Detectors
detectors
Camera
Cameras
cameras
Visual Attention
Model
Scale Invariant Feature Transform
Repeatability
Invariant
Evaluate
Vision

Keywords

  • Interest Points
  • Saliency
  • SLAM
  • Visual Attention

ASJC Scopus subject areas

  • Applied Mathematics
  • Computer Science Applications
  • Electrical and Electronic Engineering
  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics

Cite this

Dodge, S., & Karam, L. (2014). An evaluation of attention models for use in SLAM. In Proceedings of SPIE - The International Society for Optical Engineering (Vol. 9025). [90250M] https://doi.org/10.1117/12.2043042

An evaluation of attention models for use in SLAM. / Dodge, Samuel; Karam, Lina.

Proceedings of SPIE - The International Society for Optical Engineering. Vol. 9025 2014. 90250M.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Dodge, S & Karam, L 2014, An evaluation of attention models for use in SLAM. in Proceedings of SPIE - The International Society for Optical Engineering. vol. 9025, 90250M, Intelligent Robots and Computer Vision XXXI: Algorithms and Techniques, San Francisco, CA, United States, 2/4/14. https://doi.org/10.1117/12.2043042
Dodge S, Karam L. An evaluation of attention models for use in SLAM. In Proceedings of SPIE - The International Society for Optical Engineering. Vol. 9025. 2014. 90250M https://doi.org/10.1117/12.2043042
Dodge, Samuel ; Karam, Lina. / An evaluation of attention models for use in SLAM. Proceedings of SPIE - The International Society for Optical Engineering. Vol. 9025 2014.
@inproceedings{8c3ae742779e427da7468542b725bcd3,
title = "An evaluation of attention models for use in SLAM",
abstract = "In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.",
keywords = "Interest Points, Saliency, SLAM, Visual Attention",
author = "Samuel Dodge and Lina Karam",
year = "2014",
doi = "10.1117/12.2043042",
language = "English (US)",
isbn = "9780819499424",
volume = "9025",
booktitle = "Proceedings of SPIE - The International Society for Optical Engineering",

}

TY - GEN

T1 - An evaluation of attention models for use in SLAM

AU - Dodge, Samuel

AU - Karam, Lina

PY - 2014

Y1 - 2014

N2 - In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.

AB - In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.

KW - Interest Points

KW - Saliency

KW - SLAM

KW - Visual Attention

UR - http://www.scopus.com/inward/record.url?scp=84896790536&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84896790536&partnerID=8YFLogxK

U2 - 10.1117/12.2043042

DO - 10.1117/12.2043042

M3 - Conference contribution

SN - 9780819499424

VL - 9025

BT - Proceedings of SPIE - The International Society for Optical Engineering

ER -