Towards geospatial semantic search

Exploiting latent semantic relations in geospatial data

Research output: Contribution to journalArticle

28 Citations (Scopus)

Abstract

This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data. We propose an algorithm combining LSA and a Two-Tier Ranking (LSATTR) algorithm based on revised cosine similarity to build a more efficient search engine - Semantic Indexing and Ranking (SIR) - for a semantic-enabled, more effective data discovery. In addition to its ability to handle subject-based search, we propose a mechanism to combine geospatial taxonomy and Yahoo! GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related. The metadata set, in the format of ISO19115, from NASA's SEDAC (Socio-Economic Data Application Center) is used as the corpus of SIR. Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques, such as Lucene, in terms of both recall and precision. Moreover, the semantic associations among all existing words in the corpus are discovered. These associations provide substantial support for automating the population of spatial ontologies. We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.

Original languageEnglish (US)
Pages (from-to)17-37
Number of pages21
JournalInternational Journal of Digital Earth
Volume7
Issue number1
DOIs
StatePublished - 2014

Fingerprint

ranking
Semantics
engine
Search engines
metadata
Earth (planet)
Taxonomies
Metadata
Ontology
NASA
Economics

Keywords

  • Digital Earth
  • geospatial semantics
  • ontology
  • search effectiveness
  • search engine
  • similarity

ASJC Scopus subject areas

  • Earth and Planetary Sciences(all)
  • Computer Science Applications
  • Software

Cite this

Towards geospatial semantic search : Exploiting latent semantic relations in geospatial data. / Li, WenWen; Goodchild, Michael; Raskin, Robert.

In: International Journal of Digital Earth, Vol. 7, No. 1, 2014, p. 17-37.

Research output: Contribution to journalArticle

@article{8272a27177e24dd9bdde818df44a14f7,
title = "Towards geospatial semantic search: Exploiting latent semantic relations in geospatial data",
abstract = "This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data. We propose an algorithm combining LSA and a Two-Tier Ranking (LSATTR) algorithm based on revised cosine similarity to build a more efficient search engine - Semantic Indexing and Ranking (SIR) - for a semantic-enabled, more effective data discovery. In addition to its ability to handle subject-based search, we propose a mechanism to combine geospatial taxonomy and Yahoo! GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related. The metadata set, in the format of ISO19115, from NASA's SEDAC (Socio-Economic Data Application Center) is used as the corpus of SIR. Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques, such as Lucene, in terms of both recall and precision. Moreover, the semantic associations among all existing words in the corpus are discovered. These associations provide substantial support for automating the population of spatial ontologies. We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.",
keywords = "Digital Earth, geospatial semantics, ontology, search effectiveness, search engine, similarity",
author = "WenWen Li and Michael Goodchild and Robert Raskin",
year = "2014",
doi = "10.1080/17538947.2012.674561",
language = "English (US)",
volume = "7",
pages = "17--37",
journal = "International Journal of Digital Earth",
issn = "1753-8947",
publisher = "Taylor and Francis Ltd.",
number = "1",

}

TY - JOUR

T1 - Towards geospatial semantic search

T2 - Exploiting latent semantic relations in geospatial data

AU - Li, WenWen

AU - Goodchild, Michael

AU - Raskin, Robert

PY - 2014

Y1 - 2014

N2 - This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data. We propose an algorithm combining LSA and a Two-Tier Ranking (LSATTR) algorithm based on revised cosine similarity to build a more efficient search engine - Semantic Indexing and Ranking (SIR) - for a semantic-enabled, more effective data discovery. In addition to its ability to handle subject-based search, we propose a mechanism to combine geospatial taxonomy and Yahoo! GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related. The metadata set, in the format of ISO19115, from NASA's SEDAC (Socio-Economic Data Application Center) is used as the corpus of SIR. Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques, such as Lucene, in terms of both recall and precision. Moreover, the semantic associations among all existing words in the corpus are discovered. These associations provide substantial support for automating the population of spatial ontologies. We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.

AB - This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data. We propose an algorithm combining LSA and a Two-Tier Ranking (LSATTR) algorithm based on revised cosine similarity to build a more efficient search engine - Semantic Indexing and Ranking (SIR) - for a semantic-enabled, more effective data discovery. In addition to its ability to handle subject-based search, we propose a mechanism to combine geospatial taxonomy and Yahoo! GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related. The metadata set, in the format of ISO19115, from NASA's SEDAC (Socio-Economic Data Application Center) is used as the corpus of SIR. Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques, such as Lucene, in terms of both recall and precision. Moreover, the semantic associations among all existing words in the corpus are discovered. These associations provide substantial support for automating the population of spatial ontologies. We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.

KW - Digital Earth

KW - geospatial semantics

KW - ontology

KW - search effectiveness

KW - search engine

KW - similarity

UR - http://www.scopus.com/inward/record.url?scp=84892760062&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892760062&partnerID=8YFLogxK

U2 - 10.1080/17538947.2012.674561

DO - 10.1080/17538947.2012.674561

M3 - Article

VL - 7

SP - 17

EP - 37

JO - International Journal of Digital Earth

JF - International Journal of Digital Earth

SN - 1753-8947

IS - 1

ER -