Indexing the pickup and drop-off locations of NYC taxi trips in PostgreSQL – Lessons from the road

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In this paper, we present our experience in indexing the drop-off and pick-up locations of taxi trips in New York City. The paper presents a comprehensive experimental analysis of classic and state-of-the-art spatial database indexing schemes. The paper evaluates a popular spatial tree indexing scheme (i.e., GIST-Spatial), a Block Range Index (BRIN-Spatial) provided by PostgreSQL as well as a new indexing scheme, namely Hippo-Spatial. In the experiments, the paper considers five evaluation metrics to compare and contrast the performance of the three indexing schemes: storage overhead, index initialization time, query response time, maintenance overhead, and throughput. Furthermore, the benchmark takes into account parameters that affect the index performance, which include but is not limited to: data size, spatial query selectivity, and spatial area density, The paper finally analyzes the experimental evaluation results and highlights the key insights and lessons learned. The results emphasize the fact that there is no one size that fits all when it comes to indexing massive-scale spatial data. The results also prove that modern database systems can maintain a lightweight index (in terms of storage and maintenance overhead) that is also fast enough for spatial data analytics applications. The source code for the experiments presented in the paper is available here: https://github.com/DataSystemsLab/hippo-postgresql.

Original languageEnglish (US)
Title of host publicationAdvances in Spatial and Temporal Databases - 15th International Symposium, SSTD 2017, Proceedings
PublisherSpringer Verlag
Pages145-162
Number of pages18
Volume10411 LNCS
ISBN (Print)9783319643663
DOIs
StatePublished - 2017
Event15th International Symposium on Spatial and Temporal Databases, SSTD 2017 - Arlington, United States
Duration: Aug 21 2017Aug 23 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10411 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th International Symposium on Spatial and Temporal Databases, SSTD 2017
CountryUnited States
CityArlington
Period8/21/178/23/17

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Yu, J., & Elsayed, M. (2017). Indexing the pickup and drop-off locations of NYC taxi trips in PostgreSQL – Lessons from the road. In Advances in Spatial and Temporal Databases - 15th International Symposium, SSTD 2017, Proceedings (Vol. 10411 LNCS, pp. 145-162). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10411 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-64367-0_8