TY - GEN
T1 - GeoSparkViz
T2 - 30th International Conference on Scientific and Statistical Database Management, SSDBM 2018
AU - Yu, Jia
AU - Zhang, Zongsi
AU - Elsayed, Mohamed
N1 - Funding Information:
This work is supported in part by the National Science Foundation (NSF) under Grant 1654861, the Salt River Project Agricultural Improvement and Power District (SRP), and the DOD-ARMY Training and Doctrine Command (TRADOC).
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/9
Y1 - 2018/7/9
N2 - Data Visualization allows users to summarize, analyze and reason about data. A map visualization tool first loads the designated geospatial data, processes the data and then applies the map visualization effect. Guaranteeing detailed and accurate geospatial map visualization (e.g., at multiple zoom levels) requires extremely high-resolution maps. Classic solutions suffer from limited computation resources and hence take a tremendous amount of time to generate maps for large-scale geospatial data. The paper presents GeoSparkViz a large-scale geospatial map visualization framework. GeoSparkViz extends a cluster computing system (Apache Spark in our case) to provide native support for general cartographic design. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark. It provides the data scientist a holistic system that allows her to perform data management and visualization on spatial data and reduces the overhead of loading the intermediate spatial data generated during the data management phase to the designated map visualization tool. GeoSparkViz also proposes a map tile data partitioning method that achieves load balancing for the map visualization workloads among all nodes in the cluster. Extensive experiments show that GeoSparkViz can generate a high-resolution (i.e., Gigapixel image) Heatmap of 1.7 billion Open-StreetMaps objects and 1.3 billion NYC taxi trips in ≈4 and 5 minutes on a four-node commodity cluster, respectively.
AB - Data Visualization allows users to summarize, analyze and reason about data. A map visualization tool first loads the designated geospatial data, processes the data and then applies the map visualization effect. Guaranteeing detailed and accurate geospatial map visualization (e.g., at multiple zoom levels) requires extremely high-resolution maps. Classic solutions suffer from limited computation resources and hence take a tremendous amount of time to generate maps for large-scale geospatial data. The paper presents GeoSparkViz a large-scale geospatial map visualization framework. GeoSparkViz extends a cluster computing system (Apache Spark in our case) to provide native support for general cartographic design. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark. It provides the data scientist a holistic system that allows her to perform data management and visualization on spatial data and reduces the overhead of loading the intermediate spatial data generated during the data management phase to the designated map visualization tool. GeoSparkViz also proposes a map tile data partitioning method that achieves load balancing for the map visualization workloads among all nodes in the cluster. Extensive experiments show that GeoSparkViz can generate a high-resolution (i.e., Gigapixel image) Heatmap of 1.7 billion Open-StreetMaps objects and 1.3 billion NYC taxi trips in ≈4 and 5 minutes on a four-node commodity cluster, respectively.
KW - Big spatial data
KW - Distributed computation
KW - Spatial visualization
UR - http://www.scopus.com/inward/record.url?scp=85054936441&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054936441&partnerID=8YFLogxK
U2 - 10.1145/3221269.3223040
DO - 10.1145/3221269.3223040
M3 - Conference contribution
AN - SCOPUS:85054936441
T3 - ACM International Conference Proceeding Series
BT - Scientific and Statistical Database Management - 30th International Conference, SSDBM 2018, Proceedings
A2 - Bohlen, Michael
A2 - Gamper, Johann
A2 - Kroger, Peer
A2 - Sacharidis, Dimitris
PB - Association for Computing Machinery
Y2 - 9 July 2018 through 11 July 2018
ER -