A MapReduce algorithm to create contiguity weights for spatial analysis of Big data

Xun Li, WenWen Li, Luc Anselin, Sergio Rey, Julia Koschinsky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Spatial analysis of Big data is a key component of Cyber- GIS. However, how to utilize existing cyberinfrastructure (e.g. large computing clusters) to perform parallel and distributed spatial analysis on Big data remains a huge challenge. Problems such as efficient spatial weights creation, spatial statistics and spatial regression of Big data still need investigation. In this research, we propose a MapReduce algorithm for creating contiguity-based spatial weights. This algorithm provides the ability to create spatial weights from very large spatial datasets efficiently by using computing resources that are organized in the Hadoop framework. It works in the paradigm of MapReduce: mappers are distributed in computing clusters to find contiguous neighbors in parallel, then reducers collect the results and generate the weights matrix. To test the performance of this algorithm, we design experiment to create contiguity-based weights matrix from artificial spatial data with up to 190 million polygons using Amazon's Hadoop framework called Elastic MapReduce. The experiment demonstrates the scalability of this parallel algorithm which utilizes large computing clusters to solve the problem of creating contiguity weights on Big data.

Original languageEnglish (US)
Title of host publicationProceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014
PublisherAssociation for Computing Machinery, Inc
Pages50-53
Number of pages4
ISBN (Print)9781450331326
DOIs
StatePublished - Nov 4 2014
Event3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014 - Dallas, United States
Duration: Nov 4 2014 → …

Other

Other3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014
CountryUnited States
CityDallas
Period11/4/14 → …

Fingerprint

Cluster computing
Parallel algorithms
Geographic information systems
Scalability
Experiments
Statistics
Big data

Keywords

  • Big data
  • Mapreduce
  • Spatial weights

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Computer Vision and Pattern Recognition

Cite this

Li, X., Li, W., Anselin, L., Rey, S., & Koschinsky, J. (2014). A MapReduce algorithm to create contiguity weights for spatial analysis of Big data. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014 (pp. 50-53). Association for Computing Machinery, Inc. https://doi.org/10.1145/2676536.2676543

A MapReduce algorithm to create contiguity weights for spatial analysis of Big data. / Li, Xun; Li, WenWen; Anselin, Luc; Rey, Sergio; Koschinsky, Julia.

Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014. Association for Computing Machinery, Inc, 2014. p. 50-53.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, X, Li, W, Anselin, L, Rey, S & Koschinsky, J 2014, A MapReduce algorithm to create contiguity weights for spatial analysis of Big data. in Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014. Association for Computing Machinery, Inc, pp. 50-53, 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014, Dallas, United States, 11/4/14. https://doi.org/10.1145/2676536.2676543
Li X, Li W, Anselin L, Rey S, Koschinsky J. A MapReduce algorithm to create contiguity weights for spatial analysis of Big data. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014. Association for Computing Machinery, Inc. 2014. p. 50-53 https://doi.org/10.1145/2676536.2676543
Li, Xun ; Li, WenWen ; Anselin, Luc ; Rey, Sergio ; Koschinsky, Julia. / A MapReduce algorithm to create contiguity weights for spatial analysis of Big data. Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014. Association for Computing Machinery, Inc, 2014. pp. 50-53
@inproceedings{bec3322bdeef4d40927d01d4665ebc2f,
title = "A MapReduce algorithm to create contiguity weights for spatial analysis of Big data",
abstract = "Spatial analysis of Big data is a key component of Cyber- GIS. However, how to utilize existing cyberinfrastructure (e.g. large computing clusters) to perform parallel and distributed spatial analysis on Big data remains a huge challenge. Problems such as efficient spatial weights creation, spatial statistics and spatial regression of Big data still need investigation. In this research, we propose a MapReduce algorithm for creating contiguity-based spatial weights. This algorithm provides the ability to create spatial weights from very large spatial datasets efficiently by using computing resources that are organized in the Hadoop framework. It works in the paradigm of MapReduce: mappers are distributed in computing clusters to find contiguous neighbors in parallel, then reducers collect the results and generate the weights matrix. To test the performance of this algorithm, we design experiment to create contiguity-based weights matrix from artificial spatial data with up to 190 million polygons using Amazon's Hadoop framework called Elastic MapReduce. The experiment demonstrates the scalability of this parallel algorithm which utilizes large computing clusters to solve the problem of creating contiguity weights on Big data.",
keywords = "Big data, Mapreduce, Spatial weights",
author = "Xun Li and WenWen Li and Luc Anselin and Sergio Rey and Julia Koschinsky",
year = "2014",
month = "11",
day = "4",
doi = "10.1145/2676536.2676543",
language = "English (US)",
isbn = "9781450331326",
pages = "50--53",
booktitle = "Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - A MapReduce algorithm to create contiguity weights for spatial analysis of Big data

AU - Li, Xun

AU - Li, WenWen

AU - Anselin, Luc

AU - Rey, Sergio

AU - Koschinsky, Julia

PY - 2014/11/4

Y1 - 2014/11/4

N2 - Spatial analysis of Big data is a key component of Cyber- GIS. However, how to utilize existing cyberinfrastructure (e.g. large computing clusters) to perform parallel and distributed spatial analysis on Big data remains a huge challenge. Problems such as efficient spatial weights creation, spatial statistics and spatial regression of Big data still need investigation. In this research, we propose a MapReduce algorithm for creating contiguity-based spatial weights. This algorithm provides the ability to create spatial weights from very large spatial datasets efficiently by using computing resources that are organized in the Hadoop framework. It works in the paradigm of MapReduce: mappers are distributed in computing clusters to find contiguous neighbors in parallel, then reducers collect the results and generate the weights matrix. To test the performance of this algorithm, we design experiment to create contiguity-based weights matrix from artificial spatial data with up to 190 million polygons using Amazon's Hadoop framework called Elastic MapReduce. The experiment demonstrates the scalability of this parallel algorithm which utilizes large computing clusters to solve the problem of creating contiguity weights on Big data.

AB - Spatial analysis of Big data is a key component of Cyber- GIS. However, how to utilize existing cyberinfrastructure (e.g. large computing clusters) to perform parallel and distributed spatial analysis on Big data remains a huge challenge. Problems such as efficient spatial weights creation, spatial statistics and spatial regression of Big data still need investigation. In this research, we propose a MapReduce algorithm for creating contiguity-based spatial weights. This algorithm provides the ability to create spatial weights from very large spatial datasets efficiently by using computing resources that are organized in the Hadoop framework. It works in the paradigm of MapReduce: mappers are distributed in computing clusters to find contiguous neighbors in parallel, then reducers collect the results and generate the weights matrix. To test the performance of this algorithm, we design experiment to create contiguity-based weights matrix from artificial spatial data with up to 190 million polygons using Amazon's Hadoop framework called Elastic MapReduce. The experiment demonstrates the scalability of this parallel algorithm which utilizes large computing clusters to solve the problem of creating contiguity weights on Big data.

KW - Big data

KW - Mapreduce

KW - Spatial weights

UR - http://www.scopus.com/inward/record.url?scp=84920270802&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920270802&partnerID=8YFLogxK

U2 - 10.1145/2676536.2676543

DO - 10.1145/2676536.2676543

M3 - Conference contribution

SN - 9781450331326

SP - 50

EP - 53

BT - Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014

PB - Association for Computing Machinery, Inc

ER -