A general-purpose framework for parallel processing of large-scale LiDAR data

Zhenlong Li; Michael E. Hodgson; WenWen Li

doi:10.1080/17538947.2016.1269842

A general-purpose framework for parallel processing of large-scale LiDAR data

Zhenlong Li, Michael E. Hodgson, WenWen Li

Research output: Contribution to journal › Article › peer-review

32 Scopus citations

Abstract

Light detection and ranging (LiDAR) data are essential for scientific discoveries such as Earth and ecological sciences, environmental applications, and responding to natural disasters. While collecting LiDAR data over large areas is quite possible the subsequent processing steps typically involve large computational demands. Efficiently storing, managing, and processing LiDAR data are the prerequisite steps for enabling these LiDAR-based applications. However, handling LiDAR data poses grand geoprocessing challenges due to data and computational intensity. To tackle such challenges, we developed a general-purpose scalable framework coupled with a sophisticated data decomposition and parallelization strategy to efficiently handle ‘big’ LiDAR data collections. The contributions of this research were (1) a tile-based spatial index to manage big LiDAR data in the scalable and fault-tolerable Hadoop distributed file system, (2) two spatial decomposition techniques to enable efficient parallelization of different types of LiDAR processing tasks, and (3) by coupling existing LiDAR processing tools with Hadoop, a variety of LiDAR data processing tasks can be conducted in parallel in a highly scalable distributed computing environment using an online geoprocessing application. A proof-of-concept prototype is presented here to demonstrate the feasibility, performance, and scalability of the proposed framework.

Original language	English (US)
Pages (from-to)	26-47
Number of pages	22
Journal	International Journal of Digital Earth
Volume	11
Issue number	1
DOIs	https://doi.org/10.1080/17538947.2016.1269842
State	Published - Jan 2 2018

Keywords

Big data
Hadoop MapReduce
LAStools
online geoprocessing
parallel
spatial decomposition

ASJC Scopus subject areas

Software
Computer Science Applications
General Earth and Planetary Sciences

Access to Document

10.1080/17538947.2016.1269842

Cite this

@article{119feffc6e3945bf9f117285c1c81ae5,

title = "A general-purpose framework for parallel processing of large-scale LiDAR data",

abstract = "Light detection and ranging (LiDAR) data are essential for scientific discoveries such as Earth and ecological sciences, environmental applications, and responding to natural disasters. While collecting LiDAR data over large areas is quite possible the subsequent processing steps typically involve large computational demands. Efficiently storing, managing, and processing LiDAR data are the prerequisite steps for enabling these LiDAR-based applications. However, handling LiDAR data poses grand geoprocessing challenges due to data and computational intensity. To tackle such challenges, we developed a general-purpose scalable framework coupled with a sophisticated data decomposition and parallelization strategy to efficiently handle {\textquoteleft}big{\textquoteright} LiDAR data collections. The contributions of this research were (1) a tile-based spatial index to manage big LiDAR data in the scalable and fault-tolerable Hadoop distributed file system, (2) two spatial decomposition techniques to enable efficient parallelization of different types of LiDAR processing tasks, and (3) by coupling existing LiDAR processing tools with Hadoop, a variety of LiDAR data processing tasks can be conducted in parallel in a highly scalable distributed computing environment using an online geoprocessing application. A proof-of-concept prototype is presented here to demonstrate the feasibility, performance, and scalability of the proposed framework.",

keywords = "Big data, Hadoop MapReduce, LAStools, online geoprocessing, parallel, spatial decomposition",

author = "Zhenlong Li and Hodgson, {Michael E.} and WenWen Li",

note = "Funding Information: This study was funded by University of South Carolina through the ASPIRE?(Advanced Support for Innovative Research Excellence) program [13540-16-41796]. Additional funding was provided by the South Carolina Department of Transportation under contract to the University of South Carolina [SPR #707 or USC 13540FB11], USGS [G15AC00085], and NSF-BCS [1455349]. We thank the three anonymous reviewers for their insightful comments that greatly improved the manuscript. Publisher Copyright: {\textcopyright} 2017 Informa UK Limited, trading as Taylor & Francis Group.",

year = "2018",

month = jan,

day = "2",

doi = "10.1080/17538947.2016.1269842",

language = "English (US)",

volume = "11",

pages = "26--47",

journal = "International Journal of Digital Earth",

issn = "1753-8947",

publisher = "Taylor and Francis Ltd.",

number = "1",

}

TY - JOUR

T1 - A general-purpose framework for parallel processing of large-scale LiDAR data

AU - Li, Zhenlong

AU - Hodgson, Michael E.

AU - Li, WenWen

N1 - Funding Information: This study was funded by University of South Carolina through the ASPIRE?(Advanced Support for Innovative Research Excellence) program [13540-16-41796]. Additional funding was provided by the South Carolina Department of Transportation under contract to the University of South Carolina [SPR #707 or USC 13540FB11], USGS [G15AC00085], and NSF-BCS [1455349]. We thank the three anonymous reviewers for their insightful comments that greatly improved the manuscript. Publisher Copyright: © 2017 Informa UK Limited, trading as Taylor & Francis Group.

PY - 2018/1/2

Y1 - 2018/1/2

N2 - Light detection and ranging (LiDAR) data are essential for scientific discoveries such as Earth and ecological sciences, environmental applications, and responding to natural disasters. While collecting LiDAR data over large areas is quite possible the subsequent processing steps typically involve large computational demands. Efficiently storing, managing, and processing LiDAR data are the prerequisite steps for enabling these LiDAR-based applications. However, handling LiDAR data poses grand geoprocessing challenges due to data and computational intensity. To tackle such challenges, we developed a general-purpose scalable framework coupled with a sophisticated data decomposition and parallelization strategy to efficiently handle ‘big’ LiDAR data collections. The contributions of this research were (1) a tile-based spatial index to manage big LiDAR data in the scalable and fault-tolerable Hadoop distributed file system, (2) two spatial decomposition techniques to enable efficient parallelization of different types of LiDAR processing tasks, and (3) by coupling existing LiDAR processing tools with Hadoop, a variety of LiDAR data processing tasks can be conducted in parallel in a highly scalable distributed computing environment using an online geoprocessing application. A proof-of-concept prototype is presented here to demonstrate the feasibility, performance, and scalability of the proposed framework.

AB - Light detection and ranging (LiDAR) data are essential for scientific discoveries such as Earth and ecological sciences, environmental applications, and responding to natural disasters. While collecting LiDAR data over large areas is quite possible the subsequent processing steps typically involve large computational demands. Efficiently storing, managing, and processing LiDAR data are the prerequisite steps for enabling these LiDAR-based applications. However, handling LiDAR data poses grand geoprocessing challenges due to data and computational intensity. To tackle such challenges, we developed a general-purpose scalable framework coupled with a sophisticated data decomposition and parallelization strategy to efficiently handle ‘big’ LiDAR data collections. The contributions of this research were (1) a tile-based spatial index to manage big LiDAR data in the scalable and fault-tolerable Hadoop distributed file system, (2) two spatial decomposition techniques to enable efficient parallelization of different types of LiDAR processing tasks, and (3) by coupling existing LiDAR processing tools with Hadoop, a variety of LiDAR data processing tasks can be conducted in parallel in a highly scalable distributed computing environment using an online geoprocessing application. A proof-of-concept prototype is presented here to demonstrate the feasibility, performance, and scalability of the proposed framework.

KW - Big data

KW - Hadoop MapReduce

KW - LAStools

KW - online geoprocessing

KW - parallel

KW - spatial decomposition

UR - http://www.scopus.com/inward/record.url?scp=85008349838&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85008349838&partnerID=8YFLogxK

U2 - 10.1080/17538947.2016.1269842

DO - 10.1080/17538947.2016.1269842

M3 - Article

AN - SCOPUS:85008349838

SN - 1753-8947

VL - 11

SP - 26

EP - 47

JO - International Journal of Digital Earth

JF - International Journal of Digital Earth

IS - 1

ER -

A general-purpose framework for parallel processing of large-scale LiDAR data

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this