TY - GEN
T1 - Building a large-scale microscopic road network traffic simulator in apache spark
AU - Fu, Zishan
AU - Yu, Jia
AU - Sarwat, Mohamed
N1 - Funding Information:
XI. ACKNOWLEDGMENT This work is supported in part by the National Science Foundation (NSF) under Grant 1845789, the Salt River Project Agricultural Improvement and Power District (SRP), and the DOD-ARMY Training and Doctrine Command (TRADOC).
Publisher Copyright:
© 2019 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-Temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-Aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).
AB - Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-Temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-Aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).
KW - Apache Spark
KW - Microscopic traffic simulation
KW - Spatio-Temporal Data
KW - Traffic model
UR - http://www.scopus.com/inward/record.url?scp=85070992171&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85070992171&partnerID=8YFLogxK
U2 - 10.1109/MDM.2019.00-42
DO - 10.1109/MDM.2019.00-42
M3 - Conference contribution
AN - SCOPUS:85070992171
T3 - Proceedings - IEEE International Conference on Mobile Data Management
SP - 320
EP - 328
BT - Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 20th International Conference on Mobile Data Management, MDM 2019
Y2 - 10 June 2019 through 13 June 2019
ER -