Building a large-scale microscopic road network traffic simulator in apache spark

Zishan Fu, Jia Yu, Mohamed Elsayed

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-Temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-Aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).

Original languageEnglish (US)
Title of host publicationProceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages320-328
Number of pages9
ISBN (Electronic)9781728133638
DOIs
StatePublished - Jun 1 2019
Event20th International Conference on Mobile Data Management, MDM 2019 - Hong Kong, Hong Kong
Duration: Jun 10 2019Jun 13 2019

Publication series

NameProceedings - IEEE International Conference on Mobile Data Management
Volume2019-June
ISSN (Print)1551-6245

Conference

Conference20th International Conference on Mobile Data Management, MDM 2019
CountryHong Kong
CityHong Kong
Period6/10/196/13/19

Fingerprint

Electric sparks
Simulators
Telecommunication traffic
Urban planning
Information management
Global positioning system
Scalability
Railroad cars

Keywords

  • Apache Spark
  • Microscopic traffic simulation
  • Spatio-Temporal Data
  • Traffic model

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Fu, Z., Yu, J., & Elsayed, M. (2019). Building a large-scale microscopic road network traffic simulator in apache spark. In Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019 (pp. 320-328). [8788796] (Proceedings - IEEE International Conference on Mobile Data Management; Vol. 2019-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/MDM.2019.00-42

Building a large-scale microscopic road network traffic simulator in apache spark. / Fu, Zishan; Yu, Jia; Elsayed, Mohamed.

Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 320-328 8788796 (Proceedings - IEEE International Conference on Mobile Data Management; Vol. 2019-June).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fu, Z, Yu, J & Elsayed, M 2019, Building a large-scale microscopic road network traffic simulator in apache spark. in Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019., 8788796, Proceedings - IEEE International Conference on Mobile Data Management, vol. 2019-June, Institute of Electrical and Electronics Engineers Inc., pp. 320-328, 20th International Conference on Mobile Data Management, MDM 2019, Hong Kong, Hong Kong, 6/10/19. https://doi.org/10.1109/MDM.2019.00-42
Fu Z, Yu J, Elsayed M. Building a large-scale microscopic road network traffic simulator in apache spark. In Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 320-328. 8788796. (Proceedings - IEEE International Conference on Mobile Data Management). https://doi.org/10.1109/MDM.2019.00-42
Fu, Zishan ; Yu, Jia ; Elsayed, Mohamed. / Building a large-scale microscopic road network traffic simulator in apache spark. Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 320-328 (Proceedings - IEEE International Conference on Mobile Data Management).
@inproceedings{5ae3f1ea3dd5432fb422ea340a799c56,
title = "Building a large-scale microscopic road network traffic simulator in apache spark",
abstract = "Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-Temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-Aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).",
keywords = "Apache Spark, Microscopic traffic simulation, Spatio-Temporal Data, Traffic model",
author = "Zishan Fu and Jia Yu and Mohamed Elsayed",
year = "2019",
month = "6",
day = "1",
doi = "10.1109/MDM.2019.00-42",
language = "English (US)",
series = "Proceedings - IEEE International Conference on Mobile Data Management",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "320--328",
booktitle = "Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019",

}

TY - GEN

T1 - Building a large-scale microscopic road network traffic simulator in apache spark

AU - Fu, Zishan

AU - Yu, Jia

AU - Elsayed, Mohamed

PY - 2019/6/1

Y1 - 2019/6/1

N2 - Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-Temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-Aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).

AB - Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-Temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-Aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).

KW - Apache Spark

KW - Microscopic traffic simulation

KW - Spatio-Temporal Data

KW - Traffic model

UR - http://www.scopus.com/inward/record.url?scp=85070992171&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070992171&partnerID=8YFLogxK

U2 - 10.1109/MDM.2019.00-42

DO - 10.1109/MDM.2019.00-42

M3 - Conference contribution

T3 - Proceedings - IEEE International Conference on Mobile Data Management

SP - 320

EP - 328

BT - Proceedings - 2019 20th International Conference on Mobile Data Management, MDM 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -