An Online Reinforcement Learning Approach for User-Optimal Parking Searching Strategy Exploiting Unique Problem Property and Network Topology

Jun Xiao; Yingyan Lou

doi:10.1109/TITS.2021.3076408

An Online Reinforcement Learning Approach for User-Optimal Parking Searching Strategy Exploiting Unique Problem Property and Network Topology

Jun Xiao, Yingyan Lou

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

This paper investigates the idea of introducing learning algorithms into parking guidance and information systems that employ a central server, in order to provide estimated optimal parking searching strategies to travelers. The parking searching process on a network with uncertain parking availability can naturally be modeled as a Markov Decision Process (MDP). Such an MDP with full information can easily be solved by dynamic programming approaches. However, the probabilities of finding parking are difficult to define and calculate. Learning algorithms are suitable for addressing this issue. We propose an algorithm based on Q-learning, where a unique property of the parking searching MDP and the topology of the underlying transportation network are incorporated and utilized to improve its performance. This modification allows us to reduce the size of the learning problem dramatically, and thus the amount of data required to learn the optimal strategy. Numerical experiments conducted on a toy network with fixed parking probabilities show that the proposed learning algorithm outperforms the original Q-learning algorithm and three greedy heuristics in terms of the quality of the approximated optimal solution as well as the amount of training data required. Our numerical experiments on a real network with time-dependent underlying probabilities show that effective searching strategies can be achieved by the proposed algorithm, even though the learning algorithms treat the parking probabilities as constant during each exploration-exploitation cycle. The results again demonstrate that the proposed modified Q-learning algorithm significantly outperforms the original Q-learning with the same amount of training data. The results also provide insights into how the length and the split of the exploration-exploitation cycle affect the effectiveness of the proposed learning algorithm.

Original language	English (US)
Pages (from-to)	8157-8169
Number of pages	13
Journal	IEEE Transactions on Intelligent Transportation Systems
Volume	23
Issue number	7
DOIs	https://doi.org/10.1109/TITS.2021.3076408
State	Published - Jul 1 2022

Keywords

Markov decision process
parking search strategy
reinforcement learning

ASJC Scopus subject areas

Automotive Engineering
Mechanical Engineering
Computer Science Applications

Access to Document

10.1109/TITS.2021.3076408

Cite this

@article{627df61747ca43869143b486f71fa83d,

title = "An Online Reinforcement Learning Approach for User-Optimal Parking Searching Strategy Exploiting Unique Problem Property and Network Topology",

abstract = "This paper investigates the idea of introducing learning algorithms into parking guidance and information systems that employ a central server, in order to provide estimated optimal parking searching strategies to travelers. The parking searching process on a network with uncertain parking availability can naturally be modeled as a Markov Decision Process (MDP). Such an MDP with full information can easily be solved by dynamic programming approaches. However, the probabilities of finding parking are difficult to define and calculate. Learning algorithms are suitable for addressing this issue. We propose an algorithm based on Q-learning, where a unique property of the parking searching MDP and the topology of the underlying transportation network are incorporated and utilized to improve its performance. This modification allows us to reduce the size of the learning problem dramatically, and thus the amount of data required to learn the optimal strategy. Numerical experiments conducted on a toy network with fixed parking probabilities show that the proposed learning algorithm outperforms the original Q-learning algorithm and three greedy heuristics in terms of the quality of the approximated optimal solution as well as the amount of training data required. Our numerical experiments on a real network with time-dependent underlying probabilities show that effective searching strategies can be achieved by the proposed algorithm, even though the learning algorithms treat the parking probabilities as constant during each exploration-exploitation cycle. The results again demonstrate that the proposed modified Q-learning algorithm significantly outperforms the original Q-learning with the same amount of training data. The results also provide insights into how the length and the split of the exploration-exploitation cycle affect the effectiveness of the proposed learning algorithm.",

keywords = "Markov decision process, parking search strategy, reinforcement learning",

author = "Jun Xiao and Yingyan Lou",

note = "Funding Information: This work was supported by the National Science Foundation through the project Collaborative Research: Modeling and Analysis of Advanced Parking Management for Congestion Mitigation under Project CMMI 1363244 and the project EAGER: A Living Lab for Smartphone-Based Parking Management Services under Project CMMI 1643175 Publisher Copyright: {\textcopyright} 2000-2011 IEEE.",

year = "2022",

month = jul,

day = "1",

doi = "10.1109/TITS.2021.3076408",

language = "English (US)",

volume = "23",

pages = "8157--8169",

journal = "IEEE Transactions on Intelligent Transportation Systems",

issn = "1524-9050",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - An Online Reinforcement Learning Approach for User-Optimal Parking Searching Strategy Exploiting Unique Problem Property and Network Topology

AU - Xiao, Jun

AU - Lou, Yingyan

N1 - Funding Information: This work was supported by the National Science Foundation through the project Collaborative Research: Modeling and Analysis of Advanced Parking Management for Congestion Mitigation under Project CMMI 1363244 and the project EAGER: A Living Lab for Smartphone-Based Parking Management Services under Project CMMI 1643175 Publisher Copyright: © 2000-2011 IEEE.

PY - 2022/7/1

Y1 - 2022/7/1

N2 - This paper investigates the idea of introducing learning algorithms into parking guidance and information systems that employ a central server, in order to provide estimated optimal parking searching strategies to travelers. The parking searching process on a network with uncertain parking availability can naturally be modeled as a Markov Decision Process (MDP). Such an MDP with full information can easily be solved by dynamic programming approaches. However, the probabilities of finding parking are difficult to define and calculate. Learning algorithms are suitable for addressing this issue. We propose an algorithm based on Q-learning, where a unique property of the parking searching MDP and the topology of the underlying transportation network are incorporated and utilized to improve its performance. This modification allows us to reduce the size of the learning problem dramatically, and thus the amount of data required to learn the optimal strategy. Numerical experiments conducted on a toy network with fixed parking probabilities show that the proposed learning algorithm outperforms the original Q-learning algorithm and three greedy heuristics in terms of the quality of the approximated optimal solution as well as the amount of training data required. Our numerical experiments on a real network with time-dependent underlying probabilities show that effective searching strategies can be achieved by the proposed algorithm, even though the learning algorithms treat the parking probabilities as constant during each exploration-exploitation cycle. The results again demonstrate that the proposed modified Q-learning algorithm significantly outperforms the original Q-learning with the same amount of training data. The results also provide insights into how the length and the split of the exploration-exploitation cycle affect the effectiveness of the proposed learning algorithm.

AB - This paper investigates the idea of introducing learning algorithms into parking guidance and information systems that employ a central server, in order to provide estimated optimal parking searching strategies to travelers. The parking searching process on a network with uncertain parking availability can naturally be modeled as a Markov Decision Process (MDP). Such an MDP with full information can easily be solved by dynamic programming approaches. However, the probabilities of finding parking are difficult to define and calculate. Learning algorithms are suitable for addressing this issue. We propose an algorithm based on Q-learning, where a unique property of the parking searching MDP and the topology of the underlying transportation network are incorporated and utilized to improve its performance. This modification allows us to reduce the size of the learning problem dramatically, and thus the amount of data required to learn the optimal strategy. Numerical experiments conducted on a toy network with fixed parking probabilities show that the proposed learning algorithm outperforms the original Q-learning algorithm and three greedy heuristics in terms of the quality of the approximated optimal solution as well as the amount of training data required. Our numerical experiments on a real network with time-dependent underlying probabilities show that effective searching strategies can be achieved by the proposed algorithm, even though the learning algorithms treat the parking probabilities as constant during each exploration-exploitation cycle. The results again demonstrate that the proposed modified Q-learning algorithm significantly outperforms the original Q-learning with the same amount of training data. The results also provide insights into how the length and the split of the exploration-exploitation cycle affect the effectiveness of the proposed learning algorithm.

KW - Markov decision process

KW - parking search strategy

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85105846056&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85105846056&partnerID=8YFLogxK

U2 - 10.1109/TITS.2021.3076408

DO - 10.1109/TITS.2021.3076408

M3 - Article

AN - SCOPUS:85105846056

SN - 1524-9050

VL - 23

SP - 8157

EP - 8169

JO - IEEE Transactions on Intelligent Transportation Systems

JF - IEEE Transactions on Intelligent Transportation Systems

IS - 7

ER -

An Online Reinforcement Learning Approach for User-Optimal Parking Searching Strategy Exploiting Unique Problem Property and Network Topology

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this