13th Asia Pacific Transportation Development Conference
Deep Recurrent Q-Learning Method for Single Intersection Signal Control
Publication: Resilience and Sustainable Transportation Systems
ABSTRACT
In recent years, reinforcement learning is applied into traffic control as an emerging technique, which obtain more and more research attention. In this paper, a deep recurrent Q-learning agent was implemented in the context of traffic signal control in order to improve the efficiency of highway transportation while maintaining a significant degree of reality. The reinforcement learning agent was designed with a state representation that identifies the position of vehicles in the environment, an action set defined by traffic light configurations with a fixed duration, and a reward function that capture in different magnitudes the difference of vehicles waiting times between actions. In particular, the elements of the agent are designed to make sense for possible real-world devices. The learning approach applied for the agent’s training is the deep Q-Network combined with a recurrent neural network. The Q-learning is used for the update of the action values as the experience of the agent increases and the neural network is employed for the Q-values prediction and, therefore, the approximation of the state-action function. SUMO was used to replicate a 4-way intersection with multiple lanes, and to reproduce various traffic scenarios with different traffic distributions. The reward was calculated based on the simulated waiting time of vehicles, making the agent aware of the consequence of actions in different situations. Results indicate that the proposed agent can adapt to several traffic situations and is able to outperform the static traffic light system in situations of low, medium, and high densities, increasing the overall efficiency of more than 50%.
Get full access to this article
View all available purchase options and get full access to this chapter.
ACKNOWLEDGEMENT
This research is sponsored by National key research and development program (Grant 2018YFB1601101) and National Natural Science Foundation of China (Grant 71971116).
REFERENCES
Bingham, E. (2001). “Reinforcement learning in neuro-fuzzy traffic signal control.” European Journal of Operational Research, 131(2), 232–241.
Genders, W., & Razavi, S. (2016). “Using a Deep Reinforcement Learning Agent for Traffic Signal Control.” arXiv preprint arXiv:1611.01142.
Gao, J., Shen, Y., Liu, J., Ito, M., & Shiratori, N. (2017). “Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network.” arXiv preprint arXiv: 1705.02755.
Hausknecht, M. and Stone, P. (2015). “Deep Recurrent Q-Learning for Partially Observable MDPs.” arXiv:1507. 06527.
Kingma, D. P. and Ba, J. (2014). “Adam: A Method for Stochastic Optimization.” Computer Science.
Li, L., Lv, Y., and Wang, F. Y. (2016). “Traffic signal timing via deep reinforcement learning.” IEEE/CAA Journal of Automatica Sinica, 3(3), 247-254.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). “Playing Atari with Deep Reinforcement Learning.” Computer Science.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., … and Petersen, S. (2015). “Human-level control through deep reinforcement learning.” Nature, 2015(7540), 529-533.
Nie, J.Q. and Xu, D.L. (2013). “Distributed Adaptive Traffic Signal Control Based on Fuzzy Q Learning.” Computer technology and development, 23(3), 171-174.
Prabuchandran, K.J., Hemanth Kumar, A.N. & Bhatnagar, S (2015). “Decentralized learning for traffic signal control.” 2015 7th International Conference on Communication Systems and Networks (COMSNETS). IEEE, 1-6.
Richter, S. (2007). “Traffic Light Scheduling using Policy-Gradient Reinforcement Learning.” The International Conference on Automated Planning and Scheduling.
van der Pol, E. (2016). “Deep Reinforcement Learning for Coordination in Traffic Light Control.” Master’s thesis, University of Amsterdam.
Zhang, W.Q. and Yu, L.J. (2016). “Regional intersection coordination control based on BP neural network.” Technology & Economy in Areas of Communications, 18(1), 38-41.
Information & Authors
Information
Published In
Resilience and Sustainable Transportation Systems
Pages: 148 - 156
Editors: Fengxiang Qiao, Ph.D., Texas Southern University, Yong Bai, Ph.D., Marquette University, Pei-Sung Lin, Ph.D., University of South Florida, Steven I Jy Chien, Ph.D., New Jersey Institute of Technology, Yongping Zhang, Ph.D., California State Polytechnic University, and Lin Zhu, Ph.D., Shanghai University of Engineering Science
ISBN (Online): 978-0-7844-8290-2
Copyright
© 2020 American Society of Civil Engineers.
History
Published online: Jun 29, 2020
Published in print: Jun 29, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.