An Attention Reinforcement Learning–Based Strategy for Large-Scale Adaptive Traffic Signal Control System

Han, Gengyue; Liu, Xiaohan; Wang, Hao; Dong, Changyin; Han, Yu

doi:10.1061/JTEPBS.TEENG-8261

Technical Papers

Jan 3, 2024

An Attention Reinforcement Learning–Based Strategy for Large-Scale Adaptive Traffic Signal Control System

Authors: Gengyue Han https://orcid.org/0009-0007-3831-2084 [email protected], Xiaohan Liu [email protected], Hao Wang https://orcid.org/0000-0001-7961-7588 [email protected], Changyin Dong [email protected], and Yu Han [email protected]Author Affiliations

Publication: Journal of Transportation Engineering, Part A: Systems

Volume 150, Issue 3

https://doi.org/10.1061/JTEPBS.TEENG-8261

Get Access

Abstract

This paper proposes a reinforcement learning (RL)-based traffic control strategy integrated with attention mechanism for large-scale adaptive traffic signal control (ATSC) system. The proposed attention RL integrates attention mechanism into a multiagent RL model, namely multiagent proximal policy optimization (MAPPO), so as to enable more effective, scalable, and stable learning in complex ATSC environments. In the attention RL, decentralized policies are trained using a centrally computed critic that shares an attention model, while the attention model selects relevant intersections for each agent to estimate the global critic. This framework effectively reduces the computational complexity and stabilizes the training process, enhancing the ability of RL agents to control large-scale traffic networks. The proposed control strategy is tested in both a large synthetic traffic grid and a large real-world traffic network of Yangzhou city using the microscopic traffic simulation tool, SUMO. Experimental results demonstrate that the proposed approach learns stable and sustainable policies that achieve lower congestion level and faster recovery, which outperforms other state-of-art RL-based approaches, as well as a gap-based actuated controller.

Get full access to this article

View all available purchase options and get full access to this article.

Get Access

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

Thanks Prof. Ljubo Vlacic with the School of Engineering, Griffith University for review and editing on the first draft of this manuscript. This work was supported in part by National Key R&D Program of China (2022ZD0115600), in part by National Natural Science Foundation of China (52302405), and in part by Natural Science Foundation of Jiangsu Province (BK20210249).

References

Abdulhai, B., R. Pringle, and G. J. Karakoulas. 2003. “Reinforcement learning for true adaptive traffic signal control.” J. Transp. Eng. 129 (3): 278–285. https://doi.org/10.1061/(ASCE)0733-947X(2003)129:3(278).

Google Scholar

Aslani, M., M. Saadi, and M. Wiering. 2017. “Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events.” Transp. Res. Part C Emerging Technol. 85 (Dec): 732–752. https://doi.org/10.1016/j.trc.2017.09.020.

Google Scholar

Boukerche, A., D. Zhong, P. Sun, and S. Member. 2021. “A novel reinforcement learning-based cooperative traffic signal system through max-pressure control.” IEEE Trans. Veh. Technol. 71 (2): 1187–1198. https://doi.org/10.1109/TVT.2021.3069921.

Google Scholar

Choe, C. J., S. Baek, B. Woon, and S. H. Kong. 2019. “Deep Q learning with LSTM for traffic light control.” In Proc., 2018 24th Asia-Pacific Conf. on Communications, APCC 2018, 331–336. New York: IEEE.

Google Scholar

Chu, T., J. Wang, L. Codecà, and Z. Li. 2020. “Multi-agent deep reinforcement learning for large-scale traffic signal control.” IEEE Trans. Intell. Transp. Syst. 21 (3): 1086–1095. https://doi.org/10.1109/TITS.2019.2901791.

Google Scholar

Dell’Olmo, P., and P. B. Mirchandani. 1995. “REALBAND: An approach for real-time coordination of traffic flows on networks.” Transp. Res. Rec. 1494 (1): 106–116.

Google Scholar

Engelhart, M. D., and H. Moughamian. 2005. “Reinforcement learning: An introduction.” IEEE Trans. Neural Networks 16 (1): 285–286. https://doi.org/10.1109/TNN.2004.842673.

Google Scholar

Gartner, N. H. 1983. “OPAC: A demand-responsive strategy for traffic signal control.” Transp. Res. Rec. 1983 (906): 75–81.

Google Scholar

Genders, W., and S. Razavi. 2018. “Evaluating reinforcement learning state representations for adaptive traffic signal control.” Procedia Comput. Sci. 130 (Jun): 26–33. https://doi.org/10.1016/j.procs.2018.04.008.

Google Scholar

Han, G., Y. Han, H. Wang, T. Ruan, and C. Li. 2023a. “Coordinated control of urban expressway integrating adjacent signalized intersections using adversarial network based reinforcement learning method.” IEEE Trans. Intell. Transp. Syst. (Sep): 1–15. https://doi.org/10.1109/TITS.2023.3314409.

Google Scholar

Han, Y., M. Wang, and L. Leclercq. 2023b. “Leveraging reinforcement learning for dynamic traffic control: A survey and challenges for field implementation.” Commun. Transp. Res. 1 (Dec): 100–104. https://doi.org/10.1016/j.commtr.2023.100104.

Google Scholar

Han, Y., M. Wang, L. Li, C. Roncoli, J. Gao, and P. Liu. 2022. “A physics-informed reinforcement learning-based strategy for local and coordinated ramp metering.” Transp. Res. Part C Emerging Technol. 137 (Apr): 103584. https://doi.org/10.1016/j.trc.2022.103584.

Google Scholar

Hastings, W. K. 1970. “Monte Carlo sampling methods using Markov chains and their applications.” Biometrika 57 (1): 97–109. https://doi.org/10.1093/biomet/57.1.97.

Google Scholar

Haydari, A., and Y. Yilmaz. 2020. “Deep reinforcement learning for intelligent transportation systems: A survey.” IEEE Trans. Intell. Transp. Syst. 23 (1): 11–32. https://doi.org/10.1109/TITS.2020.3008612.

Google Scholar

Hunt, P. B., D. I. Robertson, R. D. Bretherton, and M. C. Royle. 1978. “The SCOOT online traffic signal optimisation technique.” Traffic Eng. Control 23 (4): 190–192.

Google Scholar

Iqbal, S., and F. Sha. 2019. “Actor-attention-critic for multi-agent reinforcement learning.” In Proc., 36th Int. Conf. on Machine Learning, ICML 2019, 5261–5270. New York: International Conference on Machine Learning.

Google Scholar

Islam, S. M. A., and A. Hajbabaie. 2017. “Distributed coordinated signal timing optimization in connected transportation networks.” Transp. Res. Part C Emerging Technol. 80 (Aug): 272–285. https://doi.org/10.1016/j.trc.2017.04.017.

Google Scholar

Konda, V. R., and G. Sachs. 2000. “Actor-critic algorithms.” SIAM J. Control Optim. 42 (4): 1143–1166.

Google Scholar

Lee, J., J. Chung, and K. Sohn. 2020. “Reinforcement learning for joint control of traffic signals in a transportation network.” IEEE Trans. Veh. Technol. 69 (2): 1375–1387. https://doi.org/10.1109/TVT.2019.2962514.

Google Scholar

Li, L., Y. Lv, and F. Y. Wang. 2016. “Traffic signal timing via deep reinforcement learning.” IEEE/CAA J. Autom. Sin. 3 (3): 247–254. https://doi.org/10.1109/JAS.2016.7508798.

Google Scholar

Liang, X., X. Du, G. Wang, and Z. Han. 2019. “A deep reinforcement learning network for traffic light cycle control.” IEEE Trans. Veh. Technol. 68 (2): 1243–1253. https://doi.org/10.1109/TVT.2018.2890726.

Google Scholar

Lopez, P. A., M. Behrisch, L. Bieker-Walz, J. Erdmann, Y. P. Flotterod, R. Hilbrich, L. Lucken, J. Rummel, P. Wagner, and E. Wiebner. 2018. “Microscopic traffic simulation using SUMO.” In Proc., IEEE Conf. on Intelligent Transportation Systems, Proceedings, ITSC, 2575–2582. New York: IEEE.

Google Scholar

Luk, J. Y. 1983. “Two traffic responsive area traffic control methods: SCAT and SCOOT.” Traffic Eng. Control 25 (1): 14.

Google Scholar

Mirchandani, P., and L. Head. 2001. “A real-time traffic signal control system: Architecture, algorithms, and analysis.” Transp. Res. Part C Emerging Technol. 9 (6): 415–432. https://doi.org/10.1016/S0968-090X(00)00047-4.

Google Scholar

Mnih, V., et al. 2015. “Human-level control through deep reinforcement learning.” Nature 518 (7540): 529–533. https://doi.org/10.1038/nature14236.

Google Scholar

Mnih, V., M. Mirza, A. Graves, T. Harley, T. P. Lillicrap, and D. Silver. 2016. “Asynchronous methods for deep reinforcement learning.” In Vol. 48 of Proc., Int. Conf. on Machine Learning. New York: International Conference on Machine Learning.

Google Scholar

Noaeen, M., A. Naik, L. Goodman, J. Crebo, T. Abrar, Z. S. H. Abad, A. L. Bazzan, and B. Far. 2022. “Reinforcement learning in urban network traffic signal control: A systematic literature review.” Expert Syst. Appl. 199 (Apr): 116830. https://doi.org/10.1016/j.eswa.2022.116830.

Google Scholar

Oh, J., S. Singh, and H. Lee. 2016. “Control of memory, active perception, and action in minecraft.” In Proc., Int. Conf. on Machine Learning, 2790–2799. New York: International Conference on Machine Learning.

Google Scholar

Rashid, T., M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson. 2018. “QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning.” In Vol. 10 of Proc., 35th Int. Conf. on Machine Learning, ICML 2018, 6846–6859. New York: International Conference on Machine Learning.

Google Scholar

Schulman, J., P. Moritz, S. Levine, M. I. Jordan, and P. Abbeel. 2016. “High-dimensional continuous control using generalized advantage estimation.” In Proc., 4th Int. Conf. on Learning Representations, ICLR 2016–Conf. Track Proc., 1–14. Appleton, WI: International Conference on Learning Representations.

Google Scholar

Schulman, J., F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. “Proximal policy optimization algorithms.” Preprint, submitted July 20, 2017. http://arxiv.org/abs/1707.06347.

Google Scholar

Smith, S., G. Barlow, X.-F. Xie, and Z. B. Rubinstein. 2013. “SURTRAC: Scalable Urban Traffic Control.” In Vol. 15 of Proc., Transportation Research Board Annual Meeting. Washington, DC: Transportation Research Board.

Google Scholar

Sutton, R. S., and A. G. Barto. 2018. Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

Google Scholar

Tan, T., F. Bao, Y. Deng, A. Jin, Q. Dai, S. Member, and J. Wang. 2020. “Cooperative deep reinforcement learning for large-scale traffic grid signal control.” IEEE Trans. Cybern. 50 (6): 2687–2700. https://doi.org/10.1109/TCYB.2019.2904742.

Google Scholar

Thorpe, T. L., and C. W. Anderson. 1996. Traffic light control using SARSA with three state representations. Boulder, CO: IBM.

Google Scholar

Varaiya, P. 2013. “Max pressure control of a network of signalized intersections.” Transp. Res. Part C Emerging Technol. 36 (Jun): 177–195. https://doi.org/10.1016/j.trc.2013.08.014.

Google Scholar

Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. 2017. “Attention is all you need.” In Advances in neural information processing systems, 5999–6009. La Jolla, CA: Neural Information Processing Systems.

Google Scholar

Wei, H., N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li. 2019. “Colight: Learning network-level cooperation for traffic signal control.” In Proc., Int. Conf. on Information and Knowledge Management, 1913–1922. Washington, DC: WikiCFP.

Google Scholar

Willia, R. J. 1992. “Simple statistical gradient-following algorithms for connectionist reinforcement learning.” Mach. Learn. 8 (3): 229–256. https://doi.org/10.1007/bf00992696.

Google Scholar

Wu, C., Z. Ma, and I. Kim. 2020. “Multi-agent reinforcement learning for traffic signal control: Algorithms and robustness analysis.” In Proc., IEEE 23rd Int. Conf. on Intelligent Transportation Systems, ITSC 2020. New York: IEEE.

Google Scholar

Zhang, R., A. Ishikawa, W. Wang, B. Striner, and O. K. Tonguz. 2021. “Using reinforcement learning with partial vehicle detection for intelligent traffic signal control.” IEEE Trans. Intell. Transp. Syst. 22 (1): 404–415. https://doi.org/10.1109/TITS.2019.2958859.

Google Scholar

Zhang, Z., X. Luo, T. Liu, S. Xie, J. Wang, W. Wang, Y. Li, and Y. Peng. 2019. “Proximal policy optimization with mixed distributed training.” In Proc., Int. Conf. on Tools with Artificial Intelligence, ICTAI, 1452–1456. Portland, OR: International Conference on Tools with Artificial Intelligence.

Google Scholar

Zhou, T., M. Y. Kris, D. Creighton, and C. Wu. 2022. “GMIX: Graph-based spatial-temporal multi-agent reinforcement learning for dynamic electric vehicle dispatching system.” Transp. Res. Part C Emerging Technol. 144 (Jul): 103886. https://doi.org/10.1016/j.trc.2022.103886.

Google Scholar

Information & Authors

Information

Published In

Journal of Transportation Engineering, Part A: Systems

Volume 150 • Issue 3 • March 2024

Copyright

History

Received: Sep 1, 2023

Accepted: Nov 9, 2023

Published online: Jan 3, 2024

Published in print: Mar 1, 2024

Discussion open until: Jun 3, 2024

Permissions

Request permissions for this article.

Request Permissions

Authors

Affiliations

Gengyue Han https://orcid.org/0009-0007-3831-2084 [email protected]

Graduate Student, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Graduate Student, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China. ORCID: https://orcid.org/0009-0007-3831-2084. Email: [email protected]

View all articles by this author

Xiaohan Liu [email protected]

Graduate Student, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Graduate Student, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China; Hikvision Digital Technology Company Limited, No.555 Qianmo Rd., Binjiang District, Hangzhou 310051, PR China. Email: [email protected]

View all articles by this author

Hao Wang https://orcid.org/0000-0001-7961-7588 [email protected]

Professor, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Professor, Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Professor, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China (corresponding author). ORCID: https://orcid.org/0000-0001-7961-7588. Email: [email protected]

View all articles by this author

Changyin Dong [email protected]

Professor, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Professor, Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Professor, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China. Email: [email protected]

View all articles by this author

Yu Han [email protected]

Professor, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Professor, Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Professor, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China. Email: [email protected]

View all articles by this author

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

Ruru Hao, Tiancheng Ruan, Advancing Traffic Simulation Precision and Scalability: A Data-Driven Approach Utilizing Deep Neural Networks, Sustainability, 10.3390/su16072666, 16, 7, (2666), (2024).
Crossref

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)

ASCE Members: Please log in to see member pricing

Purchase

Save for later

ASCE Library Card (5 downloads)

$105.00

Add to cart

ASCE Library Card (20 downloads)

$280.00

Add to cart

Buy Single Article

$35.00

Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)

ASCE Members: Please log in to see member pricing

Purchase

Save for later

ASCE Library Card (5 downloads)

$105.00

Add to cart

ASCE Library Card (20 downloads)

$280.00

Add to cart

Buy Single Article

$35.00

Add to cart

Abstract

Get full access to this article

Data Availability Statement

Acknowledgments

References

Information

Published In

Copyright

History

Permissions

Authors

Affiliations

Metrics

Citations

Download citation

Cited by

Get Access

Access content

Purchase

ASCE Library Card (5 downloads)

ASCE Library Card (5 downloads)

ASCE Library Card (20 downloads)

ASCE Library Card (20 downloads)

Buy Single Article

Buy Single Article

Get Access

Access content

Purchase

ASCE Library Card (5 downloads)

ASCE Library Card (5 downloads)

ASCE Library Card (20 downloads)

ASCE Library Card (20 downloads)

Buy Single Article

Buy Single Article

Figures

Other

Share

Copy the content Link

Share with email

Share

Request Username

Create a new account

Change Password

Password Changed Successfully

Verify Phone

Congrats!