Technical Papers
Jan 3, 2024

An Attention Reinforcement Learning–Based Strategy for Large-Scale Adaptive Traffic Signal Control System

Publication: Journal of Transportation Engineering, Part A: Systems
Volume 150, Issue 3

Abstract

This paper proposes a reinforcement learning (RL)-based traffic control strategy integrated with attention mechanism for large-scale adaptive traffic signal control (ATSC) system. The proposed attention RL integrates attention mechanism into a multiagent RL model, namely multiagent proximal policy optimization (MAPPO), so as to enable more effective, scalable, and stable learning in complex ATSC environments. In the attention RL, decentralized policies are trained using a centrally computed critic that shares an attention model, while the attention model selects relevant intersections for each agent to estimate the global critic. This framework effectively reduces the computational complexity and stabilizes the training process, enhancing the ability of RL agents to control large-scale traffic networks. The proposed control strategy is tested in both a large synthetic traffic grid and a large real-world traffic network of Yangzhou city using the microscopic traffic simulation tool, SUMO. Experimental results demonstrate that the proposed approach learns stable and sustainable policies that achieve lower congestion level and faster recovery, which outperforms other state-of-art RL-based approaches, as well as a gap-based actuated controller.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

Thanks Prof. Ljubo Vlacic with the School of Engineering, Griffith University for review and editing on the first draft of this manuscript. This work was supported in part by National Key R&D Program of China (2022ZD0115600), in part by National Natural Science Foundation of China (52302405), and in part by Natural Science Foundation of Jiangsu Province (BK20210249).

References

Abdulhai, B., R. Pringle, and G. J. Karakoulas. 2003. “Reinforcement learning for true adaptive traffic signal control.” J. Transp. Eng. 129 (3): 278–285. https://doi.org/10.1061/(ASCE)0733-947X(2003)129:3(278).
Aslani, M., M. Saadi, and M. Wiering. 2017. “Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events.” Transp. Res. Part C Emerging Technol. 85 (Dec): 732–752. https://doi.org/10.1016/j.trc.2017.09.020.
Boukerche, A., D. Zhong, P. Sun, and S. Member. 2021. “A novel reinforcement learning-based cooperative traffic signal system through max-pressure control.” IEEE Trans. Veh. Technol. 71 (2): 1187–1198. https://doi.org/10.1109/TVT.2021.3069921.
Choe, C. J., S. Baek, B. Woon, and S. H. Kong. 2019. “Deep Q learning with LSTM for traffic light control.” In Proc., 2018 24th Asia-Pacific Conf. on Communications, APCC 2018, 331–336. New York: IEEE.
Chu, T., J. Wang, L. Codecà, and Z. Li. 2020. “Multi-agent deep reinforcement learning for large-scale traffic signal control.” IEEE Trans. Intell. Transp. Syst. 21 (3): 1086–1095. https://doi.org/10.1109/TITS.2019.2901791.
Dell’Olmo, P., and P. B. Mirchandani. 1995. “REALBAND: An approach for real-time coordination of traffic flows on networks.” Transp. Res. Rec. 1494 (1): 106–116.
Engelhart, M. D., and H. Moughamian. 2005. “Reinforcement learning: An introduction.” IEEE Trans. Neural Networks 16 (1): 285–286. https://doi.org/10.1109/TNN.2004.842673.
Gartner, N. H. 1983. “OPAC: A demand-responsive strategy for traffic signal control.” Transp. Res. Rec. 1983 (906): 75–81.
Genders, W., and S. Razavi. 2018. “Evaluating reinforcement learning state representations for adaptive traffic signal control.” Procedia Comput. Sci. 130 (Jun): 26–33. https://doi.org/10.1016/j.procs.2018.04.008.
Han, G., Y. Han, H. Wang, T. Ruan, and C. Li. 2023a. “Coordinated control of urban expressway integrating adjacent signalized intersections using adversarial network based reinforcement learning method.” IEEE Trans. Intell. Transp. Syst. (Sep): 1–15. https://doi.org/10.1109/TITS.2023.3314409.
Han, Y., M. Wang, and L. Leclercq. 2023b. “Leveraging reinforcement learning for dynamic traffic control: A survey and challenges for field implementation.” Commun. Transp. Res. 1 (Dec): 100–104. https://doi.org/10.1016/j.commtr.2023.100104.
Han, Y., M. Wang, L. Li, C. Roncoli, J. Gao, and P. Liu. 2022. “A physics-informed reinforcement learning-based strategy for local and coordinated ramp metering.” Transp. Res. Part C Emerging Technol. 137 (Apr): 103584. https://doi.org/10.1016/j.trc.2022.103584.
Hastings, W. K. 1970. “Monte Carlo sampling methods using Markov chains and their applications.” Biometrika 57 (1): 97–109. https://doi.org/10.1093/biomet/57.1.97.
Haydari, A., and Y. Yilmaz. 2020. “Deep reinforcement learning for intelligent transportation systems: A survey.” IEEE Trans. Intell. Transp. Syst. 23 (1): 11–32. https://doi.org/10.1109/TITS.2020.3008612.
Hunt, P. B., D. I. Robertson, R. D. Bretherton, and M. C. Royle. 1978. “The SCOOT online traffic signal optimisation technique.” Traffic Eng. Control 23 (4): 190–192.
Iqbal, S., and F. Sha. 2019. “Actor-attention-critic for multi-agent reinforcement learning.” In Proc., 36th Int. Conf. on Machine Learning, ICML 2019, 5261–5270. New York: International Conference on Machine Learning.
Islam, S. M. A., and A. Hajbabaie. 2017. “Distributed coordinated signal timing optimization in connected transportation networks.” Transp. Res. Part C Emerging Technol. 80 (Aug): 272–285. https://doi.org/10.1016/j.trc.2017.04.017.
Konda, V. R., and G. Sachs. 2000. “Actor-critic algorithms.” SIAM J. Control Optim. 42 (4): 1143–1166.
Lee, J., J. Chung, and K. Sohn. 2020. “Reinforcement learning for joint control of traffic signals in a transportation network.” IEEE Trans. Veh. Technol. 69 (2): 1375–1387. https://doi.org/10.1109/TVT.2019.2962514.
Li, L., Y. Lv, and F. Y. Wang. 2016. “Traffic signal timing via deep reinforcement learning.” IEEE/CAA J. Autom. Sin. 3 (3): 247–254. https://doi.org/10.1109/JAS.2016.7508798.
Liang, X., X. Du, G. Wang, and Z. Han. 2019. “A deep reinforcement learning network for traffic light cycle control.” IEEE Trans. Veh. Technol. 68 (2): 1243–1253. https://doi.org/10.1109/TVT.2018.2890726.
Lopez, P. A., M. Behrisch, L. Bieker-Walz, J. Erdmann, Y. P. Flotterod, R. Hilbrich, L. Lucken, J. Rummel, P. Wagner, and E. Wiebner. 2018. “Microscopic traffic simulation using SUMO.” In Proc., IEEE Conf. on Intelligent Transportation Systems, Proceedings, ITSC, 2575–2582. New York: IEEE.
Luk, J. Y. 1983. “Two traffic responsive area traffic control methods: SCAT and SCOOT.” Traffic Eng. Control 25 (1): 14.
Mirchandani, P., and L. Head. 2001. “A real-time traffic signal control system: Architecture, algorithms, and analysis.” Transp. Res. Part C Emerging Technol. 9 (6): 415–432. https://doi.org/10.1016/S0968-090X(00)00047-4.
Mnih, V., et al. 2015. “Human-level control through deep reinforcement learning.” Nature 518 (7540): 529–533. https://doi.org/10.1038/nature14236.
Mnih, V., M. Mirza, A. Graves, T. Harley, T. P. Lillicrap, and D. Silver. 2016. “Asynchronous methods for deep reinforcement learning.” In Vol. 48 of Proc., Int. Conf. on Machine Learning. New York: International Conference on Machine Learning.
Noaeen, M., A. Naik, L. Goodman, J. Crebo, T. Abrar, Z. S. H. Abad, A. L. Bazzan, and B. Far. 2022. “Reinforcement learning in urban network traffic signal control: A systematic literature review.” Expert Syst. Appl. 199 (Apr): 116830. https://doi.org/10.1016/j.eswa.2022.116830.
Oh, J., S. Singh, and H. Lee. 2016. “Control of memory, active perception, and action in minecraft.” In Proc., Int. Conf. on Machine Learning, 2790–2799. New York: International Conference on Machine Learning.
Rashid, T., M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson. 2018. “QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning.” In Vol. 10 of Proc., 35th Int. Conf. on Machine Learning, ICML 2018, 6846–6859. New York: International Conference on Machine Learning.
Schulman, J., P. Moritz, S. Levine, M. I. Jordan, and P. Abbeel. 2016. “High-dimensional continuous control using generalized advantage estimation.” In Proc., 4th Int. Conf. on Learning Representations, ICLR 2016–Conf. Track Proc., 1–14. Appleton, WI: International Conference on Learning Representations.
Schulman, J., F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. “Proximal policy optimization algorithms.” Preprint, submitted July 20, 2017. http://arxiv.org/abs/1707.06347.
Smith, S., G. Barlow, X.-F. Xie, and Z. B. Rubinstein. 2013. “SURTRAC: Scalable Urban Traffic Control.” In Vol. 15 of Proc., Transportation Research Board Annual Meeting. Washington, DC: Transportation Research Board.
Sutton, R. S., and A. G. Barto. 2018. Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Tan, T., F. Bao, Y. Deng, A. Jin, Q. Dai, S. Member, and J. Wang. 2020. “Cooperative deep reinforcement learning for large-scale traffic grid signal control.” IEEE Trans. Cybern. 50 (6): 2687–2700. https://doi.org/10.1109/TCYB.2019.2904742.
Thorpe, T. L., and C. W. Anderson. 1996. Traffic light control using SARSA with three state representations. Boulder, CO: IBM.
Varaiya, P. 2013. “Max pressure control of a network of signalized intersections.” Transp. Res. Part C Emerging Technol. 36 (Jun): 177–195. https://doi.org/10.1016/j.trc.2013.08.014.
Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. 2017. “Attention is all you need.” In Advances in neural information processing systems, 5999–6009. La Jolla, CA: Neural Information Processing Systems.
Wei, H., N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li. 2019. “Colight: Learning network-level cooperation for traffic signal control.” In Proc., Int. Conf. on Information and Knowledge Management, 1913–1922. Washington, DC: WikiCFP.
Willia, R. J. 1992. “Simple statistical gradient-following algorithms for connectionist reinforcement learning.” Mach. Learn. 8 (3): 229–256. https://doi.org/10.1007/bf00992696.
Wu, C., Z. Ma, and I. Kim. 2020. “Multi-agent reinforcement learning for traffic signal control: Algorithms and robustness analysis.” In Proc., IEEE 23rd Int. Conf. on Intelligent Transportation Systems, ITSC 2020. New York: IEEE.
Zhang, R., A. Ishikawa, W. Wang, B. Striner, and O. K. Tonguz. 2021. “Using reinforcement learning with partial vehicle detection for intelligent traffic signal control.” IEEE Trans. Intell. Transp. Syst. 22 (1): 404–415. https://doi.org/10.1109/TITS.2019.2958859.
Zhang, Z., X. Luo, T. Liu, S. Xie, J. Wang, W. Wang, Y. Li, and Y. Peng. 2019. “Proximal policy optimization with mixed distributed training.” In Proc., Int. Conf. on Tools with Artificial Intelligence, ICTAI, 1452–1456. Portland, OR: International Conference on Tools with Artificial Intelligence.
Zhou, T., M. Y. Kris, D. Creighton, and C. Wu. 2022. “GMIX: Graph-based spatial-temporal multi-agent reinforcement learning for dynamic electric vehicle dispatching system.” Transp. Res. Part C Emerging Technol. 144 (Jul): 103886. https://doi.org/10.1016/j.trc.2022.103886.

Information & Authors

Information

Published In

Go to Journal of Transportation Engineering, Part A: Systems
Journal of Transportation Engineering, Part A: Systems
Volume 150Issue 3March 2024

History

Received: Sep 1, 2023
Accepted: Nov 9, 2023
Published online: Jan 3, 2024
Published in print: Mar 1, 2024
Discussion open until: Jun 3, 2024

Permissions

Request permissions for this article.

Authors

Affiliations

Graduate Student, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Graduate Student, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China. ORCID: https://orcid.org/0009-0007-3831-2084. Email: [email protected]
Xiaohan Liu [email protected]
Graduate Student, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Graduate Student, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China; Hikvision Digital Technology Company Limited, No.555 Qianmo Rd., Binjiang District, Hangzhou 310051, PR China. Email: [email protected]
Professor, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Professor, Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Professor, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China (corresponding author). ORCID: https://orcid.org/0000-0001-7961-7588. Email: [email protected]
Changyin Dong [email protected]
Professor, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Professor, Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Professor, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China. Email: [email protected]
Professor, School of Transportation, Southeast Univ., Nanjing 211189, PR China; Professor, Jiangsu Key Laboratory of Urban ITS, Southeast Univ., Nanjing 210096, PR China; Professor, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Southeast Univ., Nanjing 210096, PR China. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • Advancing Traffic Simulation Precision and Scalability: A Data-Driven Approach Utilizing Deep Neural Networks, Sustainability, 10.3390/su16072666, 16, 7, (2666), (2024).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share