Reinforcement Learning for True Adaptive Traffic Signal Control
Publication: Journal of Transportation Engineering
Volume 129, Issue 3
Abstract
The ability to exert real-time, adaptive control of transportation processes is the core of many intelligent transportation systems decision support tools. Reinforcement learning, an artificial intelligence approach undergoing development in the machine-learning community, offers key advantages in this regard. The ability of a control agent to learn relationships between control actions and their effect on the environment while pursuing a goal is a distinct improvement over prespecified models of the environment. Prespecified models are a prerequisite of conventional control methods and their accuracy limits the performance of control agents. This paper contains an introduction to Q-learning, a simple yet powerful reinforcement learning algorithm, and presents a case study involving application to traffic signal control. Encouraging results of the application to an isolated traffic signal, particularly under variable traffic conditions, are presented. A broader research effort is outlined, including extension to linear and networked signal systems and integration with dynamic route guidance. The research objective involves optimal control of heavily congested traffic across a two-dimensional road network—a challenging task for conventional traffic signal control methodologies.
Get full access to this article
View all available purchase options and get full access to this article.
References
Abdulhai, B., and Ritchie, S. G. (1999a). “Enhancing the universality and transferability of freeway incident detection using a Bayesian-based neural network.” Transportation Research—Part C, 7, 261–280.
Abdulhai, B., and Ritchie, S. G. (1999b). “Towards adaptive incident detection algorithms.” Proc., 6th World Congress on Intelligent Transport Systems.
Albus, J. S.(1975a). “Data storage in the Cerebellar Model Articulation Controller (CMAC).” J. Dyn. Syst., Meas., Control, 97, 228–233.
Albus, J. S.(1975b). “A new approach to manipulator control: The cerebellar model articulation controller (CMAC).” J. Dyn. Syst., Meas., Control, 97, 220–227.
Bertsekas, D. P., and Tsitsiklis, J. N. (1996). Neuro-dynamic programming, Athena Scientific, Belmont, Mass.
Bingham, E. (1998). “Neurofuzzy traffic signal control.” Master’s thesis, Dept. of Engineering Physics and Mathematics, Helsinki Univ. of Technology, Helsinki, Finland.
Bretherton, D. (1996). “Current developments in SCOOT: Version 3.” Transportation Research Record 1554, Transportation Research Board, Washington, D.C., 48–52.
Bretherton, D., Wood, K., and Raha, N. (1998). “Traffic monitoring and congestion management in the SCOOT urban traffic control system.” Transportation Research Record 1634, Transportation Research Board, Washington, D.C., 118–122.
Gartner, N. H., and Al-Malik, M. (1996). “Combined model for signal control and route choice in urban traffic networks.” Transportation Research Record 1554, Transportation Research Board, Washington, D.C., 27–35.
Hunt, P. B., Robertson, D. I., Bretherton, D., and Winton, R. I. (1981). “SCOOT—A traffic responsive method of coordinating signals.” Laboratory Rep. 1014, Transport and Road Research Laboratory.
Kaelbling, L. P., Littman, M. L., and Moore, A. W.(1996). “Reinforcement learning: A survey.” J. Artif. Intell. Res., 4, 237–285.
Sadek, A. W., Smith, B. L., and Demetsky, M. J. (1998). “Artificial intelligence-based architecture for real-time traffic flow management.” Transportation Research Record 1651, Transportation Research Board, Washington, D.C., 53–58.
Sen, S., and Head, K. L.(1997). “Controlled optimization of phases at an intersection.” Transp. Sci., 31(1), 5–17.
Smith, R. (1998). “Intelligent motion control with an artificial cerebellum.” PhD thesis, Dept. of Electrical and Electronic Engineering, Univ. of Auckland, Auckland, New Zealand.
Spall, J. C., and Chin, D. C.(1997). “Traffic-responsive signal timing for system-wide traffic control.” Transp. Res., Part C: Emerg. Technol., 5(3/4), 153–163.
Sutton, R. S., and Barto, A. G. (1998). Reinforcement learning—An introduction, MIT Press, Cambridge, Mass.
Thorpe, T. L. (1997). “Vehicle traffic light control using SARSA.” Master’s Project Rep., Computer Science Dept., Colorado State Univ., Fort Collins, Colo.
Watkins, C. J. C. H. (1989). “Learning from delayed rewards.” PhD thesis, King’s College, Univ. of Cambridge, Cambridge, U.K.
Watkins, C. J. C. H., and Dayan, P.(1992). “Q-learning.” Mach. Learn., 8, 279–292.
Yagar, S., and Dion, F. (1996). “Distributed approach to real-time control of complex signalized networks.” Transportation Research Record 1554, Transportation Research Board, Washington, D.C., 1–8.
Information & Authors
Information
Published In
Copyright
Copyright © 2003 American Society of Civil Engineers.
History
Received: Oct 30, 2001
Accepted: May 21, 2002
Published online: Apr 15, 2003
Published in print: May 2003
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.