Technical Papers
Feb 7, 2022

Assessing the Performance of Gradient-Boosting Models for Predicting the Travel Mode Choice Using Household Survey Data

Publication: Journal of Urban Planning and Development
Volume 148, Issue 2

Abstract

The importance of analyzing travel mode choice to understand travel behavior in urban areas is crucial in the formulation of mobility policies, especially considering the growing concern to reduce the use of private transport modes in favor of sustainable ones. Although multinomial logit models have been widely used in travel behavior research, machine learning models are becoming an interesting alternative to perform this task, in which tree-based ensemble models, such as random forest and gradient boosting models, have demonstrated superiority in accomplishing this goal, although they both have not been compared using household survey data. This paper compares different logit and machine learning models, with a specific emphasis on gradient boosting, random forest, and multinomial logit models to predict travel mode choice and to identify the determinants in travel behavior in an urban area for three transport modes (public transport, private transport, and walking/bike), using household survey data. Although the methodology is defined following the case and features of the metropolitan area of the Aburrá Valley—Colombia (MAAV), it can be applied to any urban area. The results show that an optimized gradient boosting model is able to predict travel mode choice in an urban area using household survey data, outperforming the other compared models. In addition, travel time, parking type at the destination, the number of motorized vehicles in the household (cars and motorbikes), age, and gender are features that explain the travel mode choice in the MAAV. The optimized gradient boosting model presented in this paper can be employed as a policy tool to study and analyze strategies to promote the reduction of the use of private transport modes in the MAAV and increase the use of more sustainable transport modes.

Get full access to this article

View all available purchase options and get full access to this article.

Acknowledgments

The authors express their gratitude to two reviewers for their constructive suggestions to improve this study.

References

Ali, M. 2020. “PyCaret: An open source, low-code machine learning library in Python.”
AMVA (Área Metropolitana del Valle de Aburrá). 2018. Origin-Destination Survey 2017—API. Área metropolitana del Valle de Aburrá - Open data. Accessed May 29, 2020. https://www.metropol.gov.co/encuesta_od2017_v2/index.html#/.
AMVA (Área Metropolitana del Valle de Aburrá). 2019. Origin-Destination Survey 2017—Data. Colombia: ÁMVA. Accessed May 28, 2020. https://datosabiertos.metropol.gov.co/dataset/encuesta-origen-destino-2017-datos-por-hogares.
Ben-Akiva, M., and S. R. Lerman. 1985. Discrete choice analysis: Theory and application to travel demand. Cambridge, MA: MIT Press.
Ben-Akiva, M., J. Walker, A. T. Bernardino, D. A. Gopinath, T. Morikawa, and A. Polydoropoulou. 2002. “Integration of choice and latent variable models.” In Perpetual motion: Travel behaviour research opportunities and application challenges, edited by H. S. Mahmassani, 431–470. Amsterdam, Netherlands: Elsevier.
Bergstra, J., and Y. Bengio. 2012. “Random search for hyper-parameter optimization.” J. Mach. Learn. Res. 13 (10): 281–305.
Böcker, L., P. van Amen, and M. Helbich. 2017. “Elderly travel frequencies and transport mode choices in Greater Rotterdam, the Netherlands.” Transportation 4: 831–852.
Bollegala, D. 2017. “Dynamic feature scaling for online learning of binary classifiers.” Knowledge-Based Syst. 129 (1): 97–105. https://doi.org/10.1016/j.knosys.2017.05.010.
Breiman, L. 1996. “Bagging predictors.” Mach. Learn. 24 (2): 123–140.
Breiman, L. 2001. “Random forests.” Mach. Learn. 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
Calonge-Reillo, F. 2021. “Travel behaviour in contexts of security crisis. Explaining daily use of car in non-central districts in Guadalajara Metropolitan Area, Mexico.” Travel Behav. Soc. 24: 1–9. https://doi.org/10.1016/j.tbs.2021.01.006.
Cantelmo, G., F. Viti, E. Cipriani, and M. Nigro. 2017. “A utility-based dynamic demand estimation model that explicitly accounts for activity scheduling and duration.” Transp. Res. Procedia 23: 440–459. https://doi.org/10.1016/j.trpro.2017.05.025.
Chalumuri, R. S., S. Minal, E. Madhu. 2016. “Mode choice analysis using random forest decision trees.” Transp. Res. Procedia 17: 644–652. https://doi.org/10.1016/j.trpro.2016.11.119.
Chapleau, R., P. Gaudette, and T. Spurr. 2019. “Application of machine learning to two large-sample household travel surveys: A characterization of travel modes.” Transp. Res. Rec. 2673 (4): 173–183. https://doi.org/10.1177/0361198119839339.
Cheng, L., X. Chen, J. De Vos, X. Lai, and F. Witlox. 2019. “Applying a random forest method approach to model travel mode choice behavior.” Travel Behav. Soc. 14: 1–10.
DESA (Department of Economic and Social Affairs). 2018. 68% of the world population projected to live in urban areas by 2050, says UN. New York: DESA.
Ding, C., X. J. Cao, and P. Næss. 2018. “Applying gradient boosting decision trees to examine non-linear effects of the built environment on driving distance in Oslo.” Transp. Res. Part A: Policy Pract. 110: 107–117. https://doi.org/10.1016/j.tra.2018.02.009.
Ding, C., S. Mishra, Y. Lin, and B. Xie. 2015. “Cross-nested joint model of travel mode and departure time choice for urban commuting trips: Case study in Maryland–Washington, DC Region.” J. Urban Plann. Dev. 141 (4): 04014036. https://doi.org/10.1061/(ASCE)UP.1943-5444.0000238.
Ding, C., D. Wang, X. Ma, and H. Li. 2016. “Predicting short-term subway ridership and prioritizing its influential factors using gradient boosting decision trees.” Sustainability 8 (11): 1100. https://doi.org/10.3390/su8111100.
Ewing, R., and R. Cervero. 2010. “Travel and the built environment.” J. Am. Plann. Assoc. 76(3): 265–294.
Fernández-Delgado, M., E. Cernadas, S. Barro, and D. Amorim. 2014. “Do we need hundreds of classifiers to solve real world classification problems?” J. Mach. Learn. Res. 15: 3133–3181.
Gao, K., Y. Yang, T. Zhang, A. Li, and X. Qu. 2021. “An extrapolation-enhanced approach for modeling travel decision making: Integrating ensemble machine learning with knowledge-based decision-making theory.” Knowledge-Based Syst. 218: 106882.
Guzman, L. A., J. Arellana, and V. Alvarez. 2020. “Confronting congestion in urban areas: Developing Sustainable Mobility Plans for public and private organizations in Bogotá.” Transp. Res. Part A: Policy Pract. 134: 321–335. https://doi.org/10.1016/j.tra.2020.02.019.
Hagenauer, J., and M. Helbich. 2017. “A comparative study of machine learning classifiers for modeling travel mode choice.” Expert Syst. Appl. 78: 273–282.
Heinen, E., B. van Wee, and K. Maat. 2010. “Commuting by bicycle: An overview of the literature.” Transp. Rev. 30 (1): 59–96. https://doi.org/10.1080/01441640903187001.
Hillel, T., M. Bierlaire, and Y. Jin. 2021. “A systematic review of machine learning methodologies for modelling passenger mode choice.” J. Choice Modell. 38: 100221.
IEA (International Energy Agency). 2018. Key world energy statistics 2018. Paris: IEA.
Khordagui, N. 2019. “Parking prices and the decision to drive to work: Evidence from California.” Transp. Res. Part A: Policy Pract. 130: 479–495. https://doi.org/10.1016/j.tra.2019.09.064.
Lee, D., J. Mulrow, C. J. Haboucha, S. Derrible, and Y. Shiftan. 2019. “Attitudes on autonomous vehicle adoption using interpretable gradient boosting machine.” Transp. Res. Rec. 2673 (11): 865–878. https://doi.org/10.1177/0361198119857953.
Li, D., X. Ye, and J. Ma. 2019. “Empirical analysis of factors influencing potential demand of customized buses in Shanghai, China.” J. Urban Plann. Dev. 145 (2): 05019006. https://doi.org/10.1061/(ASCE)UP.1943-5444.0000502.
Lundberg, S., and S.-I. Lee. 2017. “A unified approach to interpreting model predictions.” In Proc. 31st Int. Conf. on Neural Information Processing Systems, 4768–4777. Red Hook, NY: Curran Associates Inc.
Ma, S., Z. Yu, and C. Liu. 2020. “Nested logit joint model of travel mode and travel time choice for urban commuting trips in Xi’an, China.” J. Urban Plann. Dev. 146 (2): 04020020. https://doi.org/10.1061/(ASCE)UP.1943-5444.0000574.
Mandhani, J., J. K. Nayak, and M. Parida. 2020. “Interrelationships among service quality factors of Metro Rail Transit System: An integrated Bayesian networks and PLS-SEM approach.” Transp. Res. Part A: Policy Pract. 140: 320–336.
Martínez Fernández, P., I. Villalba Sanchís, V. Yepes, and R. Insa Franco. 2019. “A review of modelling and optimisation methods applied to railways energy consumption.” J. Cleaner Prod. 222: 153–162. https://doi.org/10.1016/j.jclepro.2019.03.037.
Mcfadden, D. 1980. “Econometric models of probabilistic choice.” J. Bus. 53: S13–S29.
Meek, C., B. Thiesson, and D. Heckerman. 2002. “The learning-curve sampling method applied to model-based clustering.” J. Mach. Learn. Res. 2(3): 397–418.
Moniruzzaman, M., A. Chudyk, A. Páez, M. Winters, J. Sims-Gould, and H. McKay. 2015. “Travel behavior of low income older adults and implementation of an accessibility calculator.” J. Transp. Health 2 (2): 257–268.
Muñoz, C., J. Cordoba, and I. Sarmiento. 2017. “Airport choice model in multiple airport regions.” J. Airline Airport Manage. 7 (1): 1. https://doi.org/10.3926/jairm.62.
Nti, I. K., A. F. Adekoya, and B. A. Weyori. 2020. “A comprehensive evaluation of ensemble learning for stock-market prediction.” J. Big Data 7: 20.
Oliva, I., P. Galilea, and R. Hurtubia. 2018. “Identifying cycling-inducing neighborhoods: A latent class approach.” Int. J. Sustainable Transp. 12 (10): 701–713. https://doi.org/10.1080/15568318.2018.1431822.
Ortúzar, J. de D., and L. G Willumsen. 2011. Modelling transport. Chichester, UK: Wiley.
Ospina, J. P., V. Botero-Fernández, J. C. Duque, M. Brussel, and A. Grigolon. 2020. “Understanding cycling travel distance: The case of Medellin city (Colombia).” Transp. Res. Part D: Transp. Environ. 86: 102423.
Parsa, A. B., Movahedi, A., Taghipour, H., Derrible, S., and Mohammadian AK. 2020. “Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis.” Accid. Anal. Prev. 136: 105405. https://doi.org/10.1016/j.aap.2019.105405.
Pedregosa, F., et al. 2011. “Scikit-learn: Machine learning in Python.” J. Machine Learning Res. 12 (85): 2825–2830.
Pineda-Jaramillo, J. 2019. “A review of machine learning (ML) algorithms used for modeling travel mode choice.” Dyna 86 (211): 32–41. https://doi.org/10.15446/dyna.v86n211.79743.
Pineda-Jaramillo, J. 2020. “A shallow neural network approach for identifying the leading causes associated to pedestrian deaths in Medellín.” J. Transp. Health 19: 100912. https://doi.org/10.1016/j.jth.2020.100912.
Pineda-Jaramillo, J. 2021a. “Travel household survey Medellin.” GitHub. Accessed December 10, 2020. https://github.com/jdpinedaj/travel_household_survey_Medellin.
Pineda-Jaramillo, J. 2021b. “Travel time, trip frequency and motorised-vehicle ownership: A case study of travel behaviour of people with reduced mobility in Medellín.” J.Transp. Health 22: 101110. https://doi.org/10.1016/j.jth.2021.101110.
Pineda-Jaramillo, J., and Ó Arbeláez-Arenas. 2021. “Modelling road traffic collisions using clustered zones based on Foursquare data in Medellín.” Case Stud. Transp. Policy 9 (2): 958–964. https://doi.org/10.1016/j.cstp.2021.04.016.
Pineda-Jaramillo, J., I. Sarmiento-Ordosgoitia, and J. Córdoba-Maquilón. 2016. “Railway and road discrete choice model for foreign trade freight between Antioquia and the Port of Cartagena.” Ing. Invest. 36 (3): 22. https://doi.org/10.15446/ing.investig.v36n3.57370.
Shafiq, M., Z. Tian, A. K. Bashir, A. Jolfaei, and X. Yu. 2020. “Data mining and machine learning methods for sustainable smart cities traffic classification: A survey.” Sustainable Cities Soc. 60: 102177.
Shyamala Devi, M., R. M. Mathew, and R. Suguna. 2019. “Regressor fitting of feature importance for customer segment prediction with ensembling schemes using machine learning.” Int. J. Eng. Adv. Technol. 8(6): 952–956. https://doi.org/10.35940/ijeat.F8255.088619.
Šimeček M., V. Gabrhel, M. Tögel, and M. Lazor. 2018. “Travel behaviour of seniors in Eastern Europe: A comparative study of Brno and Bratislava.” Eur. Transp. Res. Rev. 10: 1. https://doi.org/10.1007/s12544-017-0273-5.
Sprumont, F., P. Astegiano, and F. Viti. 2017. “On the consistency between commuting satisfaction and traveling utility: The case of the University of Luxembourg.” Eur. J. Transp. Infrastruct. Res. 17 (2): 248–262.
Sprumont, F., F. Viti, G. Caruso, and A. König. 2014. “Workplace relocation and mobility changes in a transnational metropolitan area: The case of the University of Luxembourg.” Transp. Res. Procedia.4, 286–299 https://doi.org/10.1016/j.trpro.2014.11.022.
Suthaharan, S. 2016. Machine learning models and algorithms for Big data classification. Integrated series in information systems. Boston: Springer.
Train, K. 2003. Discrete choice methods with simulation. Cambridge, UK: Cambridge University Press.
Uncles, M. D., M. Ben-Akiva, and S. R. Lerman. 1987. “Discrete choice analysis: Theory and application to travel demand.” J. Oper. Res. Soc. 38 (4): 370.
Wang, F., and C. L. Ross. 2018. “Machine learning travel mode choices: Comparing the performance of an extreme gradient boosting model with a multinomial logit model.” Transp. Res. Rec. 2672 (47): 35–45. https://doi.org/10.1177/0361198118773556.
Witten, I. H., E. Frank, M. A. Hall, and C. J. Pal. 2016. Data mining: Practical machine learning tools and techniques. Amsterdam, Netherlands: Elsevier.
Xu, J., A. Wang, N. Schmidt, M. Adams, and M. Hatzopoulou. 2020. “A gradient boost approach for predicting near-road ultrafine particle concentrations using detailed traffic characterization.” Environ. Pollut. 265: 114777. https://doi.org/10.1016/j.envpol.2020.114777.
Xue, W., and J. Zhang. 2016. “Dealing with imbalanced dataset: A re-sampling method based on the improved SMOTE algorithm.” Commun. Stat.—Simul. Comput. 45 (4): 1160–1172. https://doi.org/10.1080/03610918.2012.728274.
Yan, X., J. Levine, and R. Marans. 2019. “The effectiveness of parking policies to reduce parking demand pressure and car use.” Transp. Policy 73: 41–50. https://doi.org/10.1016/j.tranpol.2018.10.009.
Zhao, X., X. Yan, A. Yu, and P. Van Hentenryck. 2020. “Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models.” Travel Behav. Soc. 20: 22–35. https://doi.org/10.1016/j.tbs.2020.02.003.

Information & Authors

Information

Published In

Go to Journal of Urban Planning and Development
Journal of Urban Planning and Development
Volume 148Issue 2June 2022

History

Received: Dec 16, 2020
Accepted: Dec 7, 2021
Published online: Feb 7, 2022
Published in print: Jun 1, 2022
Discussion open until: Jul 7, 2022

Permissions

Request permissions for this article.

Authors

Affiliations

Research Associate, Dept. of Engineering, Univ. of Luxembourg, 6 Avenue de la Fonte, Maison du Nombre, office 0435010, 4364 Esch-sur-Alzette, Luxembourg (corresponding author). ORCID: https://orcid.org/0000-0002-4657-7521. Email: [email protected]
Ph.D. Candidate, Dept. of Civil Engineering, National Univ. of Colombia, Cr. 80 #65-223—Campus Robledo, Medellín 050034, Colombia. ORCID: https://orcid.org/0000-0003-2080-9045. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • Travel Mode Determining Factors for Residents within the Catchment Areas of Urban Rail Transit Stations: Evidence from Nanjing, China, Journal of Transportation Engineering, Part A: Systems, 10.1061/JTEPBS.TEENG-8210, 150, 7, (2024).
  • Accessibility Evaluation of Public Service Facilities in Villages and Towns Based on POI Data: A Case Study of Suining County, Xuzhou, China, Journal of Urban Planning and Development, 10.1061/JUPDDM.UPENG-4222, 149, 3, (2023).
  • Missions and factors determining the demand for affordable mass space tourism in the United States: A machine learning approach, Acta Astronautica, 10.1016/j.actaastro.2023.01.006, 204, (307-320), (2023).
  • Application of Machine Learning to Child Mode Choice with a Novel Technique to Optimize Hyperparameters, International Journal of Environmental Research and Public Health, 10.3390/ijerph192416844, 19, 24, (16844), (2022).
  • Modeling Sharjah's Travel Mode Choice Using Multinomial Logit Regression Model, 2022 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS), 10.1109/ICETSIS55481.2022.9888851, (91-95), (2022).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share