Technical Papers
Jul 24, 2020

Evaluating the Nonlinear Correlation between Vertical Curve Features and Crash Frequency on Highways Using Random Forests

Publication: Journal of Transportation Engineering, Part A: Systems
Volume 146, Issue 10

Abstract

Vertical curve features on interstate highways greatly affect traffic operations and vehicle performance and, thus, could have an impact on the occurrence of traffic crashes. Most studies to date only considered linear relationships. Though some researchers did consider nonlinearity, the preassumed data distribution may not fit the true distribution perfectly. Thus, the primary objective of this study is to develop a nonparametric algorithm to evaluate the nonlinear correlation between vertical curve features and crash frequency on interstate highways based on a random forest (RF) algorithm. Elevation data along interstate centerlines were extracted from Google Earth for two interstates in Washington State, and 5-year crash data were collected to estimate RF models for crash count prediction. A random effect negative binomial (RENB) model is employed to evaluate predictive performance. Analysis of the variables’ importance shows that the proposed RF models captured the nonlinear correlation between crash count and annual average daily traffic (AADT), the elevation and grade of road segments, median lane width, left shoulder width, ratio of horizontal curve, the standard deviation of grade in 1- and 2-mi road segments, the standard deviation of elevation in 1- and 2-mi road segments, and lane width. Other variables, e.g., right shoulder width and the number of lanes on the highway were also important in the proposed RF models. By better capturing the nonlinearity, the proposed RF model outperformed the baseline model in terms of the predictive performance measurements. The findings of this research can serve to facilitate improvements in highway geometric design and recommend countermeasures to reduce the crash count on interstate highways.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request. The crash data and the road elevation data used in the current research are available upon reasonable request.

Acknowledgments

This research was supported in part by the Center for Safety Equity in Transportation (CSET) project numbered #1905, US Department of Transportation University Transportation Center for Tier 1. The authors also thank Dr. Yinsong Wang, Xianzhe Chen, and Ben Wright for help with data extraction and reduction and Chris Gottsacker for language editing.

References

Abdel-Aty, M., and H. Abdelwahab. 2004. “Modeling rear-end collisions including the role of driver’s visibility and light truck vehicles using a nested logit structure.” Accid. Anal. Prev. 36 (3): 447–456. https://doi.org/10.1016/S0001-4575(03)00040-X.
Al-Deek, H. M., S. S. Ishak, and A. A. Khan. 1996. “Impact of freeway geometric and incident characteristics on incident detection.” J. Transp. Eng. 122 (6): 440–446. https://doi.org/10.1061/(ASCE)0733-947X(1996)122:6(440).
Boriboonsomsin, K., and M. Barth. 2009. “Impacts of road grade on fuel consumption and carbon dioxide emissions evidenced by use of advanced navigation systems.” Transp. Res. Rec. 2139 (1): 21–30. https://doi.org/10.3141/2139-03.
Breiman, L. 2001. “Random forests.” Mach. Learn. 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
Breiman, L. 2002. Manual on setting up, using, and understanding random forests v3.1. Berkeley, CA: Statistics Dept., Univ. of California.
Breiman, L. 2017. Classification and regression trees. London: Routledge.
Cicero-Fernández, P., J. R. Long, and A. M. Winer. 1997. “Effects of grades and other loads on on-road emissions of hydrocarbons and carbon monoxide.” J. Air Waste Manage. Assoc. 47 (8): 898–904. https://doi.org/10.1080/10473289.1997.10464455.
Díaz-Uriarte, R., and S. A. De Andres. 2006. “Gene selection and classification of microarray data using random forest.” BMC Bioinf. 7 (1): 3. https://doi.org/10.1186/1471-2105-7-3.
Dong, N., H. Huang, and L. Zheng. 2015. “Support vector machine in crash prediction at the level of traffic analysis zones: Assessing the spatial proximity effects.” Accid. Anal. Prev. 82 (Sep): 192–198. https://doi.org/10.1016/j.aap.2015.05.018.
Drucker, H., C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik. 1997. “Support vector regression machines.” In Advances in neural information processing systems, edited by M. C. Mozer, M. I. Jordan, and T. Petsche, 155–161. Cambridge, MA: MIT Press.
Emmerink, R. H. M., K. W. Axhausen, P. Nijkamp, and P. Rietveld. 1995. “Effects of information in road transport networks with recurrent congestion.” Transportation 22 (1): 21–53. https://doi.org/10.1007/BF01151617.
Farr, T. G., et al. 2007. “The shuttle radar topography mission.” Rev. Geophys. 45 (2): 1–33. https://doi.org/10.1029/2005RG000183.
Garber, N. J., and A. A. Ehrhart. 2000. “Effect of speed, flow, and geometric characteristics on crash frequency for two-lane highways.” Transp. Res. Rec. 1717 (1): 76–83. https://doi.org/10.3141/1717-10.
Geedipally, S. R., D. Lord, and S. S. Dhavala. 2012. “The negative binomial-Lindley generalized linear model: Characteristics and application using crash data.” Accid. Anal. Prev. 45 (Mar): 258–265. https://doi.org/10.1016/j.aap.2011.07.012.
Genuer, R., J.-M. Poggi, and C. Tuleau-Malot. 2010. “Variable selection using random forests.” Pattern Recognit. Lett. 31 (14): 2225–2236. https://doi.org/10.1016/j.patrec.2010.03.014.
Gesch, D., M. Oimoen, S. Greenlee, C. Nelson, M. Steuck, and D. Tyler. 2002. “The national elevation dataset.” Photogramm. Eng. Remote Sens. 68 (1): 5–32.
Grömping, U. 2009. “Variable importance assessment in regression: Linear regression versus random forest.” Am. Statistician 63 (4): 308–319. https://doi.org/10.1198/tast.2009.08199.
Hassel, D., and F.-J. Weber. 1997. Gradient influence on emission and consumption behaviour of light and heavy duty vehicles. Cologne, Germany: TÜV Rheinland.
Hilbe, J. M. 2011. Negative binomial regression. Cambridge, UK: Cambridge University Press.
Ihaka, R., and R. Gentleman. 1996. “R: A language for data analysis and graphics.” J. Comput. Graphical Stat. 5 (3): 299–314. https://doi.org/10.1080/10618600.1996.10474713.
Ishwaran, H. 2007. “Variable importance in binary regression trees and forests.” Electron. J. Stat. 1: 519–537. https://doi.org/10.1214/07-EJS039.
Lao, Y., G. Zhang, Y. Wang, and J. Milton. 2014. “Generalized nonlinear models for rear-end crash risk analysis.” Accid. Anal. Prev. 62 (Jan): 9–16. https://doi.org/10.1016/j.aap.2013.09.004.
Levin, M. W., M. Duell, and S. T. Waller. 2014. “Effect of road grade on networkwide vehicle energy consumption and ecorouting.” Transp. Res. Rec. 2427 (1): 26–33. https://doi.org/10.3141/2427-03.
Li, X., D. Lord, Y. Zhang, and Y. Xie. 2008. “Predicting motor vehicle crashes using support vector machine models.” Accid. Anal. Prev. 40 (4): 1611–1618. https://doi.org/10.1016/j.aap.2008.04.010.
Li, Z., Z. Pu, Y. Wang, W. Zhu, Z. Chen, and H. Wu. 2017. Evaluating the correlation between vertical curve features and crash rates on highways. Washington, DC: Transportation Research Board.
Liaw, A., and M. Wiener. 2002. “Classification and regression by randomForest.” R News 2 (3): 18–22.
Mannering, F. L., V. Shankar, and C. R. Bhat. 2016. “Unobserved heterogeneity and the statistical analysis of highway accident data.” Analytic Methods Accid. Res. 11 (Sep): 1–16. https://doi.org/10.1016/j.amar.2016.04.001.
Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized linear models.” J. R. Stat. Soc., Ser. A 135 (3): 370–384. https://doi.org/10.2307/2344614.
Oliveira, S., F. Oehler, J. San-Miguel-Ayanz, A. Camia, and J. M. C. Pereira. 2012. “Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest.” For. Ecol. Manage. 275 (Jul): 117–129. https://doi.org/10.1016/j.foreco.2012.03.003.
Ou, J., J. Xia, Y.-J. Wu, and W. Rao. 2017. “Short-term traffic flow forecasting for urban roads using data-driven feature selection strategy and bias-corrected random forests.” Transp. Res. Rec. 2645 (1): 157–167. https://doi.org/10.3141/2645-17.
Reutebuch, S. E., H.-E. Andersen, and R. J. McGaughey. 2005. “Light detection and ranging (LIDAR): An emerging tool for multiple resource inventory.” J. For. 103 (6): 286–292.
Rusli, R., M. M. Haque, A. P. Afghari, and M. King. 2018. “Applying a random parameters negative binomial Lindley model to examine multi-vehicle crashes along rural mountainous highways in Malaysia.” Accid. Anal. Prev. 119 (Oct): 80–90. https://doi.org/10.1016/j.aap.2018.07.006.
Shankar, V., F. Mannering, and W. Barfield. 1995. “Effect of roadway geometrics and environmental factors on rural freeway accident frequencies.” Accid. Anal. Prev. 27 (3): 371–389. https://doi.org/10.1016/0001-4575(94)00078-Z.
Skabardonis, A., P. Varaiya, and K. F. Petty. 2003. “Measuring recurrent and nonrecurrent traffic congestion.” Transp. Res. Rec. 1856 (1): 118–124. https://doi.org/10.3141/1856-12.
Specht, D. F. 1991. “A general regression neural network.” IEEE Trans. Neural Networks 2 (6): 568–576. https://doi.org/10.1109/72.97934.
Strobl, C., A.-L. Boulesteix, T. Kneib, T. Augustin, and A. Zeileis. 2008. “Conditional variable importance for random forests.” BMC Bioinf. 9 (1): 307. https://doi.org/10.1186/1471-2105-9-307.
Tachikawa, T., et al. 2011. ASTER global digital elevation model version 2—Summary of validation results. Washington, DC: National Aeronautics and Space Administration.
Wang, Y., Y. Zou, K. Henrickson, Y. Wang, J. Tang, and B.-J. Park. 2017. “Google Earth elevation data extraction and accuracy assessment for transportation applications.” PLoS One 12 (4): e0175756. https://doi.org/10.1371/journal.pone.0175756.
Washington, S. P., M. G. Karlaftis, and F. Mannering. 2010. Statistical and econometric methods for transportation data analysis. Boca Raton, FL: CRC Press.
Wong, S. C., N.-N. Sze, and Y.-C. Li. 2007. “Contributory factors to traffic crashes at signalized intersections in Hong Kong.” Accid. Anal. Prev. 39 (6): 1107–1113. https://doi.org/10.1016/j.aap.2007.02.009.
Zhu, W., B. Wright, Z. Li, Y. Wang, and Z. Pu. 2016. Analyzing the impact of grade on fuel consumption for the national interstate highway system. Washington, DC: Transportation Research Board.

Information & Authors

Information

Published In

Go to Journal of Transportation Engineering, Part A: Systems
Journal of Transportation Engineering, Part A: Systems
Volume 146Issue 10October 2020

History

Received: Feb 7, 2019
Accepted: Apr 7, 2020
Published online: Jul 24, 2020
Published in print: Oct 1, 2020
Discussion open until: Dec 24, 2020

Permissions

Request permissions for this article.

Authors

Affiliations

Research Associate, Smart Transportation Application and Research Laboratory, Dept. of Civil and Environmental Engineering, Univ. of Washington, 101 More Hall, Seattle, WA 98195. ORCID: https://orcid.org/0000-0002-9488-9175. Email: [email protected]
Zhibin Li, Ph.D. [email protected]
Professor, School of Transportation, Southeast Univ., No. 2 Southeast University Rd., Nanjing, Jiangsu 211189. Email: [email protected]
Ph.D. Candidate, Smart Transportation Application and Research Laboratory, Dept. of Civil and Environmental Engineering, Univ. of Washington, 101 More Hall, Seattle, WA 98195. ORCID: https://orcid.org/0000-0001-9139-6765. Email: [email protected]
Xuedong Hua, Ph.D. [email protected]
Assistant Professor, Jiangsu Key Laboratory of Urban Intelligent Transportation Systems, School of Transportation, Southeast Univ., No. 2 Southeast University Rd., Nanjing, Jiangsu 211189. Email: [email protected]
Yinhai Wang, Ph.D., F.ASCE [email protected]
Professor, Smart Transportation Application and Research Lab, Dept. of Civil and Environmental Engineering, Univ. of Washington, 121F More Hall, Seattle, WA 98195 (corresponding author). Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share