Technical Papers
May 20, 2024

Unsupervised Approach to Investigate Urban Traffic Crashes Based on Crash Unit, Crash Severity, and Manner of Collision

Publication: Journal of Transportation Engineering, Part A: Systems
Volume 150, Issue 8

Abstract

Both crash frequency analysis (CFA) and real-time crash prediction models (RTCPMs) divide a highway into small segments with a constant length [typically 0.161 km (0.10 mi)] for data aggregation. Many previous studies refer to this constant length as the segment length for data aggregation, but this paper adopts fragment size to avoid confusion with aggregation based on highway geometric features. Several studies have shown that segmentation length impacts the studies’ results and recommend not using a length smaller than 0.161 km (0.10 mi) or greater than 0.402 km (0.25 mi) to segment and aggregate traffic data for urban/suburban highways and freeways. Despite the significant impact of the segmentation length on traffic crash aggregation, no specific recommendation for selecting or determining the segmentation length for crash data aggregation exists. This research investigates the impact of segmentation length on traffic crash data aggregation. It establishes a methodology for determining a recommended fragment size (RFS) using hidden heterogeneity in traffic crash data. The study defines featured traffic crash rates using three major traffic crash characteristics: number of vehicles in crash, manner of collision, and crash severity. The analysis uses the Laplacian score with distance-based entropy measure and K-means to cluster highway segments based on the featured crash rates (FCRs) and total crash rates (TCRs) for fragment sizes ranging from 0.161 to 0.402 km (0.10 to 0.25 mi) with an increment of 0.016 km (0.01 mi). The clustering results are compared using their silhouette coefficients. The sample results shows that FCR-based clustering outperforms TCR-based clustering by providing important traffic crash groups within a highway and the RFS to segment and aggregate traffic crash data. The proposed method provides a data-driven comparison of different fragment sizes, revealing the pattern of traffic crashes and a standardized approach for RFS, which reduces the likelihood of fragment misclassification and benefits traffic studies depending on segmentation length.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This work was depended on the traffic count data which was provided by TxDOT Traffic Analysis and System Support. The authors would like to thank Laura Dabalain and Eric Oeding to support this research by providing the traffic count data. The authors would like to acknowledge partial support from the National Institute for Transportation and Communities (NITC; Grant No. #1578).
Author contributions: The authors confirm contribution to the paper as follows: conceptualization: Farzin Maniei and Stephen P. Mattingly; formal analysis: Farzin Maniei; Resources: Farzin Maniei; data curation: Farzin Maniei; methodology: Farzin Maniei and Stephen P. Mattingly; programming and implementation supporting algorithms: Farzin Maniei; interpretation of results: Farzin Maniei; writing–original draft: Farzin Maniei; writing review and editing: Stephen P Mattingly; draft manuscript preparation: Farzin Maniei; project administration: Stephen P. Mattingly; and funding acquisition: Stephen P. Mattingly. All authors reviewed the results and approved the final version of the manuscript.

References

AASHTO. 2010. Highway safety manual. Washington, DC: AASHTO.
Abdel-Aty, M., R. Pemmanaboina, and L. Hsia. 2006. “Assessing crash occurrence on urban freeways by applying a system of interrelated equations.” J. Transp. Res. Board 1953 (1): 1–9. https://doi.org/10.1177/0361198106195300101.
Abdel-Aty, M. A. 2003. “Analysis of driver injury severity levels at multiple locations using ordered Probit models.” J. Saf. Res. 34 (5): 597–603. https://doi.org/10.1016/j.jsr.2003.05.009.
Afghari, A. P., M. M. Haque, and S. Washington. 2020. “Applying a joint model of crash count and crash severity to identify road segments with high risk of fatal and serious injury crashes.” Accid. Anal. Prev. 144 (Sep): 1–11. https://doi.org/10.1016/j.aap.2020.105615.
Ahmed, M. M., and M. A. Abdel-Aty. 2012. “The viability of using automatic vehicle identification data for real-time crash prediction.” IEEE Trans. Intell. Transp. Syst. 13 (2): 459–468. https://doi.org/10.1109/TITS.2011.2171052.
Alabama DOT. 2015. “Alabama speed management manual.” Accessed July 16, 2022. https://www.dot.state.al.us/publications.
Anon. 2011. “sklearn.cluster.KMeans.” Accessed July 11, 2021. https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.
Azizi, L., and M. Hadi. 2021. “Using traffic disturbance metrics to estimate and predict freeway traffic breakdown and safety events.” Transp. Res. Rec. 2675 (10): 723–733. https://doi.org/10.1177/03611981211012422.
Barile, C., C. Casavola, G. Pappalettera, and V. Paramsamy Kannan. 2022. “Laplacian score and K-means data clustering for damage characterization of adhesively bonded CFRP composites by means of acoustic emission technique.” Appl. Acoust. 185 (Jan): 108425. https://doi.org/10.1016/j.apacoust.2021.108425.
Bhatia, J., R. Dave, H. Bhayani, S. Tanwar, and A. Nayyar. 2020. “SDN-based real-time urban traffic analysis in VANET environment.” Comput. Commun. 149 (Jan): 162–175. https://doi.org/10.1016/j.comcom.2019.10.011.
Bhowmik, T., S. Yasmin, and N. Eluru. 2018. “A joint econometric approach for modeling crash counts by collision type.” Anal. Methods Accid. Res. 19 (Sep): 16–32. https://doi.org/10.1016/j.amar.2018.06.001.
Borsos, A., J. N. Ivan, and G. Orosz. 2014. “Development of safety performance functions for two-lane rural first-class main roads in Hungary.” In Proc., Transport Research Arena (TRA) 5th Conf.: Transport Solutions from Research to Deployment. New York: Wiley. https://doi.org/10.1002/9781119307853.ch6.
Chang, F., S. Yasmin, H. Huang, and A. H. Chan. 2021. “Injury severity analysis of motorcycle crashes: A comparison of latent class clustering and latent segmentation based models with unobserved heterogeneity.” Anal. Methods Accid. Res. 32 (Dec): 1–28. https://doi.org/10.1016/j.amar.2021.100188.
Cheng, W., G. S. Gill, R. Dasu, M. Xie, X. Jia, and J. Zhou. 2017. “Comparison of Multivariate Poisson lognormal spatial and temporal crash models to identify hot spots of intersections based on crash types.” Accid. Anal. Prev. 99 (Part A): 330–341. https://doi.org/10.1016/j.aap.2016.11.022.
Cheng, Z., J. Yuan, B. Yu, J. Lu, and Y. Zhao. 2022. “Crash risks evaluation of urban expressways: A case study in Shanghai.” IEEE Trans. Intell. Transp. Syst. 23 (9): 15329–15339. https://doi.org/10.1109/TITS.2022.3140345.
Cook, D., R. Souleyrette, and J. Jackson. 2011. “Effect of road segmentation on highway safety analysis.” In Vol. 11 of Proc., Transportation Research Board 90th Annual Meeting. Washington, DC: Transportation Research Board.
De Luca, M., R. Mauro, R. Lamberti, and G. Dell’Acqua. 2012. “Road safety management using Bayesian and cluster analysis.” Procedia Social Behav. Sci. 54 (Oct): 1260–1269. https://doi.org/10.1016/j.sbspro.2012.09.840.
Depaire, B., G. Wets, and K. Vanhoof. 2008. “Traffic accident segmentation by means of latent class clustering.” Accid. Anal. Prev. 40 (4): 1257–1266. https://doi.org/10.1016/j.aap.2008.01.007.
Geyer, J., E. Lankina, C. Y. Chan, D. R. Ragland, T. Pham, and A. Sharafsaleh. 2008. Methods for identifying high collision concentration locations for potential safety improvements. Berkeley, CA: Univ. of California at Berkeley.
Ghadi, M., and A. Torok. 2019. “A comparative analysis of black spot identification methods and road accident segmentation methods.” Accid. Anal. Prev. 128 (Jul): 1–7. https://doi.org/10.1016/j.aap.2019.03.002.
Golob, T. F., W. Recker, and Y. Pavlis. 2008. “Probabilistic models of freeway safety performance using traffic flow data as predictors.” Saf. Sci. 46 (9): 1306–1333. https://doi.org/10.1016/j.ssci.2007.08.007.
Golob, T. F., W. W. Recker, and V. M. Alvarez. 2004. “Freeway safety as a function of traffic flow.” Accid. Anal. Prev. 36 (6): 933–946. https://doi.org/10.1016/j.aap.2003.09.006.
Green, E. R. 2018. Segmentation strategies for road safety analysis. Lexington, KY: UKnowledge. https://doi.org/10.1016/j.aap.2003.09.006.
He, W., X. Cheng, R. Hu, Y. Zhu, and G. Wen. 2017. “Feature self-representation based hypergraph unsupervised feature selection via low-rank representation.” Neurocomputing 253 (Aug): 127–134. https://doi.org/10.1016/j.neucom.2016.10.087.
He, X., D. Cai, and P. Niyogi. 2005. “Laplacian score for feature selection.” In Vol. 18 of Advances in neural information processing systems. Lexington, KY: MIT Press.
Islam, M., N. Alnawmasi, and F. Mannering. 2020. “Unobserved heterogeneity and temporal instability in the analysis of work-zone crash-injury severities.” Anal. Methods Accid. Res. 28 (Dec): 100130. https://doi.org/10.1016/j.amar.2020.100130.
Islam, M., and F. Mannering. 2020. “A temporal analysis of driver-injury severities in crashes involving aggressive and non-aggressive driving.” Anal. Methods Accid. Res. 27 (Sep): 100128. https://doi.org/10.1016/j.amar.2020.100128.
Islam, M., and A. Pande. 2020. “Analysis of single-vehicle roadway departure crashes on rural curved segments accounting for unobserved heterogeneity.” Transp. Res. Rec. 2674 (10): 146–157. https://doi.org/10.1177/0361198120935877.
Islam, M., D. Perez-Bravo, and K. K. Silverman. 2017. Performance-based assessment to transportation safety planning for metropolitan travel improvement study. Washington, DC: Transportation Research Board.
Ivan, J. N., R. K. Pasupathy, and P. J. Ossenbruggen. 1999. “Differences in causality factors for single and multi-vehicle crashes on two-lane roads.” Accid. Anal. Prev. 31 (6): 695–704. https://doi.org/10.1016/S0001-4575(99)00030-5.
Karim, A., S. Azam, B. Shanmugam, and K. Kannoorpatti. 2020. “Efficient clustering of emails into spam and ham: The foundational study of a comprehensive unsupervised framework.” IEEE Access 8: 154759–154788. https://doi.org/10.1109/ACCESS.2020.3017082.
Koorey, G. 2009. “Road data aggregation and sectioning considerations for crash analysis.” Transp. Res. Rec. 2103 (1): 61–68. https://doi.org/10.3141/2103-08.
Kwon, O. H., M. J. Park, H. Yeo, and K. Chung. 2013. “Evaluating the performance of network screening methods for detecting high collision concentration locations on highways.” Accid. Anal. Prev. 51 (Mar): 141–149. https://doi.org/10.1016/j.aap.2012.10.019.
Liu, R., N. Yang, X. Ding, and L. Ma. 2009. “An unsupervised feature selection algorithm: Laplacian score combined with distance-based entropy measure.” In Proc., 2009 3rd Int. Symp. on Intelligent Information Technology Application, 65–68. New York: IEEE.
Lu, J., A. Gan, K. Haleem, and W. Wu. 2013. “Clustering-based roadway segment division for the identification of high-crash locations.” J. Transp. Saf. Secur. 5 (3): 224–239. https://doi.org/10.1080/19439962.2012.730118.
Mahmud, A., and V. V. Gayah. 2021. “Estimation of crash type frequencies on individual collector roadway segments.” Accid. Anal. Prev. 161 (Oct): 106345. https://doi.org/10.1016/j.aap.2021.106345.
Mannering, F. L., and C. R. Bhat. 2014. “Analytic methods in accident research: Methodological frontier and future directions.” Anal. Methods Accid. Res. 1 (Jan): 1–22. https://doi.org/10.1016/j.amar.2013.09.001.
Mannering, F. L., V. Shankar, and C. R. Bhat. 2016. “Unobserved heterogeneity and the statistical analysis of highway accident data.” Anal. Methods Accid. Res. 11 (Sep): 1–16. https://doi.org/10.1016/j.amar.2016.04.001.
Pande, A., and M. Abdel-Aty. 2006. “Comprehensive analysis of the relationship between real-time traffic surveillance data and rear-end crashes on freeways.” Transp. Res. Rec. 1953 (1): 31–40. https://doi.org/10.1177/0361198106195300104.
Pande, A., M. Abdel-Aty, and A. Das. 2010. “A classification tree based modeling approach for segment related crashes on multilane highways.” J. Saf. Res. 41 (5): 391–397. https://doi.org/10.1016/j.jsr.2010.06.004.
Pedregosa, F., et al. 2011. “Scikit-learn: Machine Learning in Python.” J. Mach. Learn. Res. 12 (Nov): 2825–2830.
Qin, X., and A. Wellner. 2012. “Segment length impact on highway safety screening analysis.” Transp. Res. Rec. 12: 0644.
Raschka, S., and V. Mirjalili. 2017. Python machine learning. 2nd ed. Birmingham, UK: Packt Publishing.
Solorio-Fernández, S., J. A. Carrasco-Ochoa, and J. F. Martínez-Trinidad. 2020. “A review of unsupervised feature selection methods.” Artif. Intell. Rev. 53 (2): 907–948. https://doi.org/10.1007/s10462-019-09682-y.
Thomas, I. 1996. “Spatial data aggregation: Exploratory analysis of road accidents.” Accid. Anal. Prev. 28 (2): 251–264. https://doi.org/10.1016/0001-4575(95)00067-4.
TxDOT (Texas DOT) Traffic Safety Division. 2020. Instruction to police for reporting crashes. Austin, TX: TxDOT.
Valent, F., et al. 2002. “Risk factors for fatal road traffic accidents in Udine, Italy.” Accid. Anal. Prev. 34 (1): 71–84. https://doi.org/10.1016/S0001-4575(00)00104-4.
Wang, D., et al. 2022. “Assessing dynamic metabolic heterogeneity in non-small cell lung cancer patients via ultra-high sensitivity total-body [18F]FDG PET/CT imaging: Quantitative analysis of [18F]FDG uptake in primary tumors and metastatic lymph nodes.” Eur. J. Nucl. Med. Mol. Imaging 49 (13): 4692–4704. https://doi.org/10.1007/s00259-022-05904-8.
Wang, X., and M. Feng. 2019. “Freeway single and multi-vehicle crash safety analysis: Influencing factors and hotspots.” Accid. Anal. Prev. 132 (Feb): 1–12. https://doi.org/10.1016/j.aap.2019.105268.
Xu, C., D. Li, Z. Li, W. Wang, and P. Liu. 2018. “Utilizing structural equation modeling and segmentation analysis in real-time crash risk assessment on freeways.” KSCE J. Civ. Eng. 22 (7): 2569–2577. https://doi.org/10.1007/s12205-017-0629-3.
Xu, C., P. Liu, W. Wang, and Z. Li. 2012. “Evaluation of the impacts of traffic states on crash risks on freeways.” Accid. Anal. Prev. 47 (Jul): 162–171. https://doi.org/10.1016/j.aap.2012.01.020.
Xu, C., A. P. Tarko, W. Wang, and P. Liu. 2013. “Predicting crash likelihood and severity on freeways with real-time loop detector data.” Accid. Anal. Prev. 57 (Aug): 30–39. https://doi.org/10.1016/j.aap.2013.03.035.
Yang, Y., X. Ding, and L. Ma. 2009. “An unsupervised feature selection algorithm: Laplacian score combined with distance-based entropy measure.” In Vol. 3 of Proc., 2009 3rd Int. Symp. on Intelligent Information Technology Application, 65–68. New York: IEEE.
Yu, R., and M. Abdel-Aty. 2013. “Multi-level Bayesian analyses for single- and multi-vehicle freeway crashes.” Accid. Anal. Prev. 58 (Sep): 97–105. https://doi.org/10.1016/j.aap.2013.04.025.
Yu, R., M. Abdel-Aty, and M. Ahmed. 2013. “Bayesian random effect models incorporating real-time weather and traffic data to investigate mountainous freeway hazardous factors.” Accid. Anal. Prev. 50 (Jan): 371–376. https://doi.org/10.1016/j.aap.2012.05.011.

Information & Authors

Information

Published In

Go to Journal of Transportation Engineering, Part A: Systems
Journal of Transportation Engineering, Part A: Systems
Volume 150Issue 8August 2024

History

Received: Dec 12, 2022
Accepted: Feb 9, 2024
Published online: May 20, 2024
Published in print: Aug 1, 2024
Discussion open until: Oct 20, 2024

Permissions

Request permissions for this article.

Authors

Affiliations

Dept. of Civil Engineering, Univ. of Texas at Arlington, Box 19308, Arlington, TX 76019 (corresponding author). ORCID: https://orcid.org/0000-0002-2071-2043. Email: [email protected]
Stephen P. Mattingly, Ph.D., A.M.ASCE https://orcid.org/0000-0001-6515-6813 [email protected]
Professor, Dept. of Civil Engineering, Univ. of Texas at Arlington, Box 19308, Arlington, TX 76019. ORCID: https://orcid.org/0000-0001-6515-6813. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share