Construction-Accident Narrative Classification Using Shallow and Deep Learning

Qiao, Jianfeng; Wang, Changfeng; Guan, Shuang; Shuran, Lv

doi:10.1061/(ASCE)CO.1943-7862.0002354

Technical Papers

Jul 4, 2022

Construction-Accident Narrative Classification Using Shallow and Deep Learning

Authors: Jianfeng Qiao https://orcid.org/0000-0002-4379-0810 [email protected], Changfeng Wang, Shuang Guan, and Lv ShuranAuthor Affiliations

Publication: Journal of Construction Engineering and Management

Volume 148, Issue 9

https://doi.org/10.1061/(ASCE)CO.1943-7862.0002354

Get Access

Abstract

It is crucial to extract knowledge from past accidents to prevent future ones. To this end, narrative classification is required in text mining. This autocoding process can be seen as a multiclass classification problem with an imbalanced data set. We evaluated the performance of several state-of-the-art machine learning methods, including 10 shallow learning methods (Rocchio,

k

-nearest neighbors, linear regression, naive Bayes, decision tree, random forest, gradient boosting, bootstrap aggregating, support vector machine (SVM), and shallow neural network), and five deep learning methods [deep neural network, convolutional neural network (CNN), recurrent neural network with long short-term memory, and a gated recurrent unit, and recurrent CNN]. The input data set contained 4,770 construction accident reports from the Occupational Safety and Health Administration (OSHA). After the narratives were relabeled based on the Occupational Injury and Illness Classification System (OIICS), the accuracy of all shallow classifiers was significantly improved compared with that reported in previous studies. SVM and CNN achieved the highest accuracy of 0.91 and 0.90 among the shallow and deep learning methods, respectively. Misclassifications occur because training data sets lack rich diversity for minority classes, some cases belong to multiple classes, and some divisions have the same key feature words. In the future, when a new data set is available, we can use learned patterns to classify them with high accuracy in practice.

Get full access to this article

View all available purchase options and get full access to this article.

Get Access

Data Availability Statement

The code is available upon reasonable request. The data are available at https://github.com/qiao77/Injury-Narratives.

Acknowledgments

This research was supported by the Beijing Key Laboratory of Megaregions Sustainable Development Modeling, Capital University of Economics and Business (CUEB) fund. The authors would like to thank the anonymous reviewers for their detailed and valuable comments, which have significantly improved the quality of the paper.

References

Ayhan, B. U., and O. B. Tokdemir. 2020. “Accident analysis for construction safety using latent class clustering and artificial neural networks.” J. Constr. Eng. Manage. 146 (3): 04019114. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001762.

Abstract

Get full access to this article

Data Availability Statement

Acknowledgments

References

Information

Published In

Copyright

History

Permissions

Authors

Affiliations

Metrics

Citations

Download citation

Cited by

Get Access

Access content

Purchase

ASCE Library Card (5 downloads)

ASCE Library Card (5 downloads)

ASCE Library Card (20 downloads)

ASCE Library Card (20 downloads)

Buy Single Article

Buy Single Article

Get Access

Access content

Purchase

ASCE Library Card (5 downloads)

ASCE Library Card (5 downloads)

ASCE Library Card (20 downloads)

ASCE Library Card (20 downloads)

Buy Single Article

Buy Single Article

Figures

Other

Share

Copy the content Link

Share with email

Share

Request Username

Create a new account

Change Password

Password Changed Successfully

Verify Phone

Congrats!