Convolutional Neural Network Algorithm–Based Novel Automatic Text Classification Framework for Construction Accident Reports
Publication: Journal of Construction Engineering and Management
Volume 149, Issue 12
Abstract
Construction sites remain one of the most hazardous workplaces globally. To improve workplace safety in the construction industry and reduce the personal injuries and socioeconomic impacts resulting from workplace accidents, tacit knowledge containing fundamental causes of accidents or specific contextual factors can be extracted from past accident narrative reports. However, manually analyzing unstructured or semistructured textual data stored in records is a daunting task, and requires the use of automated and intelligent technologies to achieve rapid and accurate knowledge acquisition. Therefore, this paper proposes a text self-classification model based on deep learning natural language processing (NLP) technology for automated classification of construction site accident cases by accident type. First, combined with two statistical measures, mutual information and information entropy, the preprocessed text data were subjected to phrase segmentation to identify more complete and accurate accident precursor information without human intervention. Then a complete multilayer and multisize convolutional neural network (CNN) model was constructed using pretrained Word2Vec word embeddings for text self-classification tasks. Finally, the test results of the CNN classification algorithm were compared with the practical application results of three shallow learning algorithms, and the performance of different types of classification algorithms was evaluated. The results showed that the CNN-based deep learning algorithm developed in this paper demonstrated excellent feature extraction and learning abilities in the task of automatic text classification in the field of NLP. This not only demonstrated that reliable accident prevention knowledge could be obtained from the textual descriptions of construction accidents, but also provided a novel model reference for document archiving and information retrieval.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.
Acknowledgments
This work was supported by the Major Program of Philosophy and Social Science Research in Jiangsu University (Grant No. 2020SJZDA085), the China Postdoctoral Science Foundation (Grant No. 2019M662003), and the Jiangsu Planned Projects for Postdoctoral Research Funds (Grant No. 2019K171).
References
Cerisara, C., P. Kral, and L. Lenc. 2018. “On the effects of using word2vec representations in neural networks for dialogue act recognition.” Comput. Speech Lang. 47 (Jan): 175–193. https://doi.org/10.1016/j.csl.2017.07.009.
Chen, C., J. B. Xi, J. P. Wang, and Y. Chen. 2021. “Mining of association rules for hidden safety hazards in hydropower project construction.” Chin. J. Saf. Sci. 31 (8): 75–82. https://doi.org/10.16265/j.cnki.issn1003-3033.2021.08.011.
Cheng, M. Y., D. Kusoemo, and R. A. Gosno. 2020. “Text mining-based construction site accident classification using hybrid supervised machine learning.” Autom. Constr. 118 (Oct): 103265. https://doi.org/10.1016/j.autcon.2020.103265.
Chi, N. W., K. Y. Lin, and S. H. Hsieh. 2014. “Using ontology-based text classification to assist job hazard analysis.” Adv. Eng. Inf. 28 (4): 381–394. https://doi.org/10.1016/j.aei.2014.05.001.
Das, S., A. Mudgal, A. Dutta, and S. R. Geedipally. 2018. “Vehicle consumer complaint reports involving severe incidents: Mining large contingency tables.” Transp. Res. Rec. 2672 (32): 72–82. https://doi.org/10.1177/0361198118788464.
Fang, W., L. Ding, P. E. Love, and C. Zhou. 2020. “Computer vision applications in construction safety assurance.” Autom. Constr. 110 (Feb): 103013. https://doi.org/10.1016/j.autcon.2019.103013.
Goh, Y. M., and C. U. Ubeynarayana. 2017. “Construction accident narrative classification: An evaluation of text mining techniques.” Accid. Anal. Prev. 108 (Nov): 122–130. https://doi.org/10.1016/j.aap.2017.08.026.
Graves, J. M., J. M. Whitehill, B. E. Hagel, and F. P. Rivara. 2015. “Making the most of injury surveillance data: Using narrative text to identify exposure information in case-control studies.” Injury 46 (5): 891–897. https://doi.org/10.1016/j.injury.2014.11.012.
Guo, B. H., Y. Zou, Y. Fang, Y. M. Goh, and P. X. Zou. 2021. “Computer vision technologies for safety science and management in construction: A critical review and future research directions.” Saf. Sci. 135 (Mar): 105130. https://doi.org/10.1016/j.ssci.2020.105130.
Gupta, A. K., C. G. Pardheev, S. Choudhuri, S. Das, and A. Garg. 2022. “A novel classification approach based on context connotative network (CCNet): A case of construction site accidents.” Expert Syst. Appl. 202 (15): 117281. https://doi.org/10.1016/j.eswa.2022.117281.
Hinton, G. E., S. Osindero, and Y. W. Teh. 2006. “A fast learning algorithm for deep belief nets.” Neural Comput. 18 (7): 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527.
Hobson, L., H. Cole, and H. Hannes. 2019. Natural language processing in action. Greenwich, UK: Manning Publications.
Hughes, P., D. Shipp, M. Figueres-Esteban, and C. Van Gulijk. 2018. “From free-text to structured safety management: Introduction of a semi-automated classification method of railway hazard reports to elements on a bow-tie diagram.” Saf. Sci. 110 (Dec): 11–19. https://doi.org/10.1016/j.ssci.2018.03.011.
International Labour Organization. 2021. “Safety and health at work.” Accessed December 23, 2021. http://www.ilo.org/global/topics/safety-and-health-at-work/lang--en/index.html.
Jiang, Y. Y., B. Jin, and B. C. Zhang. 2021. “Research progress of deep learning in the field of natural language processing.” [In Chinese.] Comput. Eng. Appl. 57 (22): 1–14. https://doi.org/10.3778/j.issn.1002-8331.2106-0166.
Jin, Z., Y. Han, and Q. Zhu. 2018. “A sentiment analysis model with the combination of deep learning and ensemble learning.” [In Chinese.] J. Harbin Inst. Technol. 50 (11): 32–39. https://doi.org/10.11918/j.issn.0367-6234.201709078.
Jing, S., X. Liu, X. Gong, Y. Tang, G. Xiong, and S. Liu. 2022. “Correlation analysis and text classification of chemical accident cases based on word embedding.” Process Saf. Environ. Prot. 158 (Feb): 698–710. https://doi.org/10.1016/j.psep.2021.12.038.
Jung, N., and G. Lee. 2019. “Automated classification of building information modeling (BIM) case studies by BIM use based on natural language processing (NLP) and unsupervised learning.” Adv. Eng. Inf. 41 (Aug): 100917. https://doi.org/10.1016/j.aei.2019.04.007.
Kakhki, F. D., S. A. Freeman, and G. A. Mosher. 2019. “Evaluating machine learning performance in predicting injury severity in agribusiness industries.” Saf. Sci. 117 (Aug): 257–262. https://doi.org/10.1016/j.ssci.2019.04.026.
Khatua, A., A. Khatua, and E. Cambria. 2019. “A tale of two epidemics: Contextual Word2Vec for classifying twitter streams during outbreaks.” Inf. Process. Manage. 56 (1): 247–257. https://doi.org/10.1016/j.ipm.2018.10.010.
Khurana, D., A. Koli, K. Khatter, and S. Singh. 2023. “Natural language processing: State of the art, current trends and challenges.” Multimedia Tools Appl. 82 (3): 3713–3744. https://doi.org/10.1007/s11042-022-13428-4.
Ko, T., H. D. Jeong, and G. Lee. 2021. “Natural language processing–driven model to extract contract change reasons and altered work items for advanced retrieval of change orders.” J. Constr. Eng. Manage. 147 (11): 04021147. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002172.
Lette, A., A. Ambelu, T. Getahun, and S. Mekonen. 2018. “A survey of work-related injuries among building construction workers in southwestern Ethiopia.” Int. J. Ind. Ergon. 68 (Nov): 57–64. https://doi.org/10.1016/j.ergon.2018.06.010.
Li, S., H. Cai, and V. R. Kamat. 2016. “Integrating natural language processing and spatial reasoning for utility compliance checking.” J. Constr. Eng. Manage. 142 (12): 4016074.1. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001199.
Love, P. E., S. Jim, and T. Pauline. 2018. “Putting into practice error management theory: Unlearning and learning to manage action errors in construction.” Appl. Ergon. 69 (May): 104–111. https://doi.org/10.1016/j.apergo.2018.01.007.
Luo, X. J., L. O. Oyedele, A. O. Ajayi, and O. O. Akinade. 2020. “Comparative study of machine learning-based multi-objective prediction framework for multiple building energy loads.” Sustainable Cities Soc. 61 (Oct): 102283. https://doi.org/10.1016/j.scs.2020.102283.
MOHURD (Ministry of Housing and Urban-Rural Development). 2021. “A guide for planning and construction of rural and urban areas in China.” Accessed September 20, 2021. https://www.mohurd.gov.cn/ess/.
Pan, Y., and L. Zhang. 2021. “Roles of artificial intelligence in construction engineering and management: A critical review and future trends.” Autom. Constr. 122 (Feb): 103517. https://doi.org/10.1016/j.autcon.2020.103517.
Peng, H., F. Long, and C. Ding. 2005. “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy.” IEEE Trans. Pattern Anal. Mach. Intell. 27 (8): 1226–1238. https://doi.org/10.1109/TPAMI.2005.159.
SAOWS (State Administration of Work Safety). 2021. “A non-ministerial agency for the regulation of risks to occupational safety and health in China.” Accessed September 20, 2021. https://www.mem.gov.cn/was5/web/.
SMN (Safety Management Network). 2021. “A non-ministerial agency for the regulation of risks to occupational safety and health in China.” Accessed September 22, 2021. https://www.safehoo.com/Manage/.
Soliman, E. 2018. “Risk identification for building maintenance projects.” Int. J. Constr. Project Manage. 10 (1): 37–54.
Song, B., and Y. Suh. 2019. “Narrative texts-based anomaly detection using accident report documents: The case of chemical process safety.” J. Loss Prev. Process Ind. 57 (Jan): 47–54. https://doi.org/10.1016/j.jlp.2018.08.010.
Sun, Y. L., W. Yu, Z. Han, and K. R. Liu. 2006. “Information theoretic framework of trust modeling and evaluation for ad hoc networks.” IEEE J. Sel. Areas Commun. 24 (2): 305–317. https://doi.org/10.1109/JSAC.2005.861389.
Sunitha, D., R. K. Patra, N. V. Babu, A. Suresh, and S. C. Gupta. 2022. “Twitter sentiment analysis using ensemble based deep learning model towards COVID-19 in India and European countries.” Pattern Recognit. Lett. 158 (Jun): 164–170. https://doi.org/10.1016/j.patrec.2022.04.027.
Tanguy, L., N. Tulechki, A. Urieli, E. Hermann, and C. Raynal. 2016. “Natural language processing for aviation safety reports: From classification to interactive analysis.” Comput. Ind. 78 (May): 80–95. https://doi.org/10.1016/j.compind.2015.09.005.
Tian, D., M. Li, J. Shi, Y. Shen, and S. Han. 2021. “On-site text classification and knowledge mining for large-scale projects construction by integrated intelligent approach.” Adv. Eng. Inf. 49 (Aug): 101355. https://doi.org/10.1016/j.aei.2021.101355.
Tixier, A. J. P., M. R. Hallowell, B. Rajagopalan, and D. Bowman. 2016. “Application of machine learning to construction injury prediction.” Autom. Constr. 69 (Sep): 102–114. https://doi.org/10.1016/j.autcon.2016.05.016.
Wang, B., Y. Lei, N. Li, and W. Wang. 2021. “Multi-scale convolutional attention network for predicting remaining useful life of machinery.” IEEE Trans. Ind. Electron. 68 (8): 7496–7504. https://doi.org/10.1109/TIE.2020.3003649.
Wang, S. H. 2021. “Basic concept analysis of dual prevention mechanism and discussion on creation method.” [In Chinese.] Ind. Saf. Environ. Prot. 47 (3): 63–67. https://doi.org/10.3969/j.issn.1001-425X.2021.03.015.
Wu, F., A. Fan, A. Baevski, Y. N. Dauphin, and M. Auli. 2019. “Pay less attention with lightweight and dynamic convolutions.” Preprint, submitted July 25, 2019. http://arxiv.org/abs/1901.10430.
Xing, X., B. Zhong, H. Luo, H. Li, and H. Wu. 2019. “Ontology for safety risk identification in metro construction.” Comput. Ind. 109 (Aug): 14–30. https://doi.org/10.1016/j.compind.2019.04.001.
Xu, X. X., and X. Z. Patrick. 2021. “Discovery of new safety knowledge from mining large injury dataset in construction.” Saf. Sci. 144 (Dec): 105481. https://doi.org/10.1016/j.ssci.2021.105481.
Yao, Q., R. Li, L. Song, and M. Crabbe. 2021. “Construction safety knowledge sharing on Twitter: A social network analysis.” Saf. Sci. 143 (Nov): 105411. https://doi.org/10.1016/j.ssci.2021.105411.
Zhang, F., H. Fleyeh, X. Wang, and M. Lu. 2019. “Construction site accident analysis using text mining and natural language processing techniques.” Autom. Constr. 99 (Mar): 238–248. https://doi.org/10.1016/j.autcon.2018.12.016.
Zhang, X., P. Srinivasan, and S. Mahadevan. 2021. “Sequential deep learning from NTSB reports for aviation safety prognosis.” Saf. Sci. 142 (Oct): 105390. https://doi.org/10.1016/j.ssci.2021.105390.
Zhong, B., X. Pan, P. E. Love, L. Ding, and W. Fang. 2020. “Deep learning and network analysis: Classifying and visualizing accident narratives in construction.” Autom. Constr. 113 (May): 103089. https://doi.org/10.1016/j.autcon.2020.103089.
Zhou, P., and N. El-Gohary. 2017. “Ontology-based automated information extraction from building energy conservation codes.” Autom. Constr. 74 (Feb): 103–117. https://doi.org/10.1016/j.autcon.2016.09.004.
Zhu, Y., and S. P. Chen. 2020. “Text classification models with nearest neighbor attention and convolutional neural networks.” [In Chinese.] Small Microcomputer Syst. 41 (2): 375–380. https://doi.org/10.3969/j.issn.1000-1220.2020.02.025.
Information & Authors
Information
Published In
Copyright
© 2023 American Society of Civil Engineers.
History
Received: Jan 4, 2023
Accepted: Jul 25, 2023
Published online: Sep 29, 2023
Published in print: Dec 1, 2023
Discussion open until: Feb 29, 2024
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.