Some features of the ASCE Shopping cart and login features of the website will be down for maintenance on Sunday, June 16th, 2024, beginning at 12:00 A.M. ET and ending at 6:00 A.M. ET. During this time if you need immediate assistance at 1-800-548-2723 or [email protected].

Technical Papers
Apr 11, 2024

Using Text Mining and Bayesian Network to Identify Key Risk Factors for Safety Accidents in Metro Construction

Publication: Journal of Construction Engineering and Management
Volume 150, Issue 6

Abstract

Complex risk factors make metro construction safety accidents prone to occur, and there are various types of accidents. Accident reports record detailed information about different types of accidents in text form. However, effectively utilizing such unstructured data presents a significant challenge. Text mining (TM) provides a viable foundation for addressing this challenge, but related studies have limitations in risk feature extraction and lack of in-depth analysis capability. To address the deficiencies of existing studies and provide a feasible strategy for identifying key risk factors in the metro construction domain, this paper proposes an integrated model combining TM and machine learning–based Bayesian networks. Firstly, the term frequency-inverse document frequency (TF-IDF) algorithm in TM was used to separately extract the direct and indirect cause factors from the accident reports, with the missing factors supplemented using the TextRank algorithm. Then, depending on the assumption of whether to consider the conditional independence between factors, an improved naive Bayesian network (NBN) and a tree-augmented naive Bayesian network (TAN) were built based on the extracted factors and the corresponding accident types, respectively, for further in-depth analysis. Finally, the training set was divided to train the two network models, and sensitivity analysis was used to identify the key risk factors. Using 162 accident reports from China as an application example, the results showed that TAN exhibited a higher average accuracy (79.62%) in the test set compared with the improved NBN (71.75%), and the importance of risk factors for different accident types was successfully ranked from multiple perspectives using TAN. Meanwhile, some new insights into metro accidents in China were obtained, which can support decision-making for accident prevention and control. In conclusion, this paper effectively addresses the relevant limitations of accident text utilization and presents a novel approach for metro construction safety management.

Practical Applications

Analyzing accident texts can help gain insights from objective historical data to support safety management efforts. However, accident texts are often unstructured and contain a lot of irrelevant content. How to quickly extract valid information from accident text and use it to analyze accidents in depth is of continuous interest to safety managers. In particular, those models that have real-time decision support capabilities in addition to theoretical insights. This paper proposes an integrated model that combines text mining and machine-learning Bayesian networks. This model achieves comprehensive textual feature extraction, multifaceted accident causation analysis, and allows safety managers to input current accident information into the model to obtain real-time decision support for accident prevention and control. Although the proposed model is developed for metro construction, it can be slightly adapted by incorporating the characteristics of accident texts from similar domains to obtain an integrated model suitable for these domains, so as to effectively control the occurrence of safety accidents.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

References

AFP (Agence France Presse). 2008. “21 dead in China subway accident: State media—eb247—News—Asia—Emirates24|7.” Accessed January 10, 2023. https://www.emirates247.com/eb247/news/asia/21-dead-in-china-subway-accident-state-media-2008-11-19-1.227244.
Ahadh, A., G. V. Binish, and R. Srinivasan. 2021. “Text mining of accident reports using semi-supervised keyword extraction and topic modeling.” Process Saf. Environ. Prot. 155 (Nov): 455–465. https://doi.org/10.1016/j.psep.2021.09.022.
Alawad, H., and S. Kaewunruen. 2023. “Unsupervised machine learning for managing safety accidents in railway stations.” IEEE Access 11 (Feb): 83187–83199. https://doi.org/10.1109/ACCESS.2023.3264763.
Alyami, H., Z. Yang, R. Riahi, S. Bonsall, and J. Wang. 2019. “Advanced uncertainty modelling for container port risk analysis.” Accid. Anal. Prev. 123 (May): 411–421. https://doi.org/10.1016/j.aap.2016.08.007.
Arteaga, C., A. Paz, and J. Park. 2020. “Injury severity on traffic crashes: A text mining with an interpretable machine-learning approach.” Saf. Sci. 132 (Mar): 104988. https://doi.org/10.1016/j.ssci.2020.104988.
Baeza-Yates, R., and B. Ribeiro-Neto. 1999. Modern information retrieval. New York: Association for Computing Machinery Press.
BayesFusion. 2020. “BayesFusion, LIC.” Accessed January 22, 2023. https://www.bayesfusion.com/.
Chang, T., S. Chi, and S.-B. Im. 2022. “Understanding user experience and satisfaction with urban infrastructure through text mining of civil complaint data.” J. Constr. Eng. Manage. 148 (8): 04022061. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002308.
Ertek, G., and L. Kailas. 2021. “Analyzing a decade of wind turbine accident news with topic modeling.” Sustainability 13 (22): 12757. https://doi.org/10.3390/su132212757.
Esmaeili, B., M. R. Hallowell, and B. Rajagopalan. 2015. “Attribute-based safety risk assessment. I: Analysis at the fundamental level.” J. Constr. Eng. Manage. 141 (8): 04015021. https://doi.org/10.1061/(ASCE)CO.1943-7862.0000980.
Fan, S., E. Blanco-Davis, Z. Yang, J. Zhang, and X. Yan. 2020a. “Incorporation of human factors into maritime accident analysis using a data-driven Bayesian network.” Reliab. Eng. Syst. Saf. 203 (Mar): 107070. https://doi.org/10.1016/j.ress.2020.107070.
Fan, S., Z. Yang, E. Blanco-Davis, J. Zhang, and X. Yan. 2020b. “Analysis of maritime transport accidents using Bayesian networks.” Proc. Inst. Mech. Eng., Part O: J. Risk Reliab. 234 (3): 439–454. https://doi.org/10.1177/1748006X19900850.
Fang, W., H. Luo, S. Xu, P. E. D. Love, Z. Lu, and C. Ye. 2020. “Automated text classification of near-misses from safety reports: An improved deep learning approach.” Adv. Eng. Inf. 44 (Apr): 101060. https://doi.org/10.1016/j.aei.2020.101060.
Fung, I. W. H., V. W. Y. Tam, C. P. Sing, K. K. W. Tang, and S. O. Ogunlana. 2016. “Psychological climate in occupational safety and health: The safety awareness of construction workers in South China.” Int. J. Construct. Manage. 16 (4): 315–325. https://doi.org/10.1080/15623599.2016.1146114.
Goh, Y. M., and C. U. Ubeynarayana. 2017. “Construction accident narrative classification: An evaluation of text mining techniques.” Accid. Anal. Prev. 108 (Mar): 122–130. https://doi.org/10.1016/j.aap.2017.08.026.
Guo, S., Y. Zhao, Y. Luoren, K. Liang, and B. Tang. 2022. “Knowledge discovery of correlations between unsafe behaviors within construction accidents.” Eng. Constr. Archit. Manage. 29 (4): 1797–1816. https://doi.org/10.1108/ECAM-09-2020-0745.
Hangzhou Municipal Bureau of Emergency Management. 2021. “Investigation report on Hangzhou Metro Line 4 earthwork collapse.” Accessed January 15, 2023. http://safety.hangzhou.gov.cn/art/2021/6/20/art_1229205158_58923126.html.
Hou, W., X. Wang, H. Zhang, J. Wang, and L. Li. 2021. “Safety risk assessment of metro construction under epistemic uncertainty: An integrated framework using credal networks and the EDAS method.” Appl. Soft Comput. 108 (Sep): 107436. https://doi.org/10.1016/j.asoc.2021.107436.
Huang, W., X. Kou, Y. Zhang, R. Mi, D. Yin, W. Xiao, and Z. Liu. 2021. “Operational failure analysis of high-speed electric multiple units: A Bayesian network-K2 algorithm-expectation maximization approach.” Reliab. Eng. Syst. Saf. 205 (May): 107250. https://doi.org/10.1016/j.ress.2020.107250.
Huang, Y., Z. Zhang, Y. Tao, and H. Hu. 2022. “Quantitative risk assessment of railway intrusions with text mining and fuzzy Rule-Based Bow-Tie model.” Adv. Eng. Inf. 54 (Oct): 101726. https://doi.org/10.1016/j.aei.2022.101726.
Huang, Z., and Z. Xie. 2022. “A patent keywords extraction method using TextRank model with prior public knowledge.” Complex Intell. Syst. 8 (1): 1–12. https://doi.org/10.1007/s40747-021-00343-8.
Ismail, Z., S. Doostdar, and Z. Harun. 2012. “Factors influencing the implementation of a safety management system for construction sites.” Saf. Sci. 50 (3): 418–423. https://doi.org/10.1016/j.ssci.2011.10.001.
Jannadi, O. A., and S. Almishari. 2003. “Risk assessment in construction.” J. Constr. Eng. Manage. 129 (5): 492–500. https://doi.org/10.1061/(ASCE)0733-9364(2003)129:5(492).
Kamil, M. Z., M. Taleb-Berrouane, F. Khan, P. Amyotte, and S. Ahmed. 2023. “Textual data transformations using natural language processing for risk assessment.” Risk Anal. 43 (10): 2033–2052. https://doi.org/10.1111/risa.14100.
Kao, A., and S. R. Poteet, eds. 2007. Natural language processing and text mining. London: Springer.
Kim, D., D. Seo, S. Cho, and P. Kang. 2019. “Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec.” Inf. Sci. 477 (Mar): 15–29. https://doi.org/10.1016/j.ins.2018.10.006.
LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep learning.” Nature 521 (7553): 436–444. https://doi.org/10.1038/nature14539.
Lee, K.-J., C. H. Yun, I. Rhiu, and M. H. Yun. 2020. “Identifying the risk factors in the context-of-use of electric kick scooters based on a latent dirichlet allocation.” Appl. Sci. 10 (23): 8447. https://doi.org/10.3390/app10238447.
Li, J., J. Wang, N. Xu, Y. Hu, and C. Cui. 2018. “Importance degree research of safety risk management processes of urban rail transit based on text mining method.” Information 9 (2): 26. https://doi.org/10.3390/info9020026.
Li, S., M. You, D. Li, and J. Liu. 2022. “Identifying coal mine safety production risk factors by employing text mining and Bayesian network techniques.” Process Saf. Environ. Prot. 162 (May): 1067–1081. https://doi.org/10.1016/j.psep.2022.04.054.
Lin, S.-S., S.-L. Shen, A. Zhou, and Y.-S. Xu. 2021. “Risk assessment and management of excavation system based on fuzzy set theory and machine learning methods.” Autom. Constr. 122 (Feb): 103490. https://doi.org/10.1016/j.autcon.2020.103490.
Liu, C., and S. Yang. 2022. “Using text mining to establish knowledge graph from accident/incident reports in risk assessment.” Expert Syst. Appl. 207 (Nov): 117991. https://doi.org/10.1016/j.eswa.2022.117991.
Liu, K., B. Cai, Q. Wu, M. Chen, C. Yang, J. A. Khan, C. Wang, H. V. W. Pattiyakumbura, W. Ge, and Y. Liu. 2023a. “Risk identification and assessment methods of offshore platform equipment and operations.” Process Saf. Environ. Prot. 177 (Sep): 1415–1430. https://doi.org/10.1016/j.psep.2023.07.081.
Liu, P., Q. Li, J. Bian, L. Song, and X. Xiahou. 2018. “Using interpretative structural modeling to identify critical success factors for safety management in subway construction: A China study.” Int. J. Environ. Res. Public Health 15 (7): 1359. https://doi.org/10.3390/ijerph15071359.
Liu, Y., J. Wang, S. Tang, J. Zhang, and J. Wan. 2023b. “Integrating information entropy and latent dirichlet allocation models for analysis of safety accidents in the construction industry.” Buildings 13 (7): 1831. https://doi.org/10.3390/buildings13071831.
Ministry of Emergency Management of the People’s Republic of China. 2022. “Safety accident report in metro construction.” Accessed January 15, 2023. https://www.mem.gov.cn/.
Nikolova, O., J. Zola, and S. Aluru. 2013. “Parallel globally optimal structure learning of Bayesian networks.” J. Parallel Distrib. Comput. 73 (8): 1039–1048. https://doi.org/10.1016/j.jpdc.2013.04.001.
Qiao, J., C. Wang, S. Guan, and L. Shuran. 2022. “Construction-accident narrative classification using shallow and deep learning.” J. Constr. Eng. Manage. 148 (9): 04022088. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002354.
Qiu, Z., Q. Liu, X. Li, J. Zhang, and Y. Zhang. 2021. “Construction and analysis of a coal mine accident causation network based on text mining.” Process Saf. Environ. Prot. 153 (Sep): 320–328. https://doi.org/10.1016/j.psep.2021.07.032.
Rathnayaka, S., F. Khan, and P. Amyotte. 2011. “SHIPP methodology: Predictive accident modeling approach. Part I: Methodology and model description.” Process Saf. Environ. Prot. 89 (3): 151–164. https://doi.org/10.1016/j.psep.2011.01.002.
Rose, R. L., T. G. Puranik, and D. N. Mavris. 2020. “Natural language processing based method for clustering and analysis of aviation safety narratives.” Aerospace 7 (10): 143. https://doi.org/10.3390/aerospace7100143.
Sayed, M. A., X. Qin, R. J. Kate, D. M. Anisuzzaman, and Z. Yu. 2021. “Identification and analysis of misclassified work-zone crashes using text mining techniques.” Accid. Anal. Prev. 159 (Sep): 106211. https://doi.org/10.1016/j.aap.2021.106211.
Sogou. 2010. “Sogou input method lexicon.” Accessed January 16, 2023. https://pinyin.sogou.com/dict/detail/index/15118.
Song, B., W. Yan, and T. Zhang. 2019. “Cross-border e-commerce commodity risk assessment using text mining and fuzzy rule-based reasoning.” Adv. Eng. Inf. 40 (Apr): 69–80. https://doi.org/10.1016/j.aei.2019.03.002.
Sousa, R. L., and H. H. Einstein. 2021. “Lessons from accidents during tunnel construction.” Tunnelling Underground Space Technol. 113 (Apr): 103916. https://doi.org/10.1016/j.tust.2021.103916.
Stanford Natural Language Processing Group. 2009. “Dropping common terms: Stop words.” Accessed January 16, 2023. https://nlp.stanford.edu/IR-book/html/htmledition/dropping-common-terms-stop-words-1.html.
Tian, D., H. Liu, S. Chen, M. Li, and C. Liu. 2022. “Human error analysis for hydraulic engineering: Comprehensive system to reveal accident evolution process with text knowledge.” J. Constr. Eng. Manage. 148 (9): 04022093. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002366.
Tixier, A. J.-P., M. R. Hallowell, B. Rajagopalan, and D. Bowman. 2016. “Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports.” Autom. Constr. 62 (Feb): 45–56. https://doi.org/10.1016/j.autcon.2015.11.001.
Wang, B., and J. Zhao. 2022. “Automatic frequency estimation of contributory factors for confined space accidents.” Process Saf. Environ. Prot. 157 (Jan): 193–207. https://doi.org/10.1016/j.psep.2021.11.004.
Wang, G., M. Liu, D. Cao, and D. Tan. 2022. “Identifying high-frequency–low-severity construction safety risks: An empirical study based on official supervision reports in Shanghai.” Eng. Constr. Archit. Manage. 29 (2): 940–960. https://doi.org/10.1108/ECAM-07-2020-0581.
Wang, Q., and C. Li. 2022. “Evaluating risk propagation in renewable energy incidents using ontology-based Bayesian networks extracted from news reports.” Int. J. Green Energy 19 (12): 1290–1305. https://doi.org/10.1080/15435075.2021.1992411.
Wang, X., N. Xia, Z. Zhang, C. Wu, and B. Liu. 2017. “Human safety risks and their interactions in China’s subways: Stakeholder perspectives.” J. Manage. Eng. 33 (5): 05017004. https://doi.org/10.1061/(ASCE)ME.1943-5479.0000544.
Wang, Z., H. Li, and R. Tang. 2019. “Network analysis of coal mine hazards based on text mining and link prediction.” Int. J. Mod. Phys. C 30 (7): 1940009. https://doi.org/10.1142/S0129183119400096.
Wang, Z. Z., and C. Chen. 2017. “Fuzzy comprehensive Bayesian network-based safety risk assessment for metro construction projects.” Tunnelling Underground Space Technol. 70 (Nov): 330–342. https://doi.org/10.1016/j.tust.2017.09.012.
Wu, B., M. Lu, W. Huang, Y. Lan, Y. Wu, and Z. Huang. 2020. “A case study on the construction optimization decision scheme of urban subway tunnel based on the TOPSIS method.” KSCE J. Civ. Eng. 24 (11): 3488–3500. https://doi.org/10.1007/s12205-020-1290-9.
Wu, B., Y. Tang, X. Yan, and C. Guedes Soares. 2021. “Bayesian network modelling for safety management of electric vehicles transported in RoPax ships.” Reliab. Eng. Syst. Saf. 209 (May): 107466. https://doi.org/10.1016/j.ress.2021.107466.
Wu, X., H. Liu, L. Zhang, M. J. Skibniewski, Q. Deng, and J. Teng. 2015. “A dynamic Bayesian network based approach to safety decision support in tunnel construction.” Reliab. Eng. Syst. Saf. 134 (Feb): 157–168. https://doi.org/10.1016/j.ress.2014.10.021.
Xu, H., Y. Liu, C.-M. Shu, M. Bai, M. Motalifu, Z. He, S. Wu, P. Zhou, and B. Li. 2022. “Cause analysis of hot work accidents based on text mining and deep learning.” J. Loss Prev. Process Ind. 76 (Mar): 104747. https://doi.org/10.1016/j.jlp.2022.104747.
Xu, N., L. Ma, Q. Liu, L. Wang, and Y. Deng. 2021a. “An improved text mining approach to extract safety risk factors from construction accident reports.” Saf. Sci. 138 (May): 105216. https://doi.org/10.1016/j.ssci.2021.105216.
Xu, N., L. Ma, L. Wang, Y. Deng, and G. Ni. 2021b. “Extracting domain knowledge elements of construction safety management: Rule-based approach using Chinese natural language processing.” J. Manage. Eng. 37 (2): 04021001. https://doi.org/10.1061/(ASCE)ME.1943-5479.0000870.
Yu, Q. Z., L. Y. Ding, C. Zhou, and H. B. Luo. 2014. “Analysis of factors influencing safety management for metro construction in China.” Accid. Anal. Prev. 68 (Jul): 131–138. https://doi.org/10.1016/j.aap.2013.07.016.
Zhan, Q., W. Zheng, and B. Zhao. 2017. “A hybrid human and organizational analysis method for railway accidents based on HFACS-railway accidents (HFACS-RAs).” Saf. Sci. 91 (Jan): 232–250. https://doi.org/10.1016/j.ssci.2016.08.017.
Zhang, F., H. Fleyeh, X. Wang, and M. Lu. 2019. “Construction site accident analysis using text mining and natural language processing techniques.” Autom. Constr. 99 (Mar): 238–248. https://doi.org/10.1016/j.autcon.2018.12.016.
Zhang, L., X. Wu, M. J. Skibniewski, J. Zhong, and Y. Lu. 2014. “Bayesian-network-based safety risk analysis in construction projects.” Reliab. Eng. Syst. Saf. 131 (Nov): 29–39. https://doi.org/10.1016/j.ress.2014.06.006.
Zhang, S., R. Y. Sunindijo, M. Loosemore, S. Wang, Y. Gu, and H. Li. 2021. “Identifying critical factors influencing the safety of Chinese subway construction projects.” Eng. Constr. Archit. Manage. 28 (7): 1863–1886. https://doi.org/10.1108/ECAM-07-2020-0525.
Zhong, B., X. Pan, P. E. D. Love, L. Ding, and W. Fang. 2020a. “Deep learning and network analysis: Classifying and visualizing accident narratives in construction.” Autom. Constr. 113 (May): 103089. https://doi.org/10.1016/j.autcon.2020.103089.
Zhong, B., X. Pan, P. E. D. Love, J. Sun, and C. Tao. 2020b. “Hazard analysis: A deep learning and text mining framework for accident prevention.” Adv. Eng. Inf. 46 (Oct): 101152. https://doi.org/10.1016/j.aei.2020.101152.
Zhou, H., Y. Zhao, Q. Shen, L. Yang, and H. Cai. 2020. “Risk assessment and management via multi-source information fusion for undersea tunnel construction.” Autom. Constr. 111 (Mar): 103050. https://doi.org/10.1016/j.autcon.2019.103050.
Zhou, Z., J. Huang, Y. Lu, H. Ma, W. Li, and J. Chen. 2022a. “A new text-mining–Bayesian network approach for identifying chemical safety risk factors.” Mathematics 10 (24): 4815. https://doi.org/10.3390/math10244815.
Zhou, Z., J. Irizarry, and J. Zhou. 2021. “Development of a database exclusively for subway construction accidents and corresponding analyses.” Tunnelling Underground Space Technol. 111 (May): 103852. https://doi.org/10.1016/j.tust.2021.103852.
Zhou, Z., S. Liu, and H. Qi. 2022b. “Mitigating subway construction collapse risk using Bayesian network modeling.” Autom. Constr. 143 (Nov): 104541. https://doi.org/10.1016/j.autcon.2022.104541.
Zhu, Y., J. Zhou, B. Zhang, H. Wang, and M. Huang. 2022. “Statistical analysis of major tunnel construction accidents in China from 2010 to 2020.” Tunnelling Underground Space Technol. 124 (Jun): 104460. https://doi.org/10.1016/j.tust.2022.104460.

Information & Authors

Information

Published In

Go to Journal of Construction Engineering and Management
Journal of Construction Engineering and Management
Volume 150Issue 6June 2024

History

Received: Jun 8, 2023
Accepted: Jan 30, 2024
Published online: Apr 11, 2024
Published in print: Jun 1, 2024
Discussion open until: Sep 11, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

Professor, School of Management Engineering, Qingdao Univ. of Technology, Qingdao 266520, China. ORCID: https://orcid.org/0009-0009-6415-4424. Email: [email protected]
Graduate Student, School of Management Engineering, Qingdao Univ. of Technology, Qingdao 266520, China (corresponding author). ORCID: https://orcid.org/0009-0003-6345-1140. Email: [email protected]
Graduate Student, School of Management Engineering, Qingdao Univ. of Technology, Qingdao 266520, China. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share