Modeling Pipe Break Data Using Survival Analysis with Machine Learning Imputation Methods
Publication: Journal of Performance of Constructed Facilities
Volume 35, Issue 5
Abstract
The development of asset life estimation tools based on historical data is essential to the effective management of pipeline assets. One tool that may assist with asset management is survival analysis. However, left-truncated break records pose a challenge in the practice of survival analysis to obtain sound inferences and predictions. In this study, we propose a data-driven approach that integrates machine learning imputation methods with survival analysis. To demonstrate the proposed methodology, we perform a case study using ductile iron (DI) water distribution pipes from an anonymized utility in the midwestern United States. Two artificial neural network (ANN) models are developed as imputation methods to calibrate the survival curves and mean time to first failure (MTTF) estimates from the Weibull proportional hazards model (WPHM). Results show that the MTTF estimation bias is reduced from 14.3% to 2.1% by using imputation as a preceding procedure. Empirical findings show that despite the limited accuracy of imputation models, the use of imputation methods can still improve the survival analysis results and mitigate the impact of left-truncated break records.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
Some or all data, models, or code generated or used during the study are proprietary or confidential in nature and may only be provided with restrictions. The data sets used in this study have been provided by the water utilities that want to remain anonymous, so the data sets cannot be shared.
Acknowledgments
We gratefully acknowledge the financial support from the United States Bureau of Reclamation (USBR) for this research work. We thank the participating water utilities across the United States for providing their data used in this study. We also thank the four anonymous reviewers for their constructive comments, which greatly improved the quality of the article.
References
AWWA (American Water Works Association). 2012. Buried no longer: Confronting America’s water infrastructure challenge. Denver: AWWA.
Carrión, A., H. Solano, M. L. Gamiz, and A. Debón. 2010. “Evaluation of the reliability of a water supply network from right-censored and left-truncated break data.” Water Resour. Manage. 24 (12): 2917–2935. https://doi.org/10.1007/s11269-010-9587-y.
Chen, T. Y.-J., J. A. Beekman, S. David Guikema, and S. Shashaani. 2019. “Statistical modeling in absence of system specific data: Exploratory empirical analysis for prediction of water main breaks.” J. Infrastruct. Syst. 25 (2): 04019009. https://doi.org/10.1061/(ASCE)IS.1943-555X.0000482.
Dawood, T., E. Elwakil, H. M. Novoa, and J. F. G. Delgado. 2020. “Artificial intelligence for the modeling of water pipes deterioration mechanisms.” Autom. Constr. 120 (Dec): 103398. https://doi.org/10.1016/j.autcon.2020.103398.
Debón, A., A. Carrión, E. Cabrera, and H. Solano. 2010. “Comparing risk of failure models in water supply networks using ROC curves.” Reliab. Eng. Syst. Saf. 95 (1): 43–48. https://doi.org/10.1016/j.ress.2009.07.004.
García-Laencina, P. J., P. H. Abreu, M. H. Abreu, and N. Afonoso. 2015. “Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values.” Comput. Biol. Med. 59 (Apr): 125–133. https://doi.org/10.1016/j.compbiomed.2015.02.006.
García-Mora, B., A. Debón, C. Santamaría, and A. Carrión. 2015. “Modelling the failure risk for water supply networks with interval-censored data.” Reliab. Eng. Syst. Saf. 144 (Dec): 311–318. https://doi.org/10.1016/j.ress.2015.08.003.
Hawari, A., F. Alkadour, M. Elmasry, and T. Zayed. 2020. “A state of the art review on condition assessment models developed for sewer pipelines.” Eng. Appl. Artif. Intell. 93 (Aug): 103721. https://doi.org/10.1016/j.engappai.2020.103721.
Honaker, J., G. King, and M. Blackwell. 2011. “Amelia II: A program for missing data.” J. Stat. Software 45 (7): 1–47. https://doi.org/10.18637/jss.v045.i07.
Jenkins, L., S. Gokhale, and M. McDonald. 2014. “Comparison of pipeline failure prediction models for water distribution networks with uncertain and limited data.” J. Pipeline Syst. Eng. Pract. 6 (2): 04014012. https://doi.org/10.1061/(ASCE)PS.1949-1204.0000181.
Kabir, G., S. Tesfamariam, J. Hemsing, and R. Sadiq. 2019. “Handling incomplete and missing data in water network database using imputation methods.” Sustainable Resilient Infrastruct. 5 (6): 365–377. https://doi.org/10.1080/23789689.2019.1600960.
Kabir, G., S. Tesfamariam, J. Loeppky, and R. Sadiq. 2016. “Predicting water main failures: A Bayesian model updating approach.” Knowl.-Based Syst. 110 (Oct): 144–156. https://doi.org/10.1016/j.knosys.2016.07.024.
Kabir, G., S. Tesfamariam, and R. Sadiq. 2015. “Predicting water main failures using Bayesian model averaging and survival modelling approach.” Reliab. Eng. Syst. Saf. 142 (Oct): 498–514. https://doi.org/10.1016/j.ress.2015.06.011.
Kahn, C., A. Damiani, and S. Ge. 2020. “Validation of water main failure predictions: A 2-year case study.” AWWA Water Sci. 2 (3): e1179. https://doi.org/10.1002/aws2.1179.
Kimutai, E., G. Betrie, R. Brander, R. Sadiq, and S. Tesfamariam. 2015. “Comparison of statistical models for predicting pipe failures: Illustrative example with the City of Calgary water main failure.” J. Pipeline Syst. Eng. Pract. 6 (4): 04015005. https://doi.org/10.1061/(ASCE)PS.1949-1204.0000196.
Le Gat, Y. 2014. “Extending the Yule process to model recurrent pipe failures in water supply networks.” Urban Water J. 11 (8): 617–630. https://doi.org/10.1080/1573062X.2013.783088.
Le Gat, Y., and P. Eisenbeis. 2000. “Using maintenance records to forecast failures in water networks.” Urban Water 2 (3): 173–181. https://doi.org/10.1016/S1462-0758(00)00057-1.
Leighton, J., D. Evonuk, and T. Liberator. 2011. “Portland’s water distribution pipes asset management plan.” In Proc., Pipelines 2011: A Sound Conduit for Sharing Solutions, 33–43. Reston, VA: ASCE.
Mailhot, A., G. Pelletier, J.-F. Noël, and J.-P. Villeneuve. 2000. “Modeling the evolution of the structural state of water pipe networks with brief recorded pipe break histories: Methodology and application.” Water Resour. Res. 36 (10): 3053–3062. https://doi.org/10.1029/2000WR900185.
Osman, H., and K. Bainbridge. 2011. “Comparison of statistical deterioration models for water distribution networks.” J. Perform. Constr. Facil. 25 (3): 259–266. https://doi.org/10.1061/(ASCE)CF.1943-5509.0000157.
Osman, M. S., A. M. Abu-Mahfouz, and P. R. Page. 2018. “A survey on data imputation techniques: Water distribution system as a use case.” IEEE Access 6: 63279–63291. https://doi.org/10.1109/ACCESS.2018.2877269.
Pan, W., and R. Chappell. 1998. “A nonparametric estimator of survival functions for arbitrarily truncated and censored data.” Lifetime Data Anal. 4 (2): 187–202. https://doi.org/10.1023/A:1009637624440.
Pelletier, G., A. Mailhot, and J.-P. Villeneuve. 2003. “Modeling water pipe breaks—Three case studies.” J. Water Resour. Plann. Manage. 129 (2): 115–123. https://doi.org/10.1061/(ASCE)0733-9496(2003)129:2(115).
Phan, H. C., A. S. Dhar, and R. Sadiq. 2018. “Prioritizing water mains for inspection and maintenance considering system reliability and risk.” J. Pipeline Syst. Eng. Pract. 9 (3): 04018009. https://doi.org/10.1061/(ASCE)PS.1949-1204.0000324.
Rogers, P. D., and N. S. Grigg. 2009. “Failure assessment modeling to prioritize water pipe renewal: Two case studies.” J. Infrastruct. Syst. 15 (3): 162–171. https://doi.org/10.1061/(ASCE)1076-0342(2009)15:3(162).
Scheidegger, A., J. P. Leitão, and L. Scholten. 2015. “Statistical failure models for water distribution pipes—A review from a unified perspective.” Water Res. 83 (Oct): 237–247. https://doi.org/10.1016/j.watres.2015.06.027.
Scheidegger, A., L. Scholten, M. Maurer, and P. Reichert. 2013. “Extension of pipe failure models to consider the absence of data from replaced pipes.” Water Res. 47 (11): 3696–3705. https://doi.org/10.1016/j.watres.2013.04.017.
Shirzad, A., M. Tabesh, and R. Farmani. 2014. “A comparison between performance of support vector regression and artificial neural network in prediction of pipe burst rate in water distribution networks.” KSCE J. Civ. Eng. 18 (4): 941–948. https://doi.org/10.1007/s12205-014-0537-8.
Snider, B., and E. A. McBean. 2020. “Improving urban water security through pipe-break prediction models: Machine learning or survival analysis.” J. Environ. Eng. 146 (3): 04019129. https://doi.org/10.1061/(ASCE)EE.1943-7870.0001657.
Templ, M., A. Kowarik, and P. Filzmoser. 2011. “Iterative stepwise regression imputation using standard and robust methods.” Comput. Stat. Data Anal. 55 (10): 2793–2806. https://doi.org/10.1016/j.csda.2011.04.012.
Thomson, J., S. Flamberg, and W. Condit. 2013. Primer on condition curves for water mains. Washington, DC: USEPA.
Vakulenko-Lagun, B., M. Mandel, and R. A. Betensky. 2019. “Inverse probability weighting methods for Cox regression with right-truncated data.” Biometrics 76 (2): 484–495. https://doi.org/10.1111/biom.13162.
van Buuren, S., H. C. Boshuizen, and D. L. Knook. 1999. “Multiple imputation of missing blood pressure covariates in survival analysis.” Stat. Med. 18 (6): 681–694. https://doi.org/10.1002/(SICI)1097-0258(19990330)18:6%3C681::AID-SIM71%3E3.0.CO;2-R.
Verboven, S., K. V. Branden, and P. Goos. 2007. “Sequential imputation for missing values.” Comput. Biol. Chem. 31 (5–6): 320–327. https://doi.org/10.1016/j.compbiolchem.2007.07.001.
Vladeanu, G. J., and D. D. Koo. 2015. “A comparison study of water pipe failure prediction models using Weibull distribution and binary logistic regression.” In Proc., Pipelines 2015: Recent Advances in Underground Pipeline Engineering and Construction, 1590–1601. Reston, VA: ASCE.
Xu, H., and S. K. Sinha. 2019. “A framework for statistical analysis of water pipeline field performance data.” In Proc., Pipelines 2019: Multidisciplinary Topics, Utility Engineering, and Surveying, 180–189. Reston, VA: ASCE.
Xu, H., and S. K. Sinha. 2020. “Applying survival analysis to pipeline data: Gaps and challenges.” In Proc., Pipelines 2020: Utility Engineering, Surveying, and Multidisciplinary Topics, 148–158. Reston, VA: ASCE.
Xu, H., S. K. Sinha, and A. Vishwakarma. 2020. “Development of a fuzzy inference performance rating system for water pipelines using a comprehensive list of input variables.” In Proc., Pipelines 2020: Utility Engineering, Surveying, and Multidisciplinary Topics, 178–188. Reston, VA: ASCE.
Yamijala, S., S. D. Guikema, and K. Brumbelow. 2009. “Statistical models for the analysis of water distribution system pipe break data.” Reliab. Eng. Syst. Saf. 94 (2): 282–293. https://doi.org/10.1016/j.ress.2008.03.011.
Zangenehmadar, Z., and O. Moselhi. 2016. “Assessment of remaining useful life of pipelines using different artificial neural networks models.” J. Perform. Constr. Facil. 30 (5): 04016032. https://doi.org/10.1061/(ASCE)CF.1943-5509.0000886.
Information & Authors
Information
Published In
Copyright
© 2021 American Society of Civil Engineers.
History
Received: Feb 16, 2021
Accepted: Jun 23, 2021
Published online: Aug 11, 2021
Published in print: Oct 1, 2021
Discussion open until: Jan 11, 2022
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.
Cited by
- Dayu Apoji, Shih-Hung Chiu, Tara Sweet, David Katzev, Kenichi Soga, Resolving Left Truncation Issues and Enabling Pipe-Specific Likelihood of Failure Models, Pipelines 2024, 10.1061/9780784485583.041, (379-390), (2024).
- Chuan Wang, Yuzhong Liu, Chao Yu, Xin Deng, Jianjun Luo, Huachuan Liu, Xueliang Zhang, Dynamic Probabilistic Risk Assessment of Overpressure Burst of Pipeline with Pressure-Protection Systems in Case of Blockage, Journal of Performance of Constructed Facilities, 10.1061/JPCFEV.CFENG-4384, 37, 4, (2023).
- Wei Liu, Zhiyin Xie, Zhaoyang Song, Predicting Water Pipe Failures Using Deep Learning Algorithms, Journal of Infrastructure Systems, 10.1061/JITSE4.ISENG-2247, 29, 3, (2023).
- Panagiotis Dimas, Dionysios Nikolopoulos, Christos Makropoulos, Simulation Framework for Pipe Failure Detection and Replacement Scheduling Optimization, EWaS5, 10.3390/environsciproc2022021037, (37), (2022).
- Hao Xu, Developing Software Application for Pipeline Survival Curves, Pipelines 2022, 10.1061/9780784484302.007, (52-60), (2022).
- Stephen M. Welling, Sunil K. Sinha, Fuzzy Logic-Based Model for Optimal Renewal Timing and Project Selection of Ferrous Water Mains, Pipelines 2022, 10.1061/9780784484296.036, (304-312), (2022).
- Shaoqing Ge, James Chelius, Failure Analysis of Asbestos Cement Pipe Using Recent Main Break Data, Pipelines 2022, 10.1061/9780784484289.025, (211-218), (2022).
- Hao Xu, Software Development for Fuzzy Logic-Based Pipe Condition Prediction, Pipelines 2022, 10.1061/9780784484289.004, (30-40), (2022).
- Xiaowei Wang, Ram K. Mazumder, Babak Salarieh, Abdullahi M. Salman, Abdollah Shafieezadeh, Yue Li, Machine Learning for Risk and Resilience Assessment in Structural Engineering: Progress and Future Trends, Journal of Structural Engineering, 10.1061/(ASCE)ST.1943-541X.0003392, 148, 8, (2022).
- Nour Aljafari, Michael Burrow, Gurmel Ghataora, Mehran Eskandari Torbaghan, Jamil Raja, Condition Modeling of Railway Drainage Pipes, Journal of Infrastructure Systems, 10.1061/(ASCE)IS.1943-555X.0000708, 28, 4, (2022).
- See more