Technical Papers
Dec 27, 2019

Improving Urban Water Security through Pipe-Break Prediction Models: Machine Learning or Survival Analysis

Publication: Journal of Environmental Engineering
Volume 146, Issue 3

Abstract

North America’s water distribution systems are aging and incurring increased pipe breaks. These breaks pose a serious threat to urban drinking water security, leading to service interruptions, loss of revenue, and increasing risk of water contamination. Prediction models have been developed to help identify when individual underground water pipes are expected to break, helping utilities develop pipe renewal projects and avoid costly pipe breaks that impact water supply reliability. This paper provides an in-depth comparison of the two leading statistical pipe-break modeling methods: machine-learning and survival-analysis algorithms. A gradient-boosting decision tree machine-learning model and a Weibull proportional hazard survival-analysis model are used to predict time to next break for cast-iron pipes in a major Canadian water distribution system. Results indicate that removal of censored events from the machine-learning model biases the model to predict earlier pipe breaks than occur. Overall, water utilities concerned with short-term security arising from impacts of pipe breaks on water security may favor the machine-learning approach, but the survival-analysis models’ ability to incorporate right-censored data makes it more appropriate for long-term asset management planning.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

All data used during the study are confidential in nature and cannot be provided by agreement with the municipalities due to their concern with the security of their distribution system.

Acknowledgments

This research was funded by Natural Sciences and Engineering Research Council (NSERC). The authors are grateful for the help and data provided by the utility described in the case study of this report. In this research, data were processed with R-studio using R language.

References

Akaike, H. 1973. “Maximum likelihood identification of Gaussian autoregressive moving average models.” Biometrika 60 (2): 255–265. https://doi.org/10.1093/biomet/60.2.255.
Alvisi, S., and M. Franchini. 2010. “Comparative analysis of two probabilistic pipe breakage models applied to a real water distribution system.” Civ. Eng. Environ. Syst. 27 (1): 1–22. https://doi.org/10.1080/10286600802224064.
Anagnostopoulos, T., C. Anagnostopoulos, S. Hadjiefthymiades, M. Kyriakakos, and A. Kalousis. 2009. “Predicting the location of mobile users: A machine learning approach.” In Proc., 2009 Int. Conf. on Pervasive Services, 65–72. New York: Association for Computing Machinery.
Antolini, L., P. Boracchi, and E. Biganzoli. 2005. “A time-dependent discrimination index for survival data.” Stat. Med. 24 (24): 3927–3944. https://doi.org/10.1002/sim.2427.
Asnaashari, A., E. A. McBean, B. Gharabaghi, and D. Tutt. 2013. “Forecasting watermain failure using artificial neural network modelling.” Can. Water Resour. J. 38 (1): 24–33. https://doi.org/10.1080/07011784.2013.774153.
AWWA (American Water Works Association). 2012. “Buried no longer: Confronting America’s water infrastructure challenge.” Accessed February 1, 2019. http://www.awwa.org/Portals/0/files/legreg/documents/BuriedNoLonger.pdf.
Aydogdu, M., and M. Firat. 2015. “Estimation of failure rate in water distribution network using fuzzy clustering and LS-SVM methods.” Water Resour. Manage. 29 (5): 1575–1590. https://doi.org/10.1007/s11269-014-0895-5.
Campanella, K., C. Andreasen, A. Diba, H. Himmelberger, J. Leighton, J. Santini, and K. Vause. 2016. “2015 establishing the level of progress in utility asset management survey results.” Proc. Water Environ. Fed. 2016 (1): 462–490. https://doi.org/10.2175/193864716821123341.
Chen, T., T. He, M. Benesty, V. Khotilovich, and Y. Tang. 2015. “Xgboost: Extreme gradient boosting: R package version 0.4-2.” Accessed August 1, 2019. http://cran.fhcrc.org/web/packages/xgboost/vignettes/xgboost.pdf.
Clark, R. M., J. Carson, R. C. Thurnau, R. Krishnan, and S. Panguluri. 2010. “Condition assessment modeling for distribution systems using shared frailty analysis.” J. Am. Water Works Assoc. 102 (7): 81–91. https://doi.org/10.1002/j.1551-8833.2010.tb10151.x.
Cox, D. 1972. “Regression models and life-tables.” J. R. Stat. Soc. 34 (2): 187–220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x.
Debón, A., A. Carrión, E. Cabrera, and H. Solano. 2010. “Comparing risk of failure models in water supply networks using ROC curves.” Reliab. Eng. Syst. Saf. 95 (1): 43–48. https://doi.org/10.1016/j.ress.2009.07.004.
Folkman, S. 2018. “Water main break rates in the USA and Canada: A comprehensive study. Logan, UT: Utah State Univ.
Fuchs-Hanusch, D., B. Kornberger, F. Friedl, and R. Scheucher. 2012. “Whole of life cost calculations for water supply pipes.” Water Asset Manage. Int. 8 (2): 19–24.
Ghanavati, M., R. K. Wong, F. Chen, Y. Wang, and S. Fong. 2016. “Effective local metric learning for water pipe assessment.” In Pacific-Asia conference on knowledge discovery and data mining, 565–577. Cham, Switzerland: Springer.
Greenberg, M., B. Pardo, K. Hariharan, and E. Gerber. 2013. “Crowdfunding support tools: Predicting success and failure.” In Proc., Extended Abstracts on Human Factors in Computing Systems, 1815–1820. New York: Association for Computing Machinery.
Harrell, F. E., K. L. Lee, and D. B. Mark. 1996. “Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.” Stat. Med. 15 (4): 361–387. https://doi.org/10.1002/(SICI)1097-0258(19960229)15:4%3C361::AID-SIM168%3E3.0.CO;2-4.
Harvey, R., E. A. McBean, and B. Gharabaghi. 2013. “Predicting the timing of water main failure using artificial neural networks.” J. Water Resour. Plann. Manage. 140 (4): 425–434. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000354.
Hosmer, D. W., Jr., S. Lemeshow, and R. X. Sturdivant. 2013. Vol. 398 of Applied logistic regression. New York: Wiley.
Kalanaki, M., and J. Soltani. 2013. “Performance assessment among hybrid algorithms in tuning SVR parameters to predict pipe failure rates.” Adv. Comput. Sci. 2 (5): 40–46.
Kaplan, E. L., and P. Meier. 1958. “Nonparametric estimation from incomplete observations.” J. Am. Stat. Assoc. 53 (282): 457–481. https://doi.org/10.1080/01621459.1958.10501452.
Kerwin, S., and B. T. Adey. 2019. “Performance comparison for pipe failure prediction using artificial neural networks.” In Proc., 6th Int. Symp. on Life-Cycle Civil Engineering, 1337–1342. Boca Raton, FL: CRC Press.
Kimutai, E., G. Betrie, R. Brander, R. Sadiq, and S. Tesfamariam. 2015. “Comparison of statistical models for predicting pipe failures: Illustrative example with the City of Calgary water main failure.” J. Pipeline Syst. Eng. Pract. 6 (4): 04015005. https://doi.org/10.1061/(ASCE)PS.1949-1204.0000196.
Kleiner, Y., and B. Rajani. 2001. “Comprehensive review of structural deterioration of water mains: Statistical models.” Urban Water 3 (3): 131–150. https://doi.org/10.1016/S1462-0758(01)00033-4.
Kotsiantis, S., C. Pierrakeas, and P. Pintelas. 2004. “Predicting students’ performance in distance learning using machine learning techniques.” Appl. Artif. Intell. 18 (5): 411–426. https://doi.org/10.1080/08839510490442058.
Kuhn, M. 2018. “CARET: Classification and regression training.” Accessed February 1, 2018. https://cran.r-project.org/package=caret.
Kumar, A., et al. 2018. “Using machine learning to assess the risk of and prevent water main breaks.” In Proc., 24th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 472–480. New York: Association for Computing Machinery.
Kutyłowska, M. 2015. “Neural network approach for failure rate prediction.” Eng. Fail. Anal. 47 (Part A): 41–48. https://doi.org/10.1016/j.engfailanal.2014.10.007.
Kutyłowska, M. 2017. “Prediction of failure frequency of water-pipe network in the selected city.” Periodica Polytech. Civ. Eng. 61 (3): 548–553. https://doi.org/10.3311/PPci.9997.
Le Gat, Y., and P. Eisenbeis. 2000. “Using maintenance records to forecast failures in water networks.” Urban Water 2 (3): 173–181. https://doi.org/10.1016/S1462-0758(00)00057-1.
Li, Z., B. Zhang, Y. Wang, F. Chen, R. Taib, V. Whiffin, and Y. Wang. 2014. “Water pipe condition assessment: A hierarchical beta process approach for sparse incident data.” Mach. Learn. 95 (1): 11–26. https://doi.org/10.1007/s10994-013-5386-z.
Liang, B., Z. Li, Y. Wang, and F. Chen. 2018. “Long-term RNN: Predicting hazard function for proactive maintenance of water mains.” In Proc., 27th ACM Int. Conf. on Information and Knowledge Management, 1687–1690. New York: Association for Computing Machinery.
Lin, P., B. Zhang, Y. Wang, Z. Li, B. Li, Y. Wang, and F. Chen. 2015. “Data driven water pipe failure prediction: A Bayesian nonparametric approach.” In Proc., 24th ACM Int. on Conf. on Information and Knowledge Management, 193–202. New York: Association for Computing Machinery.
Manfredi, S., C. Lepage, C. Hatem, O. Coatmeur, J. Faivre, and A. M. Bouvier. 2006. “Epidemiology and management of liver metastases from colorectal cancer.” Ann. Surg. 244 (2): 254. https://doi.org/10.1097/01.sla.0000217629.94941.cf.
Marks, D. H., and L. A. Jeffrey. 1985. “Predicting urban water distribution maintenance strategies: A case study of New Haven, Connecticut.” Ph.D. dissertation Dept. of Civil Engineering, Massachusetts Institute of Technology.
Nelson, W. 1980. “Accelerated life testing step-stress models and data analyses.” IEEE Trans. Reliab. R-29 (2): 103–108. https://doi.org/10.1109/TR.1980.5220742.
Nishiyama, M., and Y. Filion. 2014. “Forecasting breaks in cast iron water mains in the City of Kingston with an artificial neural network model.” Can. J. Civ. Eng. 41 (10): 918–923. https://doi.org/10.1139/cjce-2014-0114.
Park, S., H. Jun, N. Agbenowosi, B. J. Kim, and K. Lim. 2011. “The proportional hazards modeling of water main failure data incorporating the time-dependent effects of covariates.” Water Resour. Manage. 25 (1): 1–19. https://doi.org/10.1007/s11269-010-9684-y.
Park, S., J. W. Kim, A. Newland, and H. Jun. 2007. “A methodology to estimate economically optimal replacement time interval of water distribution pipes.” Water Sci. Technol.: Water Supply 7 (5–6): 149–155. https://doi.org/10.2166/ws.2007.103.
Park, S., J. W. Kim, A. Newland, B. J. Kim, and H. D. Jun. 2008. “Survival analysis of water distribution pipe failure data using the proportional hazards model.” In Proc., World Environmental and Water Resources Congress 2008, 1–10. Reston, VA: ASCE.
Popular Mechanics. 2018. “65 best inventions of the past 65 years.” Accessed April 1, 2019. https://www.popularmechanics.com/technology/g24668233/best-inventions/.
R Core Development Team. 2017. “R: A language and environment for statistical computing.” Accessed February 1, 2018. https://www.rproject.org/.
Rogers, P. 2011. “Prioritizing water main renewals: Case study of the Denver water system.” J. Pipeline Syst. Eng. Pract. 2 (3): 73–81. https://doi.org/10.1061/(ASCE)PS.1949-1204.0000082.
Rogers, P., and N. Grigg. 2009. “Failure assessment modeling to prioritize water pipe renewal: Two case studies.” J. Infrastruct. Syst. 15 (3): 162–171. https://doi.org/10.1061/(ASCE)1076-0342(2009)15:3(162).
Sattar, A. M., Ö. F. Ertuğrul, B. Gharabaghi, E. A. McBean, and J. Cao. 2019. “Extreme learning machine model for water network management.” Neural Comput. Appl. 31 (1): 157–169. https://doi.org/10.1007/s00521-017-2987-7.
Sattar, A. M., B. Gharabaghi, and E. A. McBean. 2016. “Prediction of timing of watermain failure using gene expression models.” Water Resour. Manage. 30 (5): 1635–1651. https://doi.org/10.1007/s11269-016-1241-x.
Scheidegger, A., P. L. João, and L. Scholten. 2015. “Statistical failure models for water distribution pipes—A review from a unified perspective.” Water Res. 83 (Oct): 237–247. https://doi.org/10.1016/j.watres.2015.06.027.
Schmid, M., and S. Potapov. 2012. “A comparison of estimators to evaluate the discriminatory power of time-to-event models.” Stat. Med. 31 (23): 2588–2609. https://doi.org/10.1002/sim.5464.
Schmidt, P., and A. Witte. 1989. “Predicting criminal recidivism using ‘split population’ survival time models.” J. Econ. 40 (1): 141–159. https://doi.org/10.1016/0304-4076(89)90034-1.
Shirzad, A., M. Tabesh, and R. Farmani. 2014. “A comparison between performance of support vector regression and artificial neural network in prediction of pipe burst rate in water distribution networks.” KSCE J. Civ. Eng. 18 (4): 941–948. https://doi.org/10.1007/s12205-014-0537-8.
Snider, B., and E. A. McBean. 2018. “Improving time to failure predictions for water distribution systems using extreme gradient boosting algorithm.” In Vol. 1 of Proc., WDSA/CCWI Joint Conf. Proc. Kingston, ON, Canada: Queen’s Univ.
Therneau, T. 2015. “A package for survival analysis in S. Version 2.38.” Accessed January 23, 2018. https://cran.r-project.org/package=survival%3E.
Vanrenterghem-Raven, A., P. Eisenbeis, I. Juran, and S. Christodoulou. 2004. “Statistical modeling of the structural degradation of an urban water distribution system: Case study of New York City.” In Proc., World Water and Environmental Resources Congress 2003. Reston, VA: ASCE.
Wilson, D., Y. Filion, and I. Moore. 2017. “State-of-the-art review of water pipe failure prediction models and applicability to large-diameter mains.” Urban Water J. 14 (2): 173–184. https://doi.org/10.1080/1573062X.2015.1080848.
Winkler, D., M. Haltmeier, M. Kleidorfer, W. Rauch, and F. Tscheikner-Gratl. 2018. “Pipe failure modelling for water distribution networks using boosted decision trees.” Struct. Infrastruct. Eng. 14 (10): 1402–1411. https://doi.org/10.1080/15732479.2018.1443145.

Information & Authors

Information

Published In

Go to Journal of Environmental Engineering
Journal of Environmental Engineering
Volume 146Issue 3March 2020

History

Received: Apr 22, 2019
Accepted: Jul 30, 2019
Published online: Dec 27, 2019
Published in print: Mar 1, 2020
Discussion open until: May 27, 2020

Permissions

Request permissions for this article.

Authors

Affiliations

Ph.D. Candidate and Research Assistant, School of Engineering, Univ. of Guelph, Guelph, ON, Canada N1G 2W1 (corresponding author). ORCID: https://orcid.org/0000-0003-4883-6045. Email: [email protected]
Professor, School of Engineering, Univ. of Guelph, Guelph, ON, Canada N1G 2W1. ORCID: https://orcid.org/0000-0002-4701-5467. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share