Statistical Modeling in Absence of System Specific Data: Exploratory Empirical Analysis for Prediction of Water Main Breaks
Publication: Journal of Infrastructure Systems
Volume 25, Issue 2
Abstract
The replacement of deteriorating distribution pipes is an important process for water utilities. It helps reduce capital spending on water main breaks and improves customer satisfaction. To assist with the development of an effective renewal plan, statistical models that forecast future breakage rates have been used to guide planning for asset management. However, this process is difficult for older utilities that lack readily available pipe network data. We examined whether accurate and useful predictive models can be built in the absence of pipe-feature data. Using the historical break record from a mid-Atlantic utility, two data sets at different spatial scales were created using publicly available demographic and environmental information. Empirical results suggest that although accuracy suffers from the lack of pipe-level details, it is still possible to create a model that provides useful information for prioritization of high-risk regions for management.
Get full access to this article
View all available purchase options and get full access to this article.
Acknowledgments
The authors would like to thank the University of Michigan for funding this research. The opinions and views expressed are those of the researchers and do not necessarily reflect those of the sponsors.
References
Allison, P. 1999. Multiple regression: A primer. London: Pine Fore Press.
Andreou, S. A. 1986. “Predictive models for pipe break failures and their implications on maintenance planning strategies for deteriorating water distribution systems.” Ph.D. dissertation, Massachusetts Institute of Technology.
ASCE. 2017. 2017 infrastructure report card. Reston, VA: ASCE.
Breiman, L. 2001. “Random forests.” Mach. Learn. 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
Chen, T. Y.-J., J. A. Beekman, and S. D. Guikema. 2017. “Drinking water distribution systems asset management: Statistical modelling of pipe breaks.” In Proc., Sessions of the Pipelines 2017: Conf. on Condition Assessment, Surveying, and Geomatics. Reston, VA: ASCE.
Ezell, B. C., J. V. Farr, and I. Wiese. 2000. “Infrastructure risk analysis model.” J. Infrastruct. Syst. 6 (3): 114–117. https://doi.org/10.1061/(ASCE)1076-0342(2000)6:3(114).
Francis, R. A., S. D. Guikema, and L. Henneman. 2014. “Bayesian belief networks for predicting drinking water distribution system pipe breaks.” Reliability Eng. Syst. Saf. 130 (1): 1–11. https://doi.org/10.1016/j.ress.2014.04.024.
Freidman, J. H. 2001. “Greedy function approximation: A gradient boosting machine.” Ann. Stat. 29 (5): 1189–1232.
Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. 2014. Bayesian data analysis. Boca Raton, FL: Taylor & Francis.
Gelman, A., and J. Hill. 2007. Data analysis using regression and multilevel/hierarchical models. Cambridge, UK: Cambridge University Press.
Hastie, T., R. Tibshirani, and J. Friedman. 2009. The elements of statistical learning: Data mining, inference, and prediction. 2nd ed. New York: Springer.
He, H., and E. A. Garcia. 2009. “Learning from imbalanced data.” IEEE Trans. Knowl. Data Eng. 21 (9): 1263–1284. https://doi.org/10.1109/TKDE.2008.239.
Kano, Y., and A. Harada. 2000. “Stepwise feature selection in factor analysis.” Psychometrika 65 (1): 7–22. https://doi.org/10.1007/BF02294182.
Kettler, A. J., and C. Goulter. 1985. “An analysis of pipe breakage in urban water distribution networks.” Can. J. Civ. Eng. 12 (1): 286–293. https://doi.org/10.1139/l85-030.
Kiefner, J. F., and P. H. Vieth. 1989. A modified criterion for evaluating the remaining strength of corroded pipe.. Columbus, OH: Battelle.
Kleiner, Y., and B. Rajani. 2001. “Comprehensive review of structure deterioration of water mains: Statistical models.” Urban Water 3 (3): 131–150. https://doi.org/10.1016/S1462-0758(01)00033-4.
Kleiner, Y., and B. Rajani. 2002. “Forecasting variation and trends in water-main breaks.” J. Infrastruct. Syst. 8 (4): 122–131. https://doi.org/10.1061/(ASCE)1076-0342(2002)8:4(122).
Kleiner, Y., and B. Rajani. 2012. “Comparison of four models to rank failure likelihood of individual pipes.” J. Hydroinf. 14 (3): 659. https://doi.org/10.2166/hydro.2011.029.
Lee, L. F. 1982. “Specification error in multinomial logit models. Analysis of the omitted variable bias.” J. Econometrics 20 (2): 197–209. https://doi.org/10.1016/0304-4076(82)90019-7.
Mailhot, A., G. Pelletier, J. F. Noël, and J. P. Villeneuve. 2000. “Modeling the evolution of the structural state of water pipe networks with brief recorded pipe break histories: Methodology and application.” Water Resour. Res. 36 (10): 3053–3062. https://doi.org/10.1029/2000WR900185.
McCullouch, C. E. 2003. Generalized linear mixed models. Beachwood, OH: Institute of Mathematical Statistics.
NOAA (National Oceanic and Atmospheric Administration). 2010. Local climatological data (LCD) dataset documentation. Silver Spring, MD: NOAA.
Pelletier, G., A. Mailhot, and J. P. Villeneuve. 2003. “Modeling water pipe breaks: Three case studies.” J. Water Resour. Plann. Manage. 129 (2): 115–123. https://doi.org/10.1061/(ASCE)0733-9496(2003)129:2(115).
Rajani, B., and Y. Kleiner. 2001. “Comprehensive review of structural deterioration of water mains: Physically based models.” Urban Water 3 (3): 151–164. https://doi.org/10.1016/S1462-0758(01)00032-2.
Rajani, B., and J. Makar. 2000. “A methodology to estimate remaining service life of grey cast iron water mains.” Can. J. Civ. Eng. 27 (6): 1259–1272. https://doi.org/10.1139/l00-073.
Roger, J., and M. D. Lewis. 2000. “An introduction to classification and regression tree (CART) analysis.” In Proc., 2000 Annual Meeting of the Society for Academic Emergency Medicine (310), 14. Des Plaines, IL: Society for Academic Emergency Medicine.
Rogers, P. D., and N. S. Grigg. 2009. “Failure assessment modeling to prioritize water pipe renewal: Two case studies.” J. Infrastruct. Syst. 15 (3): 162–171. https://doi.org/10.1061/(ASCE)1076-0342(2009)15:3(162).
Røstum, J. 2000. Statistical modelling of pipe failures in water networks. Trondheim, Norway: Norwegian Univ. of Science and Technology.
Shi, W.-Z., A.-S. Zhang, and O.-K. Ho. 2013. “Spatial analysis of water mains failure clusters and factors: A Hong Kong case study.” Ann. Gis. 19 (2): 89–97. https://doi.org/10.1080/19475683.2013.782509.
Shortridge, J. E., and S. D. Guikema. 2014. “Public health and pipe breaks in water distribution systems: Analysis with internet search volume as a proxy.” Water Res. 53 (1): 26–34. https://doi.org/10.1016/j.watres.2014.01.013.
Sun, Y. M., A. K. C. Wong, and M. S. Kamel. 2009. “Classification of imbalanced data: A review.” Int. J. Pattern Recognit. Artif. Intell. 23 (4): 687–719. https://doi.org/10.1142/S0218001409007326.
Tsitsifli, S., V. Kanakoudis, and I. Bakouros. 2011. “Pipe Networks risk assessment based on survival analysis.” Water Resour. Manage. 25 (14): 3729–3746. https://doi.org/10.1007/s11269-011-9881-3.
US Census Bureau. 2014. 2014 TIGER/Geodatabases technical documentation. Suitland, MD: US Census Bureau.
USDA. 2012. Soil survey geographic database (SSURGO) data packing and use. Washington, DC: USDA.
Walski, T. M., and A. Pelliccia. 1982. “Economic analysis of water main breaks.” Am. Water Works Assoc. 74 (3): 140–147. https://doi.org/10.1002/j.1551-8833.1982.tb04874.x.
Wang, R., W. Dong, Y. Wang, K. Tang, and X. Yao. 2013. “Pipe failure prediction: A data mining method.” In Proc., 29th Int. Conf. on Data Engineering (ICDE), 1208–1218. New York: IEEE.
Washington, S., M. Karlaftis, and F. Mannering. 2011. Statistical and econometric methosd for transportation data analysis. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC.
Wilson, D., Y. Filion, and I. Moore. 2017. “State-of-the-art review of water pipe failure prediction models and applicability to large-diameter mains.” Urban Water J. 14 (2): 173–184. https://doi.org/10.1080/1573062X.2015.1080848.
WRF (Water Research Foundation). 2017. Managing infrastructure risk: The consequence of failure for buried assets. Denver: WRF.
Yamijala, S., S. D. Guikema, and K. Brumbelow. 2009. “Statistical models for the analysis of water distribution system pipe break data.” Reliability Eng. Syst. Saf. 94 (2): 282–293. https://doi.org/10.1016/j.ress.2008.03.011.
Information & Authors
Information
Published In
Copyright
©2019 American Society of Civil Engineers.
History
Received: Jun 28, 2017
Accepted: Oct 22, 2018
Published online: Feb 27, 2019
Published in print: Jun 1, 2019
Discussion open until: Jul 27, 2019
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.