Improving Subsurface Asset Failure Predictions for Utility Operators: A Unique Case Study on Cable and Pipe Failures Resulting from Excavation Work
Publication: ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering
Volume 6, Issue 2
Abstract
Utility operators must rely on predictive analyses regarding the availability of their subsurface assets, which highly depend on damage by increasing amounts of excavation work. However, straightforward use of standard statistical techniques, such as logistic regression or Bayesian logistic regression, does not allow for accurate predictions of these rare events. Therefore, in this paper, alternative approaches are investigated. These approaches involve weighting the likelihood as well as over- and undersampling the data. It was found that these data methods could substantially improve the accuracy of predicting rare failure events. More specifically, an application based on the real data of a Dutch water utility operator showed that undersampling and weighting improved the balanced accuracy, varying between 0.61 and 0.66, whereas the proposed methods resulted in failure predictions on between 38% and 58% of the validation data set. Hence, the proposed methods will enable utility operators to arrive at more accurate forecasts, enhancing their asset operation decision-making.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
All data and models are proprietary or confidential in nature. All statistical code used during this study is available from the corresponding author upon request.
Acknowledgments
The authors would like to thank Evides Waterbedrijf for providing the data set and its contribution during preparation of the data set used in this study. The contribution of the municipality of Rotterdam that provided the data (Rotterdam3D) is deeply appreciated.
References
Akosa, J. S. 2017. “Predictive accuracy: A misleading performance measure for highly imbalanced data.” SAS Global Forum, 942: 1–12.
Ariaratnam, S. T., A. El-Assaly, and Y. Yang. 2001. “Assessment of infrastructure inspection needs using logistic models.” J. Infrastruct. Syst. 7 (4): 160–165. https://doi.org/10.1061/(ASCE)1076-0342(2001)7:4(160).
Atef, A., and O. Moselhi. 2014. “Modeling spatial and functional interdependencies of civil infrastructure networks.” In Proc., Pipelines 2014: From Underground to the Forefront of Innovation and Sustainability, 1558–1567. Reston, VA: ASCE.
Batista, G. E., R. C. Prati, and M. C. Monard. 2004. “A study of the behavior of several methods for balancing machine learning training data.” ACM SIGKDD Explorations Newsl. 6 (1): 20–29. https://doi.org/10.1145/1007730.1007735.
Chawla, N. V., K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. “SMOTE: Synthetic minority over-sampling technique.” J. Artif. Intell. Res. 16: 321–357. https://doi.org/10.1613/jair.953.
Chawla, N. V., N. Japkowicz, and P. Drive. 2004. “Editorial: Special issue on learning from imbalanced data sets.” ACM SIGKDD Explorations Newsl. 6 (1): 1–6. https://doi.org/10.1145/1007730.1007733.
DuMouchel, W. 2012. “Multivariate Bayesian logistic regression for analysis of clinical study safety issues.” Stat. Sci. 27 (3): 319–339. https://doi.org/10.1214/11-STS381.
Engelhardt, M. O., P. J. Skipworth, D. A. Savic, A. J. Saul, and G. A. Walters. 2000. “Rehabilitation strategies for water distribution networks: A literature review with a UK perspective.” Urban Water 2 (2): 153–170. https://doi.org/10.1016/S1462-0758(00)00053-4.
Evides. 2017. Jaarverslag 2016. Rotterdam, Netherlands: Evides.
Field, A. 2013. Discovering statistics using IBM SPSS statistics. 4th ed. London: Sage.
Groot, P. J. M., R. Saitua, and N. Visser. 2016. Investeren in de infrastructuur: Trends en beleidsuitdagingen. Amsterdam, Netherlands: Economisch Instituut voor de Bouw.
Grzenda, W. 2015. “The advantages of Bayesian methods over classical methods in the context of credible intervals.” Inf. Syst. Manage. 4 (1): 53–63.
Haixiang, G., L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing. 2017. “Learning from class-imbalanced data: Review of methods and applications.” Expert Syst. Appl. 73 (May): 220–239. https://doi.org/10.1016/j.eswa.2016.12.035.
Han, H., W. Wang, and B. Mao. 2005. “Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning.” In Vol. 3644 of Proc., Advances in Intelligent Computing. ICIC 2005. Lecture Notes in Computer Science, edited by D. Huang, X. Zhang, and G. Huang. Berlin: Springer.
He, H., and E. A. Garcia. 2009. “Learning from imbalanced data.” IEEE Trans. Knowl. Data Eng. 21 (9): 1263–1284. https://doi.org/10.1109/TKDE.2008.239.
Hokstad, P., I. B. Utne, and J. Vatn. 2012. Risk and interdependencies in critical infrastructures. London: Springer.
Hosmer, D., S. Lemeshow, and R. Sturdivant. 2013. Applied logistic regression. 3rd ed. New York: Wiley.
Islam, T., and O. Moselhi. 2012. “Modeling geospatial interdependence for integrated municipal infrastructure.” J. Infrastruct. Syst. 18 (2): 68–74. https://doi.org/10.1061/(ASCE)IS.1943-555X.0000065.
Kadaster. n.d. “Graafmelding.” Accessed January 26, 2018. https://www.kadaster.nl/-/graafmelding.
King, G., and L. Zeng. 2001. “Logistic regression in rare events data.” Political Anal. 9 (2): 137–163. https://doi.org/10.1093/oxfordjournals.pan.a004868.
Kleinbaum, D. G., and M. Klein. 2010. Logistic regression: A self-learning text. 3rd ed. New York: Springer.
KLO (Kabel- en Leiding Overleg). 2016. Factsheet graafschade voorkomen. [In Dutch.] Netherlands, Amsterdam: KLO.
Lee, S. S. 2000. “Noisy replication in skewed binary classification.” Comput. Stat. Data Anal. 34 (2): 165–191. https://doi.org/10.1016/S0167-9473(99)00095-X.
Maalouf, M., D. Homouz, and T. B. Trafalis. 2018. “Logistic regression in large rare events and imbalanced data: A performance comparison of prior correction and weighting methods.” Comput. Intell. 34 (1): 161–174. https://doi.org/10.1111/coin.12123.
Osman, H. 2016. “Coordination of urban infrastructure reconstruction projects.” Struct. Infrastruct. Eng. 12 (1): 108–121.
Ouyang, M. 2014. “Review on modeling and simulation of interdependent critical infrastructure systems.” Reliab. Eng. Syst. Saf. 121 (Jan): 43–60. https://doi.org/10.1016/j.ress.2013.06.040.
Peduzzi, P., J. Concato, E. Kemper, T. R. Holford, and A. R. Feinstem. 1996. “A simulation study of the number of events per variable in logistic regression analysis.” J. Clin. Epidemiol. 49 (12): 1373–1379. https://doi.org/10.1016/S0895-4356(96)00236-3.
Rainey, C. 2016. “Dealing with separation in logistic regression models.” Political Anal. 24 (3): 339–355. https://doi.org/10.1093/pan/mpw014.
Rijksoverheid.nl. 2017. “Graafschade aan ondergrondse leidingen en kabels.” Accessed February 19, 2018. https://www.rijksoverheid.nl/onderwerpen/bodem-en-ondergrond/graafschade.
Riley, C. L., and M. Wilson. 2006. Pipeline separation design and installation reference guide. Olympia, WA: Washington State Dept. of Ecology.
Rinaldi, S. M., J. P. Peerenboom, and T. K. Kelly. 2001. “Indentifying, understanding, and analyzing critical infrastructures interdependencies.” IEEE Control Syst. Mag. 21 (6): 11–25.
Rodríguez, J. D., A. Pérez, and J. A. Lozano. 2009. “Sensitivity analysis of kappa-fold cross validation in prediction error estimation.” IEEE Trans. Pattern Anal. Mach. Intell. 32 (3): 569–575.
Scholten, L., A. Scheidegger, P. Reichert, and M. Mauer. 2013. “Strategic rehabilitation planning of piped water networks using multi-criteria decision analysis.” Water Res. 49 (Feb): 124–143. https://doi.org/10.1016/j.watres.2013.11.017.
Swets, J. A. 1988. “Measuring the accuracy of diagnostic systems.” Science 240 (4857): 1285–1293. https://doi.org/10.1126/science.3287615.
Tahir, M. A., J. Kittler, K. Mikolajczyk, and F. Yan. 2009. “A multiple expert approach to the class imbalance problem using inverse random under sampling.” In Proc., Int. Workshop on Multiple Classifier Systems, 82–91. Berlin: Springer.
Tape, T. G. n.d. “Plotting and intrepretating an ROC curve.” Accessed August 1, 2018. http://gim.unmc.edu/dxtests/ROC2.htm.
Tscheikner-Gratl, F. 2016. Integrated approach for multi-utility rehabilitation planning of urban water infrastructure: Focus on small and medium sized municipalities. Innsbruck, Austria: Innsbruck University Press.
Tscheikner-Gratl, F., R. Sitzenfrei, W. Rauch, and M. Kleidorfer. 2016. “Integrated rehabilitation planning of urban infrastructure systems using a street section priority model.” Urban Water J. 13 (1): 28–40. https://doi.org/10.1080/1573062X.2015.1057174.
Tung, Y.-K. 1985. “Channel scouring potential using logistic analysis.” J. Hydraul. Eng. 111 (2): 194–205. https://doi.org/10.1061/(ASCE)0733-9429(1985)111:2(194).
Utne, I. B., P. Hokstad, and J. Vatn. 2011. “A method for risk modeling of interdependencies in critical infrastructures.” Reliab. Eng. Syst. Saf. 96 (6): 671–678. https://doi.org/10.1016/j.ress.2010.12.006.
Van Mill, B. P. A., B. J. F. Gooskens, M. Noordink, and B. R. Dunning. 2013. Evaluatie Wion. Den Haag, Netherlands: Kwink Groep.
Vloerbergh, I. N., and R. H. S. Beuken. 2011. Levensduur van leidingen. Nieuwegein, Netherlands: BTO 2011.057.
Wei, L. X., and L. Y. Han. 2013. “Third-party damage factors analysis and control measures of Daqing-Harbin oil pipeline.” In Vol. 411 of Applied mechanics and materials, 2527–2532. Zurich, Switzerland: Trans Tech Publications.
Xiong, Y., and R. Zuo. 2018. “GIS-based rare events logistic regression for mineral prospectivity mapping.” Comput. Geosci. 111 (Sep): 18–25. https://doi.org/10.1016/j.cageo.2017.10.005.
Information & Authors
Information
Published In
Copyright
©2020 American Society of Civil Engineers.
History
Received: Jan 9, 2019
Accepted: Dec 12, 2019
Published online: Mar 24, 2020
Published in print: Jun 1, 2020
Discussion open until: Aug 24, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.