Practical Considerations in Statistical Modeling of Count Data for Infrastructure Systems
Publication: Journal of Infrastructure Systems
Volume 15, Issue 3
Abstract
Count data arise in a number of infrastructure assessment problems such as modeling traffic accidents, pipe breaks in water distribution systems, and electric power outages. A common goal in these problems is to model the number of occurrences of an event of interest in the future based on past data. There is usually a great deal of variability in the past data, but there is a considerable amount of other information available that can help inform the models. A number of statistical models have been proposed and used for modeling count data in infrastructure assessment, including linear regression and generalized linear models. This paper summarizes these approaches and their past uses in infrastructure assessment. It then gives an overview of a class of models called generalized additive models that can incorporate nonlinear relationships between explanatory variables and counts of events in a flexible manner. Throughout the paper, the focus is on the practical usefulness of the different models, and an actual data set is used to demonstrate the different models.
Get full access to this article
View all available purchase options and get full access to this article.
References
Agresti, A. (2002). Categorical data analysis, 2nd Ed., Wiley, Hoboken, N.J.
Andreou, S. A., Marks, D. H., and Clark, R. M. (1987a). “A new methodology for modeling break failure patterns in deteriorating water distribution systems: Theory.” Adv. Water Resour., 10(1), 2–10.
Andreou, S. A., Marks, D. H., and Clark, R. M. (1987b). “A new methodology for modeling break failure patterns in deteriorating water distribution systems: Applications.” Adv. Water Resour., 10(1), 11–20.
Cameron, A. C., and Trivedi, P. K. (1998). Regression analysis of count data, Cambridge Univ., Cambridge, U.K.
Cardinale, M., and Arrhenius, F. (2000). “The influence of stock structure and environmental conditions on the recruitment process of Baltic cod estimated using a generalized additive model.” Can. J. Fish. Aquat. Sci., 57, 2402–2409.
Dey, D. K., Gosh, S. K., and Mallick, B. K. (2000). Generalized linear models: A Bayesian perspective, Marcel Dekker, New York
Faraway, J. J. (2006). Extending the linear model with R: Generalized linear, mixed effects, and nonparametric regression models, Chapman & Hall/CRC, Boca Raton, Fla.
Guikema, S. D., Davidson, R. A., and Liu, H. (2006). “Statistical models of the effects of tree trimming on power system outages.” IEEE Trans. Power Deliv., 21(3), 1549–1557.
Guisan, A., Edwards, R. A., Jr., and Hastie, T. (2002). “Generalized linear and generalized additive models in studies of species distributions: Setting the scene.” Ecol. Modell., 157, 89–100.
Hastie, T., and Tibshirani, R. (1990). Generalized additive models, Chapman & Hall/CRC, Boca Raton, Fla.
Hastie, T., Tibshirani, R., and Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction, Springer, New York
Hauer, E. (2001). “Overdispersion in modelling accidents on road section and in empirical Bayesian estimation.” Accid. Anal. Prev., 33(6), 799–808.
Kettler, A. J., and Goulter, I. C. (1985). “An analysis of pipe breakage in urban water distribution networks.” Can. J. Civ. Eng., 12, 286–293.
Kleiner, Y., and Rajani, B. B. (2001). “Comprehensive review of structural deterioration of water mains: Statistical models.” Urban Water, 3(3), 131–150.
Liu, H., Davidson, R. A., Rosowsky, D. V., and Stedinger, J. R. (2005). “Negative binomial regression of electric power outages in hurricanes.” J. Infrastruct. Syst., 11(4), 258–267.
Lord, D. (2006). “Modeling motor vehicle crashes using Poisson-gamma models: Examining the effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter.” Accid. Anal Prev., 38(4), 751–766.
Lord, D., Washington, S. P., and Ivan, J. N. (2005). “Poisson, Poisson-gamma, and zero-inflated regression models of motor vehicle crashes: Balancing statistical fit and theory.” Accid. Anal Prev., 37(1), 35–46.
Nelder, J. A., and Wedderburn, R. W. M. (1972). “Generalized linear models.” J. R. Stat. Soc. Ser. A (Gen.), 135(3), 370–384.
Nogues-Bravo, D., and Araujo, M. B. (2006). “Species richness, area and climate correlates.” Global Ecol. Biogeogr., 15(5), 452–460.
Poch, M., and Mannering, F. L. (1996). “Negative binomial analysis of intersection-accident frequency.” J. Transp. Eng., 122(2), 105–113.
Pope, C. A., III, Burnett, R. T., Thun, M. J., Calle, E. E., Krewski, D., Kazuhiko, I., and Thurston, G. D. (2002). “Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution.” JAMA, J. Am. Med. Assoc., 287(9), 1132–1141.
Radmer, D. T., Kuntz, P. A., Christie, R. D., Venkata, S. S., and Fletcher, R. H. (2002). “Predicting vegetation-related failure rates for overhead distribution feeders.” IEEE Trans. Power Deliv., 17(4), 1170–1175.
Shamir, U., and Howard, C. D. D. (1979). “An analytic approach to scheduling pipe replacement.” J. Am. Water Works Assoc., 71(5), 248–258.
Wood, S. N. (2000). “Modeling and smoothing parameter estimation with multiple quadratic penalties.” J. R. Stat. Soc. Ser. B (Stat. Methodol.), 62(2), 413–428.
Wood, S. N. (2003). “Thin plate regression splines.” J. R. Stat. Soc. Ser. B (Stat. Methodol.), 65(1), 95–114.
Wood, S. N. (2004). “Stable and efficient multiple smoothing parameter estimation for generalized additive models.” J. Am. Stat. Assoc., 99, 673–686.
Yamijala, S., Guikema, S. D., and Wolter, D. F. (2006). “Comparison of statistical models for pipe breakages on the basis of short time histories.” Proc., Society for Risk Analysis Annual Meeting, Baltimore, International Society of Risk Analysis, McLean, Va.
Information & Authors
Information
Published In
Copyright
© 2009 ASCE.
History
Received: Jan 10, 2007
Accepted: Sep 2, 2008
Published online: Aug 14, 2009
Published in print: Sep 2009
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.