Binary Building Attribute Imputation, Evaluation, and Comparison Approaches for Hurricane Damage Data Sets
Publication: Journal of Performance of Constructed Facilities
Volume 34, Issue 3
Abstract
Missing building attributes are problematic for development of data-based fragility models. Relative to other disciplines, the application of imputation techniques is limited in the field of engineering. Current imputation techniques to replace missing building attributes lack evaluations of imputation model performance, which ensure accuracy and validity of the imputed data. This paper presents two imputation approaches, along with imputation diagnostic and comparison approaches, for binary building attribute data with missing observations. Predictive mean matching (PMM) and multiple imputation (MI) are used to impute foundation type and number of stories attributes. The diagnostic approach, based on the logistic regression goodness-of-fit test, is used to evaluate the imputation model fit. The comparison approach, based on the percentage of correctly imputed observations, is used to evaluate the imputation model performance. A data set of single-family homes damaged by the 2005 Hurricane Katrina is used to demonstrate implementation of the methodology. Based on the comparison approach, PMM models showed 9% and 2% greater accuracy than MI models in imputing foundation type and number of stories, respectively.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
Some or all data, models, or code generated or used during the study are available from the corresponding author by request, including data and code that are used to develop the imputation models.
Acknowledgments
The first author gratefully acknowledges funding from the Louisiana Board of Regents Graduate Fellowship in Engineering Grant No. LEQSF(2008-13)GF-01, the Donald W. Clayton Graduate Ph.D. Assistantship in Engineering at Louisiana State University, and the Chevron Engineering Graduate Student Fellowship at Louisiana State University. Hurricane Katrina reconnaissance videos were provided by MCEER.
References
Abayomi, K., A. Gelman, and M. Levy. 2008. “Diagnostics for multivariate imputations.” J. Royal Stat. Soc. Series C (Appl. Stat.) 57 (3): 273–291. https://doi.org/10.1111/j.1467-9876.2007.00613.x.
Akande, O., F. Li, and J. Reiter. 2017. “An empirical comparison of multiple imputation methods for categorical data.” Am. Statistician 71 (2): 162–170. https://doi.org/10.1080/00031305.2016.1277158.
Bernhardt, P. W. 2018. “Model validation and influence diagnostics for regression models with missing covariates.” Stat. Med. 37 (8): 1325–1342. https://doi.org/10.1002/sim.7584.
Booij, N., R. Ris, and L. H. Holthuijsen. 1999. “A third-generation wave model for coastal regions: 1. Model description and validation.” J. Geophys. Res. Oceans 104 (C4): 7649–7666. https://doi.org/10.1029/98JC02622.
Cabras, S., M. E. Castellanos, and A. Quirós. 2011. “Goodness-of-fit of conditional regression models for multiple imputation.” Bayesian Anal. 6 (3): 429–455. https://doi.org/10.1214/11-BA617.
Collins, L. M., J. L. Schafer, and C.-M. Kam. 2001. “A comparison of inclusive and restrictive strategies in modern missing data procedures.” Psychol. Methods 6 (4): 330. https://doi.org/10.1037/1082-989X.6.4.330.
Dietrich, J. C., S. Tanaka, J. J. Westerink, C. Dawson, R. Luettich Jr., M. Zijlema, L. H. Holthuijsen, J. Smith, L. Westerink, and H. Westerink. 2012. “Performance of the unstructured-mesh, SWAN+ ADCIRC model in computing hurricane waves and surge.” J. Sci. Comput. 52 (2): 468–497. https://doi.org/10.1007/s10915-011-9555-6.
Farhan, J., and T. Fwa. 2014. “Improved imputation of missing pavement performance data using auxiliary variables.” J. Transp. Eng. 141 (1): 04014–04065. https://doi.org/10.1061/(ASCE)TE.1943-5436.0000725.
Fay, R. E. 1996. “Alternative paradigms for the analysis of imputed survey data.” J. Am. Stat. Assoc. 91 (434): 490–498. https://doi.org/10.1080/01621459.1996.10476909.
FEMA. 2020. “FEMA flood map service center: Welcome!” Accessed September 29, 2016. https://www.fema.gov/.
Ferrari, P. A., P. Annoni, A. Barbiero, and G. Manzi. 2011. “An imputation method for categorical variables with application to nonlinear principal component analysis.” Comput. Stat. Data Anal. 55 (7): 2410–2420. https://doi.org/10.1016/j.csda.2011.02.007.
Gelman, A., I. Van Mechelen, G. Verbeke, D. F. Heitjan, and M. Meulders. 2005. “Multiple imputation for model checking: Completed-data plots with missing and latent data.” Biometrics 61 (1): 74–85. https://doi.org/10.1111/j.0006-341X.2005.031010.x.
Little, R. J., and D. B. Rubin. 2014. Statistical analysis with missing data. Hoboken, NJ: Wiley.
Luettich, R. A., and J. J. Westerink. 2004. “Formulation and numerical implementation of the 2D/3D ADCIRC finite element model version 44. XX.” Accessed April 16, 2018. https://adcirc.org/files/2018/11/adcirc_theory_2004_12_08.pdf.
Macabuag, J., T. Rossetto, I. Ioannou, A. Suppasri, D. Sugawara, B. Adriano, F. Imamura, I. Eames, and S. Koshimura. 2016. “A proposed methodology for deriving tsunami fragility functions for buildings using optimum intensity measures.” Nat. Hazards 84 (2): 1257–1285. https://doi.org/10.1007/s11069-016-2485-8.
Massarra, C. C., C. J. Friedland, B. D. Marx, and J. C. Dietrich. 2019. “Predictive multi-hazard hurricane data-based fragility model for residential homes.” Coastal Eng. 151 (Sep): 10–21. https://doi.org/10.1016/j.coastaleng.2019.04.008.
Meng, X.-L. 1994. “Multiple-imputation inferences with uncongenial sources of input.” Stat. Sci. 9 (4): 538–558. https://doi.org/10.1214/ss/1177010269.
Nguyen, C. D., J. B. Carlin, and K. J. Lee. 2017. “Model checking in multiple imputation: An overview and case study.” Emerging Themes Epidemiol. 14 (1): 8. https://doi.org/10.1186/s12982-017-0062-6.
Pita, G. L., R. Francis, Z. Liu, J. Mitrani-Reiser, S. Guikema, and J.-P. Pinelli. 2011. “Statistical tools for populating/predicting input data of risk analysis models.” In Proc., 1st Int. Conf. on Vulnerability, Uncertainty, and Risk, 468–476. Reston, VA: ASCE.
Raghunathan, T. E., J. M. Lepkowski, J. Van Hoewyk, and P. Solenberger. 2001. “A multivariate technique for multiply imputing missing values using a sequence of regression models.” Survey Methodol. 27 (1): 85–96.
Rubin, D. B. 1978. “Multiple imputations in sample surveys-a phenomenological Bayesian approach to nonresponse.” In Proc., Survey Research Methods Section of the American Statistical Association, 20–34. Alexandria, VA: American Statistical Association.
Stuart, E. A., M. Azur, C. Frangakis, and P. Leaf. 2009. “Multiple imputation with large data sets: A case study of the Children’s Mental Health Initiative.” Am. J. Epidemiol. 169 (9): 1133–1139. https://doi.org/10.1093/aje/kwp026.
Van Buuren, S. 2012. Flexible imputation of missing data. New York: CRC Press.
Westerink, J. J., R. A. Luettich, J. C. Feyen, J. H. Atkinson, C. Dawson, H. J. Roberts, M. D. Powell, J. P. Dunion, E. J. Kubatko, and H. Pourtaheri. 2008. “A basin-to channel-scale unstructured grid hurricane storm surge model applied to southern Louisiana.” Mon. Weather Rev. 136 (3): 833–864. https://doi.org/10.1175/2007MWR1946.1.
White, I. R., P. Royston, and A. M. Wood. 2011. “Multiple imputation using chained equations: Issues and guidance for practice.” Stat. Med. 30 (4): 377–399. https://doi.org/10.1002/sim.4067.
Zhu, H., J. G. Ibrahim, and X. Shi. 2009. “Diagnostic measures for generalized linear models with missing covariates.” Scand. J. Stat. 36 (4): 686–712. https://doi.org/10.1111/j.1467-9469.2009.00644.x.
Zijlema, M. 2010. “Computation of wind-wave spectra in coastal waters with SWAN on unstructured grids.” Coastal Eng. 57 (3): 267–277. https://doi.org/10.1016/j.coastaleng.2009.10.011.
Information & Authors
Information
Published In
Copyright
©2020 American Society of Civil Engineers.
History
Received: Apr 30, 2019
Accepted: Nov 6, 2019
Published online: Mar 27, 2020
Published in print: Jun 1, 2020
Discussion open until: Aug 27, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.