Technical Papers
Jul 4, 2022

AdaLN: A Vision Transformer for Multidomain Learning and Predisaster Building Information Extraction from Images

Publication: Journal of Computing in Civil Engineering
Volume 36, Issue 5

Abstract

Satellite and street view images are widely used in various disciplines as a source of information for understanding the built environment. In natural hazard engineering, high-quality building inventory data sets are crucial for the simulation of hazard impacts and for supporting decision-making. Screening the building stocks to gather the information for simulation and to detect potential structural defects that are vulnerable to natural hazards is a time-consuming and labor-intensive task. This paper presents an automated method for extracting building information through the use of satellite and street view images. The method is built upon a novel transformer-based deep neural network we developed. Specifically, a multidomain learning approach is employed to develop a single compact model for multiple image-based deep learning information extraction tasks using multiple data sources (e.g., satellite and street view images). Our multidomain Vision Transformer is designed as a unified architecture that can be effectively deployed for multiple classification tasks. The effectiveness of the proposed approach is demonstrated in a case study in which we use pretrained models to collect regional-scale building information that is related to natural hazard risks.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Testing data, trained models, and the codes that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

A part of this study is based on work supported by the National Science Foundation under Grant No. 1612843. Opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

References

ATC (Applied Technology Council). 1988. Rapid visual screening of buildings for potential seismic hazards: A handbook. FEMA 154. Washington, DC: FEMA.
Ba, J. L., J. R. Kiros, and G. E. Hinton. 2016. “Layer normalization.” Preprint, submitted July 21, 2016. http://arxiv.org/abs/1607.06450.
Bency, A. J., S. Rallapalli, R. K. Ganti, M. Srivatsa, and B. Manjunath. 2017. “Beyond spatial auto-regressive models: Predicting housing prices with satellite imagery.” In Proc., IEEE Winter Conf. on Applications of Computer Vision. New York: IEEE.
Bhandare, A., V. Sripathi, D. Karkada, V. Menon, S. Choi, K. Datta, and V. Saletore. 2019. “Efficient 8-bit quantization of transformer neural machine language translation model.” Preprint, submitted June 7, 2019. http://arxiv.org/abs/1906.00532.
Bilen, H., and A. Vedaldi. 2017. “Universal representations: The missing link between faces, text, planktons, and cat breeds.” Preprint, submitted January 28, 2017. https://arxiv.org/abs/1701.07275.
Bottou, L. 2012. “Stochastic gradient descent tricks.” In Neural networks: Tricks of the trade, 421–436. New York: Springer.
Brunner, D., G. Lemoine, and L. Bruzzone. 2010. “Earthquake damage assessment of buildings using VHR optical and SAR imagery.” IEEE Trans. Geosci. Remote Sens. 48 (5): 2403–2420. https://doi.org/10.1109/TGRS.2009.2038274.
Cha, Y.-J., W. Choi, G. Suh, S. Mahmoudkhani, and O. Büyüköztürk. 2018. “Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types.” Comput.-Aided Civ. Infrastruct. Eng. 33 (9): 731–747. https://doi.org/10.1111/mice.12334.
Chollet, F. 2017. “Xception: Deep learning with depthwise separable convolutions.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 1251–1258. New York: IEEE.
Czerniawski, T., and F. Leite. 2020. “Automated segmentation of RGB-D images into a comprehensive set of building components using deep learning.” Adv. Eng. Inf. 45 (Aug): 101131. https://doi.org/10.1016/j.aei.2020.101131.
de Beurs, K. M., N. S. McThompson, B. C. Owsley, and G. M. Henebry. 2019. “Hurricane damage detection on four major Caribbean islands.” Remote Sens. Environ. 229 (Aug): 1–13. https://doi.org/10.1016/j.rse.2019.04.028.
Deng, J., W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. “Imagenet: A large-scale hierarchical image database.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 248–255. New York: IEEE.
Dosovitskiy, A., et al. 2020. “An image is worth 16×16 words: Transformers for image recognition at scale.” Preprint, submitted October 22, 2020. http://arxiv.org/abs/2010.11929.
FEMA. 2018. Hazus—MH 2.1 hurricane model technical manual. Washington, DC: FEMA.
Gao, Y., and K. M. Mosalam. 2018. “Deep transfer learning for image-based structural damage recognition.” Comput.-Aided Civ. Infrastruct. Eng. 33 (9): 748–768. https://doi.org/10.1111/mice.12363.
Gebru, T., J. Krause, Y. Wang, D. Chen, J. Deng, E. L. Aiden, and L. Fei-Fei. 2017. “Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States.” Proc. Natl. Acad. Sci. U.S.A. 114 (50): 13108–13113. https://doi.org/10.1073/pnas.1700035114.
Geirhos, R., P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel. 2018. “Imagenet-trained CNNs are biased towards texture; Increasing shape bias improves accuracy and robustness.” Preprint, submitted November 29, 2018. http://arxiv.org/abs/1811.12231.
Goutte, C., and E. Gaussier. 2005. “A probabilistic interpretation of precision, recall and F-score, with implication for evaluation.” In Proc., European Conf. on Information Retrieval, 345–359. New York: Springer.
Guo, J., Q. Wang, Y. Li, and P. Liu. 2020. “Façade defects classification from imbalanced dataset using meta learning-based convolutional neural network.” Comput.-Aided Civ. Infrastruct. Eng. 35 (12): 1403–1418. https://doi.org/10.1111/mice.12578.
Guo, Y., Y. Li, L. Wang, and T. Rosing. 2019. “Depthwise convolution is all you need for learning multiple visual domains.” In Vol. 33 of Proc., AAAI Conf. on Artificial Intelligence, 8368–8375. Menlo Park, CA: Association for the Advancement of Artificial Intelligence.
Hamaguchi, R., and S. Hikosaka. 2018. “Building detection from satellite imagery using ensemble of size-specific detectors.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition Workshops, 187–191. New York: IEEE.
Hang, L., and G. Cai. 2020. “CNN based detection of building roofs from high resolution satellite images.” Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLII-3/W10: 187–192. https://doi.org/10.5194/isprs-archives-XLII-3-W10-187-2020.
He, K., X. Zhang, S. Ren, and J. Sun. 2016. “Deep residual learning for image recognition.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition. New York: IEEE.
Ivanovsky, L., V. Khryashchev, V. Pavlov, and A. Ostrovskaya. 2019. “Building detection on aerial images using U-NET neural networks.” In Proc., 24th Conf. of Open Innovations Association (FRUCT), 116–122. New York: IEEE.
Jin, X., and C. H. Davis. 2005. “Automated building extraction from high-resolution satellite imagery in urban areas using structural, contextual, and spectral information.” EURASIP J. Adv. Signal Process. 2005 (14): 1–11. https://doi.org/10.1155/ASP.2005.2196.
Joyce, K. E., S. E. Belliss, S. V. Samsonov, S. J. McNeill, and P. J. Glassey. 2009. “A review of the status of satellite remote sensing and image processing techniques for mapping natural hazards and disasters.” Prog. Phys. Geogr. 33 (2): 183–207. https://doi.org/10.1177/0309133309339563.
Kang, J., M. Körner, Y. Wang, H. Taubenböck, and X. X. Zhu. 2018. “Building instance classification using street view images.” ISPRS J. Photogramm. Remote Sens. 145 (Nov): 44–59. https://doi.org/10.1016/j.isprsjprs.2018.02.006.
Karbassi, A., and M. Nollet. 2007. “The adaptation of the FEMA 154 methodology for the rapid visual screening of existing buildings in accordance with NBCC-2005.” In Proc., 9th Canadian Conf. on Earthquake Engineering, 27–29. Vancouver, BC, Canada: Canadian Association for Earthquake Engineering.
Kucharczyk, M., and C. H. Hugenholtz. 2019. “Pre-disaster mapping with drones: An urban case study in Victoria, British Columbia, Canada.” Nat. Hazards Earth Syst. Sci. 19 (9): 2039–2051. https://doi.org/10.5194/nhess-19-2039-2019.
Lari, Z., and H. Ebadi. 2007. “Automated building extraction from high-resolution satellite imagery using spectral and structural information based on artificial neural networks.” In Proc., ISPRS Hannover Workshop. Hannover, Germany: Leibniz Univ. Hannover.
Law, S., B. Paige, and C. Russell. 2018. “Take a look around: Using street view and satellite images to estimate house prices.” Preprint, submitted July 18, 2018. http://arxiv.org/abs/:1807.07155.
Linsley, D., J. Kim, V. Veerabadran, C. Windolf, and T. Serre. 2018. “Learning long-range spatial dependencies with horizontal gated recurrent units.” In Proc., 32nd Int. Conf. on Neural Information Processing Systems, 152–164. Red Hook, NY: Curran Associates.
Lipton, Z. C., et al. 2015. “A critical review of recurrent neural networks for sequence learning.” Preprint, submitted May 29, 2015. http://arxiv.org/abs/1506.00019.
Liu, X., Q. Chen, L. Zhu, Y. Xu, and L. Lin. 2017. “Place-centric visual urban perception with deep multi-instance regression.” In Proc., 25th ACM Int. Conf. on Multimedia. New York: Association for Computing Machinery.
Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. “Efficient estimation of word representations in vector space.” Preprint, submitted January 16, 2013. http://arxiv.org/abs/1301.3781.
Naik, N., J. Philipoom, R. Raskar, and C. Hidalgo. 2014. “Streetscore–Predicting the perceived safety of one million streetscapes.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition Workshops. New York: IEEE.
NCEI (National Centers for Environmental Information). 2021. US billion-dollar weather and climate disasters. Washington, DC: National Oceanic and Atmospheric Administration.
Ningthoujam, M., and R. P. Nanda. 2018. “Rapid visual screening procedure of existing building based on statistical analysis.” Int. J. Disaster Risk Reduct. 28 (Jun): 720–730. https://doi.org/10.1016/j.ijdrr.2018.01.033.
Paszke, A., et al. 2019. “Pytorch: An imperative style, high-performance deep learning library.”http://arxiv.org/abs/1912.01703.
Patino, J. E., and J. C. Duque. 2013. “A review of regional science applications of satellite remote sensing in urban settings.” Comput. Environ. Urban Syst. 37 (Jan): 1–17. https://doi.org/10.1016/j.compenvurbsys.2012.06.003.
Perrone, D., M. A. Aiello, M. Pecce, and F. Rossi. 2015. “Rapid visual screening for seismic evaluation of RC hospital buildings.” In Vol. 3 of Structures, 57–70. Amsterdam, Netherlands: Elsevier.
Ploeger, S., M. Sawada, A. Elsabbagh, M. Saatcioglu, M. Nastev, and E. Rosetti. 2016. “Urban RAT: New tool for virtual and site-specific mobile rapid data collection for seismic risk assessment.” J. Comput. Civ. Eng. 30 (2): 04015006. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000472.
Poursanidis, D., and N. Chrysoulakis. 2017. “Remote sensing, natural hazards and the contribution of ESA sentinels missions.” Remote Sens. Appl. Soc. Environ. 6 (Apr): 25–38. https://doi.org/10.1016/j.rsase.2017.02.001.
Rebuffi, S.-A., H. Bilen, and A. Vedaldi. 2017. “Learning multiple visual domains with residual adapters.” In Advances in neural information processing systems, 506–516. Red Hook, NY: Curran Associates.
Rebuffi, S.-A., H. Bilen, and A. Vedaldi. 2018. “Efficient parametrization of multi-domain deep neural networks.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 8119–8127. New York: IEEE.
Rosenfeld, A., and J. K. Tsotsos. 2017. “Incremental learning through deep adaptation.” Preprint, submitted May 11, 2017. http://arxiv.org/abs/1705.04228.
Saatcioglu, M., M. Shooshtari, and S. Foo. 2013. “Seismic screening of buildings based on the 2010 National Building Code of Canada.” Can. J. Civ. Eng. 40 (5): 483–498. https://doi.org/10.1139/cjce-2012-0055.
Shorten, C., and T. M. Khoshgoftaar. 2019. “A survey on image data augmentation for deep learning.” J. Big Data 6 (1): 1–48. https://doi.org/10.1186/s40537-019-0197-0.
Srikanth, T., R. P. Kumar, A. P. Singh, B. K. Rastogi, and S. Kumar. 2010. “Earthquake vulnerability assessment of existing buildings in Gandhidham and Adipur cities Kachchh, Gujarat (India).” Eur. J. Sci. Res. 41 (3): 336–353.
Sun, F., J. Liu, J. Wu, C. Pei, X. Lin, W. Ou, and P. Jiang. 2019. “BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer.” In Proc., 28th ACM Int. Conf. on Information and Knowledge Management, 1441–1450. New York: Association for Computing Machinery.
Torok, M. M., M. Golparvar-Fard, and K. B. Kochersberger. 2014. “Image-based automated 3D crack detection for post-disaster building assessment.” J. Comput. Civ. Eng. 28 (5): A4014004. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000334.
Touvron, H., M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou. 2020. “Training data-efficient image transformers & distillation through attention.”Preprint, submitted December 23, 2020. http://arxiv.org/abs/2012.12877.
Ulyanov, D., A. Vedaldi, and V. Lempitsky. 2016. “Instance normalization: The missing ingredient for fast stylization.” Preprint, submitted July 27, 2016. http://arxiv.org/abs/1607.08022.
Vaswani, A., et al. 2017. “Attention is all you need.” In Advances in neural information processing systems, 5998–6008. Red Hook, NY: Curran Associates.
Wallace, N. M., and T. H. Miller. 2008. “Seismic screening of public facilities in Oregon’s western counties.” Pract. Period. Struct. Des. Constr. 13 (4): 189–197. https://doi.org/10.1061/(ASCE)1084-0680(2008)13:4(189).
Wallemacq, P., and R. House. 2018. Economic losses, poverty and disasters: 1998-2017. Brussels, Belgium: United Nations Office for Disaster Risk Reduction.
Wang, C., Q. Yu, K. H. Law, F. McKenna, X. Y. Stella, E. Taciroglu, A. Zsarnóczay, W. Elhaddad, and B. Cetiner. 2021a. “Machine learning-based regional scale intelligent modeling of building information for natural hazard risk management.” Autom. Constr. 122 (Feb): 103474. https://doi.org/10.1016/j.autcon.2020.103474.
Wang, F., M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang, and X. Tang. 2017. “Residual attention network for image classification.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 3156–3164. New York: IEEE.
Wang, N., Q. Zhao, S. Li, X. Zhao, and P. Zhao. 2018. “Damage classification for masonry historic structures using convolutional neural networks based on still images.” Comput.-Aided Civ. Infrastruct. Eng. 33 (12): 1073–1089. https://doi.org/10.1111/mice.12411.
Wang, N., X. Zhao, Z. Zou, P. Zhao, and F. Qi. 2020a. “Autonomous damage segmentation and measurement of glazed tiles in historic buildings via deep learning.” Comput.-Aided Civ. Infrastruct. Eng. 35 (3): 277–291. https://doi.org/10.1111/mice.12488.
Wang, W., E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. 2021b. “Pyramid vision transformer: A versatile backbone for dense prediction without convolutions.” Preprint, submitted February 24, 2021. http://arxiv.org/abs/2102.12122.
Wang, X., C. Wittich, T. Hutchinson, Y. Bock, D. Goldberg, E. Lo, and F. Kuester. 2020b. “Methodology and validation of UAV-based video analysis approach for tracking earthquake-induced building displacements.” J. Comput. Civ. Eng. 34 (6): 04020045. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000928.
Wang, Y., Q. Yao, J. T. Kwok, and L. M. Ni. 2020c. “Generalizing from a few examples: A survey on few-shot learning.” ACM Comput. Surv. 53 (3): 1–34. https://doi.org/10.1145/3386252.
Woo, S., J. Park, J.-Y. Lee, and I. S. Kweon. 2018. “CBAM: Convolutional block attention module.” In Proc., European Conf. on Computer Vision (ECCV), 3–19. Cham, Switzerland: Springer.
Wu, H., B. Xiao, N. Codella, M. Liu, X. Dai, L. Yuan, and L. Zhang. 2021. “CVT: Introducing convolutions to vision transformers.” Preprint, submitted March 29, 2021. http://arxiv.org/abs/2103.15808.
Yu, Q., C. Wang, F. McKenna, S. X. Yu, E. Taciroglu, B. Cetiner, and K. H. Law. 2020. “Rapid visual screening of soft-story buildings from street view images using deep learning classification.” Earthquake Eng. Eng. Vibr. 19 (4): 827–838. https://doi.org/10.1007/s11803-020-0598-2.
Zhang, Y. 1999. “Optimisation of building detection in satellite images by combining multispectral classification and texture filtering.” ISPRS J. Photogramm. Remote Sens. 54 (1): 50–60. https://doi.org/10.1016/S0924-2716(98)00027-6.
Zheng, H., J. Fu, T. Mei, and J. Luo. 2017. “Learning multi-attention convolutional neural network for fine-grained image recognition.” In Proc., IEEE Int. Conf. on Computer Vision, 5209–5217. New York: IEEE.
Zhou, B., A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016a. “Learning deep features for discriminative localization.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition. New York: IEEE.
Zhou, Z., J. Gong, and M. Guo. 2016b. “Image-based 3D reconstruction for posthurricane residential building damage assessment.” J. Comput. Civ. Eng. 30 (2): 04015015. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000480.
Zhu, Z., S. German, and I. Brilakis. 2011. “Visual retrieval of concrete crack properties for automated post-earthquake structural safety evaluation.” Autom. Constr. 20 (7): 874–883. https://doi.org/10.1016/j.autcon.2011.03.004.

Information & Authors

Information

Published In

Go to Journal of Computing in Civil Engineering
Journal of Computing in Civil Engineering
Volume 36Issue 5September 2022

History

Received: Nov 11, 2021
Accepted: Mar 26, 2022
Published online: Jul 4, 2022
Published in print: Sep 1, 2022
Discussion open until: Dec 4, 2022

Permissions

Request permissions for this article.

Authors

Affiliations

Yunhui Guo
Postdoctoral Scholar, International Computer Science Institute, Univ. of California, Berkeley, Berkeley, CA 94704.
Assistant Professor, M.E. Rinker, Sr. School of Construction Management, College of Design, Construction and Planning, Univ. of Florida, Gainesvill, FL 32603 (corresponding author). ORCID: https://orcid.org/0000-0001-8534-9276. Email: [email protected]
Director, International Computer Science Institute, Univ. of California, Berkeley, Berkeley, CA 94704. ORCID: https://orcid.org/0000-0002-3507-5761
Frank McKenna
Research Engineer, Dept. of Civil and Environmental Engineering, Univ. of California, Berkeley, Berkeley, CA 94720.
Kincho H. Law, F.ASCE
Professor, Dept. of Civil and Environmental Engineering, Stanford Univ., Stanford, CA 94305.

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • A GAN-Augmented CNN Approach for Automated Roadside Safety Assessment of Rural Roadways, Journal of Computing in Civil Engineering, 10.1061/JCCEE5.CPENG-5406, 38, 2, (2024).
  • Building Façade Style Classification from UAV Imagery Using a Pareto-Optimized Deep Learning Network, Electronics, 10.3390/electronics11213450, 11, 21, (3450), (2022).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share