Technical Papers
Apr 28, 2023

Deep Learning-Based RGB-D Fusion for Multimodal Condition Assessment of Civil Infrastructure

Publication: Journal of Computing in Civil Engineering
Volume 37, Issue 4

Abstract

Recent advancements in the areas of computer vision and deep learning have broadened the scope of vision-based autonomous condition assessment of civil infrastructure. However, a review of available literature suggests that most of the existing vision-based inspection techniques rely only on color information, due to the immediate availability of inexpensive high-resolution cameras. Regular cameras translate a 3D scene to a 2D space, which leads to a loss of information vis-à-vis distance and scale. This imposes a barrier to the realization of the full potential of vision-based techniques. In this regard, the structural health monitoring community is yet to benefit from the new opportunities that commercially-available low-cost depth sensors offer. This study aims at filling this knowledge gap by incorporating depth fusion into an encoder-decoder-based semantic segmentation model. Advanced computer graphics approaches are exploited to generate a database of paired RGB and depth images representing various damage categories that are commonly observed in reinforced concrete (RC) buildings, namely, spalling, spalling with exposed rebars, and severely buckled rebars. A number of encoding techniques are explored for representing the depth data. Additionally, various schemes for the data-level, feature-level, and decision-level fusions of RGB and depth data are investigated to identify the best fusion strategy. Overall, it was observed that feature-level fusion is the most effective and can enhance the performance of deep learning-based damage segmentation algorithms by up to 25% without any appreciable increase in the computation time. Moreover, a novel volumetric damage quantification approach is introduced, which is robust against perspective distortion. This study is believed to advance the frontiers of infrastructure resilience and maintenance.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

All data, models, or codes that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This study was supported in part by a fund from Bentley Systems, Inc.

References

Abdelbarr, M., Y. L. Chen, M. R. Jahanshahi, S. F. Masri, W.-M. Shen, and U. A. Qidwai. 2017. “3D dynamic displacement-field measurement for structural health monitoring using inexpensive RGB-D based sensor.” Smart Mater. Struct. 26 (12): 125016. https://doi.org/10.1088/1361-665X/aa9450.
Abgottspon, A. 2011. “Procedural modelling in Houdini based on function representation.” Master’s thesis, National Centre for Computer Animation, Bournemouth Univ.
ACI (American Concrete Institute). 2011. Building code requirements for structural concrete. ACI 318-11. Farmington Hills, MI: ACI.
Al-Sabbag, Z. A., C. M. Yeum, and S. Narasimhan. 2022. “Enabling human–machine collaboration in infrastructure inspections through mixed reality.” Adv. Eng. Inf. 53 (Aug): 101709. https://doi.org/10.1016/j.aei.2022.101709.
Beckman, G. H., D. Polyzois, and Y.-J. Cha. 2019. “Deep learning-based automatic volumetric damage quantification using depth camera.” Autom. Constr. 99 (Mar): 114–124. https://doi.org/10.1016/j.autcon.2018.12.006.
Cha, Y.-J., W. Choi, and O. Büyüköztürk. 2017. “Deep learning-based crack damage detection using convolutional neural networks.” Comput.-Aided Civ. Infrastruct. Eng. 32 (5): 361–378. https://doi.org/10.1111/mice.12263.
Cha, Y.-J., W. Choi, G. Suh, S. Mahmoudkhani, and O. Büyüköztürk. 2018. “Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types.” Comput.-Aided Civ. Infrastruct. Eng. 33 (9): 731–747. https://doi.org/10.1111/mice.12334.
Chang, A., A. Dai, T. Funkhouser, M. Halber, M. Niessner, M. Savva, S. Song, A. Zeng, and Y. Zhang. 2017. “Matterport3D: Learning from RGB-D data in indoor environments.” In Proc., Int. Conf. on 3D Vision (3DV). Red Hook, NY: Curran Associates.
Chen, F.-C., and M. R. Jahanshahi. 2017. “NB-CNN: Deep learning-based crack detection using convolutional neural network and naïve Bayes data fusion.” IEEE Trans. Ind. Electron. 65 (5): 4392–4400. https://doi.org/10.1109/TIE.2017.2764844.
Chen, W., Z. Yu, Z. Wang, and A. Anandkumar. 2020. “Automated synthetic-to-real generalization.” In Vol. 119 of Proc., Int. Conf. on Machine Learning, PMLR, 1746–1756. Cambridge, MA: Proceedings of Machine Learning Research.
Cheng, Y., R. Cai, Z. Li, X. Zhao, and K. Huang. 2017. “Locality-sensitive deconvolution networks with gated fusion for RGB-D indoor semantic segmentation.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 3029–3037. Piscataway, NJ: IEEE.
Deng, J., W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. “ImageNet: A large-scale hierarchical image database.” In Proc., 2009 IEEE Conf. on Computer Vision and Pattern Recognition, 248–255. New York: IEEE.
Deng, L., M. Yang, T. Li, Y. He, and C. Wang. 2019. “RFBnet: Deep multimodal networks with residual fusion blocks for RGB-D semantic segmentation.” Preprint, submitted June 29, 2019. http://arxiv.org/abs/1907.00135.
Dornia, D., and N. Fischöder. 2020. “SideFX Houdini in VFX.” In Interaktive Datenvisualisierung in Wissenschaft und Unternehmenspraxis, 21–44. Wiesbaden, Germany: Springer.
Dorsey, T. 2018. “35 state DOTs are deploying drones to save lives, time and money.” AASHTO News, April 17, 2023.
Eitel, A., J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard. 2015. “Multimodal deep learning for robust RGB-D object recognition.” In Proc., 2015 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 681–687. New York: IEEE.
Elkins, E. B. 2020. Simulating destruction effects in SideFX Houdini. Johnson City, TN: East Tennessee State Univ.
Farrar, M. M., B. Newton, M. Johnson, P. Jensen, D. Juntunen, G. Christian, T. Everett, L. Hummel, J. Thiel, and W. Casey. 2014. The AASHTO manual for bridge element inspection. Winter, FL: ASPIRE—The Concrete Bridge Magazine.
Firoozi Yeganeh, S., A. Golroo, and M. R. Jahanshahi. 2019. “Automated rutting measurement using an inexpensive RGB-D sensor fusion approach.” J. Transp. Eng., Part B: Pavements 145 (1): 04018061. https://doi.org/10.1061/JPEODX.0000095.
Fischler, M. A., and R. C. Bolles. 1981. “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography.” Commun. ACM 24 (6): 381–395. https://doi.org/10.1145/358669.358692.
Gao, Y., and K. M. Mosalam. 2018. “Deep transfer learning for image-based structural damage recognition.” Comput.-Aided Civ. Infrastruct. Eng. 33 (9): 748–768. https://doi.org/10.1111/mice.12363.
Ghosh Mondal, T., M. R. Jahanshahi, R.-T. Wu, and Z. Y. Wu. 2020. “Deep learning-based multi-class damage detection for autonomous post-disaster reconnaissance.” Struct. Control Health Monit. 27 (4): e2507. https://doi.org/10.1002/stc.2507.
Girshick, R. 2015. “Fast R-CNN.” In Proc., IEEE Int. Conf. on Computer Vision, 1440–1448. Piscataway, NJ: IEEE.
Gucunski, N., A. Maher, B. Basily, H. La, R. Lim, H. Parvardeh, and S. Kee. 2013. “Robotic platform rabit for condition assessment of concrete bridge decks using multiple NDE technologies.” HDKBR INFO Mag. 3 (4): 5–12.
Hazirbas, C., L. Ma, C. Domokos, and D. Cremers. 2016. “FuseNet: Incorporating depth into semantic segmentation via fusion-based CNN architecture.” In Proc., Asian Conf. on Computer Vision, 213–228. New York: Springer.
Hoffman, J., S. Gupta, and T. Darrell. 2016. “Learning with side information through modality hallucination.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 826–834. Piscataway, NJ: IEEE.
Hofmann, G. R. 1990. “Who invented ray tracing?” Visual Comput. 6 (3): 120–124. https://doi.org/10.1007/BF01911003.
Hoskere, V., Y. Narazaki, T. Hoang, and B. Spencer Jr. 2018. “Vision-based structural inspection using multiscale deep convolutional neural networks.” Accessed April 17, 2023. http://arxiv.org/abs/1805.01055.
Hoskere, V., Y. Narazaki, and B. Spencer. 2019a. “Learning to detect important visual changes for structural inspections using physicsbased graphics models.” In Proc., 9th Int. Conf. on Structural Health Monitoring of Intelligent Infrastructure, 1484–1490. Brussels, Belgium: International Society for Structural Health Monitoring of Intelligent Infrastructure.
Hoskere, V., Y. Narazaki, B. F. Spencer, and M. D. Smith. 2019b. “Deep learning-based damage detection of miter gates using synthetic imagery from computer graphics.” In Proc., 12th Int. Workshop on Structural Health Monitoring, 3073–3080. Lancaster, PA: DEStech Publications.
Jahanshahi, M. R., F. Jazizadeh, S. F. Masri, and B. Becerik-Gerber. 2013a. “Unsupervised approach for autonomous pavement-defect detection and quantification using an inexpensive depth sensor.” J. Comput. Civ. Eng. 27 (6): 743–754. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000245.
Jahanshahi, M. R., S. F. Masri, C. W. Padgett, and G. S. Sukhatme. 2013b. “An innovative methodology for detection and quantification of cracks through incorporation of depth perception.” Mach. Vision Appl. 24 (2): 227–241. https://doi.org/10.1007/s00138-011-0394-0.
Kim, H., E. Ahn, M. Shin, and S.-H. Sim. 2019. “Crack and noncrack classification from concrete surface images using machine learning.” Struct. Health Monit. 18 (3): 725–738. https://doi.org/10.1177/1475921718768747.
Kim, H., S. Lee, E. Ahn, M. Shin, and S.-H. Sim. 2021. “Crack identification method for concrete structures considering angle of view using RGB-D camera-based sensor fusion.” Struct. Health Monit. 20 (2): 500–512. https://doi.org/10.1177/147592172093.
Lawin, F. J., M. Danelljan, P. Tosteberg, G. Bhat, F. S. Khan, and M. Felsberg. 2017. “Deep projective 3D semantic segmentation.” In Proc., Int. Conf. on Computer Analysis of Images and Patterns, 95–107. New York: Springer.
Lee, J., K.-C. Lee, S. Lee, Y.-J. Lee, and S.-H. Sim. 2019. “Long-term displacement measurement of bridges using a LiDAR system.” Struct. Control Health Monit. 26 (10): e2428. https://doi.org/10.1002/stc.2428.
Li, Q., G. Yang, K. C. Wang, Y. Zhan, and C. Wang. 2017. “Novel macro- and microtexture indicators for pavement friction by using high-resolution three-dimensional surface data.” Transp. Res. Rec. 2641 (1): 164–176. https://doi.org/10.3141/2641-19.
Li, Z., Y. Gan, X. Liang, Y. Yu, H. Cheng, and L. Lin. 2016. “LSTM-CF: Unifying context modeling and fusion with LSTMs for RGB-D scene labeling.” In Proc., European Conf. on Computer Vision, 541–557. Berlin: Springer.
Market Research Reports. 2022. “Bridge inspection services market size, global forecast 2032.” Accessed December 24, 2022. https://www.factmr.com/report/4526/bridge-inspection-services-market.
Mondal, T. G., and M. R. Jahanshahi. 2022a. “Applications of computer vision-based structural health monitoring and condition assessment in future smart cities.” Chap. 8 in The rise of smart cities, 193–221. Amsterdam, Netherlands: Elsevier.
Mondal, T. G., and M. R. Jahanshahi. 2022b. “Applications of depth sensing for advanced structural condition assessment in smart cities.” Chap. 12 in The rise of smart cities, 305–318. Amsterdam, Netherlands: Elsevier.
Mondal, T. G., and M. R. Jahanshahi. 2023. “Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation.” Earthquake Eng. Eng. Vibr. 1–14. https://doi.org/10.1007/s11803-023-2155-2.
Narazaki, Y., V. Hoskere, K. Yoshida, B. F. Spencer, and Y. Fujino. 2021. “Synthetic environments for vision-based structural condition assessment of Japanese high-speed railway viaducts.” Mech. Syst. Signal Process. 160 (Nov): 107850. https://doi.org/10.1016/j.ymssp.2021.107850.
Nikolenko, S. I. 2021. “Synthetic-to-real domain adaptation and refinement.” In Synthetic data for deep learning, 235–268. New York: Springer.
Nowruzi, F. E., P. Kapoor, D. Kolhatkar, F. A. Hassanat, R. Laganiere, and J. Rebut. 2019. “How much real data do we actually need: Analyzing object detection performance using synthetic and real data.” Preprint, submitted July 16, 2019. http://arxiv.org/abs/1907.07061.
Ophoff, T., T. Goedemé, and K. Van Beeck. 2018. “Improving real-time pedestrian detectors with RGB+ depth fusion.” In Proc., 2018 15th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS), 1–6. New York: IEEE.
Park, S.-J., K.-S. Hong, and S. Lee. 2017. “RDFNet: RGB-D multi-level residual feature fusion for indoor semantic segmentation.” In Proc., IEEE Int. Conf. on Computer Vision, 4980–4989. Piscataway, NJ: IEEE.
Perlin, K. 2001. “Noise hardware.” In real-time shading. New York: Association for Computing Machinery.
Pharr, M., W. Jakob, and G. Humphreys. 2016. Physically based rendering: From theory to implementation. Burlington, MA: Morgan Kaufmann.
Ren, S., K. He, R. Girshick, and J. Sun. 2015. “Faster R-CNN: Towards real-time object detection with region proposal networks.” Preprint, submitted June 4, 2015. http://arxiv.org/abs/1506.01497.
Ros, G., L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez. 2016. “The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 3234–3243. Piscataway, NJ: IEEE.
Schwarz, M., A. Milan, A. S. Periyasamy, and S. Behnke. 2018. “RGB-D object detection and semantic segmentation for autonomous manipulation in clutter.” Int. J. Rob. Res. 37 (4–5): 437–451. https://doi.org/10.1177/0278364917713117.
Simonyan, K., and A. Zisserman. 2014. “Very deep convolutional networks for large-scale image recognition.” Preprint, submitted September 4, 2014. http://arxiv.org/abs/1409.1556.
Socher, R., B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng. 2012. Convolutional-recursive deep learning for 3D object classification: Advances in neural information processing systems, 656–664. Red Hook, NY: Curran Associatestes.
Taylor, A. G. 2016. “What is the Microsoft HoloLens?” In Develop Microsoft HoloLens Apps Now, 3–7. Berkeley, CA: Springer.
Tzeng, E., J. Hoffman, K. Saenko, and T. Darrell. 2017. “Adversarial discriminative domain adaptation.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 7167–7176. Piscataway, NJ: IEEE.
Wang, A., J. Lu, J. Cai, T.-J. Cham, and G. Wang. 2015. “Large-margin multi-modal deep learning for RGB-D object recognition.” IEEE Trans. Multimedia 17 (11): 1887–1898. https://doi.org/10.1109/TMM.2015.2476655.
Wang, J., Z. Wang, D. Tao, S. See, and G. Wang. 2016. “Learning common and specific features for RGB-D semantic segmentation with deconvolutional networks.” In Proc., European Conf. on Computer Vision, 664–679. New York: Springer.
Xu, X., Y. Li, G. Wu, and J. Luo. 2017. “Multi-modal deep feature learning for RGB-D object detection.” Pattern Recognit. 72 (Dec): 300–313. https://doi.org/10.1016/j.patcog.2017.07.026.
Yeum, C. M., S. J. Dyke, and J. Ramirez. 2018. “Visual data classification in post-event building reconnaissance.” Eng. Struct. 155 (Jan): 16–24. https://doi.org/10.1016/j.engstruct.2017.10.057.
Zeiler, M. D., G. W. Taylor, and R. Fergus. 2011. “Adaptive deconvolutional networks for mid and high level feature learning.” In Proc., 2011 Int. Conf. on Computer Vision, 2018–2025. New York: IEEE.
Zennaro, S., M. Munaro, S. Milani, P. Zanuttigh, A. Bernardi, S. Ghidoni, and E. Menegatti. 2015. “Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications.” In Proc., 2015 IEEE Int. Conf. on Multimedia and Expo (ICME), 1–6. New York: IEEE.
Zhang, Y., C. Chen, Q. Wu, Q. Lu, S. Zhang, G. Zhang, and Y. Yang. 2018. “A Kinect-based approach for 3D pavement surface reconstruction and cracking recognition.” IEEE Trans. Intell. Transp. Syst. 19 (12): 3935–3946. https://doi.org/10.1109/TITS.2018.2791476.
Zhou, K., A. Paiement, and M. Mirmehdi. 2017. “Detecting humans in RGB-D data with CNNs.” In Proc., 2017 15th IAPR Int. Conf. on Machine Vision Applications (MVA), 306–309. New York: IEEE.
Zhou, S., and W. Song. 2020a. “Concrete roadway crack segmentation using encoder-decoder networks with range images.” Autom. Constr. 120 (Dec): 103403. https://doi.org/10.1016/j.autcon.2020.103403.
Zhou, S., and W. Song. 2020b. “Deep learning-based roadway crack classification using laser-scanned range images: A comparative study on hyperparameter selection.” Autom. Constr. 114 (Jun): 103171. https://doi.org/10.1016/j.autcon.2020.103171.
Zhou, S., and W. Song. 2021. “Deep learning–based roadway crack classification with heterogeneous image data fusion.” Struct. Health Monit. 20 (3): 1274–1293. https://doi.org/10.1177/1475921720948434.

Information & Authors

Information

Published In

Go to Journal of Computing in Civil Engineering
Journal of Computing in Civil Engineering
Volume 37Issue 4July 2023

History

Received: Sep 26, 2022
Accepted: Mar 2, 2023
Published online: Apr 28, 2023
Published in print: Jul 1, 2023
Discussion open until: Sep 28, 2023

Permissions

Request permissions for this article.

Authors

Affiliations

Postdoctoral Fellow, Dept. of Civil, Architectural and Environmental Engineering, Missouri Univ. of Science and Technology, Rolla, MO 65409 (corresponding author). ORCID: https://orcid.org/0000-0003-2091-7046. Email: [email protected]
Associate Professor, Lyles School of Civil Engineering, Elmore Family School of Electrical and Computer Engineering, Purdue Univ., West Lafayette, IN 47907. ORCID: https://orcid.org/0000-0001-6583-3087
Zheng Yi Wu
Director Applied Research, Bentley Systems, 27 Siemon Company Dr., #200w, Watertown, CT 06795.

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share