Chapter
Mar 18, 2024

Autonomous Detection and Assessment of Indoor Building Defects Using Multimodal Learning and GPT

Publication: Construction Research Congress 2024

ABSTRACT

Buildings deteriorate over their service life. The early detection of defects such as cracking, spalling, corrosion, and moisture can benefit the preventative maintenance of building systems. Autonomous robotic systems have enormous potential in automating indoor building defect inspections, along with challenges of inaccurate prediction and unorganized information. With the implementation of state-of-art multimodal learning methods and large language model (LLM) techniques, we present a cutting-edge workflow composed of image captioning, landmark documentation, and real-time on-site human–machine interactive path planning. Compared with previous vision and language navigation (VLN) algorithms, our workflow introduces defect prompts to improve indoor inspection captioning performance. These pivotal defect features are extracted by YOLO (You Only Look Once) v5, a PyTorch-based deep learning model. As the robotic system recognizes the environment clearly, inspectors are capable of providing target-oriented instructions to control the survey path. By implementing the large language model GPT-3, vocal and textual instructions are transferred to the robotic localization system and summarize a brief inspection report. In this way, with the assistance of GPT, numerous inspections that previously demanded substantial effort can be conducted efficiently and expeditiously.

Get full access to this article

View all available purchase options and get full access to this chapter.

REFERENCES

Asadi, K., H. Ramshankar, H. Pullagurla, A. Bhandare, S. Shanbhag, P. Mehta, S. Kundu, K. Han, E. Lobaton, and T. Wu. 2018. “Vision-based integrated mobile robotic system for real-time applications in construction.” Autom. Constr., 96: 470–482. Elsevier.
Brown, T. B., et al. 2020. “Language Models are Few-Shot Learners.”.
Cai, W., L. Huang, and Z. Zou. 2023. “Actively-exploring thermography-enabled autonomous robotic system for detecting and registering HVAC thermal leaks.” Autom. Constr., 152: 104901. https://doi.org/10.1016/j.autcon.2023.104901.
Campos, C., R. Elvira, J. J. G. Rodriguez, J. M. M. Montiel, and J. D. Tardos. 2021. “ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM.” IEEE Trans. Robot., 37 (6): 1874–1890. https://doi.org/10.1109/TRO.2021.3075644.
Cao, Y., S. Li, Y. Liu, Z. Yan, Y. Dai, P. S. Yu, and L. Sun. 2023. “A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT.”.
Cornia, M., M. Stefanini, L. Baraldi, and R. Cucchiara. 2020. “Meshed-Memory Transformer for Image Captioning.” 2020 IEEECVF Conf. Comput. Vis. Pattern Recognit. CVPR, 10575–10584. Seattle, WA, USA: IEEE.
Gul, F., W. Rahiman, and S. S. Nazli Alhady. 2019. “A comprehensive study for robot navigation techniques.” Cogent Eng., (K. Chen, ed.), 6 (1): 1632046. https://doi.org/10.1080/23311916.2019.1632046.
Huang, C., O. Mees, A. Zeng, and W. Burgard. 2023. “Visual Language Maps for Robot Navigation.”.
Kareem Jaradat, M. A., M. Al-Rousan, and L. Quadan. 2011. “Reinforcement based mobile robot navigation in dynamic environment.” Robot. Comput.-Integr. Manuf., 27 (1): 135–149. https://doi.org/10.1016/j.rcim.2010.06.019.
Kirillov, A., et al. 2023. “Segment Anything.”.
Krantz, J., E. Wijmans, A. Majumdar, D. Batra, and S. Lee. 2020. “Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments.”.
Luo, Z., Y. Xi, R. Zhang, and J. Ma. 2022. “A Frustratingly Simple Approach for End-to-End Image Captioning.”.
Ma, Y., J. Ji, X. Sun, Y. Zhou, and R. Ji. 2023. “Towards local visual modeling for image captioning.” Pattern Recognit., 138: 109420. https://doi.org/10.1016/j.patcog.2023.109420.
Macario Barros, A., M. Michel, Y. Moline, G. Corre, and F. Carrel. 2022. “A Comprehensive Survey of Visual SLAM Algorithms.” Robotics, 11 (1): 24. https://doi.org/10.3390/robotics11010024.
Mai, G., et al. 2023. “On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence.”.
McNamara, T. P., J. K. Hardy, and S. C. Hirtle. n.d. Subjective Hierarchies in Spatial Memory.
Muhammad, I., K. Ying, M. Nithish, J. Xin, Z. Xinge, and C. C. Cheah. 2021. “Robot-Assisted Object Detection for Construction Automation: Data and Information-Driven Approach.” IEEEASME Trans. Mechatron., 26 (6): 2845–2856. https://doi.org/10.1109/TMECH.2021.3100306.
Mur-Artal, R., and J. D. Tardos. 2017. “ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras.” IEEE Trans. Robot., 33 (5): 1255–1262. https://doi.org/10.1109/TRO.2017.2705103.
Nguyen, V.-Q., M. Suganuma, and T. Okatani. 2022. “GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features.”.
Pan, C., et al. 2022. “Capsule network-based semantic segmentation model for thermal anomaly identification on building envelopes.” Adv. Eng. Inform., 54: 101767. https://doi.org/10.1016/j.aei.2022.101767.
Ren, Y., J. Huang, Z. Hong, W. Lu, J. Yin, L. Zou, and X. Shen. 2020. “Image-based concrete crack detection in tunnels using deep fully convolutional networks.” Constr. Build. Mater., 234: 117367. https://doi.org/10.1016/j.conbuildmat.2019.117367.
Song, K., J. Wang, Y. Bao, L. Huang, and Y. Yan. 2022. “A Novel Visible-Depth-Thermal Image Dataset of Salient Object Detection for Robotic Visual Perception.” IEEEASME Trans. Mechatron., 1–12. https://doi.org/10.1109/TMECH.2022.3215909.
Yang, L., B. Li, J. Feng, G. Yang, Y. Chang, B. Jiang, and J. Xiao. 2023. “Automated wall‐climbing robot for concrete construction inspection.” J. Field Robot., 40 (1): 110–129. https://doi.org/10.1002/rob.22119.
Zhao, H. 2023. “Image.txt: Transform Image Into Unique Paragraph.” Accessed May 16, 2023. https://zhaohengyuan1.github.io/image2paragraph.github.io/.
Wen, Y., K. Chen, and N. Choudhury. 2023. “A Robotic Sensing System for Intelligent Inspection of Indoor Building Systems”, 2023 International conference on computing in civil engineering.

Information & Authors

Information

Published In

Go to Construction Research Congress 2024
Construction Research Congress 2024
Pages: 1001 - 1009

History

Published online: Mar 18, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

1Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, Tuscaloosa, AL. Email: [email protected]
Kaiwen Chen [email protected]
2Assistant Professor, Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, Tuscaloosa, AL. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$276.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$276.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share