Chapter
Mar 18, 2024

Natural Language Navigation for Robotic Systems: Integrating GPT and Dense Captioning Models with Object Detection in Autonomous Inspections

Publication: Construction Research Congress 2024

ABSTRACT

Autonomous Unmanned Aerial Vehicles (UAVs) are rapidly transforming industries requiring inspection and surveillance. However, conventional UAV systems often require complex control schemes and lack adaptability, limiting their efficacy in variable environments such as indoor inspections. This paper introduces an innovative system integrating the cutting-edge Generative Pretrained Transformer (GPT) models and dense captioning models for autonomous navigation and fault detection in indoor environments. Our approach, displaying human-like flexibility, allows the drone to interpret and respond to natural language commands, vastly enhancing its accessibility and user-friendliness. Simultaneously, the drone utilizes object dictionaries derived from dense captioning of its captured images, facilitating an advanced understanding of its surroundings. These capabilities equip the drone to adapt its behavior and effectively handle unexpected scenarios, significantly enhancing the efficiency and accuracy of indoor inspections. This research contributes to revolutionizing building inspections, making the process more user-friendly, and localizable to a broader user base.

Get full access to this article

View all available purchase options and get full access to this chapter.

REFERENCES

Brown, T. B., et al. 2020. “Language Models are Few-Shot Learners.”.
Girshick, R. 2015. “Fast R-CNN.” 2015 IEEE Int. Conf. Comput. Vis. ICCV, 1440–1448.
Jiménez-Jiménez, S. I., W. Ojeda-Bustamante, M. Marcial-Pablo, and J. Enciso. 2021. “Digital Terrain Models Generated with Low-Cost UAV Photogrammetry: Methodology and Accuracy.” ISPRS Int. J. Geo-Inf., 10 (5): 285. https://doi.org/10.3390/ijgi10050285.
Kobori, T., T. Nakamura, M. Nakano, T. Nagai, N. Iwahashi, K. Funakoshi, and M. Kaneko. 2016. “Robust comprehension of natural language instructions by a domestic service robot.” Adv. Robot., 30 (24): 1530–1543. Taylor & Francis. https://doi.org/10.1080/01691864.2016.1252689.
Krishna, R., et al. 2017. “Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations.” Int. J. Comput. Vis., 123 (1): 32–73. https://doi.org/10.1007/s11263-016-0981-7.
Lattanzi, D., and G. Miller. 2017. “Review of Robotic Infrastructure Inspection Systems.” J. Infrastruct. Syst., 23 (3): 04017004. https://doi.org/10.1061/(ASCE)IS.1943-555X.0000353.
Lin, T.-Y., M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár. 2015. “Microsoft COCO: Common Objects in Context.”.
Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. 2016. SSD: Single Shot MultiBox Detector. 21–37.
Opfer, N., and D. Shields. 2014. “Unmanned Aerial Vehicle Applications and Issues for Construction.” 2014 ASEE Annu. Conf. Expo. Proc., 24.1302.1–24.1302.16. Indianapolis, Indiana: ASEE Conferences.
OSHA (Occupational Safety and Health Administration). 2023. “Commonly Used Statistics | Occupational Safety and Health Administration.” U. S. Dep. Labor. Accessed May 22, 2023. https://www.osha.gov/data/commonstats.
Rakha, T., and A. Gorodetsky. 2018a. “Review of Unmanned Aerial System (UAS) applications in the built environment: Towards automated building inspection procedures using drones.” Autom. Constr., 93: 252–264. Elsevier.
Rakha, T., and A. Gorodetsky. 2018b. “Review of Unmanned Aerial System (UAS) applications in the built environment: Towards automated building inspection procedures using drones.” Autom. Constr., 93: 252–264. https://doi.org/10.1016/j.autcon.2018.05.002.
Redmon, J., S. Divvala, R. Girshick, and A. Farhadi. 2016. “You Only Look Once: Unified, Real-Time Object Detection.” 2016 IEEE Conf. Comput. Vis. Pattern Recognit. CVPR, 779–788. Las Vegas, NV, USA: IEEE.
Roca, D., S. Lagüela, L. Díaz-Vilariño, J. Armesto, and P. Arias. 2013. “Low-cost aerial unit for outdoor inspection of building façades.” Autom. Constr., 36: 128–135. https://doi.org/10.1016/j.autcon.2013.08.020.
Smith, S. 2014. “Fatal Four: Safety in the Construction Industry [Infographic].” EHS Today. Accessed May 17, 2023. https://www.ehstoday.com/construction/article/21916245/fatal-four-safety-in-the-construction-industry-infographic.
Vemprala, S., R. Bonatti, A. Bucker, and A. Kapoor. 2023. “Chatgpt for robotics: Design principles and model abilities.” Microsoft Auton Syst Robot Res, 2: 20.
Wen, Y., K. Chen, and N. Choudhury. 2023. “A Robotic Sensing System for Intelligent Inspection of Indoor Building Systems”, 2023 International conference on computing in civil engineering.
Wu, J., J. Wang, Z. Yang, Z. Gan, Z. Liu, J. Yuan, and L. Wang. 2022. “GRiT: A Generative Region-to-text Transformer for Object Understanding.”.

Information & Authors

Information

Published In

Go to Construction Research Congress 2024
Construction Research Congress 2024
Pages: 972 - 980

History

Published online: Mar 18, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

Nilay R. Choudhury [email protected]
1Dept. of Mechanical Engineering, Univ. of Alabama, Tuscaloosa, AL. Email: [email protected]
2Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, Tuscaloosa, AL. Email: [email protected]
Kaiwen Chen [email protected]
3Dept. of Civil, Construction, and Environmental Engineering, Univ. of Alabama, Tuscaloosa, AL. ORCID: https://orcid.org/0000-0003-4330-5376. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$276.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$276.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share