Natural Language Navigation for Robotic Systems: Integrating GPT and Dense Captioning Models with Object Detection in Autonomous Inspections
Publication: Construction Research Congress 2024
ABSTRACT
Autonomous Unmanned Aerial Vehicles (UAVs) are rapidly transforming industries requiring inspection and surveillance. However, conventional UAV systems often require complex control schemes and lack adaptability, limiting their efficacy in variable environments such as indoor inspections. This paper introduces an innovative system integrating the cutting-edge Generative Pretrained Transformer (GPT) models and dense captioning models for autonomous navigation and fault detection in indoor environments. Our approach, displaying human-like flexibility, allows the drone to interpret and respond to natural language commands, vastly enhancing its accessibility and user-friendliness. Simultaneously, the drone utilizes object dictionaries derived from dense captioning of its captured images, facilitating an advanced understanding of its surroundings. These capabilities equip the drone to adapt its behavior and effectively handle unexpected scenarios, significantly enhancing the efficiency and accuracy of indoor inspections. This research contributes to revolutionizing building inspections, making the process more user-friendly, and localizable to a broader user base.
Get full access to this article
View all available purchase options and get full access to this chapter.
REFERENCES
Brown, T. B., et al. 2020. “Language Models are Few-Shot Learners.”.
Girshick, R. 2015. “Fast R-CNN.” 2015 IEEE Int. Conf. Comput. Vis. ICCV, 1440–1448.
Jiménez-Jiménez, S. I., W. Ojeda-Bustamante, M. Marcial-Pablo, and J. Enciso. 2021. “Digital Terrain Models Generated with Low-Cost UAV Photogrammetry: Methodology and Accuracy.” ISPRS Int. J. Geo-Inf., 10 (5): 285. https://doi.org/10.3390/ijgi10050285.
Kobori, T., T. Nakamura, M. Nakano, T. Nagai, N. Iwahashi, K. Funakoshi, and M. Kaneko. 2016. “Robust comprehension of natural language instructions by a domestic service robot.” Adv. Robot., 30 (24): 1530–1543. Taylor & Francis. https://doi.org/10.1080/01691864.2016.1252689.
Krishna, R., et al. 2017. “Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations.” Int. J. Comput. Vis., 123 (1): 32–73. https://doi.org/10.1007/s11263-016-0981-7.
Lattanzi, D., and G. Miller. 2017. “Review of Robotic Infrastructure Inspection Systems.” J. Infrastruct. Syst., 23 (3): 04017004. https://doi.org/10.1061/(ASCE)IS.1943-555X.0000353.
Lin, T.-Y., M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár. 2015. “Microsoft COCO: Common Objects in Context.”.
Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. 2016. SSD: Single Shot MultiBox Detector. 21–37.
Opfer, N., and D. Shields. 2014. “Unmanned Aerial Vehicle Applications and Issues for Construction.” 2014 ASEE Annu. Conf. Expo. Proc., 24.1302.1–24.1302.16. Indianapolis, Indiana: ASEE Conferences.
OSHA (Occupational Safety and Health Administration). 2023. “Commonly Used Statistics | Occupational Safety and Health Administration.” U. S. Dep. Labor. Accessed May 22, 2023. https://www.osha.gov/data/commonstats.
Rakha, T., and A. Gorodetsky. 2018a. “Review of Unmanned Aerial System (UAS) applications in the built environment: Towards automated building inspection procedures using drones.” Autom. Constr., 93: 252–264. Elsevier.
Rakha, T., and A. Gorodetsky. 2018b. “Review of Unmanned Aerial System (UAS) applications in the built environment: Towards automated building inspection procedures using drones.” Autom. Constr., 93: 252–264. https://doi.org/10.1016/j.autcon.2018.05.002.
Redmon, J., S. Divvala, R. Girshick, and A. Farhadi. 2016. “You Only Look Once: Unified, Real-Time Object Detection.” 2016 IEEE Conf. Comput. Vis. Pattern Recognit. CVPR, 779–788. Las Vegas, NV, USA: IEEE.
Roca, D., S. Lagüela, L. Díaz-Vilariño, J. Armesto, and P. Arias. 2013. “Low-cost aerial unit for outdoor inspection of building façades.” Autom. Constr., 36: 128–135. https://doi.org/10.1016/j.autcon.2013.08.020.
Smith, S. 2014. “Fatal Four: Safety in the Construction Industry [Infographic].” EHS Today. Accessed May 17, 2023. https://www.ehstoday.com/construction/article/21916245/fatal-four-safety-in-the-construction-industry-infographic.
Vemprala, S., R. Bonatti, A. Bucker, and A. Kapoor. 2023. “Chatgpt for robotics: Design principles and model abilities.” Microsoft Auton Syst Robot Res, 2: 20.
Wen, Y., K. Chen, and N. Choudhury. 2023. “A Robotic Sensing System for Intelligent Inspection of Indoor Building Systems”, 2023 International conference on computing in civil engineering.
Wu, J., J. Wang, Z. Yang, Z. Gan, Z. Liu, J. Yuan, and L. Wang. 2022. “GRiT: A Generative Region-to-text Transformer for Object Understanding.”.
Information & Authors
Information
Published In
History
Published online: Mar 18, 2024
ASCE Technical Topics:
- Aerospace engineering
- Aircraft and spacecraft
- Automation and robotics
- Business management
- Construction engineering
- Construction management
- Engineering fundamentals
- Equipment and machinery
- Geomatics
- Industries
- Inspection
- Integrated systems
- Navigation (geomatic)
- Organizations
- Practice and Profession
- Systems engineering
- Systems management
- Unmanned vehicles
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.