Chapter
Jan 25, 2024

A Systematic Review of Speech Understanding Studies for Human-Robot Collaborative Construction

Publication: Computing in Civil Engineering 2023

ABSTRACT

Robotic labors are increasingly incorporated in construction to team up with human workers and relieve them from physically exerting and hazardous work. Compared with traditional communication tools (e.g., control interfaces), speech is a natural and powerful one to efficiently communicate workers’ intents to robots without interrupting the work at hand. However, few studies systematically investigate the existing speech understanding technologies for construction applications. By following a systematic review methodology, this paper (1) gives an overview of the heated debate topics in recent 10 years, (2) presents a taxonomy of speech understanding studies, and (3) discusses the research gaps and future work for deploying speech-enabled collaborative robots in the noisy and unstructured construction contexts. The findings are expected to reveal the development trends of the speech understanding technologies, direct future research efforts, and advance human-robot collaborative construction applications.

Get full access to this article

View all available purchase options and get full access to this chapter.

REFERENCES

Ahn, H., Choi, S., Kim, N., Cha, G., and Oh, S. (2018). Interactive text2pickup networks for natural language-based human–robot collaboration. IEEE Robotics and Automation Letters, 3(4), 3308–3315.
Bingol, M. C., and Aydogmus, O. (2020). Performing predefined tasks using the human–robot interaction on speech recognition for an industrial robot. Engineering Applications of Artificial Intelligence, 95, 103903.
Briggs, G., Williams, T., and Scheutz, M. (2017). Enabling robots to understand indirect speech acts in task-based interactions. Journal of Human-Robot Interaction, 6(1), 64–94.
Brosque, C., Galbally, E., Khatib, O., and Fischer, M. (2020, June). Human-robot collaboration in construction: Opportunities and challenges. In 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) (pp. 1–8). IEEE.
Chai, J. Y., Fang, R., Liu, C., and She, L. (2016). Collaborative language grounding toward situated human-robot dialogue. AI Magazine, 37(4), 32–45.
Fernández-Rodicio, E., Castro-González, Á., Alonso-Martín, F., Maroto-Gómez, M., and Salichs, M. Á. (2020). Modelling multimodal dialogues for social robots using communicative acts. Sensors, 20(12), 3440.
Gómez, L. V., and Miura, J. (2021). Ontology-based knowledge management with verbal interaction for command interpretation and execution by home service robots. Robotics and Autonomous Systems, 140, 103763.
Huang, K., Han, Y., Wu, J., Qiu, F., and Tang, Q. (2022). Language-Driven Robot Manipulation With Perspective Disambiguation and Placement Optimization. IEEE Robotics and Automation Letters, 7(2), 4188–4195.
Kerzel, M., Ambsdorf, J., Becker, D., Lu, W., Strahl, E., Spisak, J., and Wermter, S. (2022). What’s on Your Mind, NICO? XHRI: A Framework for eXplainable Human-Robot Interaction. KI-Künstliche Intelligenz, 1–18.
Kim, J. (2020). Visual analytics for operation-level construction monitoring and documentation: State-of-the-art technologies, research challenges, and future directions. Frontiers in Built Environment, 6, 575738.
Liu, M., Xiao, C., and Chen, C. (2022). Perspective-Corrected Spatial Referring Expression Generation for Human–Robot Interaction. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(12), 7654–7666.
Lin, Y., Min, H., Zhou, H., and Pei, F. (2017). A human–robot-environment interactive reasoning mechanism for object sorting robot. IEEE Transactions on Cognitive and Developmental Systems, 10(3), 611–623.
Liu, R., and Zhang, X. (2019). A review of methodologies for natural-language-facilitated human–robot cooperation. International Journal of Advanced Robotic Systems, 16(3), 1729881419851402.
Lu, D., and Chen, X. (2017). Interpreting and extracting open knowledge for human-robot interaction. IEEE/CAA Journal of Automatica Sinica, 4(4), 686–695.
Mahzoon, H., Okazaki, M., Yoshikawa, Y., and Ishiguro, H. (2021). Effect of the projection of robot’s talk information on the perception of communicating human. Advanced Robotics, 35(20), 1209–1222.
Marin Vargas, A., Cominelli, L., Dell’Orletta, F., and Scilingo, E. P. (2021). Verbal communication in robotics: a study on salient terms, research fields and trends in the last decades based on a computational linguistic analysis. Frontiers in Computer Science, 63.
Mavridis, N. (2015). A review of verbal and non-verbal human–robot interactive communication. Robotics and Autonomous Systems, 63, 22–35.
Mi, J., Liang, H., Katsakis, N., Tang, S., Li, Q., Zhang, C., and Zhang, J. (2020). Intention-related natural language grounding via object affordance detection and intention semantic extraction. Frontiers in Neurorobotics, 14, 26.
Misra, D. K., Sung, J., Lee, K., and Saxena, A. (2016). Tell me dave: Context-sensitive grounding of natural language to manipulation instructions. The International Journal of Robotics Research, 35(1-3), 281–300.
Muthugala, M. A., and Jayasekara, A. G. (2019). Improving the understanding of navigational commands by adapting a robot’s directional perception based on the environment. Journal of Ambient Intelligence and Smart Environments, 11(2), 135–148.
Nikolaidis, S., Kwon, M., Forlizzi, J., and Srinivasa, S. (2018). Planning with verbal communication for human-robot collaboration. ACM Transactions on Human-Robot Interaction (THRI), 7(3), 1–21.
Oguz, O. S., Rampeltshammer, W., Paillan, S., and Wollherr, D. (2019). An ontology for human-human interactions and learning interaction behavior policies. ACM Transactions on Human-Robot Interaction (THRI), 8(3), 1–26.
Paul, R., Barbu, A., Felshin, S., Katz, B., and Roy, N. (2018). Temporal grounding graphs for language understanding with accrued visual-linguistic context.
Perera, V., Soetens, R., Kollar, T., Samadi, M., Sun, Y., Nardi, D., and Veloso, M. (2015). Learning task knowledge from dialog and web access. Robotics, 4(2), 223–252.
Pramanick, P., Sarkar, C., Paul, S., dev Roychoudhury, R., and Bhowmick, B. (2022a). DoRO: Disambiguation of referred object for embodied agents. IEEE Robotics and Automation Letters, 7(4), 10826–10833.
Pramanick, P., Sarkar, C., Banerjee, S., and Bhowmick, B. (2022b). Talk-to-Resolve: Combining scene understanding and spatial dialogue to resolve granular task ambiguity for a collocated robot. Robotics and Autonomous Systems, 155, 104183.
Qi, J., Ding, X., Li, W., Han, Z., and Xu, K. (2020). Fusing Hand Postures and Speech Recognition for Tasks Performed by an Integrated Leg–Arm Hexapod Robot. Applied Sciences, 10(19), 6995.
Sagara, R., Taguchi, R., Taniguchi, A., and Taniguchi, T. (2022). Automatic selection of coordinate systems for learning relative and absolute spatial concepts. Frontiers in Robotics and AI, 199.
Scheutz, M., Thielstrom, R., and Abrams, M. (2022). Transparency through Explanations and Justifications in Human-Robot Task-Based Communications. International Journal of Human–Computer Interaction, 38(18-20), 1739–1752.
Schütte, N., Mac Namee, B., and Kelleher, J. (2017). Robot perception errors and human resolution strategies in situated human–robot dialogue. Advanced Robotics, 31(5), 243–257.
Shridhar, M., Mittal, D., and Hsu, D. (2020). INGRESS: Interactive visual grounding of referring expressions. The International Journal of Robotics Research, 39(2-3), 217–232.
Stanton, N. A. (2006). Hierarchical task analysis: Developments, applications, and extensions. Applied ergonomics, 37(1), 55–79.
Tellex, S., Gopalan, N., Kress-Gazit, H., and Matuszek, C. (2020). Robots that use language. Annual Review of Control, Robotics, and Autonomous Systems, 3, 25–55.
Yamada, T., Murata, S., Arie, H., and Ogata, T. (2016). Dynamical integration of language and behavior in a recurrent neural network for human–robot interaction. Frontiers in neurorobotics, 10, 5.
Yu, Z., and Lee, M. (2015). Human motion based intent recognition using a deep dynamic neural model. Robotics and Autonomous Systems, 71, 134–149.
Zhang, L., and Issa, R. R. (2011, July). Development of IFC-based construction industry ontology for information retrieval from IFC models. In EG-ICE Workshop, University of Twente, The Netherlands.

Information & Authors

Information

Published In

Go to Computing in Civil Engineering 2023
Computing in Civil Engineering 2023
Pages: 437 - 444

History

Published online: Jan 25, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

1School of Civil Engineering, Purdue Univ., West Lafayette, IN. Email: [email protected]
Hubo Cai, Ph.D., P.E., M.ASCE [email protected]
2School of Civil Engineering, Purdue Univ., West Lafayette, IN. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$198.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$198.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share