Abstract

Advancement in video technology has made it possible for instructors to provide students with applied knowledge of construction practice. While videos can stimulate students’ interest in the construction domain by providing opportunities to observe real-life construction work, videos can sometimes contain extraneous information that may distract learners from essential learning contents. Computer vision techniques can be utilized to detect and direct learners’ focus to important learning concepts in videos. This study investigated the design and usability evaluation of an annotated video-based learning environment designed to direct learners’ attention to significant learning contents. Faster R-CNN with VGG16 backbone was trained with 21,595 images to detect practice concepts within videos. A Visual Translational Embedding Network was trained with the object detector and 8,004 images to predict interactions between subjects and objects of the practice concepts. The object detection model could detect all instances of subjects and objects, making the model sufficient for interaction detection. Usability evaluation was conducted using questionnaires, verbal feedback, and eye-tracking data. Results of the usability evaluation revealed that cues, such as bounding boxes, texts, and color highlights, drew learners’ attention to the practice concepts. However, students allocated more attention to the signaled images than the texts. The study contributes to dual-coding theory and the cognitive theory of multimedia learning through the use of cues to select, organize, and direct learners’ attention to noteworthy information within videos. This study also provides insights into the key features of cues that can facilitate learning of construction practice concepts with videos.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

All data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

References

Alhadreti, O., F. Elbabour, and P. Mayhew. 2017. “Eye tracking in retrospective think-aloud usability testing: Is there added value?” J. Usability Stud. 12 (3): 95–110.
Alpizar, D., O. O. Adesope, and R. M. Wong. 2020. “A meta-analysis of signaling principle in multimedia learning environments.” Educ. Technol. Res. Dev. 68 (5): 2095–2119. https://doi.org/10.1007/s11423-020-09748-7.
Ardito, C., M. F. Costabile, M. D. Marsico, R. Lanzilotti, S. Levialdi, T. Roselli, and V. Rossano. 2006. “An approach to usability evaluation of e-learning applications.” Univ. Access Inf. Soc. 4 (3): 270–283. https://doi.org/10.1007/s10209-005-0008-6.
Austin, K. A. 2009. “Multimedia learning: Cognitive individual differences and display design techniques predict transfer learning with multimedia learning modules.” Comput. Educ. 53 (4): 1339–1354. https://doi.org/10.1016/j.compedu.2009.06.017.
Bétrancourt, M., and A. Bisseret. 1998. “Integrating textual and pictorial information via pop-up windows: An experimental study.” Behav. Inf. Technol. 17 (5): 263–273. https://doi.org/10.1080/014492998119337.
Bordes, A., N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko. 2013. “Translating embeddings for modeling multi-relational data.” In Advances in neural information processing systems, 26. Red Hook, NY: Curran Associates.
Brame, C. J. 2016. “Effective educational videos: Principles and guidelines for maximizing student learning from video content.” CBE—Life Sci. Educ. 15 (4): es6. https://doi.org/10.1187/cbe.16-03-0125.
Cappelli, A., C. Parretti, E. Cini, and P. Citti. 2019. “Development of a new washing machine in olive oil extraction plant: A first application of usability-based approach.” J. Agric. Eng. 50 (3): 134–142. https://doi.org/10.4081/jae.2019.949.
Carranza-García, M., J. Torres-Mateo, P. Lara-Benítez, and J. García-Gutiérrez. 2020. “On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data.” Remote Sens. 13 (1): 89. https://doi.org/10.3390/rs13010089.
Chen, S., and K. Demachi. 2021. “Towards on-site hazards identification of improper use of personal protective equipment using deep learning-based geometric relationships and hierarchical scene graph.” Autom. Constr. 125 (May): 103619. https://doi.org/10.1016/j.autcon.2021.103619.
Cheng, B., Y. Wei, H. Shi, R. Feris, J. Xiong, and T. Huang. 2018. “Revisiting RCNN: On awakening the classification power of faster RCNN.” In Proc., European Conf. on Computer Vision (ECCV), 453–468. Berlin: Springer.
Chiu, P.-S., H.-C. Chen, Y.-M. Huang, C.-J. Liu, M.-C. Liu, and M.-H. Shen. 2018. “A video annotation learning approach to improve the effects of video learning.” Innov. Educ. Teach. Int. 55 (4): 459–469. https://doi.org/10.1080/14703297.2016.1213653.
Cho, H., D. Powell, A. Pichon, L. M. Kuhns, R. Garofalo, and R. Schnall. 2019. “Eye-tracking retrospective think-aloud as a novel approach for a usability evaluation.” Int. J. Med. Inf. 129 (Sep): 366–373. https://doi.org/10.1016/j.ijmedinf.2019.07.010.
De Koning, B. B., H. K. Tabbers, R. M. Rikers, and F. Paas. 2009. “Towards a framework for attention cueing in instructional animations: Guidelines for research and design.” Educ. Psychol. Rev. 21 (2): 113–140. https://doi.org/10.1007/s10648-009-9098-7.
Doherty, S., S. O’Brien, and M. Carl. 2010. “Eye tracking as an MT evaluation technique.” Mach. Transl. 24 (1): 1–13. https://doi.org/10.1007/s10590-010-9070-9.
Dwyer, B., and J. Nelson. 2022. “Roboflow (version 1.0).” Accessed March 23, 2021. https://roboflow.com.
Elpeltagy, M., and H. Sallam. 2021. “Automatic prediction of COVID-19 from chest images using modified ResNet50.” Multimedia Tools Appl. 80 (17): 26451–26463. https://doi.org/10.1007/s11042-021-10783-6.
Fang, Q., H. Li, X. Luo, L. Ding, H. Luo, T. M. Rose, and W. An. 2018a. “Detecting non-hardhat-use by a deep learning method from far-field surveillance videos.” Autom. Constr. 85 (Jan): 1–9. https://doi.org/10.1016/j.autcon.2017.09.018.
Fang, W., L. Ding, H. Luo, and P. E. Love. 2018b. “Falls from heights: A computer vision-based approach for safety harness detection.” Autom. Constr. 91 (Jul): 53–61. https://doi.org/10.1016/j.autcon.2018.02.018.
Gao, C., Y. Zou, and J.-B. Huang. 2018. “ican: Instance-centric attention network for human-object interaction detection.” Preprint, submitted August 30, 2018. https://arxiv.org/abs/1808.10437.
Goldberg, J. H., and A. M. Wichansky. 2003. “Eye tracking in usability evaluation: A practitioner’s guide.” In The mind’s eye, 493–516. Amsterdam, Netherlands: Elsevier.
Granić, A., and M. Ćukušić. 2011. “Usability testing and expert inspections complemented by educational evaluation: A case study of an e-learning platform.” J. Educ. Technol. Soc. 14 (2): 107–123.
Guo, P. J., J. Kim, and R. Rubin. 2014. “How video production affects student engagement: An empirical study of MOOC videos.” In Proc., 1st ACM Conf. on Learning at Scale Conf., 41–50. New York: Association for Computing Machinery.
Holmqvist, K., M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. Van de Weijer. 2011. Eye tracking: A comprehensive guide to methods and measures. Oxford: Oxford University Press.
Ibrahim, M., P. D. Antonenko, C. M. Greenwood, and D. Wheeler. 2012. “Effects of segmenting, signalling, and weeding on learning from educational video.” Learn. Media Technol. 37 (3): 220–235. https://doi.org/10.1080/17439884.2011.585993.
Jeng, J. 2005. Vol. 55 of Usability assessment of academic digital libraries: Effectiveness, efficiency, satisfaction, and learnability, 96–121. Munich, Germany: K. G. Saur Verlag GmbH. https://doi.org/10.1515/LIBR.2005.96.
Jevsikova, T., A. Berniukevičius, and E. Kurilovas. 2017. “Application of resource description framework to personalise learning: Systematic review and methodology.” Inf. Educ. 16 (1): 61–82. https://doi.org/10.15388/infedu.2017.04.
Kardovskyi, Y., and S. Moon. 2021. “Artificial intelligence quality inspection of steel bars installation by integrating mask R-CNN and stereo vision.” Autom. Constr. 130 (Oct): 103850. https://doi.org/10.1016/j.autcon.2021.103850.
Le, Q. T., A. Pedro, and C. S. Park. 2015. “A social virtual reality based construction safety education system for experiential learning.” J. Intell. Rob. Syst. 79 (3): 487–506. https://doi.org/10.1007/s10846-014-0112-z.
Li, Y., H. Wei, Z. Han, N. Jiang, W. Wang, and J. Huang. 2022. “Computer vision-based hazard identification of construction site using visual relationship detection and ontology.” Buildings 12 (6): 857. https://doi.org/10.3390/buildings12060857.
Li, Y.-L., S. Zhou, X. Huang, L. Xu, Z. Ma, H.-S. Fang, Y. Wang, and C. Lu. 2019. “Transferable interactiveness knowledge for human-object interaction detection.” In Proc., IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 3585–3594. New York: IEEE.
Lin, T.-Y., M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. 2014. “Microsoft COCO: Common objects in context.” In Proc., European Conf. on Computer Vision, 740–755. Cham, Switzerland: Springer.
Luo, X., H. Li, D. Cao, F. Dai, J. Seo, and S. Lee. 2018. “Recognizing diverse construction activities in site images via relevance networks of construction-related objects detected by convolutional neural networks.” J. Comput. Civ. Eng. 32 (3): 04018012. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000756.
MacHardy, Z., and Z. A. Pardos. 2015. “Toward the evaluation of educational videos using bayesian knowledge tracing and big data.” In Proc., 2nd (2015) ACM Conf. on Learning at Scale, 347–350. New York: Association for Computing Machinery. https://doi.org/10.1145/2724660.2728690.
Mayer, R. E. 2002. “Multimedia learning.” In Psychology of learning and motivation, 85–139. Amsterdam: Elsevier.
Mayer, R. E. 2008. “Applying the science of learning: Evidence-based principles for the design of multimedia instruction.” Am. Psychol. 63 (8): 760. https://doi.org/10.1037/0003-066X.63.8.760.
Mayer, R. E., and R. Moreno. 1998. “A cognitive theory of multimedia learning: Implications for design principles.” J. Educ. Psychol. 91 (2): 358–368.
Mayer, R. E., and V. K. Sims. 1994. “For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning.” J. Educ. Psychol. 86 (3): 389. https://doi.org/10.1037/0022-0663.86.3.389.
McLellan, S., A. Muddimer, and S. C. Peres. 2012. “The effect of experience on system usability scale ratings.” J. Usability Stud. 7 (2): 56–67.
Montgomery, B. R. 2011. “The impact of the user interface on simulation usability and solution quality.” Accessed September 10, 2021. https://nsuworks.nova.edu/gscis_etd/253.
Morais, R., V. Le, S. Venkatesh, and T. Tran. 2021. “Learning asynchronous and sparse human-object interaction in videos.” In Proc., IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 16041–16050. New York: IEEE.
Mu, X. 2010. “Towards effective video annotation: An approach to automatically link notes with video content.” Comput. Educ. 55 (4): 1752–1763. https://doi.org/10.1016/j.compedu.2010.07.021.
Nath, N. D., A. H. Behzadan, and S. G. Paal. 2020. “Deep learning for site safety: Real-time detection of personal protective equipment.” Autom. Constr. 112 (Apr): 103085. https://doi.org/10.1016/j.autcon.2020.103085.
Olayiwola, J., A. Akanmu, K. Asfari, X. Gao, and H. Murzi. 2022. “Concepts for complimenting classroom teaching with construction practice.” Smart Sustainable Built Environ. https://doi.org/10.1108/SASBE-07-2022-0144.
Ozcelik, E., I. Arslan-Ari, and K. Cagiltay. 2010. “Why does signaling enhance multimedia learning? Evidence from eye movements.” Comput. Hum. Behav. 26 (1): 110–117. https://doi.org/10.1016/j.chb.2009.09.001.
Paivio, A. 1990. Mental representations: A dual coding approach. Oxford: Oxford University Press.
Paivio, A., and J. M. Clark. 2006. “Dual coding theory and education.” In Proc., Draft Chapter Presented at the Conf. on Pathways to Literacy Achievement for High Poverty Children at the University of Michigan School of Education. Ann Arbor, MI: Univ. of Michigan.
Park, B., A. Korbach, and R. Brünken. 2015. “Do learner characteristics moderate the seductive-details-effect? A cognitive-load-study using eye-tracking.” J. Educ. Technol. Soc. 18 (4): 24–36.
Park, S., S. Yoon, and J. Heo. 2019. “Image-based automatic detection of construction helmets using R-FCN and transfer learning.” KSCE J. Civ. Environ. Eng. Res. 39 (3): 399–407. https://doi.org/10.12652/Ksce.2019.39.3.0399.
Raoofi, H., and A. Motamedi. 2020. “Mask R-CNN deep learning-based approach to detect construction machinery on jobsites.” In Proc., ISARC Int. Symp. on Automation and Robotics in Construction, 1122–1127. Seoul: IAARC Publications.
Reisslein, J., P. Seeling, and M. Reisslein. 2005. “Computer-based instruction on multimedia networking fundamentals: Equational versus graphical representation.” IEEE Trans. Educ. 48 (3): 438–447. https://doi.org/10.1109/TE.2005.849744.
Ren, S., K. He, R. Girshick, and J. Sun. 2015. “Faster R-CNN: Towards real-time object detection with region proposal networks.” In Advances in neural information processing systems, 28. Cambridge, MA: MIT Press Bookstore.
roboflow.ai. 2022. “Roboflow.” Accessed January 16, 2022. https://roboflow.com/.
Scheiter, K., and A. Eitel. 2017. “The use of eye tracking as a research and instructional tool in multimedia learning.” In Eye-tracking technology applications in educational research, 143–164. Hershey, PA: IGI Global.
Srivastava, N., S. Nawaz, J. Newn, J. Lodge, E. Velloso, M. Erfani, D. Gasevic, and J. Bailey. 2021. “Are you with me? Measurement of learners’ video-watching attention with eye tracking.” In Proc., LAK21: 11th Int. Learning Analytics and Knowledge Conf., 88–98. New York: Association for Computing Machinery. https://doi.org/10.1145/3448139.3448148.
Tang, S., D. Roberts, and M. Golparvar-Fard. 2020. “Human-object interaction recognition for automatic construction site safety inspection.” Autom. Constr. 120 (Dec): 103356. https://doi.org/10.1016/j.autcon.2020.103356.
Tullis, T. S., and J. N. Stetson. 2004. “A comparison of questionnaires for assessing website usability.” In Proc., Usability Professionals Association 2004 Conf. Boston: Fidelity Investments.
Velichkovsky, B. B., N. Khromov, A. Korotin, E. Burnaev, and A. Somov. 2019. “Visual fixations duration as an indicator of skill level in esports.” In Proc., IFIP Conf. on Human-Computer Interaction, 397–405. Berlin: Springer. https://doi.org/10.1007/978-3-030-29381-9_25.
Virzi, R. A. 1992. “Refining the test phase of usability evaluation: How many subjects is enough?” Hum. Factors 34 (4): 457–468. https://doi.org/10.1177/001872089203400407.
Wang, J., P. Antonenko, M. Celepkolu, Y. Jimenez, E. Fieldman, and A. Fieldman. 2019. “Exploring relationships between eye tracking and traditional usability testing data.” Int. J. Hum.–Comput. Interact. 35 (6): 483–494. https://doi.org/10.1080/10447318.2018.1464776.
Wang, X., L. Lin, M. Han, and J. M. Spector. 2020. “Impacts of cues on learning: Using eye-tracking technologies to examine the functions and designs of added cues in short instructional videos.” Comput. Hum. Behav. 107 (Jun): 106279. https://doi.org/10.1016/j.chb.2020.106279.
Wittrock, M. C. 1989. “Generative processes of comprehension.” Educ. Psychol. 24 (4): 345–376. https://doi.org/10.1207/s15326985ep2404_2.
Wong, S. K. B., T. T. Nguyen, E. Chang, and N. Jayaratna. 2003. “Usability metrics for e-learning.” In Proc., OTM Confederated Int. Conf. ‘On the Move to Meaningful Internet Systems’, 235–252. Berlin: Springer. https://doi.org/10.1007/978-3-540-39962-9_34.
Xiao, B., H. Xiao, J. Wang, and Y. Chen. 2022. “Vision-based method for tracking workers by integrating deep learning instance segmentation in off-site construction.” Autom. Constr. 136 (Apr): 104148. https://doi.org/10.1016/j.autcon.2022.104148.
Zardari, B. A., Z. Hussain, A. A. Arain, W. H. Rizvi, and M. S. Vighio. 2021. “QUEST e-learning portal: Applying heuristic evaluation, usability testing and eye tracking.” Univ. Access Inf. Soc. 20 (3): 531–543. https://doi.org/10.1007/s10209-020-00774-z.
Zhang, H., Z. Kyaw, S.-F. Chang, and T.-S. Chua. 2017. “Visual translation embedding network for visual relation detection.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 5532–5540. New York: IEEE. https://doi.org/10.48550/arXiv.1702.08319.

Information & Authors

Information

Published In

Go to Journal of Computing in Civil Engineering
Journal of Computing in Civil Engineering
Volume 37Issue 6November 2023

History

Received: Oct 20, 2022
Accepted: Feb 28, 2023
Published online: Aug 22, 2023
Published in print: Nov 1, 2023
Discussion open until: Jan 22, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

Ph.D. Candidate, Myers Lawson School of Construction, Virginia Tech, 1345 Perry St., Blacksburg, VA 24061. ORCID: https://orcid.org/0000-0003-2795-6195. Email: [email protected]
Associate Professor, Myers Lawson School of Construction, Virginia Tech, 1345 Perry St., Blacksburg, VA 24061 (corresponding author). ORCID: https://orcid.org/0000-0001-9145-4865. Email: [email protected]
Assistant Professor, Myers Lawson School of Construction, Virginia Tech, 1345 Perry St., Blacksburg, VA 24061. ORCID: https://orcid.org/0000-0002-3531-8137. Email: [email protected]
Assistant Professor, Dept. of Engineering Education, Virginia Tech, 635 Prices Fork Rd., Blacksburg, VA 24061. ORCID: https://orcid.org/0000-0003-3849-2947. Email: [email protected]
Assistant Professor, Myers Lawson School of Construction, Virginia Tech, 1345 Perry St., Blacksburg, VA 24061. ORCID: https://orcid.org/0000-0002-8110-8683. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • Efficacy of Annotated Video-Based Learning Environment for Drawing Students’ Attention to Construction Practice Concepts, Journal of Construction Engineering and Management, 10.1061/JCEMD4.COENG-13778, 150, 1, (2024).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share