Hand Signal Recognition of Workers on Construction Sites Using Deep Learning Networks
Publication: Computing in Civil Engineering 2021
ABSTRACT
Hand signals, as one of the common ways to communicate, are widely used on construction sites due to their simple but effective nature. However, they may not always be captured timely or interpreted correctly in construction fields, which easily leads to worker injuries/fatalities, work interruption, and stoppage, etc. This paper investigated whether construction hand signals could be captured and interpreted automatically with deep learning networks. A new data set containing 11 classes of hand signals for instructing tower crane operations is created under different scenes. The created data set is employed to compare two state-of-the-art 3D convolutional neural networks (CNNs), namely ResNeXt-101 and Res3D+ConvLSTM+MobileNet, and measure their hand signal recognition performance. The comparison results indicate that a high classification accuracy (99.0%) and a short inference time (0.21 s/gesture) could be achieved, which illustrates the feasibility of using deep learning networks to achieve hand signal recognition in construction.
Get full access to this chapter
View all available purchase options and get full access to this chapter.
REFERENCES
Benitez-Garcia, G., Olivares-Mercado, J., Sanchez-Perez, G., and Yanai, K. (2020). “IPN Hand: A Video Dataset and Benchmark for Real-Time Continuous Hand Gesture Recognition.”
Bust, P. D., Gibb, A. G. F., and Pink, S. (2008). “Managing construction health and safety: Migrant workers and communicating safety messages.” Safety Science.
ENFORM. (2013). “D8 Bulldozer Contact with Surveyor on ATV.” <http://www.energysafetycanada.com/files/safety-alerts/SA05-13-ATV-Bulldozer.pdf>.
Hagan, P. E., Montgomery, J. F., and O’Reilly, J. T. (2015). Accident prevention manual for business & industry: engineering & technology. National Safety Council.
Huang, J., Zhou, W., Zhang, Q., Li, H., and Li, W. (2018). “Video-based sign language recognition without temporal segmentation.” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018.
IONAPEX. (2013). “Safety Talk Report.” <http://www.ionapex.com/safety-talks/all-topics/mistaken-signals.shtml>.
Koller, O., Camgoz, C., Ney, H., and Bowden, R. (2019). “Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Köpüklü, O., Gunduz, A., Kose, N., and Rigoll, G. (2019). “Real-time hand gesture detection and classification using convolutional neural networks.” Proceedings - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019.
Liao, Y., Xiong, P., Min, W., Min, W., and Lu, J. (2019). “Dynamic Sign Language Recognition Based on Video Sequence with BLSTM-3D Residual Networks.” IEEE Access.
Materzynska, J., Berger, G., Bax, I., and Memisevic, R. (2019). “The jester dataset: A large-scale video dataset of human gestures.” Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019.
Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., and Kautz, J. (2016). “Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks.” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
National Commission for the Certification of Crane Operators. (2014). Signalperson Reference Manual.
Neitzel, R. L., Seixas, N. S., and Ren, K. K. (2001). “A Review of Crane Safety in the Construction Industry.” Applied Occupational and Environmental Hygiene.
Shin, S., and Sung, W. (2016). “Dynamic hand gesture recognition for wearable devices with low complexity recurrent neural networks.” Proceedings - IEEE International Symposium on Circuits and Systems.
Stereolabs. (2019). “ZED 2-AI Stereo Camera.” <https://www.stereolabs.com/zed-2/>.
Wang, X., and Zhu, Z. (2021). “Vision-based hand signal recognition in construction: A feasibility study.” Automation in Construction.
Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K. (2017). “Aggregated residual transformations for deep neural networks.” Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017.
Yang, J., Vela, P. A., Teizer, J., and Shi, Z. K. (2011). “Vision-based crane tracking for understanding construction activity.” Congress on Computing in Civil Engineering, Proceedings.
Zhang, L., Mei, L., Shah, S. A. A., Zhu, G., Shen, P., and Bennamoun, M. (2018a). “Attention in convolutional LSTM for gesture recognition.” Advances in Neural Information Processing Systems.
Zhang, Y., Cao, C., Cheng, J., and Lu, H. (2018b). “EgoGesture: A New Dataset and Benchmark for Egocentric Hand Gesture Recognition.” IEEE Transactions on Multimedia.
Information & Authors
Information
Published In
History
Published online: May 24, 2022
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.