Technical Papers
Jun 24, 2022

Synthetic Image Dataset Development for Vision-Based Construction Equipment Detection

Publication: Journal of Computing in Civil Engineering
Volume 36, Issue 5

Abstract

This paper presents a systematic method to create universally applicable synthetic training image datasets for computer vision-based construction object detection. The synthetic images created by inserting a virtual object of interest into a real site image allows us to minimize the time and effort for training image data collection and annotation. In addition, the use of synthetic images has an additional benefit that training images can be easily customized for a target construction site by considering the context of the site (e.g., different background scenes, camera positions, and angles) and the possible variability of target objects to be detected (e.g., different sizes, locations, rotation angles, and postures) on images. An automated approach proposed in this study attempts to systematically create the synthetic images using the Unity game engine in which context- and variability-related parameters can be controlled. The proposed method was validated by training a deep learning-based object detection algorithm [i.e., a faster regions with convolutional neural network (R-CNN) model] with synthetic images and testing it on real images from earthwork construction sites to detect an excavator. The CNN models trained with synthetic images showed an average precision value of more than 90%; in particular, the classifier using synthetic images outperformed the one using real site images. The detection results also demonstrated an improved capability to capture the high irregularity of a construction object on images when using techniques of context customization and variability randomization. The findings from this study demonstrate the feasibility and practicality of the use of synthetic images for vision-based approaches in a construction domain. Ultimately, the proposed approach serves as an alternative way to build comprehensive image datasets for construction entities, contributing to facilitating vision-based studies on construction.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some data, models, or code that support the findings of this study are available from the corresponding author on reasonable request, including (1) samples of the image data used in the experiment; and (2) related information about the object detection model and code (Unity).

Acknowledgments

This research study was supported by a grant (20CTAP-C151784-02) from the Technology Advancement Research Program funded by the Ministry of Land, Infrastructure and Transport of Korean government and the Early Career Scheme (PolyU 25210917) from the Research Grants Council, Hong Kong.

References

Athanasiadis, T., P. Mylonas, Y. Avrithis, and S. Kollias. 2007. “Semantic image segmentation and object labeling.” IEEE Trans. Circuits Syst. Video Technol. 17 (3): 298–312. https://doi.org/10.1109/TCSVT.2007.890636.
Azar, E. R., C. Feng, and V. R. Kamat. 2015. “Feasibility of in-plane articulation monitoring of excavator arm using planar marker tracking.” J. Inf. Technol. Construct. 20 (15): 213–229.
Braun, A., and A. Borrmann. 2019. “Combining inverse photogrammetry and BIM for automated labeling of construction site images for machine learning.” Autom. Constr. 106 (Oct): 102879. https://doi.org/10.1016/j.autcon.2019.102879.
Chi, S., and C. H. Caldas. 2012. “Image-based safety assessment: Automated spatial safety risk identification of earthmoving and surface mining activities.” J. Constr. Eng. Manage. 138 (3): 341–351. https://doi.org/10.1061/(ASCE)CO.1943-7862.0000438.
Dimitrov, A., and M. Golparvar-Fard. 2014. “Vision-based material recognition for automated monitoring of construction progress and generating building information modeling from unordered site image collections.” Adv. Eng. Inf. 28 (1): 37–49. https://doi.org/10.1016/j.aei.2013.11.002.
Dwibedi, D., I. Misra, and M. Hebert. 2017. “Cut, paste and learn: Surprisingly easy synthesis for instance detection.” In Proc., IEEE Int. Conf. on Computer Vision, 1301–1310. New York: IEEE.
Everingham, M., L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman. 2010. “The pascal visual object classes (VOC) challenge.” Int. J. Comput. Vision 88 (2): 303–338. https://doi.org/10.1007/s11263-009-0275-4.
Fang, W., L. Ding, H. Luo, and P. E. Love. 2018a. “Falls from heights: A computer vision-based approach for safety harness detection.” Autom. Constr. 91 (Jul): 53–61. https://doi.org/10.1016/j.autcon.2018.02.018.
Fang, W., L. Ding, B. Zhong, P. E. Love, and H. Luo. 2018b. “Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach.” Adv. Eng. Inf. 37 (Aug): 139–149. https://doi.org/10.1016/j.aei.2018.05.003.
Feng, C., V. R. Kamat, and H. Cai. 2018. “Camera marker networks for articulated machine pose estimation.” Autom. Constr. 96 (Dec): 148–160. https://doi.org/10.1016/j.autcon.2018.09.004.
Girshick, R. 2015. “Fast R-CNN.” In Proc., IEEE Int. Conf. on Computer Vision, 1440–1448. New York: IEEE.
Golparvar-Fard, M., A. Heydarian, and J. C. Niebles. 2013. “Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers.” Adv. Eng. Inf. 27 (4): 652–663. https://doi.org/10.1016/j.aei.2013.09.001.
Gong, J., C. H. Caldas, and C. Gordon. 2011. “Learning and classifying actions of construction workers and equipment using Bag-of-Video-Feature-Words and Bayesian network models.” Adv. Eng. Inf. 25 (4): 771–782. https://doi.org/10.1016/j.aei.2011.06.002.
Goodfellow, I. J., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. “Generative adversarial networks.” Preprint, submitted June 10, 2014. https://arxiv.org/abs/1406.2661.
Han, K. K., and M. Golparvar-Fard. 2015. “Appearance-based material classification for monitoring of operation-level construction progress using 4D BIM and site photologs.” Autom. Constr. 53 (May): 44–57. https://doi.org/10.1016/j.autcon.2015.02.007.
Han, M., A. Sethi, W. Hua, and Y. Gong. 2004. “A detection-based multiple object tracking method.” In Vol. 5 of Proc., 2004 Int. Conf. on Image Processing, 2004 (ICIP’04), 3065–3068. New York: IEEE.
Handa, A., V. Patraucean, V. Badrinarayanan, S. Stent, and R. Cipolla. 2016. “Understanding real world indoor scenes with synthetic data.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition, 4077–4085. New York: IEEE.
Hinterstoisser, S., V. Lepetit, P. Wohlhart, and K. Konolige. 2018. “On pre-trained image features and synthetic images for deep learning.” In Proc., European Conf. on Computer Vision (ECCV) Workshops. Berlin: Springer.
Karsch, K., V. Hedau, D. Forsyth, and D. Hoiem. 2011. “Rendering synthetic objects into legacy photographs.” ACM Trans. Graphics 30 (6): 1–12. https://doi.org/10.1145/2070781.2024191.
Kim, H., H. Kim, Y. W. Hong, and H. Byun. 2018. “Detecting construction equipment using a region-based fully convolutional network and transfer learning.” J. Comput. Civ. Eng. 32 (2): 04017082. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000731.
Kim, H., K. Kim, and H. Kim. 2016. “Vision-based object-centric safety assessment using fuzzy inference: Monitoring struck-by accidents with moving objects.” J. Comput. Civ. Eng. 30 (4): 04015075. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000562.
Kim, J., and S. Chi. 2017. “Adaptive detector and tracker on construction sites using functional integration and online learning.” J. Comput. Civ. Eng. 31 (5): 04017026. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000677.
Kim, J., and S. Chi. 2021. “A few-shot learning approach for database-free vision-based monitoring on construction sites.” Autom. Constr. 124 (Apr): 103566. https://doi.org/10.1016/j.autcon.2021.103566.
Kim, J., Y. Ham, Y. Chung, and S. Chi. 2019. “Systematic camera placement framework for operation-level visual monitoring on construction jobsites.” J. Constr. Eng. Manage. 145 (4): 04019019. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001636.
Kim, J., J. Hwang, S. Chi, and J. Seo. 2020. “Towards database-free vision-based monitoring on construction sites: A deep active learning approach.” Autom. Constr. 120 (Dec): 103376. https://doi.org/10.1016/j.autcon.2020.103376.
Kolar, Z., H. Chen, and X. Luo. 2018. “Transfer learning and deep convolutional neural networks for safety guardrail detection in 2D images.” Autom. Constr. 89 (May): 58–70. https://doi.org/10.1016/j.autcon.2018.01.003.
Kuznetsova, A., et al. 2020. “The open images dataset V4.” Int. J. Comput. Vision 128 (7): 1956–1981. https://doi.org/10.1007/s11263-020-01316-z.
Lin, T. Y., M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. 2014. “Microsoft coco: Common objects in context.” In Proc., European Conf. on Computer Vision, 740–755. Cham, Switzerland: Springer.
Lundeen, K. M., S. Dong, N. Fredricks, M. Akula, J. Seo, and V. R. Kamat. 2016. “Optical marker-based end effector pose estimation for articulated excavators.” Autom. Constr. 65 (May): 51–64. https://doi.org/10.1016/j.autcon.2016.02.003.
Luo, X., H. Li, D. Cao, F. Dai, J. Seo, and S. Lee. 2018. “Recognizing diverse construction activities in site images via relevance networks of construction-related objects detected by convolutional neural networks.” J. Comput. Civ. Eng. 32 (3): 04018012. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000756.
Ma, J. W., T. Czerniawski, and F. Leite. 2020. “Semantic segmentation of point clouds of building interiors with deep learning: Augmenting training datasets with synthetic BIM-based point clouds.” Autom. Constr. 113 (May): 103144. https://doi.org/10.1016/j.autcon.2020.103144.
Martinez, J. G., M. Gheisari, and L. F. Alarcón. 2020. “UAV integration in current construction safety planning and monitoring processes: Case study of a high-rise building construction project in Chile.” J. Manage. Eng. 36 (3): 05020005. https://doi.org/10.1061/(ASCE)ME.1943-5479.0000761.
Memarzadeh, M., M. Golparvar-Fard, and J. C. Niebles. 2013. “Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors.” Autom. Constr. 32 (Jul): 24–37. https://doi.org/10.1016/j.autcon.2012.12.002.
Olston, C., and M. Najork. 2010. “Web crawling.” Found. Trends Inf. Retrieval 4 (3): 175–246. https://doi.org/10.1561/1500000017.
Padilla, R., S. L. Netto, and E. A. Da Silva. 2020. “A survey on performance metrics for object-detection algorithms.” In Proc., 2020 Int. Conf. on Systems, Signals and Image Processing (IWSSIP), 237–242. New York: IEEE.
Padilla, R., W. L. Passos, T. L. Dias, S. L. Netto, and E. A. da Silva. 2021. “A comparative analysis of object detection metrics with a companion open-source toolkit.” Electronics 10 (3): 279. https://doi.org/10.3390/electronics10030279.
Park, M. W., and I. Brilakis. 2016. “Continuous localization of construction workers via integration of detection and tracking.” Autom. Constr. 72 (Dec): 129–142. https://doi.org/10.1016/j.autcon.2016.08.039.
Park, M. W., C. Koch, and I. Brilakis. 2012. “Three-dimensional tracking of construction resources using an on-site camera system.” J. Comput. Civ. Eng. 26 (4): 541–549. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000168.
Peng, X., B. Sun, K. Ali, and K. Saenko. 2015. “Learning deep object detectors from 3D models.” In Proc., IEEE Int. Conf. on Computer Vision, 1278–1286. New York: IEEE.
Ren, S., K. He, R. Girshick, and J. Sun. 2015. “Faster R-CNN: Towards real-time object detection with region proposal networks.” In Proc., Advances in Neural Information Processing Systems, 91–99. Cambridge, MA: MIT Press.
Russakovsky, O., et al. 2015. “Imagenet large scale visual recognition challenge.” Int. J. Comput. Vision 115 (3): 211–252. https://doi.org/10.1007/s11263-015-0816-y.
Russell, B. C., A. Torralba, K. P. Murphy, and W. T. Freeman. 2008. “LabelMe: A database and web-based tool for image annotation.” Int. J. Comput. Vision 77 (1–3): 157–173. https://doi.org/10.1007/s11263-007-0090-8.
Sedaghat, N., and T. Brox. 2015. “Unsupervised generation of a viewpoint annotated car dataset from videos.” In Proc., IEEE Int. Conf. on Computer Vision, 1314–1322. New York: IEEE.
Seo, J., A. Alwasel, S. Lee, E. M. Abdel-Rahman, and C. Haas. 2019. “A comparative study of in-field motion capture approaches for body kinematics measurement in construction.” Robotica 37 (5): 928–946. https://doi.org/10.1017/S0263574717000571.
Shao, J. 1993. “Linear model selection by cross-validation.” J. Am. Stat. Assoc. 88 (422): 486–494. https://doi.org/10.1080/01621459.1993.10476299.
Siebert, S., and J. Teizer. 2014. “Mobile 3D mapping for surveying earthwork projects using an unmanned aerial vehicle (UAV) system.” Autom. Constr. 41 (May): 1–14. https://doi.org/10.1016/j.autcon.2014.01.004.
Soltani, M. M., Z. Zhu, and A. Hammad. 2016. “Automated annotation for visual recognition of construction resources using synthetic images.” Autom. Constr. 62 (Feb): 14–23. https://doi.org/10.1016/j.autcon.2015.10.002.
Soltani, M. M., Z. Zhu, and A. Hammad. 2018. “Framework for location data fusion and pose estimation of excavators using stereo vision.” J. Comput. Civ. Eng. 32 (6): 04018045. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000783.
Son, H., C. Kim, N. Hwang, C. Kim, and Y. Kang. 2014. “Classification of major construction materials in construction environments using ensemble classifiers.” Adv. Eng. Inf. 28 (1): 1–10. https://doi.org/10.1016/j.aei.2013.10.001.
Sorokin, A., and D. Forsyth. 2008. “Utility data annotation with Amazon Mechanical Turk.” In Proc., 2008 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition Workshops, 1–8. New York: IEEE.
Sun, B., and K. Saenko. 2014. “From virtual to reality: Fast adaptation of virtual object detectors to real domains.” In Vol. 1 of Proc., BMVC, 3. London: British Machine Vision Association.
Tajeen, H., and Z. Zhu. 2014. “Image dataset development for measuring construction equipment recognition performance.” Autom. Constr. 48 (Dec): 1–10. https://doi.org/10.1016/j.autcon.2014.07.006.
Tobin, J., R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel. 2017. “Domain randomization for transferring deep neural networks from simulation to the real world.” In Proc., 2017IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 23–30. New York: IEEE.
Torres Calderon, W., D. Roberts, and M. Golparvar-Fard. 2021. “Synthesizing pose sequences from 3D assets for vision-based activity analysis.” J. Comput. Civ. Eng. 35 (1): 04020052. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000937.
Tremblay, J., A. Prakash, D. Acuna, M. Brophy, V. Jampani, C. Anil, T. To, E. Cameracci, S. Boochoon, and S. Birchfield. 2018. “Training deep networks with synthetic data: Bridging the reality gap by domain randomization.” In Proc., IEEE Conf. on Computer Vision and Pattern Recognition Workshops, 969–977. New York: IEEE.
Tsirikoglou, A., J. Kronander, M. Wrenninge, and J. Unger. 2017. “Procedural modeling and physically based rendering for synthetic data generation in automotive applications.” Preprint, submitted October 17, 2017. https://arxiv.org/abs/1710.06270.
Tzutalin, L. 2015. LabelImg. San Francisco, CA: GitHub.
Vahdatikhaki, F., A. Hammad, and H. Siddiqui. 2015. “Optimization-based excavator pose estimation using real-time location systems.” Autom. Constr. 56 (Aug): 76–92. https://doi.org/10.1016/j.autcon.2015.03.006.
Wenkel, S., K. Alhazmi, T. Liiv, S. Alrshoud, and M. Simon. 2021. “Confidence score: The forgotten dimension of object detection performance evaluation.” Sensors 21 (13): 4350. https://doi.org/10.3390/s21134350.
Wu, Y., H. Kim, C. Kim, and S. H. Han. 2010. “Object recognition in construction-site images using 3D CAD-based filtering.” J. Comput. Civ. Eng. 24 (1): 56–64. https://doi.org/10.1061/(ASCE)0887-3801(2010)24:1(56).
Xu, J., and H. S. Yoon. 2019. “Vision-based estimation of excavator manipulator pose for automated grading control.” Autom. Constr. 98 (Feb): 122–131. https://doi.org/10.1016/j.autcon.2018.11.022.
Yang, J., M. W. Park, P. A. Vela, and M. Golparvar-Fard. 2015. “Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future.” Adv. Eng. Inf. 29 (2): 211–224. https://doi.org/10.1016/j.aei.2015.01.011.
Yu, J., D. Farin, C. Krüger, and B. Schiele. 2010. “Improving person detection using synthetic training data.” In Proc., 2010 IEEE Int. Conf. on Image Processing, 3477–3480. New York: IEEE.
Zhang, L., Y. Tong, and Q. Ji. 2008. “Active image labeling and its application to facial action labeling.” In Proc., European Conf. on Computer Vision, 706–719. Berlin: Springer.
Zou, J., and H. Kim. 2007. “Using hue, saturation, and value color space for hydraulic excavator idle time analysis.” J. Comput. Civ. Eng. 21 (4): 238–246. https://doi.org/10.1061/(ASCE)0887-3801(2007)21:4(238).

Information & Authors

Information

Published In

Go to Journal of Computing in Civil Engineering
Journal of Computing in Civil Engineering
Volume 36Issue 5September 2022

History

Received: Aug 13, 2020
Accepted: Mar 30, 2022
Published online: Jun 24, 2022
Published in print: Sep 1, 2022
Discussion open until: Nov 24, 2022

Permissions

Request permissions for this article.

Authors

Affiliations

Assistant Professor, Division of Architecture, College of Engineering, Sun Moon Univ., Asan, Chungcheongnam-do 31460, Republic of Korea. ORCID: https://orcid.org/0000-0002-0006-2653. Email: [email protected]
Jeongbin Hwang [email protected]
Graduate Research Assistant, Dept. of Civil and Environmental Engineering, Seoul National Univ., 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea. Email: [email protected]
Associate Professor, Dept. of Civil and Environmental Engineering, Seoul National Univ., 1 Gwanak-Ro, Gwanak-Gu, Seoul 08826, Republic of Korea; Adjunct Professor, Institute of Construction and Environmental Engineering, Seoul National Univ., 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Republic of Korea. ORCID: https://orcid.org/0000-0002-0409-5268. Email: [email protected]
Associate Professor, Dept. of Building and Real Estate, Hong Kong Polytechnic Univ., Kowloon, Hong Kong (corresponding author). Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • Requirements for Parametric Design of Physics-Based Synthetic Data Generation for Learning and Inference of Defect Conditions, Construction Research Congress 2024, 10.1061/9780784485262.045, (436-445), (2024).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share