TableGraph: An Image Segmentation–Based Table Knowledge Interpretation Model for Civil and Construction Inspection Documentation
Publication: Journal of Construction Engineering and Management
Volume 148, Issue 10
Abstract
There are many manuals and codes to normalize each procedure in civil and construction engineering projects. Data tables in the codes offer various references and are playing a more and more valuable role in knowledge management. However, research has focused on regular table structure detection. For nonconventional tables— especially for nested tables—there is no efficient way to conduct automatic interpretation. In this paper, an automatic table knowledge interpretation model (TableGraph) is proposed to automatically extract table data from table images and then transform the table data into table cell graphs to facilitate table information querying. TableGraph considers that a table image is composed of three types of semantic pixel classes: background, table border, and table cell contents. Because TableGraph only considers pixel semantic meaning rather than structural rules or form features, it can handle nonconventional and complex nested table situations. In addition, a cross-hit algorithm was designed to enable fast content queries on the generated table cell graphs. Validation of a real case of automatic interpretation of inspection manual table data is presented. The results show that the proposed TableGraph model can interpret the structure and contents of table images.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.
Acknowledgments
This work was jointly supported by the National Natural Science Foundation of China (Grant No. 51978677) and the Shenzhen Science and Technology Innovation Committee Grant (Grant No. JCYJ20180507181647320). The conclusions herein are those of the authors and do not necessarily reflect the views of the sponsoring agencies.
References
Bilal, M., L. O. Oyedele, O. O. Akinade, S. O. Ajayi, H. A. Alaka, H. A. Owolabi, and S. A. Bello. 2016. “Big data architecture for construction waste analytics (CWA): A conceptual framework.” J. Build. Eng. 6 (5): 144–156. https://doi.org/10.1016/j.jobe.2016.03.002.
Chen, H. 2022. “Tablegraph github repository.” Accessed July 11, 2022. https://github.com/hainan89/TableGraph.
Chen, H., M. Vasardani, S. Winter, and M. Tomko. 2018a. “A graph database model for knowledge extracted from place descriptions.” ISPRS Int. J. Geo-Inf. 7 (6): 30. https://doi.org/10.3390/ijgi7060221.
Chen, L.-C., Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. 2018b. “Encoder-decoder with atrous separable convolution for semantic image segmentation.” In Proc., European Conf. on Computer Vision (ECCV). Cham, Switzerland: Springer.
Davoudian, A., L. Chen, and M. C. Liu. 2018. “A survey on NoSQL stores.” ACM Comput. Surv. 51 (2): 1–43. https://doi.org/10.1145/3158661.
Dong, H. Y., S. J. Liu, S. Han, Z. Y. Fu, and D. M. Zhang. 2019. TableSense: Spreadsheet table detection with convolutional neural networks. Palo Alto, CA: Association for the Advancement of Artificial Intelligence.
Gao, L., X. Yi, Z. Jiang, L. Hao, and Z. Tang. 2017. “ICDAR2017 competition on page object detection.” In Proc., 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR). Los Alamitos, CA: IEEE Computer Society.
He, D., S. Cohen, B. Price, D. Kifer, and C. L. Giles. 2017. “Multi-scale multi-task FCN for semantic page segmentation and table detection.” In Proc., 14th IAPR Int. Conf. on Document Analysis and Recognition (ICDAR). Los Alamitos, CA: IEEE Computer Society.
Highways England. 2000. CS 450–Inspection of highway structures. Birmingham, UK: Highways England.
Khusro, S., A. Latif, and I. Ullah. 2014. “On methods and tools of table detection, extraction and annotation in PDF documents.” J. Inf. Sci. 41 (1): 41–57. https://doi.org/10.1177/0165551514551903.
Lee, S. S. 2000. Construction site safety handbook for public works programme. Hong Kong: Works Bureau.
Li, M., L. Cui, S. Huang, F. Wei, M. Zhou, and Z. Li. 2020. “Tablebank: Table benchmark for image-based table detection and recognition.” In Proc., 12th Language Resources and Evaluation Conf. Paris: European Language Resources Association.
Luo, D., J. Peng, and Y. Fu. 2018. “Biotable: A tool to extract semantic structure of table in biology literature.” In Proc., 2018 5th Int. Conf. on Bioinformatics Research and Applications—ICBRA ’18. New York: Association for Computing Machinery.
Ministry of Housing and Urban-Rural Development of the PR China. 2008. Code for construction and quality acceptance of road works in city and town. CJJ 1-2008. Beijing: Ministry of Housing and Urban-Rural Development of the PR China.
Ministry of Housing and Urban-Rural Development of the PR China. 2013a. Code for monitoring measurement of urban rail transit engineering. GB 50911-2013. Beijing: Ministry of Housing and Urban-Rural Development of the PR China.
Ministry of Housing and Urban-Rural Development of the PR China. 2013b. Technical code for appraisal and reinforcement of building slope. GB 50843-2013. Beijing: Ministry of Housing and Urban-Rural Development of the PR China.
Morton, K., P. Wang, C. Bizon, S. Cox, J. Balhoff, Y. Kebede, and A. Tropsha. 2019. “ROBOKOP: An abstraction layer and user interface for knowledge graphs to support question answering.” Bioinformatics 35 (24): 5382–5384. https://doi.org/10.1093/bioinformatics/btz604.
Nazarenko, A. A., J. Sarraipa, L. M. Camarinha-Matos, O. Garcia, and R. Jardim-Goncalves. 2019. “Semantic data management for a virtual factory collaborative environment.” Appl. Sci. Basel 9 (22): 23. https://doi.org/10.3390/app9224936.
Nedeljkovic, D., and M. Kovacevic. 2017. “Building a construction project key-phrase network from unstructured text documents.” J. Comput. Civ. Eng. 31 (6): 14. https://doi.org/10.1061/(asce)cp.1943-5487.0000708.
Prasad, D., A. Gadpal, K. Kapadni, M. Visave, and K. Sultanpure. 2020. “CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents.” In Proc., IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops. Los Alamitos, CA: IEEE Computer Society.
Rastan, R., H.-Y. Paik, and J. Shepherd. 2019. “TEXUS: A unified framework for extracting and understanding tables in PDF documents.” Inf. Process. Manage. 56 (3): 895–918. https://doi.org/10.1016/j.ipm.2019.01.008.
Santosh, K. C. 2015. “g-DICE: Graph mining-based document information content exploitation.” Int. J. Document Anal. Recognition 18 (4): 337–355. https://doi.org/10.1007/s10032-015-0253-z.
Schreiber, S., S. Agne, I. Wolf, A. Dengel, and S. Ahmed. 2017. “DeepDeSRT: Deep learning for detection and structure recognition of tables in document images.” In Proc., 2017 14th IAPR Int. Conf. on Document Analysis and Recognition, 1162–1167. New York: IEEE.
Seo, W., H. I. Koo, and N. I. Cho. 2014. “Junction-based table detection in camera-captured document images.” Int. J. Document Anal. Recognition 18 (1): 47–57. https://doi.org/10.1007/s10032-014-0226-7.
Shigarov, A., A. Altaev, A. Mikhailov, V. Paramonov, and E. Cherkashin. 2018. “TabbyPDF: Web-based system for PDF table extraction.” In Information and software technologies, 257–269. Cham, Switzerland: Springer.
Siddiqui, S. A., M. I. Malik, S. Agne, A. Dengel, and S. Ahmed. 2018. “DeCNT: Deep deformable CNN for table detection.” IEEE Access 6 (18): 74151–74161. https://doi.org/10.1109/ACCESS.2018.2880211.
Tran, T. A., H. T. Tran, I. S. Na, G. S. Lee, H. J. Yang, and S. H. Kim. 2016. “A mixture model using random rotation bounding box to detect table region in document image.” J. Visual Commun. Image Represent. 39 (5): 196–208. https://doi.org/10.1016/j.jvcir.2016.05.023.
Wang, P., P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell. 2018. “Understanding convolution for semantic segmentation.” In Proc., IEEE Winter Conf. on Applications of Computer Vision (WACV). New York: IEEE.
Information & Authors
Information
Published In
Copyright
© 2022 American Society of Civil Engineers.
History
Received: Oct 19, 2021
Accepted: Apr 19, 2022
Published online: Jul 22, 2022
Published in print: Oct 1, 2022
Discussion open until: Dec 22, 2022
ASCE Technical Topics:
- Automation and robotics
- Business management
- Construction engineering
- Construction management
- Data collection
- Engineering fundamentals
- Information management
- Inspection
- Knowledge management
- Methodology (by type)
- Models (by type)
- Practice and Profession
- Research methods (by type)
- Standards and codes
- Structural models
- Systems engineering
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.
Cited by
- Haoxi Wang, Sheng Xu, Dongdong Cui, Hong Xu, Hanbin Luo, Information Integration of Regulation Texts and Tables for Automated Construction Safety Knowledge Mapping, Journal of Construction Engineering and Management, 10.1061/JCEMD4.COENG-14436, 150, 5, (2024).