Open access
Technical Papers
Feb 28, 2020

Metaresearching Structural Engineering Using Text Mining: Trend Identifications and Knowledge Gap Discoveries

Publication: Journal of Structural Engineering
Volume 146, Issue 5


The significant increase in the number of journal paper submissions/publications in the last decades has been paralleled by a shift to (mainly) on-line publication and digital archiving of past research articles. This situation has created an opportunity to metaresearch (conduct research on research) structural engineering through benefiting from emerging computational techniques such as data mining to track historical and current research focuses and trends and to better identify evolving research themes and discover possible cross-cutting knowledge gaps. Such metaresearch can benefit all structural engineering community stakeholders (e.g., researchers, designers, and funding agencies) in multiple ways including research resource realignments and optimizations to meet current and future research needs. The current study utilizes text mining—a class of data mining—to analyze published structural engineering research over 26 years. The considered dataset represents more than 11,000 articles, published in the two leading structural engineering journals (Journal of Structural Engineering and Engineering Structures) from 1991 to 2016. Following the collection and preparation of the training and testing datasets, the latent Dirichlet allocation (LDA) topic modeling technique is utilized to identify, classify, and categorize articles in terms of their topics, characterized by relevant technical terms. Subsequently, quantitative analyses are used to evaluate the temporal inclusion trends within the 11,000 article dataset. The LDA technique is also reapplied on only articles published between 2012 and 2016, to identify recent research topic developments and investigate the correlation between these topics and their counterparts covering the entire 26-year study period. Finally a word co-occurrence network and a topic interlinkage matrix are also developed, providing visual tools to rapidly evaluate structural engineering research subfield co-occurrences and linkage strengths. The overarching aim of this metaresearch is to identify understudied intersections of structural engineering subfields and highlight Blue Ocean opportunities at the interfaces of structural engineering and other established fields and emerging technologies.

Formats available

You can view the full content in the following formats:


The authors would like to acknowledge the support from the INViSiONLab. The second author is grateful to Ms. Lucy El-Sherif who first introduced him to the origin of Blue Ocean Strategy in 1999 and the authors are thankful for her feedback on the article.


Akilan, A. 2015. “Text mining: Challenges and future directions.” In Proc., 2nd Int. Conf. Electronics and Communication Systems, 1679–1684. Piscataway, NJ: IEEE.
Al-Harthy, A. S., and D. M. Frangopol. 1994. “Reliability-based design of prestressed concrete beams.” J. Struct. Eng. 120 (11): 3156–3177.
Amado, A., P. Cortez, P. Rita, and S. Moro. 2018. “Research trends on Big Data in Marketing: A text mining and topic modeling based literature analysis.” Eur. Res. Manage. Bus. Econ. 24 (1): 1–7.
Asuncion, H. U., A. U. Asuncion, and R. N. Taylor. 2010. “Software traceability with topic modeling.” In Vol. 1 of Proc., 32nd ACM/IEEE Int. Conf. on Software Engineering, 95–104. Piscataway, NJ: IEEE.
Bart, E., M. Welling, and P. Perona. 2011. “Unsupervised organization of image collections: Taxonomies and beyond.” IEEE Trans. Pattern Anal. Mach. Intell. 33 (11): 2302–2315.
Blei, D. M. 2012. “Probabilistic topic models.” Commun. ACM 55 (4): 77–84.
Blei, D. M., and J. D. Lafferty. 2006. “Dynamic topic models.” In Proc., 23rd Int. Conf. on Machine Learning, 113–120. New York: Association for Computing Machinery.
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. “Latent Dirichlet allocation.” J. Mach. Learn. Res. 3 (Jan): 993–1022.
Bonstrom, H., and R. B. Corotis. 2014. “First-order reliability approach to quantify and improve building portfolio resilience.” J. Struct. Eng. 142 (8): C4014001.
Bruneau, M., M. Barbato, J. E. Padgett, A. E. Zaghi, J. Mitrani-Reiser, and Y. Li. 2017. “State of the art of multihazard design.” J. Struct. Eng. 143 (10): 03117002.
Chalmers, I., and P. Glasziou. 2009. “Avoidable waste in the production and reporting of research evidence.” Lancet 374 (9683): 86–89.
Cimellaro, G. P., C. Renschler, A. M. Reinhorn, and L. Arendt. 2016. “Peoples: A framework for evaluating resilience.” J. Struct. Eng. 142 (10): 04016063.
Correa, J. 2001. “Interval estimation of the parameters of the multinomial distribution.” Stat. Int. 1–9.
El-Dakhakhni, W., and A. Ashour. 2017. “Seismic response of reinforced-concrete masonry shear-wall components and systems: State of the art.” J. Struct. Eng. 143 (9): 03117001.
El-Tawil, S. 2018. “Special collection on 60th anniversary state-of-the-art papers.” J. Struct. Eng. 144 (3).
Ezzeldin, M., and W. E. El-Dakhakhni. 2019. “Robustness of Ontario power network under systemic risks.” Sustainable Resilient Infrastruct. 1–20.
Feldman, R., and I. Dagan. 1995. “Knowledge discovery in textual databases (KDT).” In Vol. 95 of Proc., KDD, 112–117. Palo Alto, CA: Association for the Advancement of Artificial Intelligence.
Fleuren, W. W., and W. Alkema. 2015. “Application of text mining in the biomedical domain.” Methods 74 (Mar): 97–106.
Gatti, C. J., J. D. Brooks, and S. G. Nurre. 2015. “A historical analysis of the field of or/ms using topic models.” Preprint, submitted October 17, 2015.
Geman, S., and D. Geman. 1984. “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images.” IEEE Trans. Pattern Anal. Mach. Intell. 721–741.
Griffiths, T. L., and M. Steyvers. 2004. “Finding scientific topics.” Supplement, Proc. Natl. Acad. Sci. 101 (S1): 5228–5235.
Gupta, V., and G. S. Lehal. 2009. “A survey of text mining techniques and applications.” J. Emerging Technol. Web Intell. 1 (1): 60–76.
Hofmann, M., and A. Chisholm. 2016. Vol. 40 of Text mining and visualization: Case studies using open-source tools. Boca Raton, FL: CRC Press.
Hofmann, T. 1999. “Probabilistic latent semantic analysis.” In Proc., Fifteenth Conf. on Uncertainty in Artificial Intelligence, 289–296. San Francisco: Morgan Kaufmann Publishers.
Holnicki-Szulc, J., and C. M. Soares. 2013. Vol. 1 of Advances in smart technologies in structural engineering. Berlin: Springer.
Hong, L., and B. D. Davison. 2010. “Empirical study of topic modeling in twitter.” In Proc., 1st Workshop on Social Media Analytics, 80–88. New York: Association for Computing Machinery.
Hospedales, T., S. Gong, and T. Xiang. 2009. “A Markov clustering topic model for mining behaviour in video.” In Proc., 12th Int. Conf. Computer Vision, 1165–1172. Piscataway, NJ: IEEE.
Hosseini, M. R., I. Martek, E. K. Zavadskas, A. A. Aibinu, M. Arashpour, and N. Chileshe. 2018. “Critical evaluation of off-site construction research: A Scientometric analysis.” Autom. Constr. 87 (Mar): 235–247.
Hu, Y. 2005. “Efficient, high-quality force-directed graph drawing.” Math. J. 10 (1): 37–71.
Hu, Y., J. Boyd-Graber, B. Satinoff, and A. Smith. 2014. “Interactive topic modeling.” Mach. Learn. 95 (3): 423–469.
Kim, W. C., and R. A. Mauborgne. 2014. Blue ocean strategy, expanded edition: How to create uncontested market space and make the competition irrelevant. Boston: Harvard Business Review Press.
Krallinger, M., A. Morgan, L. Smith, F. Leitner, L. Tanabe, J. Wilbur, and A. Valencia. 2008. “The BioCreative II-critical assessment for information extraction in biology challenge.” Genome Biol. 9 (2). S10.
Lazard, A. J., E. Scheinfeld, J. M. Bernhardt, G. B. Wilcox, and M. Suran. 2015. “Detecting themes of public concern: A text mining analysis of the Centers for Disease Control and Prevention’s Ebola live Twitter chat.” Am. J. Infect. Control 43 (10): 1109–1111.
Li, H. N., D. S. Li, and G. B. Song. 2004. “Recent applications of fiber optic sensors to health monitoring in civil engineering.” Eng. Struct. 26 (11): 1647–1657.
Li, L. J., C. Wang, Y. Lim, D. M. Blei, and L. Fei-Fei. 2010. “Building and using a semantivisual image hierarchy.” In Proc., 2010 IEEE Conf. on Computer Vision and Pattern Recognition, 3336–3343. Piscataway, NJ: IEEE.
Lounis, Z., and T. P. McAllister. 2016. “Risk-based decision making for sustainable and resilient infrastructure systems.” J. Struct. Eng. 142 (9): F4016005.
Maheswaran, S., R. D. Kumar, and K. R. Sridharan. 2009. “Scientometric analysis of area-wise publications in the field of structural engineering: A case study of SERC, India.” Ann. Lib. Inf. Stud. 56 (1): 22–28.
Manning, C. D., and H. Schütze. 1999. Foundations of statistical natural language processing. Cambridge, MA: MIT Press.
Manyika, J., M. Chui, J. Bughin, R. Dobbs, P. Bisson, and A. Marrs. 2013. Vol. 180 of Disruptive technologies: Advances that will transform life, business, and the global economy. San Francisco: McKinsey Global Institute.
Minka, T. 2000. Estimating a Dirichlet distribution. Technical Rep., Cambridge, MA: Massachusetts Institute of Technology.
Nassirtoussi, A. K., S. Aghabozorgi, T. Y. Wah, and D. C. L. Ngo. 2014. “Text mining for market prediction: A systematic review.” Expert Syst. Appl. 41 (16): 7653–7670.
Ordenes, F. V., B. Theodoulidis, J. Burton, T. Gruber, and M. Zaki. 2014. “Analyzing customer experience feedback using text mining: A linguistics-based approach.” J. Service Res. 17 (3): 278–295.
Pollack, J., and D. Adler. 2015. “Emergent trends and passing fads in project management research: A scientometric analysis of changes in the field.” Int. J. Project Manage. 33 (1): 236–248.
Salem, S., M. Campidelli, W. W. El-Dakhakhni, and M. J. Tait. 2018. “Resilience-based design of urban centres: Application to blast risk assessment.” Sustainable Resilient Infrastruct. 3 (2): 68–85.
Salloum, S. A., M. Al-Emran, A. A. Monem, and K. Shaalan. 2018. “Using text mining techniques for extracting information from research articles.” In Intelligent natural language processing: trends and applications, 373–397. Cham, Switzerland: Springer.
Schnobrich, W. C. 1991. “Reflections on the behaviour of reinforced concrete shells.” Eng. Struct. 13 (2): 199–210.
Serenko, A., N. Bontis, L. Booker, K. Sadeddin, and T. Hardie. 2010. “A scientometric analysis of knowledge management and intellectual capital academic literature (1994-2008).” J. Knowl. Manage. 14 (1): 3–23.
Sohn, H., J. A. Czarnecki, and C. R. Farrar. 2000. “Structural health monitoring using statistical process control.” J. Struct. Eng. 126 (11): 1356–1363.
Spencer, B. F., Jr., and S. Nagarajaiah. 2003. “State of the art of structural control.” J. Struct. Eng. 129 (7): 845–856.
Steyvers, M., and T. Griffiths. 2007. “Probabilistic topic models.” Handb. Latent Semant. Anal. 427 (7): 424–440.
Sukanya, M., and S. Biruntha. 2012. “Techniques on text mining.” In Proc., IEEE Int. Conf. on Advanced Communication Control and Computing Technologies (ICACCCT), 269–271. Piscataway, NJ: IEEE.
Tang, J., J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. 2008. “Arnetminer: Extraction and mining of academic social networks.” In Proc., 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 990–998. New York: Association for Computing Machinery.
Wang, X., and A. McCallum. 2006. “Topics over time: A non-Markov continuous-time model of topical trends.” In Proc., 12th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 424–433. New York: Association for Computing Machinery.
Wuchty, S., and E. Almaas. 2005. “Evolutionary cores of domain co-occurrence networks.” BMC Evol. Biol. 5 (1): 24.
Zhang, Y., M. Chen, and L. Liu. 2015. “A review on text mining.” In Proc., 6th IEEE Int. Conf. on Software Engineering and Service Science (ICSESS), 681–685. Piscataway, NJ: IEEE.
Zhao, X. L., and L. Zhang. 2007. “State-of-the-art review on FRP strengthened steel structures.” Eng. Struct. 29 (8): 1808–1823.

Information & Authors


Published In

Go to Journal of Structural Engineering
Journal of Structural Engineering
Volume 146Issue 5May 2020


Received: Oct 30, 2018
Accepted: Jul 15, 2019
Published online: Feb 28, 2020
Published in print: May 1, 2020
Discussion open until: Jul 28, 2020



Mohamed Ezzeldin, A.M.ASCE [email protected]
Assistant Professor, Dept. of Civil Engineering, McMaster Univ., Hamilton, ON, Canada L8S 4L7. Email: [email protected]
Professor and Director of the INViSiONLab, Dept. of Civil Engineering, McMaster Univ., Hamilton, ON, Canada L8S 4L7 (corresponding author). ORCID: Email: [email protected]

Metrics & Citations



Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options







Copy the content Link

Share with email

Email a colleague
