Technical Papers
Mar 6, 2023

Developing a Construction Domain–Specific Artificial Intelligence Language Model for NCDOT’s CLEAR Program to Promote Organizational Innovation and Institutional Knowledge

Publication: Journal of Computing in Civil Engineering
Volume 37, Issue 3


Transportation agency personnel gain valuable knowledge through their work, but such knowledge is lost if it is not documented properly after the worker leaves the organization. The risk of losing institutional knowledge is a current problem at state departments of transportation, including the North Carolina Department of Transportation (NCDOT), due to high personnel turnover. State transportation agencies have implemented knowledge repositories in the form of lessons learned/best practices databases to address this problem. However, motivating end-users to use such databases is challenging. This paper addresses this challenge through novel artificial intelligence technology whereby a neural network–based language model is implemented as part of the NCDOT’s new knowledge management program: Communicate Lessons, Exchange Advice, Record (CLEAR). The CLEAR program encompasses a database of lessons learned/best practices and a website to access and search the database. The developed methodology involves training a language model on transportation construction texts and using that trained model in a novel algorithm enabling users to search the CLEAR database easily. The developed language-processing model provides an easily accessible interface to suggest the most relevant CLEAR data based on the end-user’s searched keywords. The model learns an inference model of construction domain–specific vocabulary extracted from various sources, such as contract documents, textbooks, and specifications, to make meaningful connections between lessons learned/best practices in the CLEAR database and project-specific knowledge. The developed model has been validated by project managers for projects at various life cycle stages. The automation of information retrieval is intended to encourage NCDOT personnel to use and embrace the CLEAR program as part of their routine work to improve project workflow. In the long run, the NCDOT will benefit from consistent usage of the CLEAR program and its high quality content, thereby leading to enhanced institutional knowledge and organizational innovation.

Practical Applications

The construction industry, with a particular emphasis on transportation construction, currently faces tremendous challenges in retaining and retraining existing personnel to ensure business continuity on projects. Knowledge gained on projects by project personnel can be lost forever if not properly documented. While knowledge repositories are effective toward ensuring the storing and retrieving of past knowledge, extant literature underlines the need to ensure continued participation by the end-users for the success of such repositories. This research effort uses natural language processing, a subfield artificial intelligence that deals specifically with text sources, as a means to quickly and accurately enhance the quality of search results being displayed to the end-users within the North Carolina Department of Transportation’s recently commissioned knowledge management program called CLEAR. As a result, end-users can stay motivated and embrace the CLEAR program, thereby ensuring its long-term success. In the long run, the consistent usage of the CLEAR program and the high quality content that is input to the CLEAR database by the NCDOT end-users will lead to enhanced institutional knowledge and internal organizational innovation.

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request. The CLEAR Lessons Learned/Best Practices data and other information such as contract documents, and feasibility study reports used in this paper are proprietary to the NCDOT and such data can be obtained directly from the NCDOT Value Management Office. In addition, the training model and code used to develop the CD-SAIL model are currently copyrighted to the NCDOT and the North Carolina State University (NCSU) research team through a tech-transfer agreement, but can be provided to anyone interested upon submitting a reasonable request to the corresponding author.


The authors would like to thank Clare E. Fullerton and Alyson W. Tamer with NCDOT’s Value Management Unit for providing proprietary text data sources such as contract documents that helped us in preparing the transportation construction corpus of words. The authors would also like to thank the NCDOT project managers who helped us validate and iteratively fine-tune the CD-SAIL model. Finally, the authors thank the anonymous reviewers of this manuscript for providing constructive feedback to improve its quality.


Information & Authors


Published In

Go to Journal of Computing in Civil Engineering
Journal of Computing in Civil Engineering
Volume 37Issue 3May 2023


Received: Feb 19, 2022
Accepted: Dec 21, 2022
Published online: Mar 6, 2023
Published in print: May 1, 2023
Discussion open until: Aug 6, 2023


Siddharth Banerjee, Ph.D., A.M.ASCE [email protected]
Assistant Professor in Residence, Dept. of Civil Engineering and Construction, Bradley Univ., Peoria, IL 61625 (corresponding author). ORCID: Email: [email protected]
Postdoctoral Associate, Dept. of Computer Science, North Carolina State Univ., Raleigh, NC 27606. ORCID: Email: [email protected]
Arnav H. Jhala, Ph.D. [email protected]
Associate Professor, Dept. of Computer Science, North Carolina State Univ., Raleigh, NC 27606. Email: [email protected]
Edward J. Jaselskis, Ph.D., A.M.ASCE [email protected]
E.I. Clancy Distinguished Professor, Dept. of Civil, Construction and Environmental Engineering, North Carolina State Univ., Raleigh, NC 27695. Email: [email protected]

