Chapter
Mar 7, 2022

Mining and Visualizing Cost and Schedule Risks from News Articles with NLP and Network Analysis

Publication: Construction Research Congress 2022

ABSTRACT

Cost overruns and schedule delays in US transit projects have been of growing concern for years. Nevertheless, the data availability and sample size have restricted quantitative analysis toward investigating the risks leading to overruns. Innovative data sources and collection methods need to be identified in addition to traditional surveys and case studies. News articles report on issues and risk events leading to overruns as projects progress but have not yet been explored in the construction domain. The difficulty lies in data compilation and analysis. To fill this gap, the paper tested combinations of different natural language processing (NLP) and machine learning methods to automatically identify risk narratives from news articles. The risk sentences are classified into 5 categories and 26 subcategories through a content analysis approach. Then the risks are ranked and analyzed using an appropriate co-occurrence network. The research demonstrates the possibility of integrating NLP and network analysis for exploring publicly available textual documents to explain project performance issues. The approach serves as a baseline for future studies to develop more intelligent models to examine a wide range of media data and other textual reports in the construction domain.

Get full access to this article

View all available purchase options and get full access to this chapter.

REFERENCES

Bhadani, S., Verma, I., and Dey, L. (2019). Mining financial risk events from news and assessing their impact on stocks. In Workshop on Mining Data for Financial Applications (pp. 85–100). Springer, Cham.
Boulis, C., and Ostendorf, M. (2005). Text classification by augmenting the bag-of-words representation with redundancy-compensated bigrams. In Proc. of the International Workshop in Feature Selection in Data Mining (pp. 9–16). Citeseer.
Carrillo, P., Harding, J., and Choudhary, A. (2011). Knowledge discovery from post-project reviews. Construction Management and Economics, 29(7), 713–723. doi:https://doi.org/10.1080/01446193.2011.588953.
Devlin, J., Chang, M. W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding.
Flyvbjerg, B., Ansar, A., Budzier, A., Buhl, S., Cantarelli, C., Garbuio, M., Glenting, C., Holm, M. S., Lovallo, D., Lunn, D., and Molin, E. (2018). Five things you should know about cost overrun. Transportation Research Part A: Policy and Practice, 118, 174–190.
Gao, N., and Touran, A. (2020). Cost Overruns and Formal Risk Assessment Program in US Rail Transit Projects. Journal of Construction Engineering and Management, 146(5), 05020004.
Gao, N., Wang, Q., and Touran, A. (2021). Labeled risk and non-risk sentences from news articles. Available online: https://github.com/gaonancy/News_Dataset.
Ghosh, S., and Jintanapakanont, J. (2004). Identifying and assessing the critical risk factors in an underground rail project in Thailand: a factor analysis approach. International journal of Project management, 22(8), 633–643.
Hassan, F. U., and Le, T. (2020). Automated Requirements Identification from Construction Contract Documents Using Natural Language Processing. Journal of Legal Affairs and Dispute Resolution in Engineering and Construction, 12(2), 04520009.
Jallan, Y., and Ashuri, B. (2020). Text Mining of the Securities and Exchange Commission Financial Filings of Publicly Traded Construction Firms Using Deep Learning to Identify and Assess Risk. Journal of Construction Engineering and Management, 146(12), 04020137.
Kuo, Y.-C., and Lu, S.-T. (2013). Using fuzzy multiple criteria decision making approach to enhance risk assessment for metropolitan construction projects. International journal of Project management, 31(4), 602–614.
Lee, J., Yi, J.-S., and Son, J. (2019). Development of automatic-extraction model of poisonous clauses in international construction contracts using rule-based NLP. Journal of Computing in Civil Engineering, 33(3), 04019003.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality.
Piskorski, J., and Jacquet, G. (2020). TF-IDF Character N-grams versus Word Embedding-based Models for Fine-grained Event Classification: A Preliminary Study. In Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020 (pp. 26–34).
Salama, D. M., and El-Gohary, N. M. (2016). Semantic text classification for supporting automated compliance checking in construction. Journal of Computing in Civil Engineering, 30(1), 04014106.
Voulgaris, C. T. (2017). Crystal Balls and Black Boxes: Optimism Bias in Ridership and Cost Forecasts for New Starts Rapid Transit Projects. UCLA.
Zhang, J., Zi, L., Hou, Y., Deng, D., Jiang, W., and Wang, M. (2020). A C-BiLSTM Approach to Classify Construction Accident Reports. Applied Sciences, 10(17), 5754. https://doi.org/10.3390/app10175754.

Information & Authors

Information

Published In

Go to Construction Research Congress 2022
Construction Research Congress 2022
Pages: 314 - 324

History

Published online: Mar 7, 2022

Permissions

Request permissions for this article.

Authors

Affiliations

Nan Gao, S.M.ASCE [email protected]
1Graduate Student, Dept. of Civil and Environmental Engineering, Northeastern Univ., Boston, MA. Email: [email protected]
Ali Touran, F.ASCE [email protected]
2Professor, Dept. of Civil and Environmental Engineering, Northeastern Univ., Boston, MA. Email: [email protected]
Qi Wang, M.ASCE [email protected]
3Assistant Professor, Dept. of Civil and Environmental Engineering, Northeastern Univ., Boston, MA. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$288.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Paper
$35.00
Add to cart
Buy E-book
$288.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share