Technical Papers
Sep 28, 2020

Text Mining of the Securities and Exchange Commission Financial Filings of Publicly Traded Construction Firms Using Deep Learning to Identify and Assess Risk

Publication: Journal of Construction Engineering and Management
Volume 146, Issue 12

Abstract

Risk factor identification is a critical topic in the construction industry. It is vital for the various construction firms and industry stakeholders to understand the different types of risks that affect their businesses and financial bottom lines. This research created a systematic methodology implementing a new set of text mining methods to identify and classify risk types affecting the publicly traded construction companies, by leveraging their 10-K reports filed with the Securities and Exchange Commission (SEC). A structured procedure was developed to apply advancements from text mining and natural language processing (NLP) to extract information from textual disclosures. A state-of-the-art deep learning algorithm named FastText was implemented to identify risk patterns and classify the text into appropriate risk types. Key findings showed that operational and financial risks associated with doing business most commonly are disclosed in the risk disclosures filed by the publicly traded construction firms. A steady monotonic increase was found in the average number of total risk disclosures from 2006 to 2018. Over the same period, growth occurred in the proportion of technology risks, reputation/intangible assets risks, financial markets risk, and third-party risks. The primary contributions of this research are (1) the development of a new methodology which serves as a risk thermometer for identification and quantification of risk at an individual company level, subindustry level, and the overall industry level; and (2) minimization of any existing information asymmetry in risk studies by utilization of a source of data that previously has not been used by construction researchers. It is anticipated that the developed methodology and its results can be used by (1) publicly traded construction companies to understand risks affecting themselves and their peers; and (2) surety bond companies and insurance providers to supplement their risk pricing models; and (3) equity investors and capital financial institutions to make more-informed risk-based decisions for their investments in the construction business.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

References

Alomari, K. A., J. A. Gambatese, and N. Tymvios. 2018. “Risk perception comparison among construction safety professionals: Delphi perspective.” J. Constr. Eng. Manage. 144 (12): 04018107. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001565.
Ashuri, B., Y. Jallan, and J. H. Lee. 2018a. Materials quality management for alternative project delivery. Atlanta: Georgia DOT.
Ashuri, B., A. Moradi, M. Baek, G. Kingsley, H. Y. An, L. Zhang, Y. Liang, and S. Bahrami. 2018b. Risk mitigation strategies to enhance the delivery of highway projects. Atlanta: Georgia DOT.
Bao, Y., and A. Datta. 2014. “Simultaneously discovering and quantifying risk types from textual risk disclosures.” J. Manage. Sci. 60 (6): 1371–1391. https://doi.org/10.1287/mnsc.2014.1930.
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. “Latent Dirichlet allocation.” J. Mach. Learn. Res. 3 (4–5): 993–1022. https://doi.org/10.5555/944919.944937.
Bojanowski, P., E. Grave, A. Joulin, and T. Mikolov. 2016. “Enriching word vectors with subword information.” Preprint, submitted June 19, 2017. arXiv:1607.04606.
Caldas, C. H., and L. Soibelman. 2003. “Automating hierarchical document classification for construction management information systems.” J. Autom. Constr. 12 (4): 395–406. https://doi.org/10.1016/S0926-5805(03)00004-9.
Campbell, J. H., H. Chen, D. S. Dhaliwal, H.-M. Lu, and L. B. Steele. 2014. “The information content of mandatory risk factor disclosures in corporate filings.” Rev. Account. Stud. 19 (1): 396–455. https://doi.org/10.1007/s11142-013-9258-3.
Carrillo, P., J. Harding, and A. Choudhary. 2011. “Knowledge discovery from post-project reviews.” J. Constr. Manage. Econ. 29 (7): 713–723. https://doi.org/10.1080/01446193.2011.588953.
Creedy, G. D., M. Skitmore, and J. K. W. Wong. 2010. “Evaluation of risk factors leading to cost overrun in delivery of highway construction projects.” J. Constr. Eng. Manage. 136 (5): 528–537. https://doi.org/10.1061/(ASCE)CO.1943-7862.0000160.
Design Build Institute of America. 2018. “Design-build moves from alternative to mainstream.” Accessed January 20, 2020. https://dbia.org/design-build-moves-from-alternative-to-mainstream/.
Dumais, S. T. 2005. “Latent semantic analysis.” Ann. Rev. Inf. Sci. Technol. 38 (1): 188–230. https://doi.org/10.1002/aris.1440380105.
Engineering News-Record. 2017. “Two teams certify costs for Boston green line extension.” Accessed December 10, 2019. https://www.enr.com/articles/43334-two-teams-certify-costs-for-boston-green-line-extension.
Engineering News-Record. 2018. “Finishing acquisition of Layne Christensen, granite construction revamps roles.” Accessed December 10, 2019. https://www.enr.com/articles/44736-finishing-acquisition-of-layne-christensen-granite-construction-revamps-roles.
Engineering News-Record. 2019. “Construction cybercrime is on the rise.” Accessed January 20, 2020. https://www.enr.com/articles/46832-construction-cybercrime-is-on-the-rise.Fortune.
Fortune. 2017. “United airlines: Stock drops following passenger incident in Chicago.” Accessed January 17, 2020. https://fortune.com/2017/04/11/united-airlines-stock-drop/.
Gad, G. M., and J. Shane. 2017. “Culture-risk-trust model for dispute-resolution method selection in international construction contracts.” J. Leg. Aff. Dispute Resolut. 9 (4): 04517020. https://doi.org/10.1061/(ASCE)LA.1943-4170.0000242.
Granite Construction. 2018. “Form 10-K.” Accessed November 25, 2019. https://www.sec.gov/Archives/edgar/data/861459/000086145918000008/gva1231201710k.htm.
Hallowell, M. R., K. R. Molenaar, and B. R. Fortunato. 2013. “Enterprise risk management strategies for state departments of transportation.” J. Manage. Eng. 29 (2): 114–121. https://doi.org/10.1061/(ASCE)ME.1943-5479.0000136.
Huang, K.-W., and Z. Li. 2011. “A multilabel text classification algorithm for labeling risk factors in SEC form 10-K.” ACM Trans. Manage. Inf. Syst. 2 (3). https://doi.org/10.1145/2019618.2019624.
Jallan, Y., E. Brogan, A. Ashuri, and C. Clevenger. 2019. “Application of natural language processing and text mining to identify patterns in construction-defect litigation cases.” J. Leg. Aff. Dispute Resolut. 11 (4): 04519024. https://doi.org/10.1061/(ASCE)LA.1943-4170.0000308.
Jarkas, A. M., and T. C. Haupt. 2015. “Major construction risk factors considered by general contractors in Qatar.” J. Eng. Des. Technol. 13 (1): 165–194. https://doi.org/10.1108/JEDT-03-2014-0012.
Le, T., and H. D. Jeong. 2017. “NLP-based approach to semantic classification of heterogeneous transportation asset data terminology.” J. Comput. Civ. Eng. 31 (6): 04017057. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000701.
Lee, J., J.-S. Yi, and J. Son. 2019. “Development of automatic-extraction model of poisonous clauses in international construction contracts using rule-based NLP.” J. Comput. Civ. Eng. 33 (3): 04019003. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000807.
Leonard, A. 2018. “Corporate reputation risk in relation to the social media landscape.” Doctoral dissertation, Faculty of Economic and Management Sciences, Univ. of Pretoria.
Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. “Efficient estimation of word representations in vector space.” Preprint, submitted January 16, 2013. https://arxiv.org/abs/1301.3781.
Mirakur, Y. 2011. “Risk disclosure in SEC corporate filings.” Accessed November 25, 2019. https://repository.upenn.edu/wharton_research_scholars/85.
Python Software Foundation. n.d. “re—Regular expression operations.” Accessed November 25, 2019. https://docs.python.org/3/library/re.html.
Rajaraman, A., and J. D. Ullman. 2011. “Data mining.” In Mining of massive datasets, 1–17. Cambridge, UK: Cambridge University Press.
Rehurek, R. n.d. “Gensim: Topic modelling for humans.” Accessed November 25, 2019. https://radimrehurek.com/gensim/.
Securities and Exchange Commission. n.d.-a. “About EDGAR.” Accessed November 25, 2019. https://www.sec.gov/edgar/aboutedgar.htm.
Securities and Exchange Commission. n.d.-b. “Division of Corporation Finance: Standard Industrial Classification (SIC) code list.” Accessed November 25, 2019. https://www.sec.gov/info/edgar/siccodes.htm.
Securities and Exchange Commission. n.d.-c. “Form 10-K.” Accessed November 25, 2019. https://www.sec.gov/fast-answers/answers-form10khtm.html.
Securities and Exchange Commission. n.d.-d. “How to Read a 10-K.” Accessed November 25, 2019. https://www.sec.gov/fast-answers/answersreada10khtm.html.
Singhal, A. 2001. “Modern information retrieval: A brief overview.” Bull. IEEE Computer Soc. Tech. Committee Data Eng. 24 (4): 35–43.
Tixier, A., M. Hallowell, B. Rajagopalan, and D. Bowman. 2016. “Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports.” J. Autom. Constr. 62 (Feb): 45–56. https://doi.org/10.1016/j.autcon.2015.11.001.
Touran, A. 2014. “A mathematical structure for modeling uncertainty in cost, schedule, and escalation factor in a portfolio of projects.” In Proc., Construction Research Congress, 1743–1751. Reston, VA: ASCE.
Tran, D. Q., and K. R. Molenaar. 2015. “Risk-based project delivery selection model for highway design and construction.” J. Constr. Eng. Manage. 141 (12): 04015041. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001024.
Zhang, L., and B. Ashuri. 2018. “BIM log mining: Discovering social networks.” J. Autom. Constr. 91 (Jul): 31–43. https://doi.org/10.1016/j.autcon.2018.03.009.

Information & Authors

Information

Published In

Go to Journal of Construction Engineering and Management
Journal of Construction Engineering and Management
Volume 146Issue 12December 2020

History

Received: Jan 31, 2020
Accepted: Jun 16, 2020
Published online: Sep 28, 2020
Published in print: Dec 1, 2020
Discussion open until: Feb 28, 2021

Permissions

Request permissions for this article.

Authors

Affiliations

Yashovardhan Jallan, S.M.ASCE [email protected]
Ph.D. Candidate, School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332. Email: [email protected]
Professor, School of Civil and Environmental Engineering and School of Building Construction, Georgia Institute of Technology, Atlanta, GA 30332-0680 (corresponding author). ORCID: https://orcid.org/0000-0002-4320-1035. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share