TECHNICAL PAPERS
Apr 30, 2009

Quantitative Research: Preparation of Incongruous Economic Data Sets for Archival Data Analysis

Publication: Journal of Construction Engineering and Management
Volume 136, Issue 1

Abstract

In the field of construction engineering and management, archival data sets are not always as correct and consistent as it would be desirable. Between different sources that are studied, e.g., companies, they may differ in format or content and within them, they may still be incongruous and require substantial preparation. This makes examining theories and extracting trends from historic data more difficult than it is for carefully controlled experimental studies or for collecting new data. The purpose of this paper is not to review the regression models that the writers developed during their research, but to focus on the data preparation that had to be applied before those analyses. The objective is to outline various techniques that can be applied to archival data that are related to construction engineering and management to give researchers a set of best practices on data preparation that can assist them in gleaning truths from them.

Get full access to this article

View all available purchase options and get full access to this article.

Acknowledgments

The first writer thanks Joseph D. Lombardo of Learning Seed and Justin P. Molineaux of Computech for their advice on creating effective pseudocode.

References

Abudayyeh, O., Dibert-DeYoung, A., and Jaselskis, E. J. (2004). “Analysis of trends in construction research: 1985–2002.” J. Constr. Eng. Manage., 130(3), 433–439.
Abudayyeh, O., Dibert-DeYoung, A., Rasdorf, W. J., and Melhem, H. (2006). “Research publication trends and topics in computing in civil engineering.” J. Comput. Civ. Eng., 20(1), 2–12.
Allmon, E., Haas, C. T., Borcherding, J. D., and Goodrum, P. M. (2000). “U.S. construction labor productivity trends, 1970–1998.” J. Constr. Eng. Manage., 126(2), 97–104.
Amado, V., and Virkler, M. R. (2006). “Using data mining to analyze archived traffic related data.” Proc., 2006 9th Int. Conf. on Applications of Advanced Technology in Transportation, K. C. P. Wang, B. L. Smith, D. R. Uzarski, and S. C. Wong, eds., ASCE, Reston, Va., 310–318.
Arboleda, C. A., and Abraham, D. M. (2004). “Fatalities in trenching operations—Analysis using models of accident causation.” J. Constr. Eng. Manage., 130(2), 273–280.
Attoh-Okine, N. O. (1997). “Rough set application to data mining principles in pavement management database.” J. Comput. Civ. Eng., 11(4), 231–237.
Bessler, F. T., Savic, D. A., and Walters, G. A. (2003). “Water reservoir control with data mining.” J. Water Resour. Plann. Manage., 129(1), 26–34.
Bodie, Z., Kane, A., and Marcus, A. J. (2002). Investments, 5th Ed., McGraw-Hill, New York.
Bureau of Labor Statistics. (2008). “Producer price indexes: Databases, tables & calculators by subject.” U.S. Dept. of Labor, ⟨http://www.bls.gov/data⟩ (July 24, 2008).
Caldas, C. H., and Soibelman, L. (2002). “Automated classification of construction project documents.” J. Comput. Civ. Eng., 16(4), 234–243.
Caldas, C. H., and Soibelman, L. (2006). “A combined text mining method to improve document management in construction projects.” Proc., 2006 Int. Conf. on Computing in Civil Engineering of ASCE, H. Rivard, H. Melhem, and E. Miresco, eds., ASCE, Reston, Va., 2912–2918.
Carter, G., and Smith, S. D. (2006). “Safety hazard identification on construction projects.” J. Constr. Eng. Manage., 132(2), 197–205.
Chang, S.-T. (2001). “Work-time model for engineers.” J. Constr. Eng. Manage., 127(2), 163–172.
Chevallier, N., and Russell, A. D. (1998). “Automated schedule generation.” Can. J. Civ. Eng., 25(6), 1059–1077.
Cowles, H. A., and Elfar, A. A. (1977). “Valuation of industrial property: A proposed model.” Eng. Econ., 23(3), 141–161.
Cox, R. F., Issa, R. A., and Frey, A. (2006). “Proposed subcontractor-based employee motivational model.” J. Constr. Eng. Manage., 132(2), 152–163.
Cross, T. L., and Perry, G. M. (1995). “Depreciation patterns for agricultural machinery.” Am. J. Agric. Econom., 77(1), 194–204.
Cross, T. L., and Perry, G. M. (1996). “Remaining value functions for farm equipment.” Appl. Eng. Agric., 12(5), 547–553.
De Veaux, R. D., and Hand, D. J. (2005). “How to lie with bad data.” Stat. Sci., 20(3), 231–238.
Douglas, J. (1975). Construction equipment policy, McGraw-Hill, New York.
Ezeldin, A. S., and Sharara, L. M. (2006). “Neural networks for estimating the productivity of concreting activities.” J. Constr. Eng. Manage., 132(6), 650–656.
Fan, H., AbouRizk, S. M., and Kim, H. (2007). “Building intelligent applications for construction equipment management.” Proc., 2007 ASCE Int. Workshop on Computing in Civil Engineering, L. Soibelman and B. Akinci, eds., ASCE, Reston, Va., 192–199.
Fan, H., AbouRizk, S. M., Kim, H., and Zaïane, O. (2008). “Assessing residual value of heavy construction equipment using predictive data mining model.” J. Comput. Civ. Eng., 22(3), 181–191.
Fayyad, U. M., and Smyth, P. (1999). “Cataloging and mining massive datasets for science data analysis.” J. Comput. Graph. Stat., 8(3), 589–610.
Goodall, C. R. (1999). “Data mining of massive datasets in healthcare.” J. Comput. Graph. Stat., 8(3), 620–634.
Green, S. B. (1991). “How many subjects does it take to do a regression analysis?” Multivar. Behav. Res., 26(3), 499–510.
Hajjar, D., and AbouRizk, S. M. (2000). “Integrating document management with project and company data.” J. Comput. Civ. Eng., 14(1), 70–77.
Hand, D. J., Blunt, G., Kelly, M. G., and Adams, N. M. (2000). “Data mining for fun and profit.” Stat. Sci., 15(2), 111–126.
Huang, X., and Hinze, J. (2003). “Analysis of construction worker fall accidents.” J. Constr. Eng. Manage., 129(3), 262–271.
Jeske, D. R., and Liu, R. Y. (2007). “Mining and tracking massive text data: Classification, construction of tracking statistics, and inference under misclassification.” Technometrics, 49(2), 116–128.
Kastens, T. (1997). “Farm machinery operation cost calculations.” Kansas State University Farm Management Guide Rep. No. MF-2244, Kansas State Univ. Agricultural Experiment Station and Cooperative Extension Service, Manhattan, Kan.
Lee, J.-R., Hsueh, S.-L., and Tseng, H.-P. (2008). “Utilizing data mining to discover knowledge in construction enterprise performance records.” Journal of Civil Engineering and Management, 14(2), 79–84.
Ling, Y. Y. (2002). “Model for predicting performance of architects and engineers.” J. Constr. Eng. Manage., 128(5), 446–455.
Liu, M., and Ling, Y. Y. (2005). “Modeling a contractor’s markup estimation.” J. Constr. Eng. Manage., 131(4), 391–399.
Lucko, G. (2003). “A statistical analysis and model of the residual value of different types of heavy construction equipment.” Ph.D. dissertation, Virginia Polytechnic Institute and State Univ., Blacksburg, Va.
Messner, J. I. (2003). “An architecture for knowledge management in the AEC industry.” Proc., 2003 Construction Research Congress, K. R. Molenaar and P. S. Chinowsky, eds., ASCE, Reston, Va.
Mitchell, Z. W. (1998). “A statistical analysis of construction equipment repair costs using field data & the cumulative cost model.” Ph.D. dissertation, Virginia Polytechnic Institute and State Univ., Blacksburg, Va.
Mohamed, S. (2002). “Safety climate in construction site environments.” J. Constr. Eng. Manage., 128(5), 375–384.
Montgomery, D. C., Peck, E. A., and Vining, G. G. (2001). Introduction to linear regression analysis, 3rd Ed., Wiley, New York.
Nawari, N. O. (2008). “The role of data mining techniques in the prediction of hurricane damages.” Proc., 2008 Structures Congress, D. Anderson, C. Ventura, D. Harvey, and M. Hoit, eds., ASCE, Reston, Va., 1–10.
Ng, H. S., and Soibelman, L. (2003). “Knowledge discovery in maintenance databases: Enhancing the maintainability in higher education facilities.” Proc., 2003 Construction Research Congress, K. R. Molenaar and P. S. Chinowsky, eds., ASCE, Reston, Va.
Perry, G. M., Bayaner, A., and Nixon, C. J. (1990). “The effect of usage and size on tractor depreciation.” Am. J. Agr. Econ., 72(2), 317–325.
Pietroforte, R., and Stefani, T. P. (2004). “ASCE Journal of Construction Engineering and Management: Review of the years 1983–2000.” J. Constr. Eng. Manage., 130(3), 440–448.
Pipino, L. L., Lee, Y. W., and Wang, R. Y. (2002). “Data quality assessment.” Commun. ACM, 45(4), 211–218.
Rajagopalan, B., and Isken, M. W. (2001). “Exploiting data preparation to enhance mining and knowledge discovery.” IEEE Trans. Syst. Man Cybern., Part C Appl. Rev., 31(4), 460–467.
Redman, T. C. (1998). “The impact of poor data quality on the typical enterprise.” Commun. ACM, 41(2), 79–82.
Roddis, W. M. K., and Zhang, L. (2000). “Equation discovery in databases from engineering.” Proc., 2000 8th Int. Conf. on Computing in Civil and Building Engineering, R. Fruchter, F. Peña-Mora, and W. M. K. Roddis, eds., ASCE, Reston, Va., 890–897.
Rojas, E. M., and Aramvareekul, P. (2003). “Is construction labor productivity really declining?” J. Constr. Eng. Manage., 129(1), 41–46.
Rojas, E. M., and Kell, I. (2008). “Comparative analysis of project delivery systems cost performance in Pacific Northwest public schools.” J. Constr. Eng. Manage., 134(6), 387–397.
Soibelman, L., and Kim, H. (2002). “Data preparation process for construction knowledge generation through knowledge discovery in databases.” J. Comput. Civ. Eng., 16(1), 39–48.
Stegemann, J., and Buenfeld, N. (2004). “Mining of existing data for cement-solidified wastes using neural networks.” J. Environ. Eng., 130(5), 508–515.
Thomas, H. R. (2000). “Schedule acceleration, work flow, and labor productivity.” J. Constr. Eng. Manage., 126(4), 261–267.
Thomas, H. R., and Horman, M. J. (2006). “Fundamental principles of workforce management.” J. Constr. Eng. Manage., 132(1), 97–104.
Vorster, M. C., and de la Garza, J. M. (1990). “Consequential equipment costs associated with lack of availability and downtime.” J. Constr. Eng. Manage., 116(4), 656–669.
Yu, W.-D. (2007). “Hybrid soft computing approach for mining of complex construction databases.” J. Comput. Civ. Eng., 21(5), 343–352.
Zayed, T. M., and Halpin, D. W. (2004). “Process versus data oriented techniques in pile construction productivity assessment.” J. Constr. Eng. Manage., 130(4), 490–499.
Zhang, S., Zhang, C., and Yang, Q. (2003). “Data preparation for data mining.” Applied Artificial Intelligence, 17(5–6), 375–381.
Zhu, Y., Mao, W., and Ahmad, I. (2007). “Capturing implicit structures in unstructured content of construction documents.” J. Comput. Civ. Eng., 21(3), 220–227.

Information & Authors

Information

Published In

Go to Journal of Construction Engineering and Management
Journal of Construction Engineering and Management
Volume 136Issue 1January 2010
Pages: 49 - 57

History

Received: Jul 24, 2008
Accepted: Apr 3, 2009
Published online: Apr 30, 2009
Published in print: Jan 2010

Permissions

Request permissions for this article.

Authors

Affiliations

Gunnar Lucko, Ph.D., A.M.ASCE [email protected]
Assistant Professor, Director, Construction Engineering and Management Program, Dept. of Civil Engineering, The Catholic Univ. of America, 620 Michigan Ave., NE, Washington, D.C. 20064 (corresponding author). E-mail: [email protected]
Zane W. Mitchell Jr., Ph.D., M.ASCE [email protected]
Associate Professor, Chair, Dept. of Engineering, Univ. of Southern Indian, Evansville, IN 47712. E-mail: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share