Multiple Imputation Scheme for Overcoming the Missing Values and Variability Issues in ITS Data
Publication: Journal of Transportation Engineering
Volume 131, Issue 12
Abstract
Traffic engineering studies such as validating Highway Capacity Manual (HCM) models require complete and reliable field data. However, the wealth of intelligent transportation systems (ITS) data is sometimes rendered useless for these purposes because of missing values in the data. Many imputation techniques have been developed in the past with virtually all of them imputing a single value for a missing datum. While this provides somewhat simple and fast estimates, it does not eliminate the possibility of producing biased results and it also fails to account for the uncertainty brought about by missing data. To overcome these limitations, a multiple imputation scheme is developed which provides multiple estimates for a missing value, simulating multiple draws from a population to estimate the unknown parameter. This paper also develops a framework of imputation which gives a broad perspective so that one can relate imputation methods to each other.
Get full access to this article
View all available purchase options and get full access to this article.
References
AASHTO. (1992).
Chandra, C., and Al-Deek, H. (2004). “New algorithms for filtering and imputation of real time and archived dual-loop detector data in the I-4 data warehouse.” Proc., 83rd Transportation Research Board (TRB) Annual Meeting, TRB, National Research Council, Washington D.C., Preprint CD-ROM.
Chen, C., Kwon, J., Rice, J., Skabardonis, A., and Varaiya, P. O. (2003). “Detecting errors and imputing missing data for single-loop surveillance systems.” Proc., 82nd Transportation Research Board (TRB) Annual Meeting, TRB, National Research Council, Washington D.C., Preprint CD-ROM.
Conklin, J. H. and Smith, B. L. (2002). “The use of local lane distribution patterns for the estimation of missing data in transportation management systems.” Transportation Research Record 1811, Transportation Research Board, Washington, D.C., 50–56.
Dailey, D. J. (1993). “Improved error detection for inductive loop sensors.” Rep. No. WA-RD 3001, Washington State Department of Transportation, Olympia, Wash.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). “Maximum-likelihood estimation from incomplete data via the EM algorithm (with discussion).” J. R. Stat. Soc. Ser. B. Methodol., 39, 1–38.
Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., eds. (1996). Markov chain Monte Carlo in Practice, Chapman & Hall, London.
Lavori, P. W., Dawson, R., and Shera, D. (1995). “A multiple imputation strategy for clinical trials with truncation of patient data.” Stat. Med., 14, 1913–1925.
Ni, D., Leonard, J. D., Guin, A., and Williams, B. M. (2004). “A systematic approach for validating traffic simulation models.” Proc., 83rd Transportation Research Board (TRB) Annual Meeting, TRB, National Research Council, Washington D.C., Preprint CD-ROM.
Nihan, N. (1997). “Aid to determining freeway metering rates and detecting loop errors.” J. Transp. Eng., 123(6), 454–458.
Rosenbaum, P. R., and Rubin, D. B. (1983). “The central role of the propensity score in observational studies for causal effects.” Biometrika, 70, 41–55.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys, Wiley, New York.
Rubin, D. B. (1996). “Multiple imputation after 18+years.” J. Am. Stat. Assoc., 91, 473–489.
Schafer, J. L. (1997). Analysis of incomplete multivariate data, Chapman & Hall, New York.
Schafer, J. L. (1999). “NORM: Multiple imputation of incomplete multivariate data under a normal model, version 2. Software for Windows 95/98/NT.” ⟨http://www.stat.psu.edu/~jls/misoftwa.html⟩, accessed February 4, 2004.
Smith, B., Scherer, W., and Conklin, J. (2003). “Exploring imputation techniques for missing data in transportation management systems.” Proc., 82nd Transportation Research Board (TRB) Annual Meeting, TRB, National Research Council, Washington D.C., Preprint CD-ROM.
Smith, B., and Babiceanu, S. (2004). “An investigation of extraction transformation and loading (ETL) techniques for traffic data warehouses.” Proc., 83rd Transportation Research Board (TRB) Annual Meeting, TRB, National Research Council, Washington D.C., Preprint CD-ROM.
Statistical Solutions. (2004). “SOLAS for Missing Data Analysis and Multiple Imputation.” ⟨http://www.statsol.ie/solas/solas.htm⟩, accessed October 16, 2004.
Tanner, M. A. and Wong, W. H. (1987). “The calculation of posterior distributions by data augmentation (with discussion).” J. Am. Stat. Assoc., 82, 528–550.
Zhong, M., Sharma, S., and Lingras, P. (2004). “Genetically designed models for accurate imputations of missing traffic counts.” Proc., 83rd Transportation Research Board (TRB) Annual Meeting, TRB, National Research Council, Washington D.C., Preprint CD-ROM.
Information & Authors
Information
Published In
Copyright
© 2005 ASCE.
History
Received: Mar 25, 2004
Accepted: Jan 31, 2005
Published online: Dec 1, 2005
Published in print: Dec 2005
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.