Outlier Detection in Multivariate Hydrologic Data
Publication: Journal of Hydrologic Engineering
Volume 13, Issue 7
Abstract
The existence of extreme events in data sets can distort the accuracy of computed statistics. Some favor censoring outliers while others oppose censoring measured values. Detection methods are available for use in the univariate case, but methods for identifying outliers in multivariate data are limited. Past research has provided detection methods, but they are limited by the number of outliers, the number of predictor variables, or limited levels of significance. A method for detecting outliers in multivariate data that is valid for as many as five outliers is presented. Critical values for levels of significance for sample sizes from 10 to 100 are presented. The method is tested on actual hydrologic data. The method is compared with Rosner’s univariate outlier test. The differences between the two methods, the effectiveness of the two methods, and their limitations are examined.
Get full access to this article
View all available purchase options and get full access to this article.
References
Barrett, B. E., and Ling, R. F. (1992). “General classes of influence measures for multivariate regression.” J. Am. Stat. Assoc., 87(417), 184–191.
Beckman, R. J., and Cook, R. D. (1983). “Outlier..........s.” Technometrics, 25, 119–145.
Benson, M. A. (1964). “Factors affecting the occurrence of floods in the southwest.” U.S. Geological Survey Water Supply Paper No. 1580-D, Washington, D.C.
Davis, L. G. (1974). “Floods in Indiana: Technical manual for estimating their magnitude and frequency.” Geological Survey Circular No. 710, USGS, Reston, Va.
Dixon, W. J. (1951). “Ratios involving extreme values.” Ann. Math. Stat., 22, 68–78.
Flaxman, E. M. (1972). “Predicting sediment yield in the western United States.” J. Hydr. Div., 98(12), 2073–2085.
Grubbs, F. E. (1950). “Sample criteria for testing outlying observations.” Ann. Math. Stat., 21, 27–58.
Hadi, A. S. (1992). “Identifying multiple outliers in multivariate data.” J. R. Stat. Soc. Ser. B (Methodol.), 54(3), 761–771.
Interagency Advisory Committee on Water Data (IACWD). (1982). Guidelines for determining flood flow frequency, Bulletin No. 17 B, Hydrology Committee, Reston, Va.
Knapp, J. W., and Rawls, W. J. (1969). “Prediction models for investment in urban drainage systems.” Water Resources Center Bulletin 24, Virginia Polytechnic Institute and State Univ., Blacksburg, Va.
McCuen, R. H. (1985). Statistical methods for engineers, Prentice-Hall, Upper Saddle River, N.J.
McCuen, R. H., Rawls, W. J., and Whaley, B. L. (1979). “Comprehensive evaluation of statistical methods for water supply forecasting.” Water Resour. Bull., 15(4), 935–947.
Pearson, E. S., and Chandra Sekar, C. (1936). “The efficiency of statistical tools and a criterion for the rejection of outlying observations.” Biometrika, 28, 308–320.
Rohlf, F. J. (1975). “Generalization of the gap test for the detection of multivariate outliers.” Biometrics, 31, 93–101.
Rosner, B. (1975). “On the detection of many outliers.” Technometrics, 17, 221–227.
Rosner, B. (1983). “Percentage points for a generalized ESD many-outlier procedure.” Technometrics, 24, 165–172.
Rousseeuw, P. J., and van Zomeren, B. C. (1990). “Unmasking multivariate outliers and leverage points.” J. Am. Stat. Assoc., 85(411), 633–639.
Siotani, M. (1959). “The extreme value of the generalized distances of the individual points in the multivariate normal sample.” Ann. Inst. Stat. Math., 10, 183–208.
Sobol, I. M. (1994). A primer on the Monte Carlo method, CRC, Boca Raton, Fla.
Spencer, C. S., and McCuen, R. H. (1996). “Detection of outliers in Pearson type III data.” J. Hydrol. Eng., 1, 2–10.
Srikantan, K. S. (1961). “Testing for a single outlier in a regression model.” Sankhya, Ser. A, 23, 251–260.
Thompson, W. R. (1935). “On a criterion for the rejection of observations and the distribution of the ratio of deviation to sample standard deviation.” Biometrika, 32, 214–219.
Whitaker, G. A., McCuen, R. H., and Brush, J. (1979). “Channel modification and macroinvertebrate diversity in small streams.” Water Resour. Bull., 15(3), 864–879.
Wilks, S. S. (1963). “Multivariate statistical outliers.” Sankhya, Ser. A, 25, 407–426.
Information & Authors
Information
Published In
Copyright
© 2008 ASCE.
History
Received: Sep 19, 2006
Accepted: Sep 5, 2007
Published online: Jul 1, 2008
Published in print: Jul 2008
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.