Replacing Outliers and Missing Values from Activated Sludge Data Using Kohonen Self-Organizing Map
Publication: Journal of Environmental Engineering
Volume 133, Issue 9
Abstract
Modeling the activated sludge wastewater treatment plant plays an important role in improving its performance. However, there are many limitations of the available data for model identification, calibration, and verification, such as the presence of missing values and outliers. Because available data are generally short, these gaps and outliers in data cannot be discarded but must be replaced by more reasonable estimates. The aim of this study is to use the Kohonen self-organizing map (KSOM), unsupervised neural networks, to predict the missing values and replace outliers in time series data for an activated sludge wastewater treatment plant in Edinburgh, U.K. The method is simple, computationally efficient and highly accurate. The results demonstrated that the KSOM is an excellent tool for replacing outliers and missing values from a high-dimensional data set. A comparison of the KSOM with multiple regression analysis and back-propagation artificial neural networks showed that the KSOM is superior in performance to either of the two latter approaches.
Get full access to this article
View all available purchase options and get full access to this article.
Acknowledgments
The writers are grateful to Thames Water for their cooperation, especially Mr. Duncan Taylor, by providing the plant operation data used in this study. They would also like to thank the two anonymous reviewers whose comments have helped us in improving the manuscript.
References
Adeloye, A. J., and De Munari, A. (2006). “Artificial neural network based generalized storage-yield-reliability models using Levenberg-Marquardt algorithm.” J. Hydrol., 326(1–4), 215–230.
Alhoniemi, E., Hollmén, J., Simula, O., and Vesanto, J. (1999). “Process monitoring and modeling using the self-organizing map.” Integrated Computer-Aided Engineering, 6(1), 3–14.
Alhoniemi, E., Simula, O., and Vesanto, J. (1997). “Analysis of complex systems using the self-organizing map.” Rep., Helsinki Univ. of Technology, Laboratory of Computer and Information Science, Helsinki, Finland, ⟨http://citeseer.ist.psu.edu/cache/papers/cs/18500/http:zSzzSzwww.cis.hut.fizSzprojectszSzidezSzpublicationszSzpaperszSziconip97.pdf/analysis-of-complex-systems.pdf⟩ (Dec. 6, 2006).
Back, B., Sere, K., and Hanna, V. (1998). “Managing complexity in large database using self organising map.” Accounting Management & Information Technologies, 8, 191–210.
Badekas, E., and Papamarkos, N. (2006). “Optimal combination of document binarization techniques using a self-organising map neural network.” Eng. Applic. Artif. Intell., 20(1), 11–24.
Barnett, V., and Lewis, T. (1994). Outliers in statistical data, 3rd Ed., Wiley, Chichester, U.K.
EEC. (1991). “Directive concerning urban wastewater treatment (91/271/EEC).” Official Journal, L135/40.
EEC. (2006). “Directive concerning the management of bathing water quality.” Official Journal of the European Union, L64/37.
Fallon, A., and Spada, C. (2006). “Detecting and accommodation of outliers in normally distributed data sets.” ⟨http://ewr.cee.vt.edu/environmental/teach/smprimer/outlier/outlier.html⟩ (Dec. 6 2006).
Garcia, H., and Gonzalez, L. (2004). “Self-organizing map and clustering for wastewater treatment monitoring.” Eng. Applic. Artif. Intell., 17(3), 215–225.
Harvey, A. C. (1989). Forecasting structural time series models and Kalman filter, Cambridge University Press, Cambridge, U.K.
Kangas, J., and Simula, O. (1995). “Process monitoring and visualisation using self organising map.” Neural networks for chemical engineers, Chap. 14, A. B. Bulsari, ed., Elsevier Science, Dordrecht, The Netherlands.
Kohonen, T., Simula, O., and Visa, A. (1996). “Engineering applications of the self-organizing map.” Proc. IEEE, 84(10), 1358–1384.
MacDonald, I. L., and Zucchini, W. (1997). Hidden Markov and other models for discrete valued time series, Monographs on Statistics and Applied Probability, Vol. 70, Chapman and Hall, London.
Maier, H. R., and Dandy, G. C. (1996). “The use of artificial neural networks for prediction of water quality parameters.” Water Resour. Res., 32(4), 1013–1022.
McBean, E. A., and Rovers, F. A. (1998). Statistical procedures for analysis of environmental monitoring data and risk assessment, Prentice-Hall, Englewood Cliffs, N.J.
Nokyoo, C. (2002). “The use of artificial neural networks in real time forecasting of wastewater treatment plant performance.” Ph.D. thesis, Univ. of Newcastle Upon Tyne, Newcastle upon Tyne, U.K.
Obu-Cann, K., Morita, Y., Fujimura, K., Tokutaka, H., Ohkita, M., and Inui, M. (2001). “Data mining of power transformer database using self-organizing maps.” Proc., 6th Int. Conf. on Soft Computing, IZUKA 2000, Iizuka, Fukuoka, Japan, 201–206, ⟨http://ieeexplore.ieee.org/iel5/7719/21162/00983717.pdf⟩ (Dec. 6, 2006).
Penn, B. S. (2005). “Using self organising maps to visualize high dimensional data.” Comput. Geosci., 31(5), 531–544.
Rosen, C. (1998). “Monitoring wastewater treatment system.” Ph.D. thesis, Dept. of Industrial Electrical Engineering and Automation, Lund Institute of Technology, Lund Univ., Lund, Sweden.
Rosen, C., and Lennox, J. A. (2001). “Multivariate and multi-scale monitoring of wastewater treatment operation.” Water Res., 35(14), 3402–3410.
Spellman, F. R. (2003). Handbook of water and wastewater treatment plant operation, Lewis, Boca Raton, Fla.
Tananaki, C., Thrasyvoulou, A., Giraudel, J. L., and Montury, M. (2007). “Determination of volatile characteristics of Greek and Turkish pine honey samples and their classification by using Kohonen self organising maps.” Food Chemistry, 101(4), 1687–1693.
Vanrolleghem, P. A. (2001). “The usefulness of models in wastewater engineering.” Dept. of Applied Mathematics, Biometrics, and Process Control, Univ. of Gent, lecture held at the Institute for Urban Water Management, Dresden University of Technology, Dresden, Germany.
Vesanto, J., Himberge, J., Alhoniemi, E., and Parhankangas, J. (2000). “Self-organizing map (SOM) toolbox for Matlab 5.” Rep. No. A57, Helsinki Univ. of Technology, Laboratory of Computer and Information Science, Helsinki, Finland.
Information & Authors
Information
Published In
Copyright
© 2007 ASCE.
History
Received: Apr 12, 2006
Accepted: Mar 16, 2007
Published online: Sep 1, 2007
Published in print: Sep 2007
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.