River Flow Prediction Using Dynamic Method for Selecting and Prioritizing K-Nearest Neighbors Based on Data Features
Publication: Journal of Hydrologic Engineering
Volume 25, Issue 5
Abstract
River flow prediction is an important aspect of robust water resources planning and flood warning systems operation. Data-driven approaches have been found efficient to this end. K-nearest neighbors (KNN) is a lazy learning method that can be used for this purpose. In this study, a new method for selecting neighbors named dynamic number of k-nearest neighbor (DKNN) is introduced which uses an optimized distance to select a different number of neighbors for each instance of predictors instead of using a fixed k number as in the classic method. The particle swarm optimization (PSO) algorithm is used for the optimization process to improve the results. Three techniques for prioritizing the contributing neighbors are applied: (1) using the pattern of the predictor data, (2) considering the date of the predictor data, and (3) using both of these features in the prediction procedure. The performance of the proposed method and techniques is tested using 2 years of the daily inflow to the Gheshlagh reservoir in Iran and is compared with the results of classic KNN, artificial neural networks (ANN), random forest regression (RFR), and support vector machines (SVM). The results indicate that the proposed method increased the accuracy of prediction by 4.9% by reducing the root-mean-square error (RMSE) compared to the classic KNN. Using the recorded date of the predictor gives the best performances out of the three proposed techniques and performs better than classic KNN, ANN, RFR, and SVM by showing 49%, 38%, 31%, and 24% improvement in RMSE, respectively. Considering the pattern of the predictor and the combined technique also resulted in 12% and 35% reduction in RMSE, respectively, compared to classic KNN.
Get full access to this article
View all available purchase options and get full access to this article.
References
Ahani, A., M. Shourian, and P. R. Rad. 2018. “Performance assessment of the linear, nonlinear and nonparametric data driven models in river flow forecasting.” Water Resour. Manage. 32 (2): 383–399. https://doi.org/10.1007/s11269-017-1792-5.
Altman, N. S. 1992. “An introduction to kernel and nearest-neighbor nonparametric regression.” Am. Statistician 46 (3): 175–185. https://doi.org/10.1080/00031305.1992.10475879.
Araghinejad, S. 2013. Vol. 67 of Data-driven modeling: Using MATLAB in water resources and environmental engineering. New York: Springer.
Araghinejad, S., N. Fayaz, and S. M. Hosseini-Moghari. 2018. “Development of a hybrid data driven model for hydrological estimation.” Water Resour. Manage. 32 (11): 3737–3750. https://doi.org/10.1007/s11269-018-2016-3.
Breiman, L. 2001. “Random forests.” Mach. Learn. 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
Domeniconi, C., J. Peng, and D. Gunopulos. 2002. “Locally adaptive metric nearest-neighbor classification.” IEEE Trans. Pattern Anal. Mach. Intell. 24 (9): 1281–1285. https://doi.org/10.1109/TPAMI.2002.1033219.
Ertuğrul, Ö. F., and M. E. Tağluk. 2017. “A novel version of k nearest neighbor: Dependent nearest neighbor.” Appl. Soft Comput. 55 (Jun): 480–490. https://doi.org/10.1016/j.asoc.2017.02.020.
Galeati, G. 1990. “A comparison of parametric and non-parametric methods for runoff forecasting.” Hydrol. Sci. J. 35 (1): 79–94. https://doi.org/10.1080/02626669009492406.
Hadi, S. J., and M. Tombul. 2018. “Monthly streamflow forecasting using continuous wavelet and multi-gene genetic programming combination.” J. Hydrol. 561 (Jun): 674–687. https://doi.org/10.1016/j.jhydrol.2018.04.036.
Hyndman, R. J., and A. B. Koehler. 2006. “Another look at measures of forecast accuracy.” Int. J. Forecasting 22 (4): 679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001.
Karlsson, M., and S. Yakowitz. 1987. “Nearest-neighbor methods for nonparametric rainfall-runoff forecasting.” Water Resour. Res. 23 (7): 1300–1308. https://doi.org/10.1029/WR023i007p01300.
Khazaee Poul, A., M. Shourian, and H. Ebrahimi. 2019. “A comparative study of MLR, KNN, ANN and ANFIS models with wavelet transform in monthly stream flow prediction.” Water Resour. Manage. 33 (8): 2907–2923. https://doi.org/10.1007/s11269-019-02273-0.
Kişi, Ö. 2004. “River flow modeling using artificial neural networks.” J. Hydrol. Eng. 9 (1): 60–63. https://doi.org/10.1061/(ASCE)1084-0699(2004)9:1(60).
Laio, F., A. Porporato, R. Revelli, and L. Ridolfi. 2003. “A comparison of nonlinear flood forecasting methods.” Water Resour. Res. 39 (5): 1129. https://doi.org/10.1029/2002WR001551.
Lall, U., and A. Sharma. 1996. “A nearest neighbor bootstrap for resampling hydrologic time series.” Water Resour. Res. 32 (3): 679–693. https://doi.org/10.1029/95WR02966.
Leander, R., A. Buishand, P. Aalders, and M. D. Wit. 2005. “Estimation of extreme floods of the River Meuse using a stochastic weather generator and a rainfall–runoff model.” Hydrol. Sci. J. 50 (6): 1103. https://doi.org/10.1623/hysj.2005.50.6.1089.
Levenshtein, V. I. 1966. “Binary codes capable of correcting deletions, insertions, and reversals.” Sov. Phys. Dokl. 10 (8): 707–710.
Liu, H., S. Zhang, J. Zhao, X. Zhao, and Y. Mo. 2010. “A new classification algorithm using mutual nearest neighbors.” In Proc., 2010 9th Int. Conf. on Grid and Cloud Computing, 52–57. New York: IEEE.
Liu, K., C. Yao, J. Chen, Z. Li, Q. Li, and L. Sun. 2017. “Comparison of three updating models for real time forecasting: A case study of flood forecasting at the middle reaches of the Huai River in East China.” Stochastic Environ. Res. Risk Assess. 31 (6): 1471–1484. https://doi.org/10.1007/s00477-016-1267-x.
Modaresi, F., S. Araghinejad, and K. Ebrahimi. 2018. “Selected model fusion: An approach for improving the accuracy of monthly streamflow forecasting.” J. Hydroinf. 20 (4): 917–933. https://doi.org/10.2166/hydro.2018.098.
Nash, J. E., and J. V. Sutcliffe. 1970. “River flow forecasting through conceptual models. Part I: A discussion of principles.” J. Hydrol. 10 (3): 282–290. https://doi.org/10.1016/0022-1694(70)90255-6.
Parsopoulos, K. E., and M. N. Vrahatis. 2002. “Recent approaches to global optimization problems through particle swarm optimization.” Nat. Comput. 1 (2–3): 235–306. https://doi.org/10.1023/A:1016568309421.
Peng, J., D. R. Heisterkamp, and H. K. Dai. 2004. “Adaptive quasiconformal kernel nearest neighbor classification.” IEEE Trans. Pattern Anal. Mach. Intell. 26 (5): 656–661. https://doi.org/10.1109/TPAMI.2004.1273978.
Shamseldin, A. Y., and K. M. O’Connor. 1996. “A nearest neighbour linear perturbation model for river flow forecasting.” J. Hydrol. 179 (1–4): 353–375. https://doi.org/10.1016/0022-1694(95)02833-1.
Sivakumar, B., A. Jayawardena, and T. Fernando. 2002. “River flow forecasting: Use of phase-space reconstruction and artificial neural networks approaches.” J. Hydrol. 265 (1–4): 225–245. https://doi.org/10.1016/S0022-1694(02)00112-9.
Smola, A. J., and B. Schölkopf. 2004. “A tutorial on support vector regression.” Stat. Comput. 14 (3): 199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88.
Solomatine, D. P., M. Maskey, and D. L. Shrestha. 2008. “Instance-based learning compared to other data-driven methods in hydrological forecasting.” Hydrol. Processes 22 (2): 275–287. https://doi.org/10.1002/hyp.6592.
Souza Filho, F. A., and U. Lall. 2003. “Seasonal to interannual ensemble streamflow forecasts for Ceara, Brazil: Applications of a multivariate, semiparametric algorithm.” Water Resour. Res. 39 (11): 1307. https://doi.org/10.1029/2002WR001373.
St-Hilaire, A., T. B. Ouarda, Z. Bargaoui, A. Daigle, and L. Bilodeau. 2012. “Daily river water temperature forecast model with a k-nearest neighbour approach.” Hydrol. Processes 26 (9): 1302–1310. https://doi.org/10.1002/hyp.8216.
Wu, C. L., K. W. Chau, and Y. S. Li. 2009. “Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques.” Water Resour. Res. 45 (8): W08432. https://doi.org/10.1029/2007WR006737.
Yang, Y., Y. Chen, Y. Wang, C. Li, and L. Li. 2016. “Modelling a combined method based on ANFIS and neural network improved by DE algorithm: A case study for short-term electricity demand forecasting.” Appl. Soft Comput. 49 (Dec): 663–675. https://doi.org/10.1016/j.asoc.2016.07.053.
Zhang, S. 2011. “Shell-neighbor method and its application in missing data imputation.” Appl. Intell. 35 (1): 123–133. https://doi.org/10.1007/s10489-009-0207-6.
Information & Authors
Information
Published In
Copyright
©2020 American Society of Civil Engineers.
History
Received: Jan 10, 2019
Accepted: Nov 1, 2019
Published online: Feb 21, 2020
Published in print: May 1, 2020
Discussion open until: Jul 21, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.