Technical Papers
Apr 16, 2024

Hypothesis Testing for the Difference between Two Nash–Sutcliffe Efficiencies for Comparing Hydrological Model Performance

Publication: Journal of Hydrologic Engineering
Volume 29, Issue 4

Abstract

The Nash–Sutcliffe efficiency (NSE) is now the most widely used criterion for measuring the goodness of fit between the hydrological model simulation and corresponding observation. Because there is substantial sampling uncertainty regarding hydrological simulation and observation, the NSE is a random variable. A probability density function (PDF) of NSE variable was derived based on the assumption of the simple linear regression model between the observation and simulation from the hydrological model. To avoid a subjective interpretation of the hydrological model performance, the confidence interval of the NSE variable was determined by its PDF. Because the difference in NSE variables (Φ) between two time periods or two hydrological models is often used for comparing their performances and can also be taken as a random variable, hypothesis testing should be implemented to determine whether the difference is adequate or whether the difference is a chance variation. Because the PDF of the difference can be derived based on the joint PDF of the every NSE random variables, a procedure of the hypothesis testing for the difference between two NSE variables is then proposed for comparing the performances between the hydrological models or between different time periods for a hydrological model. The proposed hypothesis testing has been applied in the abcd and dynamic water balance model (DWBM) hydrological models as case studies to illustrate the procedure of assessing the performances of a hydrological model and justifying its superiority to another model according to both the estimated NSE value and its confidence level due to the sampling uncertainty. Therefore, the proposed hypothesis testing can provide hydrological model end-users with rational model assessment.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Acknowledgments

The author gratefully acknowledges the financial support from National Key Research and Development Project of China (2022YFC3202803) and the National Natural Science Foundation of China (No. 52379022). This work is also partly funded by the Ministry of Foreign Affairs of Denmark and administered by Danida Fellowship Centre (File No. 18-M01-DTU).

References

Alley, W. M. 1984. “On the treatment of evapotranspiration, soil moisture accounting, and aquifer recharge in monthly water balance models.” Water Resour. Res. 20 (8): 1137–1149. https://doi.org/10.1029/WR020i008p01137.
Althoff, D., and L. N. Rodrigues. 2021. “Goodness-of-fit criteria for hydrological models: Model calibration and performance assessment.” J. Hydrol. 600 (Sep): 126674. https://doi.org/10.1016/j.jhydrol.2021.126674.
Bardsley, W. E. 2013. “A goodness off it measure related to r2 for model performance assessment.” Hydrol. Process. 27 (19): 2851–2856. https://doi.org/10.1002/hyp.9914.
Beven, K. 2001. Rainfall-runoff modeling: The primer, 1–360. Chichester, UK: Wiley.
Beven, K., and A. Binley. 1992. “The future of distributed models: Model calibration and uncertainty prediction.” Hydrol. Process. 6 (3): 279–298. https://doi.org/10.1002/hyp.3360060305.
Clark, M. P., et al. 2021. “The abuse of popular performance metrics in hydrologic modeling.” Water Resour. Res. 57 (9): e2020WR029001. https://doi.org/10.1029/2020WR029001.
Clark, M. P., A. G. Slater, D. E. Rupp, R. A. Woods, J. A. Vrugt, H. V. Gupta, T. Wagener, and L. E. Hay. 2008. “Framework for understanding structural errors (FUSE): A modular framework to diagnose differences between hydrological models.” Water Resour. Res. 44 (Dec): W00B02. https://doi.org/10.1029/2007WR006735.
Criss, R. E., and W. E. Winston. 2008. “Do Nash values have value? Discussion and alternate proposals.” Hydrol. Process. 22 (14): 2723–2725. https://doi.org/10.1002/hyp.7072.
De Vos, N. J., and T. H. M. Rientjes. 2010. “Multi-objective performance comparison of an artificial neural network and a conceptual rainfall-runoff model.” Hydrol. Sci. J. 52 (3): 397–413. https://doi.org/10.1623/hysj.52.3.397.
Duan, Q. Y., S. Sorooshian, and V. Gupta. 1992. “Effective and efficient global optimization for conceptual rainfall-runoff models.” Water Resour. Res. 28 (4): 1015–1031. https://doi.org/10.1029/91WR02985.
Duc, L., and Y. Sawada. 2023. “A signal processing-based interpretation of the Nash–Sutcliffe efficiency.” Hydrol. Earth Syst. Sci. 27 (9): 1827–1839. https://doi.org/10.5194/hess-27-1827-2023.
Fernandez, W., R. M. Vogel, and A. Sankarasubramanian. 2000. “Regional calibration of a watershed model.” Hydrol. Sci. J. 45 (5): 689–707. https://doi.org/10.1080/02626660009492371.
Guo, Y. H., Y. Q. Zhang, Y. Q. Zhang, L. Zhang, and Z. G. Wang. 2020. “Regionalization of hydrological modeling for predicting streamflow in ungauged catchments: A comprehensive review.” Wiley Interdiscip. Rev.: Water 8 (1): e1487. https://doi.org/10.1002/wat2.1487.
Gupta, H. V., and H. Kling. 2011. “On typical range, sensitivity, and normalization of mean squared error and Nash–Sutcliffe efficiency type metrics.” Water Resour. Res. 47 (10): W10601. https://doi.org/10.1029/2011WR010962.
Gupta, H. V., H. Kling, K. K. Yilmaz, and G. F. Martinez. 2009. “Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modeling.” J. Hydrol. 377 (1–2): 80–91. https://doi.org/10.1016/j.jhydrol.2009.08.003.
Gupta, H. V., S. Sorooshian, and P. O. Yapo. 1998. “Toward improved calibration of hydrologic models: Multiple and noncommensurable measures of information.” Water Resour. Res. 34 (4): 751–763. https://doi.org/10.1029/97WR03495.
Hamel, P., A. Guswa, J. Sahl, and L. Zhang. 2017. “Predicting dry-season flows with a monthly rainfall-runoff model: Performance for gauged and ungauged catchments.” Hydrol. Process. 31 (22): 3844–3858. https://doi.org/10.1002/hyp.11298.
Hamel, P., and A. J. Guswa. 2015. “Uncertainty analysis of a spatially explicit annual water-balance model: Case study of the Cape Fear basin, North Carolina.” Hydrol. Earth Syst. Sci. 19 : 839–853. https://doi.org/10.5194/hess-19-839-2015.
Hammarwall, D., M. Bengtsson, and B. Ottersten. 2008. “Acquiring partial CSI for spatially selective transmission by instantaneous channel norm feedback.” IEEE Trans. Signal Process. 56 (3): 1188–1204. https://doi.org/10.1109/TSP.2007.907895.
Jiang, X. L., H. V. Gupta, Z. M. Liang, and B. Q. Li. 2019. “Toward Improved probabilistic predictions for flood forecasts generated using deterministic models.” Water Resour. Res. 55 (11): 9519–9543. https://doi.org/10.1029/2019WR025477.
Klemeš, V. 1986. “Operational testing of hydrological simulation models.” Hydrol. Sci. J. 31 (1): 13–24. https://doi.org/10.1080/02626668609491024.
Krause, P., D. P. Boyle, and F. Bäse. 2005. “Comparison of different efficiency criteria for hydrological model assessment.” Adv. Geosci. 5 (Dec): 89–97. https://doi.org/10.5194/adgeo-5-89-2005.
Lamontagne, J. R., C. A. Barber, and R. M. Vogel. 2020. “Improved estimators of model performance efficiency for skewed hydrologic data.” Water Resour. Res. 56 (9): e2020WR027101. https://doi.org/10.1029/2020WR027101.
Legates, D. R., and G. J. McCabe. 1999. “Evaluating the use of ‘goodness-of-fit’ measures in hydrologic and hydroclimatic model validation.” Water Resour. Res. 35 (1): 233–241. https://doi.org/10.1029/1998WR900018.
Legates, D. R., and G. J. McCabe. 2013. “Short communication a refined index of model performance: A rejoinder.” Int. J. Climatol. 33 (4): 1053–1056. https://doi.org/10.1002/joc.3487.
Le Moine, N. 2008. “Le bassin versant de surface vu par le souterrain: une voie d’amélioration des performance et du réalisme des modéles pluie–débit?” Ph.D. thesis, Université Pierre et Marie Curie, Paris. http://webgr.irstea.fr/wp-content/uploads/2012/07/2008-LE_MOINE-THESE.pdf.
Liu, D., S. L. Guo, Z. L. Wang, P. Liu, X. X. Yu, Q. Zhao, and H. Zou. 2018. “Statistics for sample splitting for the calibration and validation of hydrological models.” Stochastic Environ. Res. Risk Assess. 32 (Nov): 3099–3116. https://doi.org/10.1007/s00477-018-1539-8.
Liu, D. D. 2020. “A rational performance criterion for hydrological model.” J. Hydrol. 590 (Nov): 125488. https://doi.org/10.1016/j.jhydrol.2020.125488.
Liu, D. D. 2021. “Reply to ‘comment on Liu (2020): A rational performance criterion for a hydrological model’ by HyunIl Choi.” J. Hydrol. 603 (Dec): 126935. https://doi.org/10.1016/j.jhydrol.2021.126935.
Martinez, G. F., and H. V. Gupta. 2010. “Toward improved identification of hydrological models: A diagnostic evaluation of the ‘abcd’ monthly water balance model for the conterminous United States.” Water Resour. Res. 46 (8): W08507. https://doi.org/10.1029/2009WR008294.
Mathevet, T., C. Michel, V. Andreassian, and C. J. Perrin. 2006. “A bounded version of the Nash–Sutcliffe criterion for better model assessment on large sets of basins.” In Large sample basin experiment for hydrological model parameterization: Results of the model parameter experiment–MOPEX, edited by V. Andréassian, A. Hall, N. Chahinian, and J. Schaake, 567. Wallingford, UK: International Association of Hydrological Sciences.
McCuen, R. H., Z. Knightm, and A. G. Cutter. 2006. “Evaluation of the Nash–Sutcliffe efficiency index.” J. Hydrol. Eng. 11 (6): 597–602. https://doi.org/10.1061/(ASCE)1084-0699(2006)11:6(597).
Moriasi, D. N., J. G. Arnold, M. W. Van Liew, R. L. Bingner, R. D. Harmel, and T. L. Veith. 2007. “Model evaluation guidelines for systematic quantification of accuracy in watershed simulations.” Trans. ASABE 50 (3): 885–900. https://doi.org/10.13031/2013.23153.
Nash, J. E., and J. V. Sutcliffe. 1970. “River flow forecasting through conceptual models. Part 1: A discussion of principles.” J. Hydrol. 10 (3): 282–290. https://doi.org/10.1016/0022-1694(70)90255-6.
Newman, A. J., et al. 2015. “Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: Data set characteristics and assessment of regional variability in hydrologic model performance.” Hydrol. Earth Syst. Sci. 19 (1): 209–223. https://doi.org/10.5194/hess-19-209-2015.
Oudin, L., V. Andréassian, T. Mathevet, and C. Perrin. 2006. “Dynamic averaging of rainfall-runoff model simulations from complementary model parameterizations.” Water Resour. Res. 42 (7): W07410. https://doi.org/10.1029/2005WR004636.
Pechlivanidis, I. G., B. M. Jackson, H. K. Mcmillan, and H. V. Gupta. 2012. “Using an informational entropy-based metric as a diagnostic of flow duration to drive model parameter identification.” Global NEST Int. J. 14 (3): 325–334.
Ritter, A., and R. Muñoz-Carpena. 2013. “Performance evaluation of hydrological models: Statistical significance for reducing subjectivity in goodness-of-fit assessments.” J. Hydrol. 480 (3): 33–45. https://doi.org/10.1016/j.jhydrol.2012.12.004.
Santos, L., G. Thirel, and C. Perrin. 2018. “Technical note: Pitfalls in using log-transformed flows within the KGE criterion.” Hydrol. Earth Syst. Sci. 22 (8): 4583–4591. https://doi.org/10.5194/hess-22-4583-2018.
Schaefli, B., and H. V. Gupta. 2007. “Do Nash values have value?” Hydrol. Process. 21 (15): 2075–2080. https://doi.org/10.1002/hyp.6825.
Sklar, A. 1959. Vol. 8 of Fonctions de répartition à n dimensions et Leurs marges, 229–231. Paris: Publications de l’Institut de Statistique de l’Université de Paris.
Steinschneider, S., A. Polebitski, C. Brown, and B. H. Letcher. 2012. “Toward a statistical framework to quantify the uncertainties of hydrologic response under climate change.” Water Resour. Res. 48 (11): W11525. https://doi.org/10.1029/2011WR011318.
Thomas, H. A. 1981. Improved methods for national water assessment, report, contract WR 15249270. Washington, DC: US Water Resources Council.
Thornthwaite, C. W. 1948. “An approach toward a rational classification of climate.” Geogr. Rev. 38 (11): 55–94. https://doi.org/10.2307/210739.
Todini, E., and D. Biondi. 2017. “Calibration, parameter estimation, uncertainty, data assimilation, sensitivity analysis, and validation.” In Handbook of applied hydrology, edited by V. P. Singh, 1–19. New York: McGraw Hill.
Vandewiele, G. L., and C. Y. Xu. 1992. “Methodology and comparative study of monthly water balance models in Belgium, China and Burma.” J. Hydrol. 134 (1–4): 315–347. https://doi.org/10.1016/0022-1694(92)90041-S.
Vis, M., R. Knight, S. Pool, W. Wolfe, and J. Seibert. 2015. “Model calibration criteria for estimating ecological flow characteristics.” Water 7 (12): 2358–2381. https://doi.org/10.3390/w7052358.
Vogel, R. M., and A. Sankarasubramanian. 2003. “Validation of a watershed model without calibration.” Water Resour. Res. 39 (10): 1292. https://doi.org/10.1029/2002WR001940.
Zhang, L., K. Hickel, and Q. Shao. 2016. “Predicting afforestation impacts on monthly streamflow using the DWBM model.” Ecohydrology 10 (2): e1821. https://doi.org/10.1002/eco.1821.
Zhang, L., N. Potter, K. Hickel, Y. Q. Zhang, and Q. X. Shao. 2008. “Water balance modeling over variable time scales based on the Budyko framework—Model development and testing.” J. Hydrol. 360 (1–4): 117–131. https://doi.org/10.1016/j.jhydrol.2008.07.021.

Information & Authors

Information

Published In

Go to Journal of Hydrologic Engineering
Journal of Hydrologic Engineering
Volume 29Issue 4August 2024

History

Received: Mar 15, 2023
Accepted: Jan 18, 2024
Published online: Apr 16, 2024
Published in print: Aug 1, 2024
Discussion open until: Sep 16, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

Dedi Liu, Ph.D. [email protected]
Professor, State Key Laboratory of Water Resources Engineering and Management, Hubei Key Laboratory of Water System Science for Sponge City Construction, Wuhan Univ., Wuhan 430072, China. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share