Introduction
Large-scale hydrological models have emerged in recent years as tools, e.g., for flood forecasting (e.g.,
Kauffeldt et al. 2016), climate impact analyses (e.g.,
Krysanova et al. 2018), and assessments of human alterations in hydromorphology (
Arheimer et al. 2017) or pollution transport from land to sea (e.g.,
Brack et al. 2015;
Bartosova et al. 2019). The riverine transport of sediment represents an important pathway in the global geochemical cycle (
Martin and Meybeck 1979;
Ludwig et al. 1996). However, ongoing sediment fluxes to the oceans are measured for less than 10% of the Earth’s rivers (
Syvitski et al. 2005), and intrabasin measurements are even scarcer (
Kettner et al. 2010). There have been several attempts to provide global estimates of sediment, focusing on long-term load and mostly using empirical relationships between basin characteristics and the sediment load or relationships between stream flow and sediment load (e.g.,
Milliman and Syvitski 1992;
Syvitski and Milliman 2007;
Cohen et al. 2013). Many others important drivers of sediment yield are lacking at global scale despite their important impacts pointed out by several studies, e.g., the effect of extreme events (
Li and Fang 2016), or gully erosion and landslide erosion (
Tan et al. 2018). Large-scale catchment-based hydrological models can provide a consistent way to estimate sediment generation and transport from land to coasts.
Many of the traditional global hydrological models that are applied on the large scale are gridded water-balance or water-allocation models (
Bierkens et al. 2015;
Sood and Smakhtin 2015). The scientific community uses a variety of model types to evaluate impact of changing climate on streamflow, including land-surface models and global hydrological models forced by global circulation models (GCMs) (e.g.,
Haddeland et al. 2014) or using runoff directly from GCMs (e.g.,
Koirala et al. 2014;
Nohara et al. 2006). However, traditional catchment-based models are also being more and more often applied over larger spatial domains with advancing computer capabilities, increasing availability of data, and increasing need for integrated assessments (e.g.,
Archfield et al. 2015;
Arheimer et al. 2020). Catchment modeling techniques, such as those presented in this study, enable determination of the balances and fluxes within water divides and linking of parameters to physiographic properties without aggregation to a grid. Such models also provide an opportunity to consider catchment-specific properties, which have coevolved over time from interactions within the system of water divides (
Sivapalan 2005), both for water and transported materials.
There are many factors determining the confidence in the model results and the impact analyses that follow, for instance, the input data and assumptions incorporated into data processing (e.g.,
Hutton et al. 2016), model performance and calibration procedure (e.g.,
Hundecha et al. 2020), choice of the hydrological model (e.g.,
Karlsson et al. 2016), and spatial and temporal scales (e.g.,
Mertz et al. 2009). Not only are different hydrological processes dominant over different regions (
Kuentz et al. 2017), which may be difficult to capture in a model of a large domain, but also many simplifications and assumptions are inherently incorporated within the available data sets used as model inputs or for evaluation of results. Such data can be difficult to find with high resolution at a large scale, especially when linked to water management, as regulations (e.g.,
Arheimer et al. 2017), or as point-source discharges and emissions of pollutants (e.g.,
Donnelly et al. 2013).
This study’s main objective is to explore how global-scale catchment models can be used and interpreted at global, continental, and national scales, especially under a changing climate, not only for streamflow or runoff but also for other variables available from the model outputs including sediment concentrations. This is illustrated by (1) providing examples of hydrological variables at a global scale and their projected changes with climate, and (2) using nested models to compare these global variables with those derived at a continental scale for the European domain, and (3) comparing them with those derived at a continental and national scale for Sweden. This study also presents the first sediment modeling results from the authors’ large-scale models based on the Hydrological Predictions of the Environment (HYPE) model (
Lindström et al. 2010) applied worldwide (
Arheimer et al. 2020), across Europe (E-HYPE;
Donnelly et al. 2016;
Hundecha et al. 2016), and in Sweden (S-HYPE;
Strömqvist et al. 2011). The main science questions are as follows:
•
How do large-scale model applications of nested domains compare for various variables?
•
What are plausible causes for differences in results?
•
When should one use large-scale model results?
Materials and Methodology
Impact Modeling Tool
Hydrological and water quality variables were simulated with HYPE model (
Lindström et al. 2010) for all nested domains. HYPE is a semidistributed process-based hydrological model capable of simulating selected water quality constituents together with the rainfall-runoff generation processes at various scales. HYPE was originally designed to aid in evaluating water quality status in Sweden for the EU Water Framework Directive reporting. It has since been used in many different applications such as in flood and drought forecasting (e.g.,
Pechlivanidis et al. 2014), climate impact analyses (e.g.,
Bartosova et al. 2019;
Gelfan et al. 2017;
Donnelly et al. 2017), or evaluation of nutrient mitigation measures (e.g.,
Hankin et al. 2019;
Arheimer et al. 2015) to name a few. HYPE source code and full documentation are freely available (
SMHI 2020b).
HYPE simulates major water pathways and fluxes in a catchment with mass conservation at each time step using precipitation and temperature as forcing data. Model parameters linked to the catchments’ physiographic properties determine the storages and fluxes of water and water quality constituents among the model components. Specific routines account for snow, glacier, reservoir regulations, floodplains, and deep aquifer processes, although the deep aquifer routine was not implemented in the specific applications used in this study due to a lack of large-scale information. Floodplain routine was implemented only in applied worldwide (WW-HYPE;
Arheimer et al. 2020), where temporary inundation plays a significant role, e.g., in inland deltas (
Andersson et al. 2017).
A number of potential evapotranspiration algorithms are available. WW-HYPE uses three evapotranspiration algorithms: Jensen-Haise (
Jensen and Haise 1963) in temperate areas (same as E-HYPE and S-HYPE), modified Hargreaves (
Hargreaves and Samani 1982) in arid and equatorial areas, and Priestley-Taylor (
Priestley and Taylor 1972) in polar and snow-/ice-dominated areas. The algorithms were selected based on their applicability in different climate conditions, which was tested prior to the model calibration. Evapotranspiration parameters were calibrated using the moderate resolution imaging spectroradiometer (MODIS) global evapotranspiration product (MOD16) by Mu et al. (
2011).
There are two erosion modules in HYPE. The default option used in S-HYPE is based on the Morgan-Morgan-Finney erosion model (
Morgan et al. 1984) and calculates particles mobilized by rainfall energy and surface runoff, taking into account the impact of vegetation or snow on the energy. The second model is based on the Hydrologiska Byråns Vattenbalansavdelning Sediment (HBV-SED) model (
Lidén 1999;
Lidén et al. 2001) and calculates particles mobilized by rain using a simpler index-based approach. Both WW-HYPE and E-HYPE use the HBV-SED sediment module. For both options, the mobilized particles are retained in a temporary storage pool and released over time based on the simulated runoff. The results on sediments presented here represent the first version of calibration and are undergoing continued revision and development of both sediment and hydrology with acquiring more observations and further calibration.
A catchment is simulated using the concept of hydrological response units (HRUs) with up to three soil layers. Runoff combined from the individual soil layers within all HRUs in each catchment is first routed to local streams (i.e., local tributaries within the catchment) that flow into the catchment’s main stream. River flow in main streams is routed from upstream to downstream, accounting for branching of flow where needed. Additional retention (lakes and reservoirs) can be defined both for local and main streams.
Nested Domain Models
Here, results from three separate HYPE applications in nested domains are used: (1) global, WW-HYPE version 1.3.5 (
Arheimer et al. 2020); (2) the continent of Europe, E-HYPE version 3.1.6 (
Donnelly et al. 2016); and (3) the country of Sweden, S-HYPE version 16D (
Strömqvist et al. 2011). All three applications have been previously used in operational forecasting and various impact analyses (
SMHI 2020a,
c).
Although all these applications are based on the HYPE model code, their internal structure in terms of land use and soil types, as well as their spatial resolution, differs (Table
1). WW-HYPE simulates the largest area (most land surface of planet Earth, except Antarctica) and has the largest number of catchments and unique HRUs (130,000 and 169, respectively). However, per simulated area, the national S-HYPE has the highest number of catchments and unique HRUs and is thus the most detailed large-scale application. E-HYPE, with a continental scale, stands between the global and national models in the catchment size but has a similar number of catchments as S-HYPE (about 35,000). The national model, S-HYPE, with its highest spatial resolution has also the highest number of flow-gauging and sediment-monitoring sites per unit area of the three models. In addition, WW-HYPE’s HRUs are determined from a combination of land use, vegetation type, and elevation. The variability in soil properties is expressed through assigning hydrologically active soil depth based on the variability in vegetation, climate, and elevation (
Arheimer et al. 2020). HRUs for E-HYPE and S-HYPE consider land use and soil type directly, with additional differentiation of agriculture land use by crop types.
The national and continental models were calibrated to daily flows and daily sediment concentrations using a stepwise parameter estimation method where each step focuses on different processes (
Arheimer et al. 2020;
Donnelly et al. 2016;
Strömqvist et al. 2011). The global model was calibrated to monthly flows, daily sediment concentrations, and long-term sediment loads mostly near river outlets to sea. The calibration was a combination of automatic calibration toward the in-stream measurements and manual calibration. Model performance for streamflow expressed as relative error (RE), Nash-Sutcliff efficiency (NSE), and Kling-Gupta efficiency (KGE) decreases with the increasing area of the model domain and decreasing spatial resolution (Table
1; Figs.
S2 and
S7). Details of model performances for streamflow are available elsewhere (
SMHI 2020a,
c).
The pattern for sediment model performances across the nested domains is not that clear. The average correlation coefficient for sediment concentrations decreases with the increasing area of the domain, but the same pattern is not present for average RE. However, the largest proportion of sites with RE within 25% and 50% is found for S-HYPE (36% and 70%, respectively). The proportions are lower for E-HYPE and WW-HYPE but comparable between these two models.
The nested models used in this study were developed for their specific domain using the best data sources and approaches appropriate for their spatial extent and purpose at the time of their development. Many of the differences among the models are inherently related to the model domain and resolution and are a part of the term scale as used in this study, i.e., when discussing global, continental, and national scales. The comparisons in this study are thus conducted from a practitioners’ point of view, evaluating a range of tools that may be available to decision makers rather than focusing on a single aspect of scale, such as model resolution.
Study Setup
In order to understand how to interpret results from a large-scale model, a series of model outputs at selected scales will be presented. Four hydrologic variables representing different processes were selected for analyses and output from the models: (1) local water runoff generated within each catchment (), (2) soil moisture calculated as the catchment mean annual soil moisture in a root zone as a fraction of field capacity, (3) actual aridity index calculated as mean annual values of the ratio between actual evapotranspiration and precipitation (sometimes also called evaporation ratio or evaporative index), and (4) suspended sediment concentrations (mg/L). The first three variables describe local conditions within each catchment and are not affected by routing through rivers, streams, lakes, or reservoirs.
The selected variables represent both primary outputs used in calibration (albeit streamflow after routing was used for calibration instead of water runoff directly) as well as derived model outputs. The first three variables were calculated for two periods: the present period represented as mean values during 1971–2000 and the future midcentury period represented as percent change from the mean present values projected for 2041–2070. Mean sediment concentrations were calculated for a more recent period of 2001–2010. All four variables are evaluated at the global and continental scales with WW-HYPE and E-HYPE, respectively.
Then, a series of comparisons was conducted to compare the results of these models developed at two different scales for two nested spatial domains. Spatial variabilities of the present mean values, and the projected future changes for water runoff were graphically compared for E-HYPE and WW-HYPE outputs for Europe. In addition, the cumulative distributions were compared over the respective domain areas by sorting the values in ascending order and calculating a percentage of area with runoff or changes at or below each value. The confidence in the model predictions was represented by calculating a percentage of GCMs agreeing on the sign of the change at each catchment for two different impact thresholds where changes within 1% and within 5% were not considered significant, respectively.
Finally, mean runoff and sediment concentrations were exported from all three models for Sweden for years 2001–2010. The shorter time period better reflects the calibration data used at the national scale. Here, spatial variability of both variables and cumulative distribution functions for catchment-average sediment concentrations is compared. At this time, impact of climate change was not evaluated with S-HYPE, and none of these models have been used to assess impact of changing climate on sediments.
Future Impact Scenarios
Aside from structural differences in the model directly related to the scale (e.g., catchment size), other differences arise due to different data sets being available or appropriate for each scale, domain, and purpose. This includes climate data for future impact scenarios because a different set of climate models may be available for each model domain. An ensemble with three coupled model intercomparison project phase 5 (CMIP5) GCMs was used as forcing data for both WW-HYPE and E-HYPE in this study, presenting results for the representative concentration pathway (RCP) RCP8.5 for illustration (Table
2). The RCP8.5 represents a high-emission scenario with the end-of-century radiative forcing of
. Due to the higher spatial resolution of E-HYPE, the climate forcing data were taken from an ensemble of all regional climate models (RCMs) available for each GCM. Results from the individual RCMs were evaluated in order to assess the impact from regional downscaling via assessing the variability among the RCMs. Results from other GCMs/RCMs and RCPs have also been calculated but are not presented here to simplify the analyses and to limit them to the same GCMs for both models, but the full set of results is available elsewhere (
SMHI 2020a).
The forcing data were bias-adjusted to a reference data set, HydroGFD v2 (
Berg et al. 2018), using a distribution-based scaling (DBS) method (
Yang et al. 2010). The DBS method matches observed and simulated frequency distributions by assuming variable-dependent theoretical distributions. The reference period for the calibration of the bias-adjustment parameters was set to 1971–2000. The same bias-adjustment method was used for both WW-HYPE and E-HYPE forcing data.
Results and Discussion
Present and Future Global Hydrological Conditions
With the use of a hydrological model, one can explore not only streamflow and runoff but also other derived hydrological variables that can provide additional information and aid the interpretation. There is a great range of conditions occurring across the continents with a large variability even within each continent (Fig.
1). Water runoff [Fig.
1(a)] shows large global patterns resulting from grouping areas with similar runoff. General underestimation of runoff in the Western Plains and Rocky Mountains in the US and Brazilian Highlands (see
SMHI 2020a) is contributing to the low simulated runoff in these areas. The changes in water runoff [Fig.
1(b)] are dominated by an increase projected with a medium to high confidence in the direction of the projected change across the domain (Fig.
S1). The decrease in average runoff is projected for large parts of Europe, Central and South Americas, eastern Australia, and parts of the US as well as Asia’s mountain region. Comparing the confidence maps for projected decreases and increases can identify areas where GCMs do not agree on direction, e.g., surrounding the northern part of South America with projected decrease above 5%.
In general, the spatial pattern of projected decrease in runoff is mostly consistent with findings of other studies, e.g., Koirala et al. (
2014), who evaluated a change in mean and extreme streamflow using a runoff projected by 11 GCMs (RCP8.5) by 2071–2100 or Schewe et al. (
2014), who used an ensemble of 11 hydrological, land-surface, and vegetation models with five GCMs (RCP8.5). There are also some important differences because the extent of the projected decrease is overall much larger in Schewe et al. (
2014), e.g., for example the US, where WW-HYPE projects much larger area with runoff increase. However, different number of GCMs used in different studies together with other assumptions, e.g., on reference period of bias-adjustment methods make the comparison of the impacts difficult.
Merks et al. (
2020) found large differences between both current discharges and future changes projected with WW-HYPE using the full ensemble of 18 GCMs and those simulated with the variable infiltration capacity (VIC) model (
Liang et al. 1994;
van Vliet et al. 2015). The differences were partly attributed to propagation of change in areas with low annual precipitation and discharge, where it is difficult to predict runoff, and partly to differences between the models themselves (hydrological versus land-surface model), the evapotranspiration routines, and GCM ensembles.
Present soil moisture and actual aridity [Figs.
1(c and e)] largely follow runoff patterns. However, many areas with projected increase in runoff have a projected decrease in soil moisture [Figs.
1(b and d)]. This may indicate possible changes projected for evapotranspiration, with further implications for local water balance and availability. Examining projected changes in actual aridity [Fig.
1(f)] supports this conclusion. Here, the projected changes are of lower magnitude, and the pattern slightly differs. Slight increases are projected for areas in North America, the northern part of South America, most of Europe, central Africa, and parts of Asia.
Global distribution of sediment concentrations partly follows the distribution of runoff at a first look (Fig.
2). However, closer examination shows low sediment concentrations present in areas of both low and medium runoff (e.g., northern Europe and Asia). In other places, such as the western US, Australia, or Central Asia, areas with low runoff but high sediment concentrations can be found. Areas with low runoff typically sustain less vegetation, and as such, certain substrates may be prone to higher erosion rates during the runoff events. Because the erosion capabilities of runoff further vary with vegetation, land use contributes to deviations in the runoff-sediment relationship.
Global Model at Continental Domain for Europe
When examining outputs from WW-HYPE for Europe and E-HYPE, the differences in spatial resolution are immediately noticeable with the global model showing larger patches of uniform colors, whereas the continental model highlights numerous smaller areas that differ from the general trends in the same regions (Fig.
3). The patterns for historical values of runoff agree between both models [Figs.
3(a and b)] with the highest runoff shown at the outer Atlantic coast and in mountainous regions (Alps, Pyrenees, and Diranides) and the lowest runoff values in southern and eastern parts of the E-HYPE domain. WW-HYPE simulates mean flows over the European domain with a better accuracy than over the global domain despite the underestimation bias (Fig.
S2).
Although the choice of climate forcing data does not significantly affect results for the historical time period, E-HYPE produces a higher range of runoff values (Fig.
4). The largest discrepancy between the simulated runoff is for runoff values of 5–10 and
, where the proportion between the two models differs by 8.1% and 2.8%, respectively. E-HYPE also produces 12.6% more area with runoff above
, of which about half (6.8%) is for runoff between 30 and
. The proportion of areas with runoff below 5 or within
is comparable between the models.
The difference between the simulated ranges of values is even more significant for soil moisture and the actual aridity index. E-HYPE simulates a wider range and a higher proportion of medium values for soil moisture and actual aridity index. A larger catchment can represent a larger variety of physiographical, geospatial, and hydroclimatic conditions that are averaged over the catchment area, providing a smoother response than a number of smaller catchments would.
Ensemble-averaged changes in runoff projected by WW-HYPE and E-HYPE [Figs.
3(c and d)] agree for the southern and northern portions of the E-HYPE domain but show opposite trends for a large central portion. Investigation of results for individual GCMs and RCMs shows E-HYPE consistently projects overall decrease of runoff in this area (Figs.
S3–
S6), whereas the direction of the projected change varies for WW-HYPE: two GCMs result in projecting an increase and one GCM in a decrease. It is also noteworthy that relatively small changes are projected for a large portion of the area by both models (approximately 30% of the European domain area shows a change within 5% for either model) (Fig.
S6).
The cumulative distribution curves [Figs.
4(b and d)] for projected changes in runoff and soil moisture show an overall agreement between the two models despite some differences in the historical simulations. The variability in the runoff change (and for the most part, the variability in the soil moisture change) projected by WW-HYPE due to GCM choice was much larger than that projected by E-HYPE for the same GCMs. This is consistent with general expectations of larger uncertainties in the changes projected by the global model. However, it has serious implications for interpretation of the changes projected by a global model, highlighting the need for using the confidence in the direction of the change together with the mean ensemble projections. Larger uncertainties can then be reflected in an increased threshold that must be reached before indicating a change is projected.
The distribution of changes projected for actual aridity index varied significantly between the two models both in the magnitude of the change over the European domain as well as direction of the change in some areas [Figs.
3(g and h)]. This implies that the differences in the model structures and parameters play a significant role, possibly through differences in land use, elevation, and vegetation characteristics, and their impact on actual evapotranspiration.
Comparison of sediment concentrations from WW-HYPE and E-HYPE reveals large differences in the model results (Fig.
5). Both spatial patterns and concentration values varied significantly between the two models despite an overall comparable model performance over their respective domains (Fig.
S2). However, WW-HYPE’s performance worsened when only the European domain was considered (Table
1). All stations with observed data are displayed in Fig.
S2, and the model was calibrated using a subset of sites carefully selected to equally represent a range of conditions with respect to climate, physiographic properties, streamflow, and sediment concentrations present in the full domain. Large differences were found between the cumulative distribution functions for the simulated and observed sediment concentrations for all three models at their original domain as well as for WW-HYPE at the European domain (Fig.
6, dashed lines).
Global and Continental Models at National Domain for Sweden
The effects of model scale become even more obvious when the global- and continental-scale models are compared with the Swedish national model, S-HYPE, that was developed at a much higher resolution (Fig.
7). All three models capture the general pattern of the highest runoff being present in the mountains on the western edge of Sweden [Figs.
7(a–c)]. However, the magnitude as well as the spatial extent of these high-runoff areas increase with an increasing model resolution. WW-HYPE produces lower runoff overall and shows several areas with runoff below
that are not present in the two other models.
The continental-scale model, E-HYPE, simulates mean sediment concentrations higher than either S-HYPE or WW-HYPE for Sweden [Figs.
5(d–f)]. Despite the differences in the runoff, WW-HYPE simulates sediment concentrations at a similar level as S-HYPE and with a similar spatial variability, even if it fails to display the highest concentrations in a band in south-central Sweden. Generally, sediment concentrations in Sweden are quite low, with most catchment-mean concentrations at or below
[Fig.
6(b)], significantly lower than typical sediment concentrations in many other countries. Nevertheless, sediment concentrations from WW-HYPE also match the probability distribution of sediment concentrations from S-HYPE much closer than E-HYPE (Fig.
6). WW-HYPE then fails to reproduce the highest concentrations from S-HYPE. Large differences were found between cumulative density functions for the observed and simulated sediment concentrations even at the Swedish domain (Fig.
6, solid lines), although the closest match was found for E-HYPE at the Swedish domain.
The difference in results is rather interesting becacuse both E-HYPE and WW-HYPE use the same erosion module and have very similar model performance indicators for 2001–2010 at their respective domains (Table
1). It is also of interest to note that although the overall model performance is similar for the two models, median and average REs for S-HYPE and WW-HYPE are negative, whereas E-HYPE has negative median RE but positive average RE. A thorough investigation of the average simulated and observed values across stations for the nested models show (1) a severe lack of low sediment concentrations among observed data for both E-HYPE and WW-HYPE (less than 1% compared with 18% for S-HYPE), (2) a decreasing correlation between the simulated and observed values across stations with increasing model scale, and (3) decreasing correlation between the simulated and observed values across stations for WW-HYPE with decreasing size of the nested domains (Figs.
S2 and
S7; Table
1). This illustrates that different model performance criteria can play a very different role and brings forward a question on what model performance indicator or indicators are best used when evaluating the usefullness of large-scale sediment models.
These results can potentially mean that the observed data overrepresented certain conditions and physiographic characteristics. For example, 89% and 65% of S-HYPE catchments showed simulated sediment concentrations at or below 5 and , whereas the same concentrations were found at only 51% and 18% of the catchments with observed data, respectively. At the same time, the information contained in the data observed at the calibration sites was not necessarily sufficient to fully calibrate the models across all investigated domains. In this case, the three models can be interpreted as an ensemble of models, and their overall range used to represent the uncertainty of model predictions.
Usability and Applicability of Large-Scale Models
Large-scale models are needed to provide information to global and regional stakeholders and decision makers. The challenges are (1) deciding when one should use the large-scale models and (2) interpreting the results considering the uncertainty of the model results and quality of data, especially at the global scale. Differences among the available large-scale approaches and the selected ensemble of GCMs/RCMs contribute, among others, to inconsistencies in the projected climate change impacts. Efforts to simplify the message can then lead to conflicting messages and consequently to a lack of trust from decision makers. Robustness of the projected change thus needs to be an integral part of the assessment.
The authors join the call for collaborative efforts at the global scale in order to assess uncertainty arising from hydrological models and confidence in using them in impact assessments, especially from changing climate. Each model is a plausible representation of reality. Harmonizing the GMC ensemble, the reference period, and the type of model outputs would allow not only for creation of a global ensemble of large-scale hydrological models well-suited for impact analyses but also a systematic assessment of modeling uncertainties (e.g.,
Hagemann et al. 2013;
Schewe et al. 2014). This would complement and build upon the existing studies reporting on uncertainties in climate models (e.g.,
Kumar et al. 2014) and runoff and evapotranspiration approaches (e.g.,
Hagemann et al. 2013;
Haddeland et al. 2011). It is important that the scientific inquiry into differences continues both with models that have similar representation of the hydrological processes, such as in the present study, and with models with different representation in order to expand the ensemble results.
The use of a single large-scale hydrological model can still be justified when, e.g., a harmonized approach is needed over the large domain, as may be a case for some global users, as long as robustness of the projection is assessed and conveyed. Open-access hydrological models such as WW-HYPE can play an important role, e.g., in a screening process or where users do not have the resources in terms of time, finances, data, or skills needed for detailed local analyses. Such models can be used to identify critical regions or regions with high uncertainty and prioritize them for more careful assessments. After such a screening, it is advised to apply a more detailed modeling approach for that specific area to provide decision support for water management or planning of adaptation measures.
The differences in variables derived from primary hydrological outputs deserve further investigation in order to determine whether the more detailed model should be considered more reliable or whether the differences point to a larger uncertainty in general. Evaluating large-scale model performance for sediment needs further considerations because the model comparisons were inconclusive. Models of comparable performances such as presented in this study can produce very different average concentration values, including their spatial patterns. This raises a question of not only what performance indicators should be used but also what conditions dominate the sediment generation and transport processes at these large scales. With limited amount of available observed sediment data in surface waters, other complementary data might be useful to reduce the equifinality of the modeling setups. Further model calibration with a better representation of the range of the simulated conditions and re-evaluation of the scale impact is needed.
Modeling a Change with a Large-Scale Hydrological Model
A systematic approach is needed to select a suite of hydrological models appropriate for the hydrological model ensemble at large scales in order to avoid having many similar models and masking the uncertainty in key processes (
Gosling et al. 2017). A variety of models can be strategically chosen to explore these key processes as well as a parameter space for the model calibration. Model performance also affects the agreement among the hydrological models (
Merks et al. 2020). A standardized evaluation, e.g., over smaller regions, would further help in these evaluations.
Understanding the climate model ensemble and the confidence in the projected changes is critical; this is where a multiple-evidence approach using multiple models is especially useful. An ensemble of GCMs larger than that used in this study is typically needed to explore uncertainty due to climate forcing data. Conclusions should then not be made only from the ensemble averages; the confidence expressed, e.g., as the overall agreement on the direction of the change across the GCMs in the ensemble, should accompany the results. At the same time, a larger threshold should be considered for large-scale models during the interpretation of the results with respect to whether a change is projected to occur.
It is worth considering the type of the model output when judging the importance of the model scale. Output variables that are calibrated, e.g., runoff calibrated using observed stream flows, were shown to result in more reliable and consistent projected changes among the different model scales than derived variables such as the actual aridity index (
Merks et al. 2020).
Projected impacts due to changing climate should be considered together with changing socioeconomic impacts in order to interpret the magnitude of the projected changes with respect to past or future anthropogenic impacts such as land-use changes, agriculture practices, and municipal discharges. For example, Arheimer and Lindström (
2019) determined that regulation of river flow for hydropower production affected streamflow in Sweden more than land-use change or climate change. The impact of socioeconomic conditions can be rather significant for the transported materials and even comparable in magnitude to the changes projected from the changing climate (
Bartosova et al. 2019;
Haddeland et al. 2014;
Eriksson Hägg et al. 2014).
Conclusions
WW-HYPE, a global hydrological catchment-based model, can be a useful tool to advance understanding of global hydrological patterns and to assess projected impacts from changing climate at a large scale for certain hydrological variables. Comparison of the projected changes with the nested large-scale models showed that caution needs to be exercised when making conclusions about the direction of the change because results may vary among the different models, especially for a limited number of GCMs. The differences between the WW-HYPE, E-HYPE, and S-HYPE as used in this comparison could arise from a number of sources such as differences in model structures, process description, spatial resolutions, data sources, regional downscaling of GCMs, or model performance criteria, to name a few. Many of these differences are directly or indirectly related to scale and cannot be separated from it from the practitioners’ point of view.
These differences need to be considered when interpreting the results. The larger the scale, the smaller the variability in the outputs should be expected because the values are averaged over larger areas in terms of model inputs, process descriptions, and model outputs. The scale of the model and the simulated area can also drive other choices aside from the input data resolution. This is especially the case for climate models, where the selection and availability varies with scale and location.
This study’s findings show that hydrological results from the global model were comparable to those at a continental scale or national scale despite the lower model resolution. This demonstrates that large-scale hydrologic models can be useful to provide information on current and future hydrological states at various domains.