Open access
Technical Papers
Jun 11, 2019

New Flood Early Warning and Forecasting Method Based on Similarity Theory

This article has a reply.
VIEW THE REPLY
This article has a reply.
VIEW THE REPLY
Publication: Journal of Hydrologic Engineering
Volume 24, Issue 8

Abstract

The challenge of achieving reliable flood forecasting results in semiarid regions remains stark. We developed a flood early warning and forecasting method based on similarity theory and a hydrological model to extend the lead time and achieve dynamic rolling forecasting. A multimeasure-based rainfall event similarity analysis (MRESA) method was proposed for rainfall forecasting based on the similarity evaluation between two rainstorms from multiple perspectives (including the quantity similarity, pattern similarity, earth mover’s distance, and rainstorm spatial distribution similarity). Moreover, an ideal sample experiment was conducted to verify the method’s rationality and feasibility. The MRSA method for rainfall prediction and the vertically mixed runoff model were applied to the Beiniuchuan River located in the middle Yellow River basin. Results showed that the flood forecast would be continuously updated and the prediction accuracy gradually increases with the increase of rainstorm and flood information. Therefore, in this study the proposed flood forecasting method based on the similarity analysis is effective and applicable.

Introduction

In recent years, catastrophic rainfall events have led to flash floods and caused extensive social and economic losses (Li et al. 2019). To reduce the effects of floods on public safety and the economy, early and accurate flood warning systems are required (Thielen et al. 2009). These systems rely on watershed hydrological models, which are mainly dependent on rainfall input data, especially the forecasted rainfall during the model lead time. Consequently, there is a need to implement forecasting systems to predict the rainfall within the lead time period.
Short-term rainfall forecasting is an ongoing challenge in hydrology, and tremendous efforts have been made with regard to related issues (Grimes and Diop 2003; Manzato 2007; Liguori et al. 2012; Gagnon et al. 2017). Currently, short-term rainfall forecasts are commonly obtained from radar-based nowcasting methods, numerical weather prediction models, and statistically based techniques (Liguori et al. 2012; Wang et al. 2015). Radar-based nowcasting methods can provide very-short-term forecasts based on radar image extrapolation. However, the associated forecasting results have considerable uncertainties regarding rainfall intensities (Krzysztofowicz 1995), and further studies are required for the optimal estimation and fusion of multisensor data (Bocchiola and Rosso 2006). With rapid developments in atmospheric science and computer technology, numerical weather prediction models can now provide short-term rainfall forecasts. These models can generate 1–3-day-ahead valid rainfall forecasts based on the current weather conditions on a considerable spatial scale (Sirangelo et al. 2007; Shahrban et al. 2016). However, these prediction models cannot reliably describe and simulate the small-scale evolution of the atmosphere because of spin-up problems and coarse spatial resolutions (Mecklenburg et al. 2000). To produce short-term rainfall forecasts for small basins, statistically based techniques are generally most appropriate (Burlando et al. 1993; Luk et al. 2000; Nikam and Gupta 2014) because these approaches can be employed to effectively model nonlinear rainfall events with limited assumptions. Similarity searching constitutes one of the core tasks in statistically based methods; that is, different types of similar sequence pairs can be identified in a hydrological sequence library. Accordingly, research on similarity searching methods can benefit rainfall-runoff forecasting, analyses of environmental evolution, and studies in other fields. The most direct application of similarity searching in hydrology is to determine whether a current hydrological process is similar or equivalent to a process in a historical period. Therefore, similarity searches of hydrological series have significant potential for flood control and forecasting applications.
Over the past few decades, considerable research in the literature has focused on similarity analysis in the field of hydrology. Veitzer and Gupta (2001) applied the random self-similar network (RSN) model to search for width function maximums with implications for floods. Zhu et al. (2008) proposed a similarity measure and performed similarity mining based on hydrological time series. Ouyang et al. (2010) used dynamic time warping (DTW) to identify similarities in a discharge process. Mishra et al. (2013) employed a similarity matrix based on DTW to search discharge data under given climatic conditions. Sharma and Bose (2014) predicted rainfall totals using a K-nearest neighbor (NN)–based similarity measure and a historical data set. Dilmi et al. (2017) presented a modified DTW technique to quantitatively estimate the similarity among rainfall time series.
In humid areas, most watershed hydrological models, such as the Stanford model, Xinanjiang model, and Sacramento model, can achieve good flood forecasting results. However, achieving reliable results still constitutes an ongoing challenge for flood forecasting in semiarid regions. The main sources of difficulty are the high temporal and spatial variability of rainfall, the large and variable transmission losses, and the seasonal variability of vegetation and its impact on runoff (Al-Qurashi et al. 2008). In the Loess Plateau of North China, a typical semiarid region, the performance of traditional hydrological models is especially limited. Floods in this area commonly have a high flood peak and a short duration. Coupled with the uneven distribution of the underlying surface in the area, the runoff-generation mechanism in the middle Yellow River basin is complicated. Traditionally, flood peak forecasting in this area is based on the correlation between the upstream and downstream flow, and the accuracy is usually not good. Hence, in China, it is widely acknowledged that flood forecasting for this area is a difficult task.
In this study, we conducted research on flood peak forecasting with a longer lead time based on similarity theory and watershed hydrological model, thereby providing an effective approach to solve the previously mentioned problem. To extend the lead time, it is essential to identify the actual flood-causing rainfall event as early as possible. The rainfall is predicted with a multimeasure-based rainfall event similarity analysis (MRSA) method. Both the observed rainfall and the predicted rainfall form the inputs to a rainfall-runoff model that is used to produce real-time early warnings for floods and perform flood forecasting. The MRSA method offers a comprehensive rainfall similarity measure based on the quantity similarity index, pattern similarity index, earth mover’s distance, and distribution similarity index. If floods in the semiarid region cannot be predicted with high accuracy, the proposed method can estimate flood peaks in advance, providing valuable information for reducing the risks associated with flood control.

Study Area and Data

The Beiniuchuan River is a first-order tributary of the left bank of the Kuye River and a secondary tributary of the Yellow River (Fig. 1). The length of the main stream is 109 km, and the river covers an area of approximately 2,274  km2. The Beiniuchuan River mainly flows through hilly loess areas. The Beiniuchuan River basin receives an average of 425.1 mm of precipitation annually, mostly in the summer (70%–80%). The annual mean temperature in the basin is 7.9°C, and the annual runoff totals 1.15×109  m3, most of which is snowmelt and precipitation. The Xinmiao hydrological station is the key station on the Beiniuchuan River.
Fig. 1. Beiniuchuan River basin.
The data used in this study include the rainstorm and flood records of 12 rain gauge stations from 1980 to 2007. The precipitation and runoff data were provided by the Hydrology Bureau, Yellow River Conservancy of the Ministry of Water Resources, China.

Methodology

Multimeasure-Based Rainfall Event Similarity Analysis Method

Variations in basin-scale runoff are influenced by the amount of rainfall and its spatiotemporal distribution. In this case, four basic measures were used to evaluate the similarity between two rainstorm events: the quantity similarity, pattern similarity, earth mover’s distance index, and rainstorm spatial distribution similarity. The first three measures compare the similarity of area-averaged rainfall in the total rainfall amount and temporal variations, while the fourth measure evaluates the similarity of two rainstorms at various stations. Then a comprehensive similarity measure was constructed through a weighting method based on the preceding measures, which can be used as an index to evaluate the overall similarity between two rainstorm events. Notably, several computational methods, such as the arithmetic mean method, Thiessen polygon method, and isohyetal method, can be used to obtain the area-averaged precipitation values. In this paper, the arithmetic mean method was adopted to calculate the area-averaged precipitation.

Quantity Similarity

Suppose there are N rainfall stations in the studied basin. The observed precipitation from each gauging station and the area-averaged precipitation at time t from one rainfall event are represented as Xtk and Xt, respectively. For other events, these values are Ytk and Yt respectively, where k=1,2,,N and t=1,2,,T. The difference between two rainstorm events based on the total precipitation can be defined as the quantity similarity
quantity(X,Y)=t=1TXtt=1TYt
(1)
where X denotes the current rainfall event; Y denotes a historical rainfall event; and T represents the period over which the current rainstorm has been ongoing. Here, the current rainfall event refers to rainfall that has occurred at and before moment T. The quantity similarity index describes the similarity degree of two rainstorm events based on the cumulative precipitation. Specifically, lower values of quantity(X,Y) indicate greater similarity between two rainstorm events.

Pattern Similarity

The similarity of two rainstorms in temporal variation is defined as the pattern similarity. Let con(t)=(XtXt+1)/(YtYt+1). The pattern similarity at time t can then be expressed by a unit step function
Score(t)={0,con(t)>01,con(t)0
(2)
where con(t) represents the consistency of their trends at time t. If the two rainstorms exhibit the same trend at time t (both are increasing or are decreasing), the value of con(t) will be positive, and Score(t) will be assigned a value of 0. In contrast, Score(t) will be assigned a value of 1 if the opposite trend is observed. The cumulative unit step function tScore(t) is employed to estimate the pattern similarity of two rainstorm events. The higher the value of this index, the more similar the trend over time, and vice versa.

Earth Mover’s Distance

To address the limitations of the simplistic indicators described previously, we propose the use of the earth mover’s distance (Rubner et al. 2000), which can be intuitively regarded as the minimum cost that must be paid to transform one rainfall process into a different process to evaluate the dissimilarity between two rainstorms. The earth mover’s distance between rainfall events X and Y is defined by the solution to a transportation problem: the least expensive flow fij associated with moving the distance between X and Y. Our goal is to minimize i=1Tj=1Tdijfij subject to the following conditions
fij0for  1iT,1jT
(3)
j=1TfijXifor  1iT
(4)
i=1TfijYjfor  1jT
(5)
i=1Tj=1Tfij=min(i=1TXi,j=1TYj)
(6)
where dij is a descriptor of the similarity between the ith (at time i) and jth (at time j) rainfall amounts associated with processes X and Y, respectively. Concretely, dij=abs(ij) is adopted. Thus, if the rainfall events are moved at the same time (i=j), the distance dij will be relatively small, and vice versa. For instance, d12 will be assigned a value of 1 if a single unit during the first time step of the current rainfall event is moved to the second time step during a historical rainfall event; additionally, when a single unit during the first time step of the current rainfall event is moved to the third time step of a historical rainfall event, d13 will be assigned a value of 2. Moreover, fij represents the cost of transporting a single unit of the current rainfall event X to a historical rainfall event Y. In this study, we employed fij=abs(XiYj) in the computations. After a search is performed to determine the optimal flow, the earth mover’s distance is defined as the resulting work normalized by the total flow
EMD(X,Y)=i=1Tj=1Tdijfiji=1Tj=1Tfij
(7)
A low earth mover’s distance value indicates a high similarity degree between two rainfall events, and vice versa.

Rainstorm Spatial Distribution Similarity

The similarity of two rainstorms at various stations is defined as the rainstorm spatial distribution similarity, which is measured based on the Euclidean distance (ED)
ED=t=1T(k=1N(XtkYtk)2)1/2
(8)
where Xtk and Ytk represent the rainfall amounts at the kth gauging station at time t for the current and historical rainstorm events, respectively. In the formula, k=1N(XtkYtk)2 represents the Euclidean distance between the previous two rainfall events at time t, and the summed Euclidean distance across all periods is regarded as the rainstorm spatial distribution similarity. When the Euclidean distance is small, their similarity degree is high, and vice versa.

Comprehensive Similarity Measure

The four previously mentioned indicators are used to evaluate the similarity between two rainstorms from different perspectives. Therefore, it is necessary to build a comprehensive index for their overall similarity. First, to eliminate the influence of the magnitude, a min-max normalization method is employed to standardize the range of each indicator to [0,1]. Assuming that there are R historical rainfall events, the standardized similarity index for the rth (r=1,2,,R) historical rainstorm Yr compared to the current rainstorm X is calculated by Eq. (9)
xs*(r)=xs(r)xsmin(r)xsmax(r)xsmin(r)
(9)
where xs(r) represents the sth (s=1, 2, 3, and 4) unstandardized similarity index of the rth historical rainstorm Yr for the current rainstorm X; xs*(r) denotes the standardized value of xs(r); and xsmax and xsmin = maximum and minimum values of the sth index, respectively. The overall similarity index W(r) can be calculated based on the average of xs*(r) (s=1, 2, 3, and 4)
W(r)=1Ms=1Mxs*(r)
(10)
where xs*(r) denotes the standardized similarity indicator; and M denotes the number of indicators. In this study, we have four indicators, and thus M is equal to 4.
According to the comprehensive similarity measure, we can find a historical rainstorm that is the most similar to the current rainstorm. Referring to the design flood calculation method, the most similar historical rainfall event identified by the MRSA method is regarded as a typical rainfall event, which is then scaled according to the amount of rainfall to perform rainfall forecasting
P(t)=PcurrentPhistoricalPhistorical(t)
(11)
where Pcurrent and Phistorical = total rainfall amounts of the current rainstorm and the typical rainstorm, respectively; and Phistorical(t) represents a typical flood event.
The preceding rainfall forecasting method based on similarity analysis is simplistic but convenient, and it aims to provide estimates rather than perfectly accurate predictions of future rainfall events. This precipitation forecasting method can extend the lead time as much as possible, and can be efficiently implemented in early warning systems.

Vertically Mixed Runoff Model

The Beiniuchuan River basin is located in a semiarid area. The runoff generation mechanism in the basin is sophisticated; consequently, numerous hydrologists have researched this topic (Güntner and Bronstert 2003; Rödiger et al. 2014; Li et al. 2018). Among the developed hydrological models, the vertically mixed runoff model developed by Bao and Wang (1997) is the most suitable for simulating the rainfall-runoff relationship in the study region. This model has been widely applied for flood forecasting in many basins throughout semiarid regions in China (Wang and Ren 2009). In the vertically mixed runoff model, surface runoff depends on the rainfall intensity and infiltration; this runoff mechanism is known as excess infiltration. Moreover, subsurface runoff is dependent on the soil water deficit and infiltration water, which comprise a mechanism of runoff formation based on natural storage. The core of the vertically mixed runoff model is its combined runoff generation theory, which postulates that two runoff generation mechanisms may exist simultaneously in the vertical direction.
After subtracting losses due to evaporation (PE), rainfall is separated into two components, namely, surface runoff (RS) and infiltration flow (FA), according to the infiltration capacity distribution curve upon reaching the ground. During downward infiltration, FA replenishes the soil water content, and runoff is not produced in areas with large soil water deficits. In areas with enough soil water, subsurface runoff (RR) is produced after FA eliminates the soil water deficit. The surface runoff (RS) is calculated based on the improved Green-Ampt infiltration curve as follows (Bao and Zhao 2014):
FM=FC(1+KFWMWWM)
(12)
FA={FMFM[1PEFM(1+BF)]1+BFPE<FM(1+BF)FMPEFM(1+BF)
(13)
RS=PEFA
(14)
where FM represents the areal-averaged infiltration capacity (mm); FC represents the stable infiltration rate (mm/day); KF = osmotic coefficient, indicating the sensitivity coefficient of the influence of soil water shortage quantity to the infiltration rate; WM denotes the area-averaged tension water capacity (mm); W denotes the area-averaged tension water storage (mm); and BF = exponent of the infiltration capacity distribution curve, indicating the spatial distribution characteristics of the infiltration capacity.
The subsurface runoff (RR) is generated according to the antecedent soil moisture content and the infiltration flow, which belongs to the mechanism of the runoff formation on repletion of storage. It can be calculated by Eq. (15)
RR={FAWM+W+WM(1a+FAWMM)1+BFA+a<WMMFAWM+WFA+aWMM
(15)
a=WMM[1(1WWM)1/(1+B)]
(16)
where a = ordinate value of the spatial distribution curve of tension water storage capacity under the condition that the tension water storage is W (mm); WMM=WM(1+B) is the maximum ordinate value of the spatial distribution curve of tension water storage capacity (mm); and B represents the exponent of the spatial distribution curve of the tension water storage capacity.
Therefore, the total generated runoff can be expressed by the following equation:
R=RS+RR
(17)
The inputs of the vertically mixed runoff model are rainfall and evaporation, and the outputs are surface runoff (RS) and subsurface flow (RR). The latter variable can be further divided into interflow (RI) and groundwater (RG) according to the free water tank model. Moreover, the runoff in each subbasin is simulated first based on the linear reservoir method, and runoff is then routed down channels to the main basin outlet according to the Muskingum method. In the confluence module, the linear reservoir method is employed and it contains three parameters: the recession constant of channel network storage CI, the recession constant of groundwater storage CG, and the recession coefficient of river network CR.

Ideal Sample Experiment

An ideal sample experiment is presented to further demonstrate the feasibility of the proposed MRSA method. The employed ideal sample can be generated from any actual rainfall time series, and the specific generation procedure is as follows.
Suppose there are N precipitation stations in the studied basin. The observed rainfall amount and the simulated rainfall amount of the kth station at time t are denoted respectively by Ptk and Stk, where k=1,2,,N and t=1,2,,T. Then, a pseudorandom numerical matrix AT×N is generated that is uniformly distributed over [0,1], that is, AT×NU[0,1]. If the random number Atk is greater than or equal to 0.5, an increased proportion α will be assigned to the simulated rainfall Stk based on the observed rainfall Ptk. Similarly, if the random number Atk is less than 0.5, a decreased proportion α based on the observed rainfall Ptk will be assigned to the simulated rainfall Stk
Stk={Ptk×(1+α)for  Atk0.5Ptk×(1α)for  Atk<0.5
(18)
where Stk denotes the simulated rainfall for the ith precipitation station at time t; Ati represents the random number of the ith precipitation station at time t; and α represents the proportional change with values of 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3, 4, or 5.
In this method, 15 simulated rainfall events are randomly generated that vary from 5% to 500% of the total rainfall of the actual time series, and these simulated series form the ideal sample. Using the ideal sample as a library of historical rainfall events, the proposed MRSA method is employed to search for the most similar rainfall events in the database of ideal samples. This ideal sample experiment can be used to verify the theoretical feasibility of the MRSA method proposed in this paper.
For the sake of avoiding the influence of accidental error, 1,000 experiments were conducted for each α value, and the average of the 1,000 resulting values was calculated (Table 1). As α increases, the difference between the corresponding simulated rainfall event and the actual rainfall event will increase; thus, the similarity will decrease. Table 1 shows that the values of all five indicators increase with increasing α; therefore, the similarity gradually decreases, which is consistent with the results of the previous analysis. Therefore, the MRSA method is effective and feasible.
Table 1. Results of the similarity indicators in the ideal sample experiment
α (%)Quantity similarityPattern similarityEarth mover’s distance indexRainstorm distribution similarityComprehensive measure
50.261190.0220.1680.036
100.530200.0450.3360.057
201.067200.0800.6720.073
301.567200.1111.0070.095
402.129210.1401.3430.106
502.655210.1731.6790.126
603.171210.1992.0150.140
703.785220.2292.3510.156
804.082220.2592.6860.169
904.816220.2873.0220.181
1005.095220.3163.3580.189
20019.342220.6205.2950.260
30038.413221.0607.4440.346
40058.991221.5259.7700.433
50078.712222.00611.9150.509

Results and Discussion

Model Calibration and Validation

To calibrate and validate the vertically mixed runoff model for the target region, data from 24 rainfall events and corresponding flood data collected at Xinmiao station from 1984 to 2006 were selected to determine the model parameters. These events were arranged in descending order according to their flood peaks. The first eight events were classified as large floods with flood peaks of more than 900  m3/s, the last eight events as small floods with flood peaks of less than 600  m3/s, and the rest as medium floods. Because the runoff generation mechanism is different for floods with different magnitudes, three sets of model parameters were calibrated for large floods, medium floods, and small floods, respectively, as shown in Table 2. In each group, the first five events were used for calibration and the rest for validation. The simulation results for all events are shown in Table 3.
Table 2. Calibrated parameters of the vertically mixed runoff model in the Beiniuchuan River basin
ParameterDefinitionParameter value
Large floodMedium floodSmall flood
KCRatio of the potential evapotranspiration to the pan evaporation (mm)1.01.01.0
WMArea-averaged tension water capacity (mm)155.5113.5170
W0Initial soil moisture content (mm)18.638.331.31
FCStable infiltration rate (mm/0.5  h)57.3756.5447.158
KFOsmotic coefficient2.880.451.48
BFExponent of the infiltration capacity distribution curve2.821.181.74
BExponent of the spatial distribution curve of the tension water storage capacity0.290.2050.34
CSRecession constant of channel network storage (days)0.150.670.1
CGRecession constant of groundwater storage (days)0.80.790.97
CRRecession coefficient of river network (days)0.820.830.68
Table 3. Simulated results of the vertically mixed runoff model with calibrated parameters
PeriodDate of eventCharacteristics of rainfall eventsFlood typeObserved flood peak (m3/s)Forecasted flood peak (m3/s)Error (m3/s)Relative error (%)Qualified or not
Total rainfall (mm)Duration (h)Max intensity (mm/0.5  h)
CalibrationJuly 30, 198443.1207.65Large flood1,8201,99417410Yes
August 26, 198442.3213.21Large flood9241,03010612Yes
August 24, 198539.3254.11Large flood1,0501,085353Yes
August 4, 198847.9235.91Large flood1,8201,63718310Yes
June 10, 199128.2204.84Large flood1,2801,299191Yes
ValidationAugust 8, 199247.1226.38Large flood2,0101,8561548Yes
August 4, 199450.6313.6Large flood1,22098323719Yes
July 30, 200361.8218.26Large flood1,6502,15950931No
CalibrationAugust 23, 198547.4193.04Medium flood72086514520Yes
June 24, 198814.6232.36Medium flood69252017225No
August 27, 199030.2273.57Medium flood7828638110Yes
July 28, 199227.2292.89Medium flood600644447Yes
July 29, 199314.2192.49Medium flood63952111818Yes
ValidationJuly 22, 199420.8232.67Medium flood85063121926No
July 28, 199533.1192.9Medium flood8337349912Yes
July 31, 199727.9193.7Medium flood8459308510Yes
CalibrationAugust 23, 19877.1171.36Small flood372400287Yes
July 25, 19929.7162.51Small flood37738692Yes
August 26, 19925.2201.37Small flood347377309Yes
July 26, 199415.9172.54Small flood4214987718Yes
August 2, 19949.3171.66Small flood48448041Yes
ValidationJuly 12, 19967.6201.55Small flood5654719417Yes
August 29, 20006.2201.47Small flood40640821Yes
July 14, 20063.2131.42Small flood3893127720Yes
In the paper, the relative error was selected to evaluate the performance of the vertically mixed runoff model. In accordance with the accuracy standard established by the GB/T 22482 (MWR 2008), the prediction result is qualified relative to the flood peak if the absolute value of the relative error is less than 20%, That is, if the relative error of a flood peak is within the range from 20% to +20%, it is considered qualified for prediction, and vice versa. As shown in Table 3, among the 24 floods, 21 are qualified; thus the qualification rate is 87.5%. The average absolute relative error of the 24 floods is 12%, and the relative errors for all events are shown in Fig. 2. It can be concluded from the figure that the relative errors of 10 events are within ±10%; the relative errors of 11 events are within ±10%20%. Among them, the mean relative errors for the large floods, medium floods, and small floods are 12%, 16%, and 9%, respectively, and the bias for all events is 13  m3/s. The average flood peak is 879  m3/s, and the ratio of the bias to the average flood peak is approximately 1.2%. The root-mean-square error (RMSE) of all events in flood peaks is 155  m3/s.
Fig. 2. Relative absolute errors for simulated results of the vertically mixed runoff model with calibrated parameters.
In addition, the bias, relative bias, RMSE, and relative RMSE (RRMSE) measures were also applied for evaluating the performance of the vertically mixed runoff model, as shown in Table 4. It can be concluded that the relative bias values are approximately between 12% and 7% and the RRMSE varies from 9% to 20.7%. Overall, the vertically mixed runoff model is applicable for flood forecasting at Xinmiao station and meets the relevant precision requirements. For real-time application, there is a need to recalibrate the final model with all of the data.
Table 4. Fitting statistics for vertically mixed runoff model
StatisticsL:CalL:VerM:CalM:VerS:CalS:Ver
Bias (m3/s)30.239.3477.62856.3
Relative bias (%)2.22.40.69.27.012.4
RMSE (m3/s)123.8336.1120.9147.239.370.2
RRMSE (%)9.020.717.617.59.815.5

Note: L = large; M = medium; S = small; Cal = calibration; and V = verification.

Flood Forecasting Results

In this paper, an early warning and flood forecasting method is developed based on the following procedure. First, with the MRSA method, the most similar rainfall event in the historical rainstorm event data set is identified based on the known information of the current event. Second, under the assumption that similar rainfall events lead to similar floods, the selected typical rainfall event can be scaled according to the rainfall amount, and a forecasted rainfall event can be obtained. Finally, the observed rainfall and forecasted rainfall are imported to the vertically mixed runoff model to make a flood prediction. Over time, more rainstorm and flood information is added to the database, and the flood forecasting process is constantly updated. The procedure is applied to Xinmiao hydrological station on the Beiniuchuan River. In this example, there are 85 historical rainstorm events and corresponding flood events, and the time step is 0.5 h (ΔT=0.5  h). Therefore, the flood forecasting calculation has a half-hourly temporal resolution. The rainfall events and corresponding flood events are numbered according to the times at which the rainstorms occurred. Data from three floods are used to illustrate the procedure, and the forecasting results are shown in Table 5. The relative error for these flood events are 11%, 19%, and 4%, respectively, and their average of absolute relative error is 11%. From Table 5, we can see that flood peak prediction of these events are all qualified.
Table 5. Flood peak forecasting results
Date of eventObserved flood peak (m3/s)Forecasted flood peak (m3/s)Bias (m3/s)Relative error (%)Qualified or not
July 27, 19811,3201,17414611Yes
August 26, 19849241,08215819Yes
July 31, 1997845805404Yes
Taking the first three time steps (3ΔT) of the July 31, 1997 rainfall event as examples and comparing to the historical precipitation events, the comprehensive similarity index rankings are determined. The June 13, 1984, rainfall event is the most similar to the September 31, 1997, rainfall event according to the rainfall information from the first 3ΔT. The hydrograph of the rainfall events is shown in Fig. 3.
Fig. 3. Rainfall forecasting result for July 31, 1997, rainstorm event based the first three ΔT.
Similarly, taking the first 6ΔT and 9ΔT of the September 31, 1997, rainfall event as examples and comparing each with the historical precipitation events, the comprehensive similarity index rankings are determined. Notably, the August 24, 1985, rainfall event continues to be the most similar to the September 31, 1997, rainfall event according to the rainfall information from the first 6ΔT and 9ΔT. The hydrograph of the rainfall events is shown in Figs. 4 and 5.
Fig. 4. Rainfall forecasting result for July 31, 1997, rainstorm event based the first six ΔT.
Fig. 5. Rainfall forecasting result for July 31, 1997, rainstorm event based the first nine ΔT.
Next, the information from the preceding three forecasted rainfall events is input into the vertically mixed runoff model, and the associated flood peaks are predicted based on the different rainfall forecasting results, as shown in Table 6.
Table 6. Forecasted flood peaks with different input rainfall information for example of September 31, 1997, rainfall event
Input rainfall informationObserved flood peak (m3/s)Forecasted flood peak (m3/s)Relative error (%)
3ΔT84570117
6ΔT84573313
9ΔT8458054
When only the first 3ΔT of the current rainstorm information are used as inputs, the rainfall of the September 31, 1997, event is similar to that of the June 13, 1984, historical event. However, in this case the forecasted flood peak is lower than the actual flood peak, and the relative error is 17%. When the first 6ΔT of the current rainstorm information are used as inputs, the rainfall of the August 11, 2004, event is similar to that of the historical August 24, 1985, event, and the relative error is 13%. In addition, the forecasted flood peak is close to the actual flood peak relative to the preceding search, and the relative error is 4%. According to the forecasted results shown previously, as the amount of rainstorm and flood information used in a search increases, the flood peak forecasting accuracy increases. Therefore, the proposed flood forecasting method based on rainfall similarity analysis is effective and can provide early estimates of flood peaks.

Further Discussion of Weights in MRSA

In the proposed MRSA method, four similarity measures are put forward to evaluate the similarity degree between two rainstorm events: the quantity similarity, the pattern similarity, the earth mover’s distance, and the rainstorm spatial distribution similarity. The first three measures are constructed for the area-averaged rainfall, and the other involves the spatial distribution of the rainstorm events. Among the measures, the quantity similarity is used to compare the accumulated rainfall amount, while the pattern similarity is applied to compare the temporal variation of the rainstorm events. The earth mover’s distance is used to evaluate the visual differences between two rainstorm sequences, which is a combination of the quantity similarity and pattern similarity. These three measures evaluate the temporal similarity between two rainstorms, and both the accumulated rainfall amount (quantity similarity) and the trend of the rainstorm (pattern similarity) have a significant impact on the runoff yield process.
In this paper, the dynamic rolling forecasting method is put forward based on similarity theory (MRSA method) and watershed hydrological model (vertically mixed runoff model), and the application of this method is illustrated. The weight optimization is not the focus of this study, but also needs further discussion. In this paper, equal weights are assigned to the four similarity measures; however, there exist the optimal weights in a specific application research. The average Euclidean distance of two rainstorm sequences is employed to assess the similarity between two rainstorm events. Taking the main rain process of the actual July 24, 1981, rainstorm event as an example (which includes 30ΔT), the results of the MRSA method based on a single similarity measure and the comprehensive similarity measure are provided in Table 7, along with the characteristics of the rainstorm event (total rainfall amount and maximum intensity) and their average Euclidean distance of two rainstorm sequences
AED=1Tt=1T(XtYt)2
(19)
where Xt denotes the duration precipitation of a rainstorm event; Yt denotes the duration precipitation of the other rainstorm event; T = total duration; and averaged Euclidean distance (AED).
Table 7. Results of MRSA method based on a single similarity measure and the comprehensive similarity measure for example of July 24, 1981, rainfall event
CriterionDate of eventTotal rainfall (mm)Max intensity (mm/0.5  h)Average Euclidean distance
Original rainstormJuly 24, 198126.64.7
Quantity similarityJuly 7, 200028.54.41.7
Pattern similarityAugust 24, 198537.94.12.7
Earth mover’s distanceJuly 8, 198534.65.52.7
Rainstorm distribution similarityJuly 30, 200361.78.34.7
Comprehensive measureAugust 11, 200535.64.20.6
Results based on single similarity only reflect the similarity degree between two rainstorm events from a certain aspect. Even though the similarity degree of the quantity criterion of the two rainstorms is high, their temporal distributions are still very different. Therefore, the comprehensive measures can provide an optimal result based on several aspects. It can be concluded from Table 7 that the similar rainstorm identified by comprehensive measures yielded the lowest average Euclidean distance, while the rainstorm identified by the earth mover’s distance yielded the highest average Euclidean distance.
Additionally, different combinations of weights have been tested for the MRSA method; the final results in Table 8 show that the August 11, 2005, rainstorm event continues to be the most similar event with the lowest average Euclidean distance. The weights that yield results of the August 11, 2005, even are selected, and their upper bound and lower bound and the average value of these weights can be calculated and the average weights for the quantity similarity, the pattern similarity, the earth mover’s distance, and rainstorm spatial distribution similarity are 0.18, 0.30, 0.24, and 0.28 in this area. The determination of optimal weights in the MRSA method is a relatively complex optimization problem that needs further study.
Table 8. Results of rainstorm similar to July 24, 1981, rainstorm event and corresponding average Euclidean distances under different weights in MRSA
Date of eventAverage Euclidean distance
August 23, 19852.06
August 24, 19851.70
June 10, 19910.87
August 8, 19922.78
July 7, 20002.73
July 3, 20021.03
July 30, 20034.67
August 11, 20050.65

Conclusions

Most watershed hydrological models, such as the Stanford model, Xinanjiang model, and Sacramento model, can achieve good flood forecasting results in humid areas. However, achieving reliable results still constitutes an ongoing challenge for flood forecasting in semiarid regions due to the high temporal and spatial variability of rainfall, and the large and variable transmission losses in these area. In this study, a flood early warning and forecasting method was developed based on similarity theory and a watershed hydrological model to extend the lead time and achieve dynamic rolling flood forecasting. The main conclusions are as follows:
1.
The rainstorm process is closely related to its corresponding flood event. Under the assumption that similar rainstorms can produce similar floods, the MRSA method was developed for short-term rainfall forecasting. The forecasted rainfall and observed rainfall were imported into the vertically mixed runoff model for flood early warning and forecasting. The entire procedure was applied to the Xinmiao hydrological station on the Beiniuchuan River. Floods here are usually characterized by a short duration, with steep rising and falling. Therefore, the goal of flood forecasting in this kind of catchment is to obtain the flood peak as early as possible; to this end, the proposed MRSA method can provide an effective and reliable approach for the early warning and flood forecasting in semiarid regions. Taking typical flood events as examples, the relative error of flood peaks was within ±20%, and the mean value of the relative absolute error was 11%. Thus, the proposed method can provide a new way to achieve real-time flood warning and forecasting.
2.
According to the actual rainstorm, an ideal sample experiment was carried out to verify the rationality of the proposed MRSA method. One thousand experiments were conducted for each changing rate α to avoid the influence of accidental error. As α increases, the values of all four similarity indicators as well as the comprehensive similarity indicator gradually increase, thereby implying a larger difference between the corresponding simulated rainfall event and the actual rainfall event. Thus, the MRSA method is reasonable.
3.
In the paper, equal weights were assigned to the quantity similarity, the pattern similarity, the earth mover’s distance, and the rainstorm spatial distribution similarity. The weights are closely related to the characteristics of rainstorm in the study area, and the determination of optimal weights in the MRSA method needs further consideration.
4.
Some large-scale construction projects have been built in recent years for soil and water conservation measures to retain water for agricultural irrigation and industrial production; however, these structures have changed the runoff generation mechanism and have consequently introduced additional challenges to real-time flood forecasting in the region. Furthermore, due to the increasing influence of human activities, the underlying surface conditions are changing greatly, resulting in different floods caused by the same rainfall in different years. In the future, it will be necessary to further study the joint similarity between the underlying surface conditions and rainstorm processes.

Acknowledgments

This work was supported by the National Key R&D Program of China (2016YFC0402706), National Natural Science Foundation of China (41730750, 41877147), and Special Scientific Research Fund of Public Welfare Industry of Ministry of Water Resources, China (201501004), sponsored by Qing Lan Project. The authors gratefully acknowledge the helpful review comments and suggestions on earlier version of the manuscript from the editor, associate editor and reviewers.

References

Al-Qurashi, A., N. McIntyre, H. Wheater, and C. Unkrich. 2008. “Application of the Kineros2 rainfall-runoff model to an arid catchment in Oman.” J. Hydrol. 355 (1–4): 91–105. https://doi.org/10.1016/j.jhydrol.2008.03.022.
Bao, W., and L. Zhao. 2014. “Application of linearized calibration method for vertically mixed runoff model parameters.” J. Hydrol. Eng. 19 (8): 04014007. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000984.
Bao, W. M., and Wang, C. L. 1997. “Vertically-mixed runoff model and its application.” Hydrology 3: 18–21.
Bocchiola, D., and R. Rosso. 2006. “The use of scale recursive estimation for short term quantitative precipitation forecast.” Phys. Chem. Earth 31 (18): 1228–1239. https://doi.org/10.1016/j.pce.2006.03.019.
Burlando, P., R. Rosso, L. G. Cadavid, and J. D. Salas. 1993. “Forecasting of short-term rainfall using ARMA models.” J. Hydrol. 144 (1–4): 193–211. https://doi.org/10.1016/0022-1694(93)90172-6.
Dilmi, D., L. Barthès, C. Mallet, and A. Chazottes. 2017. “Modified DTW for a quantitative estimation of the similarity between rainfall time series.” In Proc., EGU General Assembly Conf. Munich, Germany: European Geosciences Union.
Gagnon, P., A. N. Rousseau, D. Charron, V. Fortin, and R. Audet. 2017. “The added value of stochastic spatial disaggregation for short-term rainfall forecasts currently available in Canada.” J. Hydrol. 554 (Nov): 507–516. https://doi.org/10.1016/j.jhydrol.2017.08.023.
Grimes, D. I. F., and M. Diop. 2003. “Satellite-based rainfall estimation for river flow forecasting in Africa. Part I: Rainfall estimates and hydrological forecasts.” Hydrol. Sci. J. 48 (4): 567–584. https://doi.org/10.1623/hysj.48.4.567.51410.
Güntner, A., and A. Bronstert. 2003. “Large-scale hydrological modeling of a semiarid environment: Model development, validation and application, global change and regional impacts.” In Global change and regional impacts, 217–228. Berlin: Springer. https://doi.org/10.1007/978-3-642-55659-3_17.
Krzysztofowicz, R. 1995. “Recent advances associated with flood forecast and warning systems.” Supplement, Rev. Geophys. 33 (S2): 1139–1147. https://doi.org/10.1029/95RG00873.
Li, B., Z. Liang, Z. Bao, J. Wang, and Y. Hu. 2019. “Changes in streamflow and sediment for a planned large reservoir in the middle Yellow River.” Land Degrad. Dev. 30 (7): 878–893. https://doi.org/10.1002/ldr.3274.
Li, B., Z. Liang, J. Zhang, G. Wang, W. Zhao, H. Zhang, J. Wang, and Y. Hu. 2018. “Attribution analysis of runoff decline in a semiarid region of the Loess Plateau, China.” Theor. Appl. Climatol. 131 (1–2): 845–855. https://doi.org/10.1007/s00704-016-2016-2.
Liguori, S., M. A. Rico-Ramirez, A. N. A. Schellart, and A. J. Saul. 2012. “Using probabilistic radar rainfall nowcasts and NWP forecasts for flow prediction in urban catchments.” Atmos. Res. 103 (3): 80–95. https://doi.org/10.1016/j.atmosres.2011.05.004.
Luk, K. C., J. E. Ball, and A. Sharma. 2000. “A study of optimal model lag and spatial inputs to artificial neural network for rainfall forecasting.” J. Hydrol. 227 (1–4): 56–65. https://doi.org/10.1016/S0022-1694(99)00165-1.
Manzato, A. 2007. “Sounding-derived indices for neural network based short-term thunderstorm and rainfall forecasts.” Atmos. Res. 83 (2): 349–365. https://doi.org/10.1016/j.atmosres.2005.10.021.
Mecklenburg, S., J. Joss, and W. Schmid. 2000. “Improving the nowcasting of precipitation in an alpine region with an enhanced radar echo tracking algorithm.” J. Hydrol. 239 (1–4): 46–68. https://doi.org/10.1016/S0022-1694(00)00352-8.
Mishra, S., V. K. Dwivedi, C. Sarvanan, and K. K. Pathak. 2013. “Pattern discovery in hydrological time series data mining during the monsoon period of the high flood years in Brahmaputra River basin.” Int. J. Comput. Appl. 67 (6): 7–14. https://doi.org/10.5120/11397-6698.
MWR (Ministry of Water Resources of the People’s Republic of China). 2008. Standard for hydrological information and hydrological forecasting. GB/T 22482. [In Chinese.] Beijing: MWR.
Nikam, V., and K. Gupta. 2014. “SVM-based model for short-term rainfall forecasts at a local scale in the Mumbai urban area, India.” J. Hydrol. Eng. 19 (5): 1048–1052. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000875.
Ouyang, R., L. Ren, W. Cheng, and C. Zhou. 2010. “Similarity search and pattern discovery in hydrological time series data mining.” Hydrol. Processes 24 (9): 1198–1210. https://doi.org/10.1002/hyp.7583.
Rödiger, T., S. Geyer, U. Mallast, R. Merz, P. Krause, C. Fischer, and C. Siebert. 2014. “Multi-response calibration of a conceptual hydrological model in the semiarid catchment of Wadi al Arab, Jordan.” J. Hydrol. 509 (2): 193–206. https://doi.org/10.1016/j.jhydrol.2013.11.026.
Rubner, Y., C. Tomasi, and L. J. Guibas. 2000. “The earth mover’s distance as a metric for image retrieval.” J. Comput. Vision 40 (2): 99–121. https://doi.org/10.1023/A:1026543900054.
Shahrban, M., J. P. Walker, Q. J. Wang, A. Seed, and P. Steinle. 2016. “An evaluation of numerical weather prediction based rainfall forecasts.” Hydrol. Sci. J. 61 (15): 2704–2717. https://doi.org/10.1080/02626667.2016.1170131.
Sharma, A., and M. Bose. 2014. “Rainfall prediction using k-NN based similarity measure.” In Recent advances in information technology, Vol. 226 of Advances in intelligent systems and computing, 125–132. New Delhi, India: Springer. https://doi.org/10.1007/978-81-322-1856-2_14.
Sirangelo, B., P. Versace, and D. L. De Luca. 2007. “Rainfall nowcasting by at site stochastic model P.R.A.I.S.E.” Hydrol. Earth Syst. Sci. 11 (4): 1341–1351. https://doi.org/10.5194/hess-11-1341-2007.
Thielen, J., J. Bartholmes, M. H. Ramos, and A. D. Roo. 2009. “The European Flood Alert System. Part I: Concept and development.” Hydrol. Earth Syst. Sci. 13 (2): 125–140. https://doi.org/10.5194/hess-13-125-2009.
Veitzer, S. A., and V. K. Gupta. 2001. “Statistical self-similarity of width function maxima with implications to floods.” Adv. Water Resour. 24 (9–10): 955–965. https://doi.org/10.1016/S0309-1708(01)00030-6.
Wang, G., and L. Ren. 2009. “A contrastive study of simulation results between GWSC-VMR and hybrid runoff model in Dianzi basin.” In Proc., Int. Conf. on Environmental Science and Information Application Technology, 583–588. New York: IEEE. https://doi.org/10.1109/ESIAT.2009.302.
Wang, G., W. K. Wong, Y. Hong, L. Liu, J. Dong, and M. Xue. 2015. “Improvement of forecast skill for severe weather by merging radar-based extrapolation and storm-scale NWP corrected forecast.” Atmos. Res. 154: 14–24. https://doi.org/10.1016/j.atmosres.2014.10.021.
Zhu, Y., L. Shijin, W. Dingsheng, and Z. Xiaohua. 2008. “A novel approach to the similarity analysis of multivariate time series and its application in hydrological data mining.” In Proc., Int. Conf. on Computer Science and Software Engineering, 730–734. New York: IEEE.

Information & Authors

Information

Published In

Go to Journal of Hydrologic Engineering
Journal of Hydrologic Engineering
Volume 24Issue 8August 2019

History

Received: Jul 26, 2018
Accepted: Mar 8, 2019
Published online: Jun 11, 2019
Published in print: Aug 1, 2019
Discussion open until: Nov 11, 2019

Authors

Affiliations

Zhangling Xiao [email protected]
Ph.D. Student, College of Hydrology and Water Resources, Hohai Univ., Nanjing 210098, China. Email: [email protected]
Zhongmin Liang [email protected]
Professor, College of Hydrology and Water Resources, Hohai Univ., Nanjing 210098, China. Email: [email protected]
Associate Professor, State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai Univ., Nanjing 210098, China (corresponding author). Email: [email protected]
Bo Hou
Engineer, Hydrology Bureau, Yellow River Conservancy Commission, East Chengbei Rd., Zhengzhou 450004, China.
Yiming Hu
Associate Professor, College of Hydrology and Water Resources, Hohai Univ., Nanjing 210098, China.
Jun Wang
Associate Professor, State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai Univ., Nanjing 210098, China.

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share