Open access
Technical Papers
Dec 4, 2020

Urban Water Demand: Statistical Optimization Approach to Modeling Daily Demand

Publication: Journal of Water Resources Planning and Management
Volume 147, Issue 2

Abstract

Reliable forecasts of water demand that account for factors that drive demand are imperative to understanding future urban water needs. The effects of meteorological dynamics and sociocultural settings are expressed weakly in many published municipal water demand models, limiting their utility for high-accuracy urban water demand modeling. To fill this gap, this paper presents an empirical daily urban water demand model based on a 365-day trailing average per capita demand that incorporates functions and factors for meteorological, seasonal, policy, and cultural driving forces. A nonlinear iterative regression model of daily water demand was calibrated and validated with historical data (2005–2015) for El Paso, Texas, a major urban area in the American southwest which had a consistent water conservation policy during the study period. The model includes daily temperature and precipitation response functions (which modify demand by as much as ±20% relative to the annual average), as well as factors that capture effects of month of the year, day of the week, and special holidays (which modify demand within ±15% relative to the annual average). For the validation period (2011–2015), the model performed well, with a coefficient of determination (R2) of 0.95, a Nash–Sutcliff efficiency of 0.94, a mean absolute-value relative error of 4.38%, a relative standard error of estimate of 5.82%, a relative RMS error of 5.71%, and a mean absolute-value peak-day error of 2.78%. The use of these site-specific demand variables and response curves facilitates parsimonious urban water demand forecast modeling for regional water security.

Introduction

Modeling is fundamental to modern water resource management in the face of growing concerns about water security. Many major cities in the US, especially those in arid regions, face water scarcity issues and continually are trying to keep up with the growing demand caused by increasing population, changing societal and political factors, and in the longer term, increasing temperature and changing precipitation patterns. Accurate simulations of the annual peak-day demand assist in the decision-making process for reliable operation and expansion of water utilities. Peak-day demand, or the greatest expected daily demand for the year, is one of the most important factors for sizing and expansion of water supply facilities or taking emergency management measures because urban water treatment and storage capacities are designed to ensure that the system can meet this demand (Beal and Stewart 2014).
Urban water management based on a nuanced quantitative understanding of water demand is essential for addressing water security issues facing arid and semiarid environments (Falkenmark et al. 1989). Urban water demand modeling approaches that facilitate accounting for climatic and sociocultural dimensions of water demand, particularly annual peak-day demand, advance the ability of water utilities to provide reliable service. Therefore, this research developed a parsimonious modeling methodology that uses basic data to provide an accurate daily time-scale municipal water demand model that is sensitive to the dynamics of climate and culture. An application of the statistical water demand modeling methodology was presented by simulating daily water demand, including annual peak-day demand, in the city of El Paso, Texas, which exemplifies the challenges of meeting urban water demand in a desert environment.

Literature Review

Predictions of urban water demand progressed over the last four decades, with some of the earliest research focusing on time-series analysis with linear regression and trend-extrapolation (House-Peters and Chang 2011; Donkor et al. 2014). Maidment and Parzen (1984) trained a monthly predictive cascade model using a Box–Jenkins method that used temperature, precipitation, and evaporation data, and subsequently improved it to generate daily time series for nine cities in Texas (Maidment et al. 1985; Maidment and Miaou 1986). Homwongs et al. (1994) showed weekly municipal flow patterns based on weekday and weekend cycles, whereas Zhou et al. (2000) demonstrated a clear separation in characteristic behaviors of winter and summer water use. Those early works demonstrated the underlying factors of seasonality, trend, and noise functions that constitute municipal water demand. Other models adopted econometric end-use methods, presenting a multitude of other approaches to modeling urban water demand (Billings and Agthe 1998; Bennett et al. 2013).
Analyzing water demand time-series data has taken many forms, such as regression, autoregression, autoregressive integrated moving average (ARIMA), exponential smoothing, and machine learning methods [e.g., support vector machines (SVMs), artificial neural networks (ANNs), classification and regression trees, and random forests (Villarin and Rodriguez-Galiano 2019)]. SVM and ANN are among the most popular methods (Msiza et al. 2008). Adamowski et al. (2012) developed a forecast with an ANN based on temperature and precipitation data with relatively low errors; optimally performing SVM and ANN models produced a mean absolute relative error (MARE) of 5.47% and 2.95%, respectively. Sardinha-Lourenço et al. (2018) evaluated the performance of a combined ANN-SVM method using ARIMA, ARIMA clustering, heuristic forecasting, mean weighting, mean squared error (MSE) weighting, and the parallel adaptive weighting strategy (PAWS). They reported R2 values of 0.69 for the worst performing models to 0.97 for the best model, which produced a mean relative error (MRE) of 8.56%. Other notable applications include developing a web-based forecasting model [Nash–Sutcliffe Efficiency (NSE) of 0.73 and MRE od 15.6%] based on multiple water-use factors for municipal-scale environments, including forecasting for policy, social, potable reuse, and irrigation changes (Sharvelle et al. 2017).
Geospatial approaches to urban water demand modeling typically are associated with national-level analyses, river basins, and grid-scale hydrological simulations [e.g., 0.5°  longitude×0.5°  latitude (Alcamo et al. 2003)]. For example, WaterGAP (Alcamo et al. 1997, 2003) attempts to cover municipal-scale modeling within global hydrology with an emphasis on climate, econometric, technological, and population factors, producing modest estimates of urban water demand with Nash–Sutcliffe efficiency coefficients ranging from 0.5 to 0.7. PCR-GLOBWB 2 (Sutanudjaja et al. 2018) is another notable global hydrological model that incorporates daily municipal-scale water balance. That model’s primary method of handling domestic (household) and industrial water demand is the application of a recycling ratio based on gross domestic product (GDP) per capita assuming that the nonconsumptive portion of the delivered water returns to usable sources. The spatial scale of water demand modeling creates a trade-off between detailed representations of municipal water demand and larger-scale river basin models.
Accurate prediction of daily demand at municipal level has remained challenging due to the nonlinear nature of urban water demand (Guo et al. 2018). Improved predictions are achievable by relying heavily on the accuracy of a multitude of data sets (Ghiassi et al. 2008), some of which typically may not be available in many municipal settings (Lee and Derrible 2020). Although it is possible to achieve accurate water demand predictions within niche communities (e.g., residential, commercial, and industrial) using end-use (i.e., bottom-up) models of water demand that include consumption behavior and policy factors, these models may be limited when applied to large-scale general urban settings over longer forecast horizons. A good example is an end-use model developed by Blokker et al. (2010) which uses a comprehensive set of high-resolution data of residential water features (e.g., toilet, bathroom sink, kitchen sink, washer, lawn watering, and so forth) to quantify the magnitude and use pattern of household level water demand.
This paper advances parsimonious municipal-level water demand modeling to predict daily demand with high accuracy to support management decisions at water utilities, taking into account major factors that affect water demand. A significant contribution of the presented statistical modeling approach is the high-accuracy prediction of annual peak-day demand to lower the risks of system failure through demand management measures (e.g., issuing advisories to curtail outdoor water use) in the short run, and planning system expansions in the long run. High-accuracy water demand modeling is defined as simulations that offer an R2 greater than 0.95 (Pearson 1909), and a relative RMS error (RRMSE) (Levinson 1946) less than 7.5%, for short-, medium-, and long-term validation periods (Zhou et al. 2000; Al-Zahrani and Abo-Monasar 2015). We also evaluated the results based on other common indicators such as Nash–Sutcliffe efficiency coefficient, 95th or 5th percentile errors, and standard errors, which characterize different aspects of the model performance based on the variability of the data.

Methodology

Fig. 1 shows the conceptual framework of the water demand model. As a general overview, the daily water demand prediction is based on the annual-average water demand (megaliters per day), which is calculated based on the day-of population estimate (capita) and the preceding 365-day trailing average of unit consumption (liters per day per capita). Meteorological and social factors capture effects of month of the year, day of the week, special holidays, daily average temperature, and preceding 3-day precipitation. Model calibration was improved by minimizing the RMS error (RMSE) of daily water demand predictions. This process is explained in greater detail in the following sections.
Fig. 1. Conceptual framework for daily water demand modeling.

Data Requirements

One of the primary objectives of this work was to develop a predictive water demand model that requires minimal data for calibration and validation, both for use in simple computational platforms as well as for broad applicability among municipalities. This model requires four daily time-series data sets for calibration: historical water demand, population, average daily temperature, and daily precipitation. For the model calibration in this paper, 5 years (2006–2010) of daily time-series data were used for temperature and precipitation, and an additional preceding year (2005) was required for water demand and population (i.e., 2005–2010, a total of 6 years of daily data) to calculate the 365-day trailing average for the first model calibration year (2006). The model then was validated with the following 5 years of data (2011–2015).

Mathematical Formulation

The model uses known meteorological and social factors to represent the forces that affect water demand trends based on historical water demand data decomposed into three constituents: long-term trends, seasonal patterns, and noise (Dokumentov and Hyndman 2015). The goal of the model was to create sets of fitting parameters that could explain these constituents to be combined to model the daily water demand. From the observed water demand, population, daily average temperature, and daily precipitation data, an equation was developed to synthesize the three constituents into mathematical terms. The daily water demand is calculated using Eq. (1), where per-capita unit consumption, U, and population, P, describe the trend; monthly, day of the week, and holiday scaling coefficients, CM, CDotW, and CH, respectively, describe seasonal patterns; and temperature and precipitation functions, fT and fP, respectively, describe the noise due to daily meteorological conditions. The weighting factors, W1 and W2, attenuate the amplitude of daily fluctuations
Q(i,m,d,c)=W1[Ui1PCM(m)CDotW(m,d)CH(c)fTemp(T)fprecip(p¯)]+W2Q¯
(1)
where Q(i,m,d,c) = predicted water demand (L/day) for ith day of model, where m = month, d = day of week, and c = calendar day; W1/W2 = weighting factor; Ui1=365-day average unit consumption lagged by one day (L/day/person); P = population for ith model day (persons); CM(m) = monthly constant for month m; CDotW(m,d) = weekly constant for month m and day of week d; CH(c) = holiday constant for calendar day c; fTemp (T) = temperature factor, where T = average daily temperature (°C); fprecip (p¯) = precipitation factor, where p¯=3-day average precipitation of ith day of model and two previous days (cm); and Q¯ = average predicted water demand for previous 3 days (L/day).
The model was calibrated and validated using a common split sample approach (e.g., Bakker et al. 2014) and model performance evaluation. An optimization-based approach was implemented to improve model calibration by minimizing the RMSE of daily water demand using the generalized reduced gradient (GRG) nonlinear solver in Microsoft Excel on a 5-year calibration period (2006–2010). Subsequently, the model was validated against the next 5 years (2011–2015) of water demand data. This combination of calibration and validation periods was intended to prevent overfitting by calibrating the model with data within the calibration period and observing the accuracy of the model during the validation period.

Weighting Factors

The weighting factors, W1 and W2, also referred to as memory factors, serve to smooth the predicted daily water demand, preventing extreme changes from one day to the next. A rapid meteorological change such as a cold front or a heat wave could change daily temperature or precipitation substantially, but historically, actual water demand changes at a slower rate than meteorological fluctuations. Thus, the weighting factors smooth the demand prediction for day i, and the values of W1 and W2 were constrained to sum to unity.

Per Capita Consumption and Population

Per capita (unit) water demand (liters per day per capita) was calculated for each day as the water demand, Q, divided by the 365-day trailing average population, P365¯ (linearly interpolated for daily values based on annual data). The unit consumption, U [Eq. (1)], was calculated as a 365-day trailing average
U=Q365¯P365¯
(2)
where Q365¯ = average daily demand for previous 365 days (L/day); and P365¯ = average population for previous 365 days (persons).
The combination of unit consumption and population form the base consumption rate of the prediction model. Several models in the literature start with the winter consumption as the baseline for the model and then add additional factors for prediction (Berkes et al. 1998; Wong et al. 2010; Willuweit and O’Sullivan 2013; Donkor et al. 2014). However, the method described in this paper uses the unit consumption and population to predict the average annual consumption, and then other factors amplify or attenuate for seasonal and daily conditions.

Meteorological Effects

The temperature factor, fT, is a significant contributor to the accuracy of this model. Theoretically, the effect of temperature on demand is bounded quantitatively between minimum and maximum values, with a sigmoidal behavior between. Several types of sigmoidal functions were tested, and the hyperbolic tangent function was selected for its superior correlation coefficient (R2) in this model across a wide range of temperatures. Eq. (3) describes the sigmoidal hyperbolic tangent function sensitivity of daily water demand to average daily temperature and contains four site-specific fitting constants, which were assumed to be constant over the duration of the modeled period (10 years)
fTemp(T)=CT1+CT2tanh(TCT3CT4)
(3)
where T = average day temperature (°C); CT1 and CT2 = dimensionless location-specific temperature constants; and CT3 and CT4 = location-specific temperature constants (°C).
The sensitivity of daily water demand to precipitation was modeled with Eq. (4), which is characterized as exponential decay with increasing precipitation. The average precipitation of the previous 2 days and the day-of precipitation, p¯, is used as the argument of the function because the demand sensitivity was observed to have a lagged association with precipitation events. For example, people may be more influenced by precipitation on a previous day than day-of precipitation in their decision to turn off scheduled lawn irrigation. Zhou et al. (2000) showed that the response to precipitation events varies significantly by location, due to factors including lawn sizes, automated sprinkler systems, and population behaviors. The values of fitting parameters CP1 and CP2 were constrained such that for a 3-day precipitation value of zero, fP equals unity
fPrecip(p¯)=1Cp1(1eCp2p¯)
(4)
where p¯=3-day average precipitation of ith model day and previous 2 days (cm/day); CP1 = dimensionless location-specific constant; and CP2 = location-specific constant (day/cm).
Pan evaporation rate was considered as a possible factor in Eq. (1) but was rejected because pan evaporation is correlated highly with temperature (Piri et al. 2009). Thus, because the effects of temperature were included in this model, pan evaporation was excluded, even though it is used in other models.

Seasonal Analysis

The seasonal or repeating pattern of the system was decomposed into three distinct chronological coefficients for the month (m) of the year, CM(m), the day of the week (d) of each month, CDotW(m,d), and several popular holidays that recur on a certain calendar day (c), CH(c). Each of these factors was created to represent independent driving forces such as annual consumptive uses, day of the week lawn irrigation restrictions, and holiday celebrations.
The monthly coefficient, CM(m), has one numerical value for each month of the year (January–December), and during model calibration, the 12 values are constrained to average to unity. Thus, this coefficient represents how each month of the year compares with the annual average, so this coefficient represents the effects of seasonal patterns such as lawn irrigation, evaporative cooling, and filling swimming pools.
The day of the week coefficient, CDotW(m,d), has one numerical value for each day of the week (Sunday–Saturday) for each month of the year (creating 12 sets of 7 values, for a total of 84 values), and for each month, the seven values are constrained to average to unity. This type of day of the week factor is common among models reported in the literature. This term arises from the policy restrictions of water demands within urban environments. For example, some water utilities restrict lawn irrigation for residential or industrial sectors to certain days of the week, which can lead to a noticeable pattern in water demand throughout the week. Bakker et al. (2014) used a 10-week trailing average to predict the demand in the next week. Wong et al. (2010) noted a strong correlation between demand and day of the week in their model of urban water demand trends for Hong Kong, but the model failed to recognize changes in magnitude of the CDotW(m,d) with different seasons throughout the year.
Identifying the holidays for the target location and including them in the model can change the results significantly (Wong et al. 2010). The holiday coefficient, CH(c), accounts for the average bias of each of the 366 possible calendar days within a year. The value of this coefficient is assumed to be unity for all nonholiday days, and a nonunity value is determined by evaluating the average error of each day of the year after optimization of the month coefficients and day of the week coefficients. This approach allows for automatically identifying the holidays and accounting for them in the model, which is different from selecting the holidays a priori, as performed by Wong et al. (2010).

Study Area

The modeling framework proposed in this study was applied to the city of El Paso, Texas, the 6th largest city in Texas and the 22nd largest city in the US, with a population of approximately 682,000 in 2019 (US Census Bureau 2019). The city is located at the far west corner of Texas where it borders the state of New Mexico and the country of Mexico. El Paso is sustained by three primary sources of water, a river that borders the city, named the Rio Grande (or Rio Bravo in Mexico), and two aquifers that lie on either side of the Franklin Mountain Range: the Mesilla Bolson (west) and Hueco Bolson (east). Based on temperature data from 1994 to 2015 (NOAA 2020), the area is classified as a BWh dry arid desert according to the revised Köppen–Geiger system (Peel et al. 2007). The average daily temperature for this period was 19°C, and El Paso had average daily temperatures below 12°C and as high as 37°C. Since the mid-1900s, drought conditions, increased farming activity (USDA 2012), and national and international agreements have caused El Paso to rely heavily on the aquifers, especially during low flow in the river. Fortunately, since the 1980s, El Paso leaders have strategically advanced regional water sustainability by promoting residential water conservation, reclamation of treated wastewater effluent (purple pipe) for irrigation of municipal parks, indirect potable reuse of treated municipal wastewater through aquifer recharge from the Fred Hervey Water Reclamation Plant, and brackish groundwater desalination at the Kay Bailey Hutchison Desalination Plant. These factors have made El Paso an interesting case to study for water modeling and forecasting, because effective water management is essential to the sustainability of this region.

Water Demand and Population Data

Daily water demand data for the period 2005–2015 were obtained from El Paso Water (the urban water management agency in the city of El Paso) and were used for model calibration (2005–2010) and validation (2011–2015). The water demand data were checked for inconsistencies; there were four outliers beyond 3.5 standard deviations of the mean of daily values. A representative from the El Paso Water confirmed the accuracy of these values, so all data were kept as provided. Daily water quantity data initially came in two sets: daily production/pumping volumes, and daily ground/elevated storage volume changes. Because the goal was to model the daily demand of water, the change in storage was subtracted from the production value to calculate the daily water demand. Figs. 2(a and b) show the general water-use trends, chronologically and monthly, respectively, from 2005 to 2015, and historic annual population estimates for the County of El Paso were acquired from the US Census Bureau (2020). Daily population values were interpolated linearly from annual population data. Fig. 2(c) shows the daily population and 365-day trailing-average unit water consumption for the El Paso study area.
Fig. 2. (a) Historic water demand for El Paso (2005–2015); (b) box and whisker plot of average daily demand by month (2005–2015); and (c) historical unit consumption and population (2005–2016).

Meteorological Data

The historic daily temperature and precipitation data were collected for the study period from the National Oceanic and Atmospheric Administration (NOAA 2020). Average daily temperature and average monthly precipitation are shown in Fig. 3. Missing or incomplete data, signified by the value 9999, were replaced by values from the almanac service from Weather Underground (2014). To ensure data consistencies, 12 random days of known values were selected from each year, and the Weather Underground almanac temperature and precipitation values were identical to NOAA values. The meteorological data were checked, and only three values exceeded the 3.5 standard deviation window. These values occurred on February 2–4, 2011, which was an extreme snowstorm event, so those data were not removed from the data set. On average, 57% of the rainfall for the entire year occurs during July, August, and September.
Fig. 3. El Paso, Texas: (a) average daily temperature (2005–2015); (b) average monthly precipitation (2005–2015).

Results and Discussion

Meteorological Effects

For the selected study area and time period, the four temperature fitting constants (CT1, CT2, CT3, and CT4) were determined to be 1.204, 0.377, 28.009°C, and 9.791°C, respectively, so that the values of the temperature factor, fT, ranged from a minimum of 0.827 in cold weather to a maximum of 1.58 in impracticably hot weather [Fig. 4(a)]. At an average daily temperature of 22.1°C, the temperature factor evaluated to unity and had no effect in Eq. (1). Over the study period, the minimum temperature factor value of 0.827 was observed each year, and the greatest temperature factor value observed was 1.435. An important feature of using this particular method to describe behaviors based on temperature is that for many regions, wintertime consumption is not temperature sensitive, but certain locations are temperature sensitive (Berkes et al. 1998; Donkor et al. 2014), and this hyperbolic tangent function can accommodate both conditions. The El Paso fit in Fig. 4 had little temperature sensitivity for average daily temperatures less than 15°C.
Fig. 4. Modeling results for El Paso, Texas (2006–2010): (a) temperature factor, fT; and (b) precipitation factor, fp.
The two precipitation fitting constants, CP1 and CP2, were determined to be 0.24 and 2.12  day/cm respectively, such that for increasing precipitation in the previous 3 days, the values for the precipitation factor, fp, range from a value of 1 at no precipitation in the previous 3 days to the minimum value asymptote of 0.75 for an average precipitation greater than 2.5  cm/day in the preceding 3 days [Fig. 4(b)]. The lowest observed precipitation factor value within the model was 0.753.

Seasonal Analysis

The monthly CM coefficient was used to capture the general annual cycle, and varied from 0.86 in the December to 1.12 in May and June [Fig. 5(a)]. Furthermore, weekly watering restrictions in El Paso [Fig. 5(b)] affected the day of the week constant [shown in Fig. 5(c) for 4 months that represent the four seasons]. The value of CDotW ranged from 0.854 to 1.138 and had greater amplitude during the spring and summer (greater lawn irrigation) than the winter. Day of the week values for all 12 months are listed in Table 1.
Fig. 5. Modeling results for El Paso, Texas (2006–2010): (a) monthly coefficient, CM; (b) irrigation schedule; and (c) day of week coefficient, CDotW.
Table 1. Day of week coefficient (CDotW)
MonthSundayMondayTuesdayWednesdayThursdayFridaySaturday
January1.0480.9810.9761.0131.0390.9850.959
February1.0271.0060.9611.0241.0181.0100.953
March1.0480.9570.9171.0721.0471.0000.959
April0.9701.0080.8821.0951.0820.9700.994
May0.9590.9600.8961.0761.0901.0260.993
June0.9640.8860.8541.0901.0971.0581.050
July0.8990.9460.9431.1381.0811.0030.989
August0.9550.9890.9251.0311.0891.0270.984
September0.9790.9600.9071.0751.0741.0011.003
October0.9760.9560.9361.0211.0401.0371.035
November1.0191.0170.9511.0471.0570.9560.952
December0.9931.0010.9701.0191.0580.9950.964
Selected values of the holiday constant, CH, are listed in Table 2, and holidays with values farthest from unity include regional holidays such as Cinco de Mayo. The unconstrained average of this factor was unity, which indicates that because the holiday constant was determined after the other factors and constants, no significant bias was corrected or introduced by the CH constant.
Table 2. Holiday coefficient (CH)
HolidayDateCD
New Year’s DayJanuary 10.91
UnknownMarch 50.94
UnknownMarch 70.94
UnknownMarch 90.93
EasterApril 211.07
Cinco De MayoMay 51.07
UnknownMay 160.94
UnknownMay 181.06
UnknownJuly 11.08
Independence DayJuly 41.06
UnknownJuly 300.94
UnknownAugust 10.93
UnknownSeptember 20.94
HalloweenOctober 310.94
ThanksgivingNovember 220.93
ThanksgivingNovember 230.94
ThanksgivingNovember 240.94
ThanksgivingNovember 270.94
ThanksgivingNovember 290.94
ThanksgivingNovember 300.94
Christmas EveDecember 240.92
ChristmasDecember 250.87
Following the meteorological and seasonal regressions, the weighting factors, W1 and W2, were determined to be 0.592 and 0.408, respectively.

Statistical Analyses

Fig. 6 displays known daily water demand values and predicted demand values, and model performance evaluation results are listed in Table 3. To verify the results of a regression model, one method is to check for normality of relative errors, which provided no visual indication of nonnormal distributions. Following the rule of thumb by Bulmer (1979), skewness of errors between 0.5 and 0.5 can be interpreted as being approximately symmetric, and an excess kurtosis between 2 and 2 is acceptable (George and Mallery 2009). From visual inspection of the distribution of errors, the errors were assumed to be normally distributed, and the skewness and excess kurtosis values (0.12 and 0.94, respectively) gave no reason to reject this assumption. Fig. 7 shows the nonexceedance curve for errors for the validation period of 2011–2015; more than 90% of the errors were less than 10%.
Fig. 6. Historical and modeled daily water demand for validation period (2011–2015).
Table 3. Characterization of model performance with respect to predicting daily water demand
Performance parameterCalibration (2006–2010)Validation (2011–2015)
Nash–Sutcliffe efficiency0.940.945
Correlation coefficient (r)0.970.98
Coefficient of determination: (R2)0.950.952
Upper-limit relative error, overpredicted (%)22.4826.15
Lower-limit relative error, underpredicted (%)16.9220.82
Mean relative error (%)0.311.30
Mean absolute-value relative error (%)4.444.38
Standard error of estimate (ML/day)22.1722.13
Relative standard error of estimate (%)5.835.82
RMS error (ML/day)22.2722.71
Mean relative squared error (%)0.320.32
Relative RMS error (%)5.685.71
Mean absolute-value peak-day error (%)2.702.38
Max absolute-value peak-day error (%)6.585.98
5th percentile daily demand error (ML/day)40.9128.46
25th percentile daily demand error (ML/day)14.448.93
50th percentile daily demand error (ML/day)1.843.65
75th percentile daily demand error (ML/day)10.7218.06
95th percentile daily demand error (ML/day)34.5143.23
5th percentile relative error (%)9.687.04
25th percentile relative error (%)3.932.48
50th percentile relative error (%)0.541.03
75th percentile relative error (%)3.214.73
95th percentile relative error (%)9.3510.46
5th percentile absolute value relative error (%)0.300.31
25th percentile absolute value relative error (%)1.751.71
50th percentile absolute value relative error (%)3.563.54
75th percentile absolute value relative error (%)6.316.07
95th percentile absolute value relative error (%)11.5011.25
Fig. 7. Nonexceedance curve of errors for validation period (2011–2015).

Discussion

During the validation period, the model demonstrated acceptable performance compared with the goodness-of-fit factors in the literature for similar modeling applications. Commonly reported R2 values within recent literature ranged between 0.75 and 0.85, and some models produced R2 values greater than 0.9 (most of which employ ANN or SVM). When using ANN or SVM models, units may not be conserved because they might use exponential beta values, log functions, or other methods that result in the loss of units, whereas the model presented here conserves units by identifying and solving for scalar terms and using unit consumption and population to derive the units of megaliters per day. Another difference is that ANN or SVM methods might output only the desired forecasted values, whereas the parameters used in this model provide insight into the behavior of the system. For example, the temperature and precipitation response curves used in this model might be used for other applications and decision-making policies, such as considering a new building code standard for smart sprinkler systems that automatically will decrease irrigation based on precipitation. If landscape irrigation represents a significant fraction of total demand in that community, and this model shows that the demand historically has been insensitive to precipitation events, then enforcing such a building code likely would help reduce demand.
One of the key features of the presented model is its ability to predict annual peak-day demand. Other studies primarily focused on average model error, but other key pieces of information are important to water utility management. The annual peak-day demand dictates the minimum capacity of system water treatment and storage infrastructure. For prediction of the peak-day demand during the validation period, the mean absolute-value peak-day relative error was 2.38% and the maximum absolute-value peak-day relative error was 5.98%. The statistical modeling approach to estimate the annual peak-day demand advances the capability to quantify this critical factor, which currently is represented at some water utilities using a simple average or rule-of-thumb approach (e.g., City of Oxnard 2017; Vallecitos Water District 2018). For a municipal water utility that calibrates this model with their data, theoretically that water utility could use population growth projections to estimate peak-day demand to help with their capital improvements planning for plant expansions and distribution storage capacity.
Furthermore, a municipal water utility that calibrates this model with their data theoretically could use short-term weather forecast data to optimize of the timing of water production. For example, if cooler weather was expected in short-term weather forecasts, a water utility could reduce their electrical consumption by shifting more production to shoulder and off-peak hours.

Limitations

All models are sensitive to source data accuracy; thus, some amount of uncertainty is introduced to this model, accordingly. One such limitation is the accuracy and resolution of population data for the given study area. The US Census Bureau is one of the most reliable sources for population counts, but it releases data only in the form of annual values. Therefore, interpolation of the data from 1 year to the next was necessary, which introduces uncertainty about what is occurring from day to day. For example, winter snowbirds or rapid changes in Fort Bliss are not necessarily captured in US Census Bureau data. Increasing the resolution of data has made models increasingly more accurate, but public data availability of population figures seems to have lagged in this aspect. In a similar manner, lack of inclusion of spatial modeling methods such as those in use with WaterGAP (Eicker et al. 2014) also can contribute a certain amount of uncertainty. The model described herein treats the El Paso water demand system as a single entity despite the fact that multiple water treatment facilities serve different parts of the city. Spatial modeling methods could reduce uncertainty by parsing sections of the city that use water in different ways.

Conclusions

Demand modeling is essential to water utilities for reliable water supply management. By accounting for the long-term trend, seasonal, and daily meteorological variability, a robust statistical urban water demand modeling methodology was developed based on 365-day trailing average per capita consumption to predict daily water demand that has similar performance as reported ANN and SVM modeling. The temperature and precipitation factors modulate for daily meteorological conditions, and monthly and day of the week constants account for seasonal trends. Furthermore, the model includes a factor related to holidays, which affect urban water demand.
The developed methodology was calibrated and validated with data from the city of El Paso, Texas, requiring only four data sets (daily water demand, population, temperature, and precipitation). For the validation period (2011–2015), the model performed well, with a coefficient of determination (R2) of 0.95, a Nash–Sutcliff efficiency of 0.94, a mean absolute-value relative error of 4.38%, a relative standard error of estimate of 5.82%, a relative RMS error of 5.71%, and a mean absolute-value peak-day error of 2.78%.
The presented methodology improves the capability to estimate annual peak-day demand in a systematic way compared with arbitrary rule-of-thumb approaches. Meteorological sensitivity was a key objective for this model, and by including two weather variables (daily temperature and precipitation), subsequent modeling efforts can simulate the effects of future climate scenarios on El Paso’s municipal water demand. Considering the relatively simple data requirements for calibration, and considering the relatively high accuracy of predictions, this generic predictive water demand modeling approach is transferable to other cities seeking to characterize daily water demand. In future work, we intend to apply this modeling methodology to other cities in arid and semiarid regions of the world.

Notation

The following symbols are used in this paper:
CDotW(m,d)
day of week constant for month, m, on weekday, d;
CH(c)
holiday constant for calendar day, c;
CM(m)
month constant for month m;
Cp1
location-specific precipitation function constant;
Cp2
location-specific precipitation function constant (day/cm);
CT1, CT2
location-specific temperature function constant;
CT3, CT4
location-specific temperature function constant (°C);
fprecip(p3¯)
precipitation function;
ftemp(T)
temperature function;
p¯
3-day average precipitation (cm/day);
P
population (day-of) (persons);
P365¯
average population for previous 365 days (persons);
Qmdc
predicted demand in month, m, on weekday, d, on calendar day, c (ML/day);
Q¯
average predicted demand of previous 3 days (ML/day);
Q365¯
average demand for previous 365 days (ML/day);
R2
Pearson coefficient of determination;
T
average daily temperature (°C);
Ui1
previous-day unit consumption, 365-day trailing average (L/person/day); and
W1, W2
weighting factors;

Subscripts

c
calendar day (1–366);
d
day of week (Su, M, T, W, R, F, Sa); and
m
month of year (1–12).

Data Availability Statement

Some data used during the study were provided by a third party. Direct requests for these materials may be made to the provider as indicated in the Acknowledgements. All data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This research was funded by the United States Dept. of Agriculture (USDA) under Grant No. 2015-68007-23130 to work with regional stakeholders on developing a shared understanding of future scenarios of water availability and use in the Middle Rio Grande Valley of southern New Mexico, west Texas, and northern Chihuahua. We sincerely thank El Paso Water for sharing the daily water demand data and answering questions about the data; this work would not have been possible without their support.

References

Adamowski, J., H. Fung Chan, S. O. Prasher, B. Ozga-Zielinski, and A. Sliusarieva. 2012. “Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada.” Water Resour. Res. 48 (1): 1–14. https://doi.org/10.1029/2010WR009945.
Alcamo, J., P. Döll, T. Henrichs, F. Kaspar, B. Lehner, T. Rösch, and S. Siebert. 2003. “Development and testing of the WaterGAP 2 global model of water use and availability.” Hydrol. Sci. J. 48 (3): 317–337. https://doi.org/10.1623/hysj.48.3.317.45290.
Alcamo, J., P. Döll, F. Kaspar, and S. Siebert. 1997. Global change and global scenarios of water use and availability: An application of WaterGAP 1.0. Kassel, Germany: Center for Environmental Systems Research, Univ. of Kassel.
Al-Zahrani, M. A., and A. Abo-Monasar. 2015. “Urban residential water demand prediction based on artificial neural networks and time series models.” Water Resour. Manage. 29 (10): 3651–3662. https://doi.org/10.1007/s11269-015-1021-z.
Bakker, M., H. van Duist, K. van Schagen, J. Vreeburg, and L. Rietveld. 2014. “Improving the performance of water demand forecasting models by using weather input.” Procedia Eng. 70 (Jan): 93–102. https://doi.org/10.1016/j.proeng.2014.02.012.
Beal, C. D., and R. A. Stewart. 2014. “Identifying residential water end uses underpinning peak day and peak hour demand.” J. Water Resour. Plann. Manage. 140 (7). https://doi.org/10.1061/(ASCE)WR.1943-5452.0000357.
Bennett, N. D., et al. 2013. “Characterising performance of environmental models.” Environ. Modell. Software 40 (Feb): 1–20. https://doi.org/10.1016/J.ENVSOFT.2012.09.011.
Berkes, F., C. Folke, and J. Colding. 1998. Linking social and ecological systems: Management practices and social mechanisms for building resilience. Cambridge, UK: Cambridge University Press.
Billings, R. B., and D. E. Agthe. 1998. “State-space versus multiple regression for forecasting urban water demand.” J. Water Resour. Plann. Manage. 124 (2), https://doi.org/10.1061/(ASCE)0733-9496(1998)124:2(113).
Blokker, E. J. M., J. H. G. Vreeburg, and J. C. Van Dijk. 2010. “Simulating residential water demand with a stochastic end-use model.” J. Water Resour. Plann. Manage. 136 (1): 19–26. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000002.
Bulmer, M. 1979. “Concepts in the analysis of qualitative data.” Sociol. Rev. 27 (4): 651–677. https://doi.org/10.1111/j.1467-954X.1979.tb00354.x.
City of Oxnard. 2017. “Public works integrated master plan.” Accessed July 8, 2020. https://www.oxnard.org/wp-content/uploads/2017/09/PM-2.2.pdf.
Dokumentov, A., and R. J. Hyndman. 2015. “STR: A seasonal-trend decomposition procedure based on regression.” Accessed November 3, 2020. https://robjhyndman.com/papers/wp13-15.pdf.
Donkor, E. A., T. A. Mazzuchi, R. Soyer, and J. Alan Roberson. 2014. “Urban water demand forecasting: Review of methods and models.” J. Water Resour. Plann. Manage. 140 (2): 146–159. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000314.
Eicker, A., M. Schumacher, J. Kusche, P. Döll, and H. M. Schmied. 2014. “Calibration/data assimilation approach for integrating GRACE data into the WaterGAP global hydrology model (WGHM) using an ensemble Kalman filter: First results.” Surv. Geophys. 35 (6): 1285–1309. https://doi.org/10.1007/s10712-014-9309-8.
Falkenmark, M., J. Lundqvist, and C. Widstrand. 1989. “Macro-scale water scarcity requires micro-scale approaches. Aspects of vulnerability in semi-arid development.” Nat. Resour. Forum 13 (4): 258–267. https://doi.org/10.1111/j.1477-8947.1989.tb00348.x.
George, D., and M. Mallery. 2009. SPSS for windows step by step: A simple guide and reference, 17.0 update. Boston: Allyn & Bacon.
Ghiassi, M., D. K. Zimbra, and H. Saidane. 2008. “Urban water demand forecasting with a dynamic artificial neural network model.” J. Water Resour. Plann. Manage. 134 (2): 138–146. https://doi.org/10.1061/(ASCE)0733-9496(2008)134:2(138).
Guo, G., S. Liu, Y. Wu, J. Li, R. Zhou, and X. Zhu. 2018. “Short-term water demand forecast based on deep learning method.” J. Water Resour. Plann. Manage. 144 (12): 04018076. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000992.
Homwongs, C., T. Sastri, and J. W. Foster III. 1994. “Adaptive forecasting of hourly municipal water consumption.” J. Water Resour. Plann. Manage. 120 (6): 888–905. https://doi.org/10.1061/(ASCE)0733-9496(1994)120:6(888).
House-Peters, L. A., and H. Chang. 2011. “Urban water demand modeling: Review of concepts, methods, and organizing principles.” Water Resour. Res. 47 (5): 1–15. https://doi.org/10.1029/2010WR009624.
Lee, D., and S. Derrible. 2020. “Predicting residential water demand with machine-based statistical learning.” J. Water Resour. Plann. Manage. 146 (1): 04019067. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001119.
Levinson, N. 1946. “The Wiener RMS (root mean square) error criterion in filter design and prediction.” J. Math. Phys. 25 (1–4): 261–278. https://doi.org/10.1002/sapm1946251261.
Maidment, D. R., and S. P. Miaou. 1986. “Daily water use in nine cities.” Water Resour. Res. 22 (6): 845–851. https://doi.org/10.1029/WR022i006p00845.
Maidment, D. R., S. P. Miaou, and M. M. Crawford. 1985. “Transfer function models of daily urban water use.” Water Resour. Res. 21 (4): 425–432. https://doi.org/10.1029/WR021i004p00425.
Maidment, D. R., and E. Parzen. 1984. “Time patterns of water use in six Texas cities.” J. Water Resour. Plann. Manage. 110 (1): 90–106. https://doi.org/10.1061/(ASCE)0733-9496(1984)110:1(90).
Msiza, I. S., F. V. Nelwamondo, and T. Marwala. 2008. “Water demand prediction using artificial neural networks and support vector regression.” J. Comput. 3 (11): 1–8.
NOAA (National Oceanic and Atmospheric Administration). 2020. “Data tools: Find a station.” Accessed November 3, 2020. https://www.ncdc.noaa.gov/cdo-web/datatools/findstation.
Pearson, K. 1909. “Determination of the coefficient of correlation.” Science 30 (757): 23–25. https://doi.org/10.1126/science.30.757.23.
Peel, M. C., B. L. Finlayson, and T. A. McMahon. 2007. “Updated world map of the Köppen-Geiger climate classification.” Hydrol. Earth Syst. Sci. Discuss. 4 (2): 439–473. https://doi.org/10.5194/hessd-4-439-2007.
Piri, J., S. Amin, A. Moghaddamnia, A. Keshavarz, D. Han, R. Remesan, and O. Kisi. 2009. “Daily pan evaporation modeling in a hot and dry climate.” J. Hydrol. Eng. 14 (8): 803. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000056.
Sardinha-Lourenço, A., A. Andrade-Campos, A. Antunes, and M. S. Oliveira. 2018. “Increased performance in the short-term water demand forecasting through the use of a parallel adaptive weighting strategy.” J. Hydrol. 558 (Mar): 392–404. https://doi.org/10.1016/j.jhydrol.2018.01.047.
Sharvelle, S., A. Dozier, M. Arabi, and B. Reichel. 2017. “A geospatially-enabled web tool for urban water demand forecasting and assessment of alternative urban water management strategies.” Environ. Modell. Software 97 (Nov): 213–228. https://doi.org/10.1016/j.envsoft.2017.08.009.
Sutanudjaja, E. H., et al. 2018. “PCR-GLOBWB 2: A 5 arcmin global hydrological and water resources model.” Geosci. Model Dev. 11 (6): 2429–2453. https://doi.org/10.5194/gmd-11-2429-2018.
US Census Bureau. 2019. “American Community Survey (ACS), Total population, El Paso city, Texas, 2019 estimate.” Accessed November 3, 2020. https://data.census.gov/cedsci/table?q=El%20Paso%20city,%20Texas%26tid=ACSDT1Y2019.B01003.
US Census Bureau. 2020. “QuickFacts – El Paso County, Texas.” Accessed November 2, 2020. https://www.census.gov/quickfacts/fact/table/elpasocountytexas#.
USDA. 2012. “Census of agriculture: 2012 state and county profiles.” Accessed April 5, 2019. https://www.nass.usda.gov/Publications/AgCensus/2012/Online_Resources/County_Profiles/.
Vallecitos Water District. 2018. “2018 water, wastewater, and recycled water master plan.” Accessed July 8, 2020. http://www.vwd.org/home/showdocument?id=10656.
Villarin, M. C., and V. F. Rodriguez-Galiano. 2019. “Machine learning for modeling water demand.” J. Water Resour. Plann. Manage. 145 (5). https://doi.org/10.1061/(ASCE)WR.1943-5452.0001067.
Weather Underground. 2014. “El Paso, TX history.” Accessed April 5, 2019. https://www.wunderground.com/history/daily/KELP/date/2014-6-5.
Willuweit, L., and J. J. O’Sullivan. 2013. “A decision support tool for sustainable planning of urban water systems: Presenting the dynamic urban water simulation model.” Water Res. 47 (20): 7206–7220. https://doi.org/10.1016/j.watres.2013.09.060.
Wong, J. S., Q. Zhang, and Y. D. Chen. 2010. “Statistical modeling of daily urban water consumption in Hong Kong: Trend, changing patterns, and forecast.” Water Resour. Res. 46 (3): 1–10. https://doi.org/10.1029/2009WR008147.
Zhou, S. L., T. A. McMahon, A. Walton, and J. Lewis. 2000. “Forecasting daily urban water demand: A case study of Melbourne.” J. Hydrol. 236 (3–4): 153–164. https://doi.org/10.1016/S0022-1694(00)00287-0.

Information & Authors

Information

Published In

Go to Journal of Water Resources Planning and Management
Journal of Water Resources Planning and Management
Volume 147Issue 2February 2021

History

Received: Apr 15, 2020
Accepted: Aug 21, 2020
Published online: Dec 4, 2020
Published in print: Feb 1, 2021
Discussion open until: May 4, 2021

Authors

Affiliations

Tallen Capt [email protected]
Dept. of Civil Engineering, Univ. of Texas at El Paso, 500 W University Ave., El Paso, TX 79968. Email: [email protected]
Assistant Professor, Dept. of Biosystems and Agricultural Engineering, Oklahoma State Univ., 111 Agriculture Hall, Stillwater, OK 74078. ORCID: https://orcid.org/0000-0002-9649-2964. Email: [email protected]
Saurav Kumar, M.ASCE [email protected]
Assistant Professor, Dept. of Biological and Agricultural Engineering, Texas A&M Univ., AgriLife Research, 1380 A&M Circle, El Paso, TX 79927. Email: [email protected]
Director of the Center for Inland Desalination Systems and Associate Professor, Dept. of Civil Engineering, Univ. of Texas at El Paso, 500 W University Ave., El Paso, TX 79968 (corresponding author). ORCID: https://orcid.org/0000-0002-4136-8499. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share