Technical Papers
May 21, 2021

Deep Reinforcement Learning for Optimal Hydropower Reservoir Operation

Publication: Journal of Water Resources Planning and Management
Volume 147, Issue 8

Abstract

Optimal operation of hydropower reservoir systems is a classical optimization problem of high dimensionality and stochastic nature. A key challenge lies in improving the interpretability of operation strategies, i.e., the cause–effect relationship between system outputs (or actions) and contributing variables such as states and inputs. This paper reports for the first time a new deep reinforcement learning (DRL) framework for optimal operation of reservoir systems based on deep Q-networks (DQNs), which provides a significant advance in understanding the performance of optimal operations. DQN combines Q-learning and two deep artificial neural networks (ANNs), and acts as the agent to interact with the reservoir system through learning its states and providing actions. Three knowledge forms of learning considering the states, actions, and rewards were constructed to improve the interpretability of operation strategies. The impacts of these knowledge forms and DRL learning parameters on operation performance were analyzed. The DRL framework was tested on the Huanren hydropower system in China, using 400-year synthetic flow data for training and 30-year observed flow data for verification. The discretization levels of reservoir water level and energy output yield contrasting effects: finer discretization of water level improved performance in terms of annual hydropower generated and hydropower production reliability; however, finer discretization of hydropower production can reduce search efficiency, and thus the resulting DRL performance. Compared with benchmark algorithms including dynamic programming, stochastic dynamic programming, and decision tree, the proposed DRL approach can effectively factor in future inflow uncertainties when determining optimal operations and can generate markedly higher hydropower. This study provides new knowledge of the performance of DRL in the context of hydropower system characteristics and data input features, and shows promise for potentially being implemented in practice to derive operation policies that can be updated automatically by learning from new data.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request. Data include the synthetic and observed flow time series. The code that was used for the deep reinforcement learning also is available.

Acknowledgments

This research is supported by the National Natural Science Foundation of China (Grant No. 51609025), the UK Royal Society through an industry fellowship to Guangtao Fu (Ref: IF160108) and an international collaboration project (Ref: IEC\NSFC\170249), the Open Fund Approval (SKHL1713, 2017), and a Chongqing technology innovation and application demonstration project (cstc2018jscx-msybX0274 and cstc2016shmszx30002). Guangtao Fu and Weisi Guo also are supported by The Alan Turing Institute under the Engineering and Physical Sciences Research Council (EPSRC) (Grant No. EP/N510129/1). Special thanks are given to the Hun River Cascade Hydropower Development Company and Dalian University of Technology for the case study data.

References

Bessler, F. T., D. A. Savic, and G. A. Walters. 2003. “Water reservoir control with data mining.” J. Water Resour. Plann. Manage. 129 (1): 26–34. https://doi.org/10.1061/(ASCE)0733-9496(2003)129:1(26).
Castelletti, A., S. Galelli, M. Restelli, and R. Soncini-Sessa. 2010. “Tree-based reinforcement learning for optimal water reservoir operation.” Water Resour. Res. 46 (9): W09507.
Castelletti, A., F. Pianosi, and M. Restelli. 2013. “A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run.” Water Resour. Res. 49 (6): 3476–3486. https://doi.org/10.1002/wrcr.20295.
Doltsinis, S., P. Ferreira, and N. Lohse. 2014. “An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis.” IEEE Trans. Syst. Man Cybern.: Syst. 44 (9): 1125–1138. https://doi.org/10.1109/TSMC.2013.2294155.
Ficchi, A., L. Raso, D. Dorchies, F. Pianosi, P.-O. Malaterre, P.-J. Van Overloop, and M. Jay-Allemand. 2016. “Optimal operation of the multireservoir system in the seine river basin using deterministic and ensemble forecasts.” J. Water Resour. Plann. Manage. 142 (1): 05015005. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000571.
François-Lavet, V., P. Henderson, R. Islam, M. G. Bellemare, and J. Pineau. 2018. “An introduction to deep reinforcement learning.” Found. Trends Mach. Learn. 11 (3–4): 219–354. https://doi.org/10.1561/2200000071.
Galelli, S., A. Goedbloed, D. Schwanenberg, and P.0J. van Overloop. 2014. “Optimal real-time operation of multipurpose urban reservoirs: Case study in Singapore.” J. Water Resour. Plann. Manage. 140 (4): 511–523. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000342.
Gao, Y., J. Chen, T. Robertazzi, and K. A. Brown. 2019. “Reinforcement learning based schemes to manage client activities in large distributed control systems.” Phys. Rev. Accel. Beams 22 (1): 014601. https://doi.org/10.1103/PhysRevAccelBeams.22.014601.
Giuliani, M., J. D. Quinn, J. D. Herman, A. Castelletti, and P. M. Reed. 2018. “Scalable multi-objective control for large scale water resources systems under uncertainty.” IEEE Trans. Control Syst. Technol. 26 (4): 1492–1499. https://doi.org/10.1109/TCST.2017.2705162.
Hashimoto, T., J. R. Stedinger, and D. P. Loucks. 1982. “Reliability, resiliency, and vulnerability criteria for water resource system performance evaluation.” Water Resour. Res. 18 (1): 14–20. https://doi.org/10.1029/WR018i001p00014.
Hecht, J. S., R. M. Vogel, R. A. McManamay, C. N. Kroll, and J. M. Reed. 2020. “Decision trees for incorporating hypothesis tests of hydrologic alteration into hydropower–ecosystem tradeoffs.” J. Water Resour. Plann. Manage. 146 (5): 04020017. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001184.
LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep learning.” Nature 521 (7553): 436–444. https://doi.org/10.1038/nature14539.
Lee, J.-H., and J. W. Labadie. 2007. “Stochastic optimization of multireservoir systems via reinforcement learning.” Water Resour. Res. 43 (11): W11408. https://doi.org/10.1029/2006WR005627.
Lin, S.-Y. 2015. “Reinforcement learning-based prediction approach for distributed dynamic data-driven application systems.” Inf. Technol. Manage. 16 (4): 313–326. https://doi.org/10.1007/s10799-014-0205-1.
McLeod, A. I., and W. K. Li. 1983. “Diagnostic checking ARMA time series models using squared-residual autocorrelations.” J. Time Ser. Anal. 4 (4): 269–273. https://doi.org/10.1111/j.1467-9892.1983.tb00373.x.
Meng, F., G. Fu, and D. Butler. 2017. “Cost-effective river water quality management using integrated real-time control technology.” Environ. Sci. Technol. 51 (17): 9876–9886. https://doi.org/10.1021/acs.est.7b01727.
Meng, F., G. Fu, and D. Butler. 2020. “Regulatory implications of integrated real-time control technology under environmental uncertainty.” Environ. Sci. Technol. 54 (3): 1314–1325. https://doi.org/10.1021/acs.est.9b05106.
Ming, B., P. Liu, J. Chang, Y. Wang, and Q. Huang. 2017. “Deriving operating rules of pumped water storage using multiobjective optimization: Case study of the Han to Wei interbasin water transfer project, China.” J. Water Resour. Plann. Manage. 143 (10): 05017012. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000828.
Mnih, V., et al. 2015. “Human-level control through deep reinforcement learning.” Nature 518 (7540): 529. https://doi.org/10.1038/nature14236.
Mnih, V., K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. “Playing Atari with deep reinforcement learning.” Preprint, submitted December 19, 2013. http://arxiv.org/abs/1312.5602.Mujumdar, P. P., and B. Nirmala. 2007. “A Bayesian stochastic optimization model for a multi-reservoir hydropower system.” Water Resour. Manage. 21 (9): 1465–1485. https://doi.org/10.1007/s11269-006-9094-3.
Peng, Y., J. Chu, A. Peng, and H. Zhou. 2015. “Optimization operation model coupled with improving water-transfer rules and hedging rules for inter-basin water transfer-supply systems.” Water Resour. Manage. 29 (10): 3787–3806. https://doi.org/10.1007/s11269-015-1029-4.
Quinlan, J. R. 2020. “Data mining tools See5 and C5.0.” Accessed January 26, 2021. https://www.rulequest.com/see5-info.html.
Quinn, J. D., P. M. Reed, M. Giuliani, and A. Castelletti. 2019. “What is controlling our control rules? Opening the black box of multireservoir operating policies using time-varying sensitivity analysis.” Water Resour. Res. 55 (7): 5962–5984. https://doi.org/10.1029/2018WR024177.
Schaul, T., J. Quan, I. Antonoglou, and D. Silver. 2015. “Prioritized experience replay.” Preprint, submitted November 18, 2015. http://arxiv.org/abs/1511.05952.
Silver, D., et al. 2016. “Mastering the game of Go with deep neural networks and tree search.” Nature 529 (7587): 484–489. https://doi.org/10.1038/nature16961.
Sutton, R. S., and A. G. Barto. 2018. Reinforcement learning: An introduction. 2nd ed. Cambridge, MA: MIT Press.
Vermuyten, E., P. Meert, V. Wolfs, and P. Willems. 2018. “Combining model predictive control with a reduced genetic algorithm for real-time flood control.” J. Water Resour. Plann. Manage. 144 (2): 04017083. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000859.
Vermuyten, E., E. Van Uytven, P. Meert, V. Wolfs, and P. Willems. 2020. “Real-time river flood control under historical and future climatic conditions: Flanders case study.” J. Water Resour. Plann. Manage. 146 (1): 05019022. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001144.
Wan, W., J. Zhao, J. R. Lund, T. Zhao, X. Lei, and H. Wang. 2016. “Optimal hedging rule for reservoir refill.” J. Water Resour. Plann. Manage. 142 (11): 04016051. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000692.
Wang, Y.-M., J.-X. Chang, and Q. Huang. 2010. “Simulation with RBF neural network model for reservoir operation rules.” Water Resour. Manage. 24 (11): 2597–2610. https://doi.org/10.1007/s11269-009-9569-0.
Watkins, C. J. C. H., and P. Dayan. 1992. “Q-learning.” Mach. Learn. 8 (3–4): 279–292. https://doi.org/10.1007/BF00992698.
Wei, C.-C., and N.-S. Hsu. 2008. “Derived operating rules for a reservoir operation system: Comparison of decision trees, neural decision trees and fuzzy decision trees.” Water Resour. Res. 44 (2): W02428. https://doi.org/10.1029/2006WR005792.
Xi, S., B. Wang, G. Liang, X. Li, and L. Lou. 2010. “Inter-basin water transfer-supply model and risk analysis with consideration of rainfall forecast information.” Sci. China Technol. Sci. 53 (12): 3316–3323. https://doi.org/10.1007/s11431-010-4170-6.
Xu, W., Y. Peng, and B. Wang. 2013. “Evaluation of optimization operation models for cascaded hydropower reservoirs to utilize medium range forecasting inflow.” Sci. China Technol. Sci. 56 (10): 2540–2552. https://doi.org/10.1007/s11431-013-5346-7.
Xu, W., C. Zhang, Y. Peng, G. Fu, and H. Zhou. 2014. “A two stage Bayesian stochastic optimization model for cascaded hydropower systems considering varying uncertainty of flow forecasts.” Water Resour. Res. 50 (12): 9267–9286. https://doi.org/10.1002/2013WR015181.
Yang, T., X. Liu, L. Wang, P. Bai, and J. Li. 2020. “Simulating hydropower discharge using multiple decision tree methods and a dynamical model merging technique.” J. Water Resour. Plann. Manage. 146 (2): 04019072. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001146.
Yeh, W. W.-G. 1985. “Reservoir management and operations models: A state-of-the-art review.” Water Resour. Res. 21 (12): 1797–1818. https://doi.org/10.1029/WR021i012p01797.
Zhang, K., X. Wu, R. Niu, K. Yang, and L. Zhao. 2017. “The assessment of landslide susceptibility mapping using random forest and decision tree methods in the Three Gorges Reservoir area, China.” Environ. Earth Sci. 76 (11): 405. https://doi.org/10.1007/s12665-017-6731-5.
Zhang, X., Y. Peng, W. Xu, and B. Wang. 2019. “An optimal operation model for hydropower stations considering inflow forecasts with different lead-times.” Water Resour. Manage. 33 (1): 173–188. https://doi.org/10.1007/s11269-018-2095-1.

Information & Authors

Information

Published In

Go to Journal of Water Resources Planning and Management
Journal of Water Resources Planning and Management
Volume 147Issue 8August 2021

History

Received: Apr 15, 2020
Accepted: Feb 21, 2021
Published online: May 21, 2021
Published in print: Aug 1, 2021
Discussion open until: Oct 21, 2021

Permissions

Request permissions for this article.

Authors

Affiliations

Associate Professor, College of River and Ocean Engineering, Chongqing Jiaotong Univ., No.66 Xuefu Rd., Ran’an District, Chongqing 400074, China. Email: [email protected]
Fanlin Meng [email protected]
Research Fellow, Center for Water Systems, Univ. of Exeter, Exeter EX4 4QF, UK. Email: [email protected]
Professor, School of Aerospace, Transport and Manufacturing, Cranfield Univ., College Road, Bedford, Bedfordshire MK43 0AL, UK. Email: [email protected]
Associate Professor, College of River and Ocean Engineering, Chongqing Jiaotong Univ., No.66 Xuefu Rd., Ran’an District, Chongqing 400074, China. Email: [email protected]
Professor, Center for Water Systems, Univ. of Exeter, Exeter EX4 4QF, UK; Turing Fellow, Alan Turing Institute, 96 Euston Rd., London NW1 2DB, UK (corresponding author). ORCID: https://orcid.org/0000-0003-1045-9125. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control, Journal of Water Resources Planning and Management, 10.1061/JWRMD5.WRENG-6089, 150, 7, (2024).
  • Flow Assessment Downstream of a Hydroelectric Project in an Ungauged Area, Journal of Hydrologic Engineering, 10.1061/JHYEFF.HEENG-6050, 28, 11, (2023).
  • The role of deep learning in urban water management: A critical review, Water Research, 10.1016/j.watres.2022.118973, 223, (118973), (2022).
  • DNN-SSDP for hydropower system operation using small state sets, Journal of Hydrology, 10.1016/j.jhydrol.2022.128612, 614, (128612), (2022).
  • Weekly hydropower scheduling of cascaded reservoirs with hourly power and capacity balances, Applied Energy, 10.1016/j.apenergy.2022.118620, 311, (118620), (2022).
  • Managing chance-constrained hydropower with reinforcement learning and backoffs, Advances in Water Resources, 10.1016/j.advwatres.2022.104308, 169, (104308), (2022).
  • A State‐of‐the‐Art Review of Optimal Reservoir Control for Managing Conflicting Demands in a Changing World, Water Resources Research, 10.1029/2021WR029927, 57, 12, (2021).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share