Deep Reinforcement Learning for Optimal Hydropower Reservoir Operation

Xu, Wei; Meng, Fanlin; Guo, Weisi; Li, Xia; Fu, Guangtao

doi:10.1061/(ASCE)WR.1943-5452.0001409

Technical Papers

May 21, 2021

Deep Reinforcement Learning for Optimal Hydropower Reservoir Operation

Authors: Wei Xu [email protected], Fanlin Meng [email protected], Weisi Guo [email protected], Xia Li [email protected], and Guangtao Fu https://orcid.org/0000-0003-1045-9125 [email protected]Author Affiliations

Publication: Journal of Water Resources Planning and Management

Volume 147, Issue 8

https://doi.org/10.1061/(ASCE)WR.1943-5452.0001409

Get Access

Abstract

Optimal operation of hydropower reservoir systems is a classical optimization problem of high dimensionality and stochastic nature. A key challenge lies in improving the interpretability of operation strategies, i.e., the cause–effect relationship between system outputs (or actions) and contributing variables such as states and inputs. This paper reports for the first time a new deep reinforcement learning (DRL) framework for optimal operation of reservoir systems based on deep Q-networks (DQNs), which provides a significant advance in understanding the performance of optimal operations. DQN combines Q-learning and two deep artificial neural networks (ANNs), and acts as the agent to interact with the reservoir system through learning its states and providing actions. Three knowledge forms of learning considering the states, actions, and rewards were constructed to improve the interpretability of operation strategies. The impacts of these knowledge forms and DRL learning parameters on operation performance were analyzed. The DRL framework was tested on the Huanren hydropower system in China, using 400-year synthetic flow data for training and 30-year observed flow data for verification. The discretization levels of reservoir water level and energy output yield contrasting effects: finer discretization of water level improved performance in terms of annual hydropower generated and hydropower production reliability; however, finer discretization of hydropower production can reduce search efficiency, and thus the resulting DRL performance. Compared with benchmark algorithms including dynamic programming, stochastic dynamic programming, and decision tree, the proposed DRL approach can effectively factor in future inflow uncertainties when determining optimal operations and can generate markedly higher hydropower. This study provides new knowledge of the performance of DRL in the context of hydropower system characteristics and data input features, and shows promise for potentially being implemented in practice to derive operation policies that can be updated automatically by learning from new data.

Get full access to this article

View all available purchase options and get full access to this article.

Get Access

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request. Data include the synthetic and observed flow time series. The code that was used for the deep reinforcement learning also is available.

Acknowledgments

This research is supported by the National Natural Science Foundation of China (Grant No. 51609025), the UK Royal Society through an industry fellowship to Guangtao Fu (Ref: IF160108) and an international collaboration project (Ref: IEC\NSFC\170249), the Open Fund Approval (SKHL1713, 2017), and a Chongqing technology innovation and application demonstration project (cstc2018jscx-msybX0274 and cstc2016shmszx30002). Guangtao Fu and Weisi Guo also are supported by The Alan Turing Institute under the Engineering and Physical Sciences Research Council (EPSRC) (Grant No. EP/N510129/1). Special thanks are given to the Hun River Cascade Hydropower Development Company and Dalian University of Technology for the case study data.

References

Bessler, F. T., D. A. Savic, and G. A. Walters. 2003. “Water reservoir control with data mining.” J. Water Resour. Plann. Manage. 129 (1): 26–34. https://doi.org/10.1061/(ASCE)0733-9496(2003)129:1(26).

Abstract

Get full access to this article

Data Availability Statement

Acknowledgments

References

Information

Published In

Copyright

History

Permissions

Authors

Affiliations

Metrics

Citations

Download citation

Cited by

Get Access

Access content

Purchase

ASCE Library Card (5 downloads)

ASCE Library Card (5 downloads)

ASCE Library Card (20 downloads)

ASCE Library Card (20 downloads)

Buy Single Article

Buy Single Article

Get Access

Access content

Purchase

ASCE Library Card (5 downloads)

ASCE Library Card (5 downloads)

ASCE Library Card (20 downloads)

ASCE Library Card (20 downloads)

Buy Single Article

Buy Single Article

Figures

Other

Share

Copy the content Link

Share with email

Share

Request Username

Create a new account

Change Password

Password Changed Successfully

Verify Phone

Congrats!