*BC Hydro is investigating the use of several stochastic modeling techniques to determine water value functions for stored water in its two multi-year storage reservoirs. The marginal values of water in these reservoirs will help the utility perform more robust operations analyses, which account for uncertainties in inflows, market prices and domestic load.*

**By Ziming Guan, Ziad Shawwash, Alaa Abdalla, Amr Ayad and Joel Evans**

BC Hydro is an electric utility owned by the province of British Columbia, Canada, that serves about 95% of the province's population. It owns and operates a large hydroelectric power system with total installed capacity of about 11,000 MW. Out of its 35 hydro plants, five facilities on two main rivers account for more than 70% of the installed capacity in the system and 65% of the total energy generated. They are:

- Peace River System, consisting of the 2,730-MW G.M. Shrum powerhouse downstream of the Williston Reservoir (39.46 billion cubic meters of active storage) and the 694-MW Peace Canyon powerhouse downstream of the Dinosaur Reservoir (0.02 billion cubic meters of storage); and

- Columbia River System, consisting of 1,805-MW Mica powerhouse downstream of the Kinbasket Reservoir (14.80 billion cubic meters of storage), 2,480-MW Revelstoke powerhouse downstream of the Revelstoke Reservoir (1.85 billion cubic meters of storage) and Hugh Keenleyside Dam impounding Arrow Lakes (8.77 billion cubic meters of storage).

The Williston and Kinbasket reservoirs are typically used for multi-year strategic operations planning. The main challenge in determining the value of water stored in these two reservoirs is handling uncertainties in future inflows to these reservoirs, market prices and domestic load.

BC Hydro uses the Generalized Optimization Model developed by a research team from BC Hydro and the University of British Columbia. GOM is a deterministic linear optimization model and is used for medium-term (five to 10 years) operations planning studies. To further enhance this planning model's capabilities, a capital project called the Water Value Project is being undertaken to incorporate uncertainty.

The project's objective is to investigate stochastic optimization techniques that can be used to solve the multi-reservoir operations planning problem and generate water value functions to quantify the marginal values of water stored in the two main reservoirs. A stochastic optimization technique uses a mathematical programming algorithm to solve an optimization problem over a defined future planning horizon broken down into multiple time steps and that includes uncertainty.

These techniques will enable BC Hydro to assess the impact of uncertainties on integrated reservoir operations and achieve more optimal operating policies for these multi-year reservoirs. Prototype models applying these techniques are being developed and will be assessed based on specific performance criteria. The appropriate technique(s) will be further developed and implemented into a decision support tool for use by BC Hydro in reservoir operations planning. This article overviews the techniques investigated and summarizes the main outcomes of this project.

## Stochastic variables and scenario generation

There are three stochastic variables modeled: major inflows to the Peace and Columbia river systems, electricity prices in the Alberta and mid-Columbia (Pacific Northwest) markets, and electrical demand in British Columbia.

A set of monthly inflow data was used to construct a model that generates inflow scenarios. Because of the uncertainty of inflows in each month, the random variables in the model need to be conditionally sampled from discrete probability distributions, which form a scenario tree with finite outcomes.^{1} BC Hydro's energy planning group provided the market price and domestic load forecast scenarios. Each scenario spans the planning horizon of five to 20 years, implying that prices and load follow the same trace once a scenario has been realized. This assumption is considered acceptable because government policies and economic growth, the main drivers behind the electricity prices and load, usually do not change drastically or frequently in this amount of time.

One commonly used method for generating inflow scenarios is autoregressive (AR) modeling, which captures the serial correlation of variables and assumes these variables are sampled from the same distribution over all stages. Furthermore, the principal component approach was employed to capture the correlation among the AR random variables and to generate them from their independent random variables.2,3

Another method for generating inflows is a Markov chain. This method constructs bins for the outcome of random variables representing the combinations of intervals in the discretized spaces of these variables and uses transition probabilities for conditional sampling of the bins. This method results in stage-dependent random variables, making the water value function more difficult to calculate.4

The methods for moment matching5 and scenario reduction6 yield a set of scenarios under which random variables are not conditionally sampled but a set of scenario fans capture the properties of the random variables. A class of clustering methods can be used to generate or build a scenario tree using these fans,7 but the scenarios generated are not guaranteed to be a good approximation of the originals.

After finding that the serial correlation between two successive monthly inflows was strong, the lag-1 AR model with the principal component approach was chosen to generate a scenario tree for inflows. The model was constructed using the following steps:

1. A linear regression model was created based on the histori cal record of monthly inflows for each reservoir. This split the inflows into a linear component and a residual component with a mean of zero and normalized standard deviation of one.

2. A lag-1 AR model was developed. The correlation coefficient was estimated by the maximum likelihood estimation method, and the white noises were calculated from the residuals.

3. A log-transformation was used on the white noises so that they are normally distributed.

4. Using the principal component approach, each transformed white noise was calculated as the weighted sum of independent random variables, with the weights calculated from the principal components that were calculated from the correlation matrix of the transformed white noises.

5. Each independent random variable was sampled from a discrete distribution that approximates the standard normal distribution, resulting in a finite scenario tree.

Inflow from April to November exhibits large variations, is treated as uncertain and is generated by using the lag-1 AR model developed above. The variability in inflows from December through March is muted due to freezing conditions in the basin, thus inflows are treated as certain. Figure 1 compares the variability in inflow scenarios generated using the lag-1 AR model vs. the variability in the historical record of inflows.

## Stochastic optimization techniques

The techniques that can be used to solve the multistage stochastic optimization problem include stochastic dynamic programming (SDP),8 stochastic dual dynamic programming (SDDP),9 and reinforcement learning (RL).7,11

SDP is the classical technique to solve this class of problems, and it uses a grid of points for the initial state variables in each stage, computes the future value at each point representing the expected value of the objective function values of the stage problems over all outcomes of the random variables, and then recursively constructs the future value function used in the previous stage using the grid of points and their future values. This approach suffers from two curses: dimensionality and modeling. The number of stage problems in each stage to be solved increases exponentially as the numbers of state variables and grid of points increase, and so does the computational time.

SDDP constructs an outer approximation to the future value function of the state variables using Bender's cuts, which are linear functions of state variables computed using the dual values of state variables. The outer approximation gives an upper bound to the future value function. Figure 2 illustrates the outer approximation to a future value function as function of the storage in a reservoir. The cuts are constructed in passes, with each pass consisting of a forward and a backward pass. In a forward pass, from stage 2 to stage T-1, random variables (known and fixed in stage 1) are sampled to form scenarios, and the stage problems from stage 1 to stage T-1 are solved to give a trajectory for the state variables. In a backward pass, from stage T to stage 2, the stage problems for all outcomes of random variables are solved, and cuts are generated for the future value function in the previous stage.

The convergence criterion of SDDP is based on statistical estimation criteria.10 The future value function at the first stage is approximated by using the SDDP to generate a number of Bender's cuts. Then, a simulation is performed by running a large number of forward passes and calculating a 95% confidence interval for the estimated expected value of total benefits in each stage plus the future value at the last stage. If the objective function value in stage one is in that confidence interval, then SDDP is considered to have converged. If it does not pass this test, then SDDP to generate additional cuts to further refine the approximate future value function. This process is repeated until the SDDP converges.

Reinforcement learning (RL) was originally developed for robotics and artificial intelligence. The main idea is based on incremental learning procedures where a credit is assigned to the difference between temporally successive predictions. The method uses a sampling technique while updating the action-value function rather than computing the expected rewards and transition probabilities as in an SDP technique. Accordingly, learning the optimal policy from experience does not require constructing a model of the environment's dynamics. This technique has been used to solve multi-reservoir operations problems.

## Prototype model for SDDP

In the Water Value Project, we are testing a number of prototype models for SDP, SDDP and RL. Below we present results of our work on developing the prototype for the SDDP algorithm.

The planning horizon consists of three years, from October 2025 to September 2028, with monthly decisions on reservoir storage and energy trades. This planning horizon is extended to eight years to reduce the impact of assumed future value function at the end of the planning horizon.^{3} To save computational time, the time steps in each of these extended years are aggregated to be three individual months in the freshet period (summer season) and three seasons (fall, winter and spring) accounting for the remainder of the year.

The operation of the five main hydro plants in the BC Hydro system is optimized to maximize the value of the hydroelectric system while meeting domestic demand and interacting in external electricity markets. Energy coming from other resources - like small hydro and run-of-river plants, thermal plants and wind power - is assumed to be non-dispatchable and is netted from the forecasted domestic demand. Trade in the Alberta and mid-Columbia electricity markets are subject to transmission path limits. More load and price scenarios, capturing the variability and uncertainty in those variables, will be included in future development of the prototype.

The SDDP algorithm solves many optimization problems that maximize the value of BC Hydro resources from energy trades at each stage in the planning horizon. To calculate the present value of water in the first stage, the value functions in future stages are discounted to the start of the planning horizon. The marginal values of stored water in the Williston, Kinbasket and Arrow Lakes reservoirs are calculated by evaluating the derivative of the water value functions.

## Results from prototype model for SDDP

The SDDP model converges according to a 95% confidence interval, established by 10,000 forward pass simulations, with 2,000 Bender's cuts.

To test the impact of the assumed future value functions at the end of the extended planning horizon, two alternatives were tested: a linear function and a piecewise linear function. The results show that the marginal values of generated future value functions in each stage of the two alternatives are becoming similar as they approach the first stage. The expected values for the original planning horizon (total benefits in each stage and future value at the last stage in the original planning horizon) in simulation for the two alternatives are calculated, and the difference between them is very small.

To assess the impact of inflow uncertainty, the problem is solved for the original planning horizon and the uncertainty in inflows is ignored. The SDDP model is then used to generate future value functions from this problem, and these functions are used in the simulation with the original inflow model. The expected value in simulation is lower than that in the first result by 2%. This shows that for this problem, ignoring inflow uncertainty in generating future value functions will result in a moderate decrease in expected value.

## Conclusion and future work

Results show that by extending the planning horizon, the impact of the accuracy of approximating the future value function at the end of the extended planning horizon on the generated future value functions can be reduced. The results also show that the uncertainty in inflows needs to be taken into account in generating future value functions.

For planning purposes, the future value function of the first stage needs to be accurately determined for the space of state variables. Because SDDP explores the optimal corridor of trajectories of the state variables and only covers a limited range of state variable space, our implementation of SDDP for planning studies will initialize runs with a total of 48 starting reservoir levels. This will be done to generate more accurate cuts and to better approximate the future value function that covers the full range of storage state variables in the first stage.

The other two major sources of uncertainty, market prices and domestic load, will be added as stochastic variables in the future based on forecast scenarios.

BC Hydro uses seasonal snowpack and inflow forecasts that we are incorporating into the inflow generation model. This will enable us to better model the Columbia River Treaty constraints on storage and release decisions at Mica and Arrow Lakes reservoirs.

## Notes

^{1}Kaut, M. and S.W. Wallace, Evaluation of Scenario-Generation Methods for Stochastic Programming, Stochastic Programming E-Print Series, 14, University of Humboldt, Berlin, Germany, 2003.

^{2}Guan, Z. and A.B. Philpott, "A Multistage Stochastic Programming Model for the New Zealand Dairy Industry," Robust Supply Chain Management, International Journal of Production Economics, Volume 134, No. 2, December 2011, pages 289-299.

^{3}Maceira, M.E.P., "Ten Years of Application of Stochastic Dual Dynamic Programming in Official and Agent Studies in Brazil - Description of the NEWAVE Program," Proceedings of Power Systems Computation Conference 2008, Institute for Research in Technology, Madrid, Spain, 2008.

^{4}Gjelsvik, A. and S.W. Wallace, "Methods for Stochastic Medium-Term Scheduling in Hydro-Dominated Power Systems," Report EFI TR A4438, Norwegian Electric Power Research Institute, Trondheim, Norway, 1996.

^{5}Hoyland K., M. Kaut, and S.W. Wallace, "A Heuristic for Moment-Matching Scenario Generation," Computational Optimization and Applications, Volume 24, 2003, pages 169-185.

^{6}Dupacova, J., N. Growe-Kuska, and W. Romisch, "Scenario Reduction in Stochastic Programming: An Approach using Probability Metrics," Mathematical Programming, Series A, Volume 95, 2003, pages 493-511.

^{7}Lee, J.H., and J.W. Labadie, "Stochastic Optimization of Multireservoir Systems via Reinforcement Learning," Water Resources Research, Volume 43, No. 11, November 2007.

^{8}Archibald, T., K. McKinnon, and L. Thomas, "Modeling the Operation of Multireservoir Systems using Decomposition and Stochastic Dynamic Programming," Naval Research Logistics, Volume 53, No. 3, 2006, pages 217-225.

^{9}Pereira, M.V.F. and L.M.V.G. Pinto, "Multi-Stage Stochastic Optimization Applied to Energy Planning," Mathematical Programming, Volume 52, 1991, pages 359-375.

^{10}Hindsberger, M. and A.B. Philpott, "Stopping Criteria in Sampling Strategies for Solving Multistage Stochastic SLP-Problem," Proceedings of Applied Mathematical Programming and Modelling, INFORMS, Hanover, Maryland, USA, 2002.

^{11}Abdalla, A.E., "A Reinforcement Learning Algorithm for Operations Planning of a Hydroelectric Power Multireservoir System," PhD thesis, University of British Columbia, Vancouver, British Columbia, Canada, 2007.

## Acknowledgments

BC Hydro and the Natural Sciences and Engineering Research Council of Canada (NSERC) provided research funding for the first two authors. Many others at BC Hydro provided guidance, and the authors hereby acknowledge their support and contributions.

**Ziming Guan is a postdoctoral fellow and Ziad Shawwash is an adjunct professor at the University of British Columbia, Canada. Alaa Abdalla is manager of the reliability and planning group in BC Hydro. Amr Ayad is a PhD student at the University of British Columbia. Joel Evans is a senior engineer in BC Hydro's reliability and planning group.**

*This article has been evaluated and edited in accordance with reviews conducted by two or more professionals who have relevant expertise. These peer reviewers judge manuscripts for technical accuracy, usefulness, and overall importance within the hydroelectric industry.*