Forecasting System Imbalance Volumes in Competitive Electricity Markets

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/3267508

Forecasting System Imbalance Volumes in Competitive Electricity Markets

Article  in  IEEE Transactions on Power Systems · March 2006


DOI: 10.1109/TPWRS.2005.860924 · Source: IEEE Xplore

CITATIONS READS
45 1,651

2 authors, including:

D.s. Kirschen
University of Washington Seattle
263 PUBLICATIONS   14,140 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

CPS: Synergy: Certifiable, Scalable, and Attack-resilient Submodular Control Framework for Smart Grid Stability View project

IEEE Cascading Failure Working Group Project ion Cascading View project

All content following this page was uploaded by D.s. Kirschen on 01 February 2016.

The user has requested enhancement of the downloaded file.


240 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

Forecasting System Imbalance Volumes


in Competitive Electricity Markets
Maria P. Garcia and Daniel S. Kirschen, Senior Member, IEEE

Abstract—Forecasting in power systems has been made consid- then highlighted. This paper also shows how exploratory anal-
erably more complex by the introduction of competitive electricity ysis can be used to describe the structure of the data series and
markets. Furthermore, new variables need to be predicted by var- determine the relationships between the different variables in-
ious market participants. This paper shows how a new method-
ology that combines classical and data mining techniques can be volved in the forecasting process. Results based on actual NETA
used to forecast the system imbalance volume, a key variable for market data demonstrate the accuracy that can be achieved.
the system operator in the market of England and Wales under the
New Electricity Trading Arrangements (NETA). II. FORECASTING IN ELECTRICITY MARKETS
Index Terms—Data mining, electricity markets, multidimen-
sional forecasting, neural networks, time series. Dash et al. [1], Bunn [2], Sfetsos [3], and Kermanshahi et
al. [4] provide overviews of the progress that has been made
recently in the broad field of forecasting in power systems. The
I. INTRODUCTION introduction of competitive electricity markets has consider-
ably increased not only the complexity of this task but also the
F ORECASTING in power systems used to deal mostly with
predicting future values of the load. With the introduction
of competitive electricity markets, forecasting has become not
breadth of this field.
• Forecasting is no longer an activity performed only by the
only a much larger but also a much more complex topic. While system operator. All market participants must do some
predicting future prices is obviously an important issue, it is not forecasting to maximize their profitability and control
the only question for which market participants seek answers. their exposure to risk.
Their commercial performance is also strongly affected by their • Load is no longer the only uncertain variable that must
ability to predict other variables such as the volumes that will be be forecasted. Market participants are interested in prices
traded in different segments of the market. Since in a competi- [5]–[10], traded volumes, and market length.
tive environment all participants have the freedom to operate in- • Market variables are much more “noisy” than the system
dependently, the overall level of uncertainty in the operation of load.
the power system increases, and the variables that might be rel- • The values of these variables are driven in complex ways
evant proliferate. Choosing among the dozens of variables that by many interacting factors. It is thus important to expand
are recorded in a typical market, those that are the best predic- previous one-dimensional approaches (see, for example,
tors of the quantity of interest becomes a major challenge. [6]) to multidimensional inputs.
This paper explores how data mining can be used to select the • The amount of data to be considered is huge and involves
most promising predictors and how new and traditional tech- not only the market clearing data but also the positions
niques can be combined to develop more accurate forecasting that the participants took for every market period and syn-
tools. It adopts the perspective of a system operator that must thetic indicators of market activity.
procure the energy needed to maintain the balance between gen- • Changes in market rules affect the way some variables are
eration and load. If system operators can accurately forecast the calculated and influence the behavior of market partici-
amount of balancing energy needed during each market period, pants. These changes reduce the amount of historical data
they can purchase this energy on the forward market rather than that can reliably be used for forecasting.
on the spot market. Such an approach usually helps minimize One could try to adapt to electricity markets the techniques
the balancing cost. In the New Electricity Trading Arrangement that are used for forecasting the behavior of other markets. Un-
(NETA), which is in effect in England and Wales, this key vari- fortunately, these techniques cannot easily be applied to elec-
able is called the net imbalance volume (NIV). tricity because it significantly differs from other physical com-
Traditional time series forecasting methods are first compared modities, such as corn or oil, and financial instruments, such as
with neural networks techniques. The advantages of multi- stocks and bonds. Furthermore, the rules and characteristics of
dimensional forecasting over one-dimensional approaches are electricity markets are quite different from those of these other
markets [11].
Manuscript received December 20, 2004; revised July 6, 2005. This work was
supported in part by the National Grid Transco and in part by Statsoft. Paper no.
III. NETA AND THE NIV
TPWRS-00661-2004.
The authors are with the Department of Electrical Engineering and Elec- Since March 2001, electricity trading in England and Wales
tronics, University of Manchester, Manchester M60 1QD, U.K. (e-mail: Daniel.
kirschen@manchester.ac.uk). has been governed by NETA. Unlike its predecessor, the Elec-
Digital Object Identifier 10.1109/TPWRS.2005.860924 tricity Pool of England and Wales, NETA does not dictate how
0885-8950/$20.00 © 2006 IEEE
GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES 241

Fig. 1. NETA timeline.

electrical energy is to be bought and sold. Instead, it establishes


a framework for bilateral trading between generators, suppliers,
traders, and consumers [12]. Participants can choose the timing
and the instruments they use for trading. NETA only provides
mechanisms for keeping the system in balance and for settling Fig. 2. NIV detail from 01/04/2001 to 21/11/2004 in megawatts.
the imbalances that inevitably arise between the physical and
contractual positions of the market participants. sum of the imbalances of all the individual market participants.
Fig. 1 illustrates the timeline for market operation under This variable represents the total net energy that it must buy or
NETA. At gate closure, one hour ahead of real time, bilateral sell in the forward market or through the balancing mechanism.
trading comes to an end for the current half-hourly trading Fig. 2 shows the values of NIV over a seven-month period. Ob-
period, and all participants must notify the system operator servation of this figure suggests that this variable does not dis-
of their expected production or consumption. They may also play any obvious seasonality, such as the daily and weekly pat-
submit to the system operator bids and offers expressing their terns that one can observe in the demand for electrical energy.
willingness to deviate from these levels. These bids and offers
in the balancing mechanism involve a quantity, a price, and IV. COMPARISON OF TRADITIONAL TIME SERIES ANALYSIS
technical parameters, indicating the speed at which these ad- AND DATA MINING TECHNIQUES
justments can be made. The system operator chooses the bids Exploratory time series analysis aims to identify the nature
and offers needed to keep the generation and the load in balance and the structure of an event through observation of its past
and to maintain the security of the system. If the system is behavior. Time series analysis can also be used to predict fu-
short, the operator accepts offers from generators to increase ture values of a variable based on these past values. Traditional
their production or bids from the demand side to reduce their forecasting techniques [e.g., autoregressive integrated moving
consumption. On the other hand, if the system is long, the average (ARIMA), exponential smoothing] are limited to pre-
system operator accepts bids to reduce the output of generators dicting values for one variable based on its previous values [14].
or offers from the demand side to increase the load. ARIMA also assumes that the time series is stationary, that the
Keeping the system in balance thus has a cost that is ulti- values are normally distributed, and that the residuals are in-
mately passed on to the consumers. To keep this cost under dependent. Many of the time series relevant to electricity mar-
control, the regulator and the system operator agree each year kets do not satisfy these conditions. More complex models must
on an annual target cost. If the system operator manages to op- be used to forecast nonperiodic, nonstationary, and noisy series.
erate the system for less than this target cost, it is rewarded by Emerging data mining modeling techniques can be adapted to
being allowed to retain part of the difference [13]. On the other uncover nonlinear relations in a priori irregular data. This can
hand, if it exceeds the target, it must pay part of the excess. This lead to more accurate forecasting techniques [15].
scheme gives the system operator, which is a for-profit company, The Cross-Industry Standard Process for data mining was de-
a strong incentive to minimize the balancing cost. Being able to veloped in the mid 1990s to organize the process of transforming
forecast accurately the amount of balancing energy that it will raw data into useful information [16]. This data mining process
need to buy or sell during each half-hourly market period helps can be adapted to data analysis and forecasting in electricity
the system operator meet this goal. Instead of accepting some of markets. The basic steps of this process are defined as follows.
the bids and offers that are made by market participants in the • Business understanding: In the context of this paper, this
balancing mechanism, the system operator also has the option to step involves understanding the rules of the market.
buy or sell energy in the forward market. Since the prices it can • Data understanding: Having collected all the relevant
get through this advance trading in the forward market are often data, their structure must be analyzed. Exploratory time
better than those that it can achieve through the balancing mech- series analysis techniques (such as correlation analysis,
anism, this strategy can save a significant amount of money as singular spectrum decomposition, and distributed lags
long as the forecast is sufficiently accurate. If the forecast is in- analysis) are used to determine the statistical character-
correct, the system operator might indeed have to compensate istics of the series.
for excessive trades it made in the forward market by buying or • Data preparation: This step involves data selection, data
selling energy in the balancing mechanism. cleaning, data construction, data integration, and data for-
The system operator is thus very interested in forecasting as matting [17]. While these tasks may appear mundane,
accurately as possible the NIV, which is defined as the algebraic they are critical to the success of the whole process.
242 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

• Modeling: Various techniques are applied in this phase, and


the corresponding parameters are adjusted to their optimal
values. Wehenkel [18] provides a complete description of
data mining modeling techniques and their application to
power systems. Nowadays, modeling for forecasting is
most commonly based on neural networks (NNs).
• Evaluation: Once the model has been created, its output
must be assessed. For forecasting, this step involves the
computation of error measurements to quantify the accu-
racy that has been achieved and some sensitivity analysis
to appraise the robustness of the process.
• Deployment: In forecasting, this phase involves the au-
tomation of the collection of real-time data and the timely
Fig. 3. Autocorrelation and partial autocorrelation of the current value of NIV
processing of this data. with previous values (i.e., consecutive lags).
These steps are not necessary completed in sequential order.
It may be necessary to iterate between them.
oscillatory, and noise components of the transformed
Traditional time series analysis and data mining are not in-
series.
compatible techniques. Classical exploratory time series tech-
niques are a useful tool not only to identify the time structure Fig. 3 shows the correlation and partial correlation results ob-
of a series but also to select the input variables and analyze the tained with NIV data. The correlation results show the high au-
relations that might exist between them. In practice, they are tocorrelation at lag 1 and how it decays slowly thereafter. The
complementary techniques, with time series analysis aiding in partial correlation also shows how, despite the strong correla-
the critical steps of the data mining process. tion at lag 1, none of the partial autocorrelations are important.
The spectrum Fourier analysis shows consistent mathematical
results for the analysis of all the transformed series, confirming
V. ONE-DIMENSIONAL ANALYSIS OF NIV
NIV noisy data structure. Similarly, the caterpillar decomposi-
One-dimensional analysis uses time series to identify the tion does not detect strong seasonal components but can be used
“time structure” of the data and then to produce a medium-term as a filter to eliminate the high-frequency components that are
forecast (one week ahead) of this variable [19]. present in NIV. The combination of these results for NIV orig-
inal and differentiated series shows that the NIV time series has
A. Analysis of the Time Structure of NIV Data no seasonality, no constant mean, and a constant noisy struc-
The objective of this analysis is to identify NIV’s time struc- ture [26].
ture and to separate its main temporal components: trend, sea-
sonality, and noise. The data set is preprocessed to better expose B. NIV Forecasting Using Time Series Techniques
the embedded information while preserving the original nature This section compares the performance of different time se-
of the pattern. [20], [21]. The two steps involved in this prepro- ries forecasting methods when applied to NIV and explores the
cessing are as follows. effect of the size of the training data set on the quality of the
• Filtering and smoothing: Moving medians of different forecast. The following three techniques have been investigated
window length (8 and 48 periods’ length) are applied to [14], [22]:
the original series • ARIMA, which combines the facts that elements of time
• Variable transformations: mean subtraction, normaliza- series are serially dependent (autoregressive process) and
tion, linear trend subtraction, and autocorrelation correc- that each of these elements is affected by past errors not
tion [22]. The transformed series are the inputs used in the included in the autoregressive process (moving average
modeling phase. process);
The modeling tools applied to uncover the data time structure • exponential smoothing, which operates like a moving av-
are as follows. erage process where the recent observations are given a
• Autocorrelation and partial correlation analysis [23] to higher weight than older ones;
detect NIV’s seasonal patterns. The partial autocorrela- • caterpillar forecasting, which is based on the linear recur-
tion is an extension of the autocorrelation function that rent formula decomposition of the same name.
clarifies the existence of seasonal effects by removing The data analyzed consist of a series smoothed on the basis
the effect of the correlation of the intermediate elements of an eight-period window. The original series is divided in
within a specific lag. seen/training and unseen data blocks. Unseen data sets corre-
• Spectrum Fourier analysis [24] is a “mathematical prism” spond to a one-week period (42 observations). Two different
that decomposes the data into its sinusoidal components unknown data sets are selected. In the first one, NIV presents
to detect seasonal and cyclical components. an increasing trend and, in the second one, a decreasing trend.
• Caterpillar decomposition [25] decomposes the original For each of these unseen data sets, the seen data sets consist of
series into independent additive time series. This decom- series of 500, 1000, and 1500 observations. Fig. 4 illustrates a
position is used to identify, extract, and isolate the trend, detail of the forecasted values for the decreasing trend data set.
GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES 243

In summary, these results show that past NIV data are not
a good predictor of future values. NIV time series are noisy,
unstructured, changing, and normally distributed. It should not
come as a surprise that NIV is a clear example of the central
limit theorem because it results from a large number of actions
by different market participants acting independently.

VI. MULTIDIMENSIONAL ANALYSIS OF NIV


The results from the one-dimensional analysis suggest that a
multidimensional approach is necessary to effectively forecast
NIV. One of the main challenges when applying multidimen-
sional forecasting techniques is the selection of the most rele-
vant input variables. In the case of NIV, an exploratory analysis
was performed to understand the possible interactions between
NIV and other variables from the balancing mechanism. This
made possible a systematic and rational selection of the vari-
ables used as input to the forecasting [19].

A. Multidimensional Exploratory Analysis


To detect the qualitative, quantitative, and temporal relations
between NIV and the other variables, the following techniques
were applied.
• Time series analysis [23] was used to uncover the relations
between the variables at different points in time. This in-
cluded a cross spectrum analysis to determine the correla-
tion between variables at different frequencies, as well as
a distributed lag analysis to evaluate the delayed relation-
ships between variables, i.e., the lagged effect on NIV of
other variables.
• Multivariate exploratory techniques [27] were used to un-
derstand multidimensional relations and their statistical
significance. For example, multidimensional correlation
analysis was used to measure the linear relations between
variables. Similarly, data mining Kohonen networks [28]
determined cluster areas of common values mapped in the
two-dimensional spatial information given by the output
neurons.
• All variables are first preprocessed using moving medians
Fig. 4. NIV forecasted solutions (in megawatts) for different forecasting bases to filter and smooth the series. Each variable is then nor-
(500, 1000, and 1500 cases).
malized and differentiated. Table I shows how the use
of the transformed series allows the analysis of different
It shows how the forecasts converge quickly either to an almost interactions between NIV and the rest of the balancing
constant value [for a (1,0,6) ARIMA model] or to a constant mechanism variables.
slope (exponential smoothing). Only the caterpillar forecasting The input variables have been divided into pre-gate closure
method is capable of predicting an oscillatory behavior. With an and post-gate closure variables, where gate closure refers to the
average prediction error of 500 MW (50%) [19], the numerical time when the bilateral forward markets close and the balancing
error measurements do not show any consistent advantage for mechanism begins operation until real time. Under NETA, gate
any one method over the other two. Not only is the accuracy of closure takes place one hour before each half-hour trading pe-
the forecast poor, but there are also substantial differences be- riod. Pre-gate closure variables include demand forecast, sub-
tween the “dynamics” of NIV and the various forecasts. mitted offer volumes, submitted bid volumes, maximum de-
The number of cases included in the training data sets has a clared capacity at gate closure, gate closure imbalance volume
different effect on each method. For ARIMA, it only affects the (GCIV), market imbalance volume, and activity on the elec-
final value and, for exponential smoothing, the final slope. On the tronic power exchange. Post-gate closure variables include de-
other hand, for the caterpillar method, an increase in the number mand, demand forecast error (DFE), post-gate closure effects,
of observations affects the series decomposition and the recon- accepted bid volumes, accepted offer volumes, accepted bid
struction that forms the basis of the forecast. The forecasted cashflows, accepted offers cashflows, and imbalance prices.
results become smoother as the amount of seen data increases Statistical analysis of the pre-gate closure variables only con-
(similar results are obtained for the increasing trend data set). firms the expected relation between GCIV and NIV. On the
244 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

TABLE I tackled by any structure of NN suitable for regression purposes,


RELATION BETWEEN PREPROCESSED INPUT AND OUTPUT VARIABLES providing that the data set is suitably preprocessed into the
correct form. Choosing the optimal NN structure is thus an
important task.
2) Development of the NNs: The main stages of the develop-
ment of the NNs used to forecast NIV follow the standard data
mining procedure [22], [31].
• Data selection: The selection of variables is guided by
the physical quantity that they represent and by the re-
sults of the multivariable exploratory analysis. Experience
demonstrated that the choice of variables has a strong in-
fluence on the quality of the results. Including too many
variables or the wrong variables can lead to dimension-
ality problems.
• Cases selection: Three different data sets are required to
other hand, the results obtained with the post-gate closure vari- develop an NN for forecasting. The training data set is
ables show that these variables can be used as input variables used to train the network. The selection data set is used
for forecasting NIV. A delayed time relation between NIV and to select the best trained network. Finally, the test data
the DFE was exposed. However, due to the time scale used for set (containing only unseen data) is used to measure the
forecasting, this relation can be omitted. performance of the networks. The number of cases needed
for both the training and the selection data sets depends
B. Multidimensional NN Forecasting on the number of connections and the complexity of the
function to be modeled. However, over-fitting problems
This forecasting technique is based on the relation between
can arise if excessively large sets are used.
the past (seen) values of the balancing mechanism variables and
• Data preparation: NNs usually require pre- and post- pro-
the future (unseen) values of NIV. However, the relations linking
cessing of the data to adapt first the input variables to
the past and future values of these variables are neither simple
the characteristics of the neurons’ activation functions and
nor linear. Data mining techniques, in particular NNs, have been
second to transform the output to the normal data range.
shown to be able to uncover these complex associations while
• Training: This process consists in a progressive adapta-
maintaining the time structure of the analyzed series.
tion of the NN parameters to learn the desired behavior.
1) Possible NN Architectures: NN techniques continue to be
Several networks of different architectures are trained
enhanced through improvements in computational performance
using the training data set. Each network produces its
and in the flexibility of the software used for their implementa-
own prediction for the unseen data set.
tion [15], [29], [30]. Depending on the ways the input, hidden,
• Assessment: Once the different networks have been cre-
and output neurons are connected, various NN architectures are
ated, several measures of performance are obtained (error
produced. In this project, the following structures were used and
measurements, training performance, sensitivity analysis,
tested:
and residuals analysis). These results can also be used as
• linear networks (LNs); a feedback to modify the parameters of the networks. The
• multilayer perceptron (MLP); best architecture is selected based on performance over
• radial basis function (RBF); the selection data set.
• probabilistic neural networks (PNNs); This development process does not produce a unique network
• generalized regression networks (GRNNs) that can be used in all cases. Various architectures produce op-
The selection of a network architecture depends on the timal results for different forecasting conditions. A different op-
problem characteristics. While MLP is one of the most popular timal network, therefore, must be created for each forecasting
architectures for modeling functions of any complexity, RBFs scenario. The parameters of the networks also need to be ad-
are extensively used for large and recurrent problems as they justed to the timeframe considered.
are quick to train. PNNs, on the other hand, have been exten- The following two forecasting scenarios were considered.
sively used for classification problems considering the outputs
as the probability value of class membership. Finally, GRNNs • Case 1: One month ahead forecast performed on a daily
are similar to the PNN, but they perform only regression tasks. basis. Each forecast value represents the median NIV
Although these different network structures work on different value for a whole day.
principles, all of them can be adapted to suit a specific problem. • Case 2: One week ahead forecast where six values are
For example, a PNN can be used for regression and forecasting forecast for each day. Working and nonworking days are
problems if the output is treated as the expected value of the treated separately.
model. One important characteristic of NIV forecasting is the 3) Data Preparation and Selection: All variables are filtered
structure of the time series of both the input variables and the using smoothing moving medians with windows of 48 periods
forecasted output. This transforms the problem of forecasting for Case 1 and eight periods for Case 2. The series are then nor-
NIV into a specialized form of regression. As such, it can be malized. Finally, the time relation between the input and output
GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES 245

TABLE II TABLE III


DEFINITION OF THE TRAINING, SELECTION, AND TEST DATA SETS FOR CASE 1 BEST NNs AND CORRESPONDING ERRORS FOR EACH SUBCASE OF CASE 1

• Nonworking days: The selection data set consists of the


nonworking days of the previous week. The training data
set correspond to a two-week period finishing two weeks
prior to the test data set.
The variables used as predictors for this forecast are the
variables is transformed to expose the information in such a way demand forecast, DFE, accepted bid volumes, accepted offer
that the present values of the input variables can predict future volumes, forward trades, gate closure imbalance volume, Min
values of NIV. Accepted Offers Accepted Bids , imbalance prices, and
Let be a vector from the known (training and selection) type of day (Monday, Tuesday, ).
data set 4) Results:
a) Case 1: Forecast Over a One-Month Period: Table III
(1) shows the NN configurations that produce the smallest error for
each month as well as the accuracy that has been achieved in
where terms of the root mean-squared error (RMSE) and the mean ab-
time index (e.g., corresponds to day one); solute relative error (MARE), which are defined in (3) and (4),
th balancing mechanism variable at time ; respectively. The numbers following the network names indi-
value of NIV (i.e., the output) at time . cate the number of units in the input, hidden, and output layers,
In the training and selection data sets, all the input and output separated by a colon (e.g., 12:72-5-1:1 indicates 12 units in the
components of the vectors are known. The unknown (or test) input layer, 72, 5, and 1 units, respectively in the first, second,
data set is formed by vectors defined as and third hidden layers and 1 unit in the output layer). For the
cases considered, the optimum network architectures are either
(2) multilayer perceptron or radial basis function. Linear networks
perform badly in all the cases studied. These results show that
where there is not a unique network structure or even type that is best
time index; for all the forecasted months
th balancing mechanism variable at time ;
value of NIV at time .
In the test data set, known values of the balancing mecha- RMSE NIV NIV (3)
nism variables (input) allow us to calculate the unknown (fu-
ture) values of NIV (output). The data selection process differs NIV NIV
for each of the two cases that were considered. MARE (4)
NIV
Case 1: includes six different subcases. Each of them cor-
responds to a four-week period of data. Table II Fig. 5 compares in the time domain the actual values of NIV
shows the definitions of the training, selection, and with the best forecasts for each month.
test data sets. b) Case 2: Forecast Over a One-Week Period: Similar
Case 2: consists of twelve different subcases, each of them tests show that for forecasts over a one-week period, the multi-
corresponding to a week’s worth of data. The fore- layer perceptron is the network architecture that provides op-
casted period consists of 12 consecutive weeks timum solutions for most of the analyzed cases and for both
(weeks 17 to 29 in 2002) divided into working and working and nonworking days [19].
nonworking days. The training, selection, and test Table IV shows the minimum errors for working and non-
data sets were selected as follows. working days separately and a weighted average for the weekly
• Working days: The selection data set consists of the week error. This table shows that there is a big difference between the
preceding the test data. The training data set consists of forecasting accuracy for working and nonworking days. Fore-
the four-week period finishing two weeks prior to the test casts for nonworking days are more accurate than for working
data set. day. Possible reasons for this difference include the following.
246 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

Fig. 5. Actual and forecasted values of NIV for the month-ahead forecasts using the best network.

TABLE IV c) Sensitivity Analysis: As part of the assessment process,


MEASURES OF ERROR FOR CASE 2 (WEEKLY FORECASTING)
a sensitivity analysis was performed to evaluate the relative im-
portance of the input variables on the accuracy of the forecast.
This evaluation was based on the effect that omitting a predictor
from the development of the NN has on the accuracy of the fore-
cast. Predictors were then ranked on the basis of the deteriora-
tion that their omission causes. This process was repeated for
12 separate single week forecasts for working days. Table V
shows the relative importance of each predictor based on this
sample. It also shows the range of the rankings (1 for most im-
portant, 15 for least important) and a raw cumulative score ob-
tained by summing the rankings of each variable for each of the
12 weekly forecasts. These results suggest that some variables
are better predictors than others but that the relative importance
varies from week to week. For all the cases considered, ignoring
a predictor never improves the accuracy of the forecast.
5) Discussion: While the NNs that have been developed are
able to predict NIV with reasonable accuracy for both weekly
and monthly horizons, no single network architecture provides
optimal results for all market conditions. A new network must,
therefore, be developed regularly if this accuracy is to be
maintained.
Comparing the results obtained for cases 1 and 2 shows that
• The number of cases considered: For working days, the increasing the frequency of observations does not improve the
forecast is based on 30 cases, while for nonworking days, accuracy of the results. A more accurate forecast is obtained on a
it is based on 12 cases only. monthly basis (average MAE: 363 MW) than on a weekly basis
• The difference in input data: NIV is less volatile for non- (average MAE: 440 MW). This can be explained by the nature
working days than it is for working days. The standard of the data: A daily aggregation of NIV is less scattered than an
deviation of NIV for working days is 835.47 MW, while
it is only 721.11 MW for nonworking days
GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES 247

TABLE V The number of cases and their selection for the required data
SENSITIVITY ANALYSIS FOR WEEKLY FORECAST (WORKING DAYS) sets can be modified as more data become available. Different
forecasting scenarios may require a different number of cases to
be included in the data sets. The more updated information the
network can use for learning, the better the forecast becomes.
However, it is important to avoid overtraining since that would
lead to an inflexible network and inaccurate results.
Multidimensional forecasting using NNs gives a better ac-
curacy than other multidimensional methods based on linear
regression. Table VI compares the monthly forecast obtained
with NNs and with a production-grade program based on linear
regression methods. Table VII presents a similar comparison
for working and nonworking days in the weekly scenario. NNs
also outperform one-dimensional linear methods forecasting.
Table VIII compares the accuracy of multidimensional NNs
with one-dimensional methods for weekly forecast. In all
cases, the NNs outperform other methods, even when the time
horizon is larger than the one considered in the one-dimensional
forecasting.

VII. CONCLUSION
Competition increases the need for forecasting in power sys-
tems. New market variables present complex structures that are
not easily modeled by traditional techniques. A recently devel-
oped data mining technique can provide the necessary tools to
uncover nonlinear relations between these variables and identify
TABLE VI
ERROR COMPARISON FOR MONTHLY FORECAST the best predictors for the variables to be forecast. The appli-
cation of classical techniques in combination with data mining
NNs yields a more accurate and realistic performance than con-
ventional forecasting techniques. However, to maintain a rea-
sonable accuracy, these networks must be updated on a regular
basis.
While the results presented in this paper demonstrate that
the proposed approach yields forecasts that are useful and fi-
nancially valuable, it is clear that the accuracy of forecasts of
market variables is still much lower than the accuracy that one
can achieve when trying to predict daily or weekly load profiles
and prices. Differences in accuracy when forecasting different
TABLE VII
ERROR COMPARISON FOR WEEKLY FORECAST variables can be explained by a number of factors. For any vari-
(WORKING AND NONWORKING DAYS) able, the main difficulties in forecasting arise from high noise,
nonlinear effects, data availability, and length of the forecasting
horizon. Therefore, when comparing the accuracy of forecasts
of NIV and electricity prices, one should consider the following.
• NIV is more volatile than prices. If the volatility is de-
fined as the standard deviation of the rate of change of the
normalized variables, NIV’s volatility is 0.08 compared
with 0.04 and 0.009 for the system buy and sell prices,
TABLE VIII
ERROR COMPARISON FOR WEEKLY FORECAST (ONE-DIMENSIONAL
respectively.
AND MULTIDIMENSIONAL NNs) • NIV needs to be forecasted one week to one month ahead,
while electricity prices forecasts are usually performed for
shorter horizons [5]–[7], [9], [10], [32].
• There are no clear linear relations between NIV and other
market variables, while several studies have shown linear
relations between prices and other market variables, such
as demand and capacity shortfalls [7], [32].
aggregation in six blocks of four hours per day (i.e., aggregated • NIV is a newer variable than prices. Much less historical
in so-called EFA blocks). data are thus available since the market only started in
248 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

2001 and has since undergone several rule changes that [16] C. Pete, C. Julian, K. Randy, and K. Thomas, “CRISP-DM 1.0 Step-By-
have caused external modifications in the data. Step Data Mining Guide,”, 2000.
[17] D. Pyle, Data Preparation for Data Mining. San Francisco, CA:
Finally, the true measure of improvement when forecasting Morgan Kaufmann, 1999.
market imbalance volumes is not an abstract error index but [18] L. A. Wehenkel, Automatic Learning Techniques in Power Systems.
Norwell, MA: Kluwer, 1998.
rather the savings in balancing costs that this improvement [19] M. P. Garcia, “Forecasting system imbalance volumes and analysis of
makes possible. With a one-month-ahead time scale in partic- unusual events in competitive electricity markets,” Ph.D. dissertation,
ular, the 20% error reduction in the forecasted volume that the Univ. Manchester, Manchester, U.K., 2005.
[20] C. Antunes and A. Oliveira, “Temporal data mining: An overview,” pre-
proposed method achieves makes possible a significant increase sented at the 7th ACM SIGKDD Int. Conf. Knowledge Discovery Data
in the amount of energy that can be the traded in the forward Mining (KDD-2001), San Francisco, CA, 2001.
market and hence a substantial saving in balancing costs. [21] M. Gavrilov, D. Anguelov, P. Indyk, and R. Motwani, “Mining the stock
market: Which measure is best?,” presented at the Conf. Knowledge Dis-
covery Data, Boston, MA, 2000.
ACKNOWLEDGMENT [22] (2006) Electronic Statistics Textbook. Statsoft, Tulsa, OK. [Online].
Available: http://www.statsoft.com/textbook/stathome.html
The authors would like to thank Dr. C. Aldridge for his [23] D. Peña, G. C. Tiao, and R. Tsay, “Basic concepts in univariate time
assistance. series,” in A Course in Time Series Analysis, Wiley series in Probability
and Statistics. Probability and statistics, J. W. Sons, Ed. New York:
Wiley-Interscience, 2000, p. 496.
REFERENCES [24] J. B. Elsner and A. A. Tsonis, “Singular spectrum analysis: A new tool
in time series analysis,” in A New Tool in Time Series Analysis, J. B.
[1] P. K. Dash, G. Ramakrishna, A. C. Liew, and S. Rahman, “Fuzzy neural Elsner, Ed. New York: Plenum, 1996.
networks for time-series forecasting of electric load,” Proc. Inst. Elect. [25] N. Golyandina, V. Nekrutkin, and A. Zhigljavsky, Analysis of Time Se-
Eng., Gener., Transm., Distrib., vol. 142, no. 5, pp. 535–544, Sep. 1995. ries Structure. SSA and Related Techniques, 2001.
[2] D. W. Bunn, “Forecasting loads and prices in competitive power mar- [26] C. Alexander, Market Models: A Guide to Financial Data Analysis.
kets,” Proc. IEEE, vol. 88, no. 2, pp. 163–169, Feb. 2000. New York: Wiley, 2001.
[3] A. Sfetsos, “Short-term load forecasting with a hybrid clustering algo- [27] S. K. Kachigan, Multivariate Statistical Analysis: A Conceptual Intro-
rithm,” in Proc. Inst. Elect. Eng., Gener., Transm., Distrib., vol. 150, May duction, 2nd ed. New York: Radius, 1991.
2003, pp. 257–262. [28] T. Kohonen, Self-Organizing Maps, 3rd ed., 2001.
[4] B. Kermanshahi and H. Iwamiya, “Up to year 2020 load forecasting [29] T. Kolarik and G. Rudorfer, “Time series forecasting using neural net-
using neural nets,” Int. J. Elect. Power Energy Syst., vol. 24, pp. 789–797, works,” J. Time Series Neural Netw., pp. 86–94, 2004.
2002. [30] S.-H. C. a. S. H. Kim, “Data mining for financial prediction and trading:
[5] J. Bastian, J. Zhu, V. Banunarayanan, and R. Mukerji, “Forecasting en- Application to single and multiple markets,” Expert Syst. Appl., vol. 26,
ergy prices in a competitive market,” Inst. Elect. Eng. Comput. Appl. pp. 131–139, 2004.
Power, pp. 40–45, 1999. [31] Y. JingTao and T. C. Lim, “Guidelines for financial forecasting with
[6] J. C. Cuaresma, J. Hlouskova, S. Kossmeier, and M. Obersteiner, neural networks,” presented at the Int. Conf. Neural Information Pro-
“Forecasting electricity spot-prices using linear univariate time-series cessing, Shanghai, China, 2001.
models,” Appl. Energy, vol. 77, pp. 87–106, 2004. [32] H. Y. Yamin, S. Shahidehpour, and Z. Li, “Adaptive short-term elec-
[7] C. P. Rodriguez and G. J. Anders, “Energy price forecasting in the On- tricity price forecasting using artificial neural networks in the restruc-
tario competitive power system market,” IEEE Trans. Power Syst., vol. tured power markets,” Elect. Power Energy Syst., vol. 26, pp. 571–581,
19, no. 1, pp. 366–374, Feb. 2004. 2004.
[8] X. Wang, N. Hatziargyriou, and L. H. Tsoulakas, “A new methodology
for nodal forecasting in deregulated power systems,” IEEE Power Eng.
Rev., pp. 48–51, 2002.
[9] F. J. Nogales, J. Contreras, A. J. Conejo, and R. Espinola, “Forecasting
next-day electricity prices by time series models,” IEEE Trans. Power Maria P. Garcia received the electrical engineer’s degree from the Universidad
Syst., vol. 17, no. 2, pp. 342–348, May 2002. Pontificia de Comillas, Madrid, Spain, in 2001 and the M.Sc. degree in power
[10] J. Contreras, R. Espinola, F. Nogales, and A. J. Conejo, “ARIMA models systems from the University of Manchester Institute of Science and Technology
to predict next-day electricity prices,” IEEE Trans. Power Syst., vol. 18, (UMIST), Manchester, U.K., in 2001. She is currently working toward the Ph.D.
no. 3, pp. 1014–1020, Aug. 2003. degree at UMIST.
[11] D. Pilipovic, “Energy risk: Valuing and managing energy derivatives,”
J. Energy Lit., vol. 4, p. 111, 1998.
[12] P. Stephenson and M. Paun, “Electricity market trading,” Power Eng. J.,
vol. 15, no. 6, pp. 277–288, Dec. 2001.
[13] Office of Gas and Electricity Markets (OFGEM), “NGC System Oper- Daniel S. Kirschen (M’86–SM’91) received the electrical and mechanical engi-
ator Incentive Scheme from April 2004,”, Dec. 2003. neer’s degree from the Free University of Brussels, Brussels, Belgium, in 1979
[14] R. S. Tsay, “Analysis of financial time series,” in Wiley Series in Proba- and the M.S. and Ph.D. degrees in electrical engineering from the University of
bility & Statistics. New York: Wiley, Dec. 21, 2001, p. 472. Wisconsin-Madison in 1980 and 1985, respectively.
[15] Z. Vojinovic, K. Vojislav, and R. Seidel, “A data mining approach to From 1985 to 1994, he worked for Control Data Corporation and for Siemens.
financial time series modeling and forecasting,” Int. J. Intell. Syst. Ac- He is currently a Professor of electrical energy systems at the University of Man-
count., Fin., Manage., vol. 10, pp. 225–239, 2001. chester, Manchester, U.K.

View publication stats

You might also like