Goswami 2020

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

2020 International Conference on Computational Performance Evaluation (ComPE)

North-Eastern Hill University, Shillong, Meghalaya, India. Jul 2-4, 2020

Electricity Demand Prediction using Data Driven


Forecasting Scheme: ARIMA and SARIMA for
Real-Time Load Data of Assam
Kakoli Goswami Aditya Bihar Kandali
Department of Electrical Engineering Department of Electrical Engineering
Jorhat Engineering College, Dibrugarh University Jorhat Engineering College, Dibrugarh University
Jorhat, India Jorhat, India
kakoligoswami2009@gmail.com abkandali@gmail.com

Abstract— Aim of forecasting electrical load focuses in classical forecasting techniques. Many methods based on
predicting satisfactorily and accurately the demand that might different techniques has been proposed in literature such as
increase or decrease in the future. A large number of time series analysis [4] [5], ARIMA and SARIMA method
engineering applications count on accurate and reliable [6] [7] [8], fuzzy logic [9], ANN [10] and SVR [11]. Use of
prediction models for electrical load demand. A precise
forecasting of load helps in planning the capacity and
combination models are also found in literature [12].
operations of power companies to reliably supply energy to the This paper comprises of five sections. The methods used in
consumers. In this study electrical load (L) in Assam is our study for forecasting are introduced and described in
predicted using a data driven forecasting scheme. The study is Section II. Section III deals with the various steps required
carried out using daily 24 hourly L data obtained from SLDC, for identifying an accurate model while Section IV
Kahilipara, Assam. The study focuses mainly on two types of summarizes the results along with their analysis. Section V
regression model: ARIMA and SARIMA and also provides a presents the conclusion along with future scope of
performance evaluation of the models. The input data has been improvement.
split into two groups of training and testing data to build the
forecasting model. The correctness of the forecasting models
has been assessed using the different error matrices. The final II. METHODS
results indicated that the SARIMA model that considers the
seasonality of load data provided better prediction with A. Autoregressive Integrated Moving Average (ARIMA)
minimum error. MATLABR2016a was used during the entire ARIMA is among the simplest yet popular statistical
analysis.
methods that have been applied to time series model.
Keywords— Electrical Load forecasting, Time Series, ARIMA popularity can be credited to the effort of Box and
ARIMA, SARIMA. Jenkins [13]. ARIMA model is developed by integrating
two different forms of linear regressions, the Autoregressive
I. INTRODUCTION (AR) and the Moving Average (MA) [14]. AR (p) and MA
Load forecasting is a technique used by power companies (q) model is expressed as
for predicting the load required to meet the demand. It forms
an integral part in the operation of the power system [1]. y(t) = c + a1yt−1 + …+apyt−p + et (1)
The aim of load forecasting is to predict the future load
demand (denoted as L in this paper) with satisfactory and y(t) = μ + ut + m1ut−1 + …+mqut−q (2)
accuracy. Load forecasting can be classified as very short-
term, short –term, medium-term and long-term. Expansion Here, a1,…,ap, and m1,…, mq are the parameters of
of power utility results in a complex, heavily stressed power autoregressive and the moving average portions
system which is vulnerable to cascade outages [2]. respectively; the constant in the expression is denoted by c.
Assam is a state in north-eastern part of India. An accurate p and q represents the order of respective AR and MA
load-forecasting model is a need for a state like Assam [3]. portions. White noise is denoted as et. In equation (2) ut,
Vast amount of literature on load forecasting is available. ut−1,…,ut−q represents white noise (error) terms. Expectation
However, there are limited literature on recent work on load of y(t) is represented by μ. Integrating the two models
forecasting model for tropical region like North Eastern part represented by (1) and (2) using same data of the training
of India that has high humidity and temperature. This set, the ARIMA (p, q) becomes
provides a high scope of research in the field of load
forecasting in this part of our country. y(t) = c + a1yt−1 +...+apyt−p + ut + m1ut−1 + …+mqut−q (3)
Load forecasting schemes can be categorized into classical
techniques and computational intelligence (CI) techniques Here p represents the autoregressive and q represents the
[4]. Multidisciplinary participation in the field of research moving average terms. ಯpರ refers to the number of
has dramatically reduced the difference between the observation from the historical L data that are used in
classical and CI techniques. Traditional statistical methods, predicting the future values. ಯqರ refers to the lagged values
exponential smoothing, the ARMA model, the ARIMA
of error terms. For this model the prime requirement is to
model, the SARIMA and Kalman filter are some of the make the time series data statistically stationary in terms of

978-1-7281-6644-5/20/$31.00 ©2020 IEEE


570

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 19:46:51 UTC from IEEE Xplore. Restrictions apply.
mean, variance and autocorrelation. In case the data displays TABLE I
non stationary, the data has to be differenced. This helps to
convert the data in a stationary time series data resulting in Statistics of electricity load data (L) in MW
ARIMA (p,d,q) where d refers to the degree of
differencing. Data Min Max Mean Std. Skew-
Period (MW) (MW) Deviation ness
B. SARIMA (10am)
Electrical load (L) pattern shows the obvious periodic 1-1-2016 720 1390 987.449 138.0984 0.526
vibration resulting from seasonal changes. These seasonal to
changes can be dealt using the SARIMA model. This model 31-12-
2018
is formed by incorporating a seasonal term in the ARIMA
model. The SARIMA model in the general form is
expressed in the form SARIMA (p,d,q)(P,D,Q)S. In this
expression p signifies the order of autoregressive while P
signifies the order of seasonal autoregressive. Order by
difference is represented by d and D represents the seasonal
difference. The term q and Q represents the order of MA
and SMA respectively. S represents the seasonality. The
seasonal portion of the model will be depicted by the
seasonal lags of the correlation function plots. SARIMA Fig. 2 ACF and PACF of 24 hourly L data of 2016, 2017 and 2018
model may be depicted as

(P, D, Q)S A. Stationary test


ARIMA (p,d,q)
SARIMA In our study the degree of differencing d(and D) have
been determine using ACF and PACF followed by the
Augmented Dickey-Fuller Test(ADF) test using “adftest”
Non-seasonal part Seasonal part command in MATLABR2016a. Fig. 3(a) shows the L data
plot along with its ACF and PACF. Fig. 3(b) shows the L
data plot with first order differencing along with the ACF
III. TECHNIQUES ADOPTED and PACF plots and similarly Fig. 3(c) shows the L data
plot with second order differencing. Differencing the
The four important steps required for identifying the original non-stationary time series shown in Fig.1 resulted
correct forecasting model are stationary test, identification in a near stationary time series. The ACF plot of Fig. 2
of model, model fitting and finally performance evaluation shows significant spikes at lag 1, 7, 14, 21 and so on,
[15]. ARIMA model and SARIMA model are considered in indicating that lag 7seasonality can be used for the seasonal
this paper and their performance is compared. The real time model. Plot of L data (with lag 7 seasonality) shown in Fig.
daily electrical load data of three years namely 2016, 2017 4. This Lag 7 seasonality plot is near stationary as compared
and 2018 were collected from the State load Dispatch to plot of Fig1.
Centre (SLDC) Kahilipara, Assam. In this study, 24 hourly
electrical load (10 a.m. every day) from 01-01-2016 to 31-
12-2018 were taken. The three years L data plotted as a time
series using MATLAB R2016a is shown in Fig.1 and the
statistics associated with the plot is summarized in Table I.
Statistical analysis revealed that an ARIMA model can be
used to forecast 24 hourly loads. Careful observation of
Fig.1 also shows obvious periodic vibration. These seasonal
Fig. 3 ACF and PACF of L data, 1st difference and 2nd difference
changes can be dealt using the Seasonal Autoregressive
Integrated Moving Average Model known as the SARIMA
model. The correlation statistics in our study were
determined using the ACF and PACF plots presented in Fig
2.

Fig. 4 Plot of 24 hourly electrical loads (lag7) for the year 2016, 2017 and
2018

Fig. 1 Daily10 a.m. load data of 2016, 2017and 2018 as a time series plot.

571

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 19:46:51 UTC from IEEE Xplore. Restrictions apply.
B. Identification of Model
Identification of model is basically determining the
parameter p(P) and q(Q) for ARIMA as well as SARIMA
model. Many techniques are available in literature for
determining the parameters. In this study the parameters has
been determined using the ACF and PACF plots. The model
parameters are estimated by careful observation of the
correlation graph (ACF and PACF) and the best possible
model is determined by evaluating Akaike Information Fig. 6 ACF of Residuals of ARIMA (1, 2, 2)
Criterion (AIC), Log-likelihood function (logL) and
Bayesian Information Criterion (BIC). Order of the model
has been determined according to the rule given in Table II.
The model having minimum values of AIC, BIC is selected
as the best one. A total of 16 different ARIMA model were
considered varying p and q from 1 to 4 for the calculation of
minimum AIC and BIC. [15][16].
ACF and PACF of 24 hourly load data with a seasonality of
7 are shown in Fig.5. The ACF shows a tapering form
representing an AR (1) model while for the PACF there is a Fig. 7 ACF of Residuals of ARIMA (3, 2, 4)
significant spike at lag 1 and then at lag 7. However, it is
seen that the ACF tappers in multiple of s=7, thus
representing a MA (1) seasonal component. Considering D. Performance evaluation
different Seasonal ARIMA (SARIMA) model, the various
model parameters has been calculated for the best probable The correctness of the model was estimated using model
SARIMA model. accuracy measures like Root Mean Square Error (RMSE),
C. Model Fitting Mean Absolute Error (MAE), correlation coefficient R. The
outcomes are summarized in Table III and Table IV.
Among the different models, five most probable ARIMA (1, 2, 2) appears as the most promising model
ARIMA models and ten most probable SARIMA models among the five selected model. Table IV summarizes the
were selected. The total sample of 1095 L data were split results of ten different SARIMA model. The errors are least
into 75% training data and 25% testing data. The models in case of Model 1 i.e. SARIMA (0 1 1) (0 1 1)7, whereas
were trained using the training data (822 L points) and the highest for Model 5. The log-likelihood value has been
corresponding parameter estimation was carried out for each highest for Model 6 followed by Model 3 and Model 8.
model (Table III and Table IV). The Ljung-Box Q-test was
performed using lbqtest command in MATLAB for any
IV. PREDICTION RESULTS AND ANALYSIS
residual autocorrelation. The residuals were checked for any
remaining correlation using ACF. The residual ACF of
ARIMA (1,2, 2) are within the significant level compared to ARIMA (1, 2, 2) model was used for forecasting the
ARIMA (3, 2, 4) (Fig.6 and Fig. 7). electrical load. Fig. 8 gives the forecasted results with 95%
confidence interval for ARIMA (1, 2, 2). Fig. 9 shows the
Table II forecasted results when ARIMA (3, 2, 4) model is used. Fig.
Model Order Criterion for ARMA Model 9 indicate that predicted L data follows the observed L data
ACF PACF Model initially but as the time increases the error tends to increase.
This indicates that the model is a good one for very short –
Declines gradually Vanishes after p lags AR(p) term forecasting. However, the model may not be a good
enough for predicting medium and long term electrical load.
Vanishes after q lags Declines gradually MA(q) However, in Fig. 8 it is seen that the predicted L values
ARMA
Declines gradually Declines gradually (p,q) follows the trend of the observed L values in a better
pattern. This model can rightly predict the future load in a
better way, thereby indicating the superiority of the model.
The test data of electrical load were used to forecast using
two best selected SARIMA models among the ten models to
find the best fit model. Fig. 10 shows the forecast results
(best possible) using the model SARIMA (0 1 1)(0 1 1)7.
Clearly, among the three figures (Fig. 8, Fig. 9 and Fig. 10),
SARIMA (0 1 1)(0 1 1)7 provides a better forecast with
least error and best fit.
Fig. 5 ACF and PACF of L data, with lag 7 difference

572

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 19:46:51 UTC from IEEE Xplore. Restrictions apply.
TABLE III

Model logL AIC BIC RMSE MAE R MAPE


1.ARIMA(0,2,1) -4.40E+03 8.82E+03 8.83E+03 735 661 0.412 62.62
2.ARIMA(1,2,2) -4.36E+03 8.73E+03 8.76E+03 190 158 0.735 13.73
3.ARIMA(2,2,2) -4.36E+03 8.73E+03 8.76E+03 208 165 0.729 13.75
4.ARIMA(3,2,3) -4.38E+03 8.78E+03 8.78E+03 207 164 0.731 14.73
5.ARIMA(3,2,4) -4.34E+03 8.70E+03 8.75E+03 354 325 0.474 29.54

TABLE IV

Model logL AIC BIC RMSE MAE R MAPE


1.SARIMA
-5.77E+03 1.16E+04 1.16E+04 200 134 0.876 10.7
(0 1 1)(0 1 1)7
2.SARIMA
-5.79E+03 1.16E+04 1.16E+04 208 145 0.772 11.8
(1 0 0)(0 1 1)7
3.SARIMA
-5.76E+03 1.15E+04 1.15E+04 202 136 0.854 10.9
(1 0 1)(0 1 1)7
4.SARIMA
-5.77E+03 1.16E+04 1.16E+04 208 145 0.732 11.6
(1 1 0)(0 1 1)7
5. SARIMA
-5.79E+04 1.16E+04 1.16E+04 230 179 0.257 15.0
(1 1 1)(0 1 1)7
6.SARIMA
(1 1 1)(1 1 1)7 -5.75E+03 1.15E+04 1.15E+04 230 178 0.243 14.9
7.SARIMA
(1 1 0)(1 1 1)7 -5.79E+03 1.16E+04 1.16E+04 208 145 0.729 11.7
8.SARIMA
(1 0 1)(1 1 1)7 -5.76E+03 1.15E+04 1.16E+04 205 141 0.776 11.3
9.SARIMA
(1 0 0)(1 1 1)7 -5.79E+03 1.16E+04 1.16E+04 211 150 0.654 12.2
10.SARIMA
(0 1 1)(1 1 1)7 -5.77E+03 1.15E+04 1.16E+04 201 135 0.641 10.8

V. CONCLUSION ,
In this study 24 hourly electrical load of three consecutive
years were taken from SLDC, Assam. Two types of models
were applied separately for forecasting the L data: the
ARIMA model and the SARIMA model and their
performances have been studied. The conclusion is based on
the results obtained from the study. Considering the
seasonality of the L data in the SARIMA model, better
forecasting results were obtained. SARIMA therefore
presents a better fitting effect. However, there is scope of Fig. 9 Forecast of ARIMA (3,2,4)
further improving the forecasting model when external
factors affecting electrical load like weather, temperature
are taken into consideration. The author intends to
incorporated these factors in the future study with a view to
upgrade the present model.

Fig. 8 Forecast of ARIMA (1, 2, 2) Fig. 10 Forecasted results of SARIMA (0 1 1)(0 1 1) with a magnified view

573

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 19:46:51 UTC from IEEE Xplore. Restrictions apply.
REFERENCES
[1] Martin T. Hagan, Suzanne M. Behr, “The Time Series
Approach to Short Term Load Forecasting”, IEEE Transactions
on Power Systems ( Volume: 2 , Issue: 3 , Aug. 1987 )
[2] D Saxena, S.N Singh, K.S Verma,“ Application of
computational intelligence in emerging power systems”,
International Journal of Engineering, Science and Technology,
Vol. 2 ,No. 3
[3] The Sentinel, August 24, 2018.
[4] Qiang Wang, Shuyu Li, Rongrong Li, “Forecasting Energy
Demand in China and India: Using Single-linear, Hybrid-linear,
and Non-linear Time Series Forecast Techniques”, Energy
Elsevier, vol. 161(C), pages 821-831. (2018), doi:
10.1016/j.energy.2018.07.168.
[5] Xiaobo Zhang, Jianzhou Wang, “A novel decomposition-
ensemble model for forecasting short-term load-time series with
multiple seasonal patterns”; Journal Applied Soft Computing,
Volume 65 Issue C, April 2018 , Pages 478-494
[6] Ahmed I. Saleh , Asmaa H. Rabie , Khaled M. Abo-Al-Ez , “A
data mining based load forecasting strategy for smart electrical
grids”, Advanced Engineering Informatics 30 (2016) 422–448.
Elsevier.
[7] Hyndman, Rob J; Athanasopoulos, George. "8.9 Seasonal
ARIMA models". Forecasting: principles and practice. Texts.
Retrieved 19 May 2015.
[8] Chikobvu D, Sigauke C. “Regression-SARIMA modeling of
daily peak electricity demand in South Africa” [J]. Journal of
Energy in Southern Africa, 2012, 23(3):23-30.
[9] K.B. Song, Y.S. Baek, D.H. Hong, G. Jang, Short-term load
forecasting for the holidays using fuzzy linear regression
method, IEEE Trans. Power Syst. 20 (1)(2005) 96–101.
[10] M. Beccali, M. Cellura, V.L. Brano, A. Marvuglia, Forecasting
daily urban electric load profiles using artificial neural
networks, Energy Conver. Manage. 45 (18–19) (2004) 2879–
2900. Elsevier.
[11] G. Lv, X. Wang, Y. Jin, “Short-term load forecasting in power
system using least squares support vector machine”, in:
Proceedings of the 2006 International Conference on
Computational Intelligence, Theory and Applications, 9th Fuzzy
Days, 2006, pp. 117–126.
[12] K. W. Yu, C. H. Hsu and S. M. Yang, "A Model Integrating
ARIMA and ANN with Seasonal and Periodic Characteristics
for Forecasting Electricity Load Dynamics in a State," 2019
IEEE 6th International Conference on Energy Smart Systems
(ESS), Kyiv, Ukraine, 2019, pp. 19-24.

[13] Box G E P, Jenkins G M. Time Series Analysis: Forecasting and


Control [J]. Journal of the American Statistical Association,
1970, 68(342):199-201.
[14] Yanming Yang, Haiyan Zheng and Ruili Zhang, “Prediction and
Analysis of Aircraft Failure Rate Based on SARIMA Model”,
2017 2nd IEEE International conference on Computational
Intelligence and Applications
[15] Mohanad S. Al-Musaylha, Ravinesh C. Deoa, Jan F.
Adamowskic, Yan Lia, “Short-term electricity demand
forecasting with MARS, SVR and ARIMA models using
aggregated demand data in Queensland, Australia.” Advanced
Engineering Informatics 35(2018)1-16, Elsevier.
[16] Seunghyeon Park ; Sekyung Han ; Yeongik Son , “Demand
power forecasting with data mining method in smart grid”, 2017
IEEE Innovative Smart Grid Technologies - Asia (ISGT-Asia).

574

Authorized licensed use limited to: Carleton University. Downloaded on November 02,2020 at 19:46:51 UTC from IEEE Xplore. Restrictions apply.

You might also like