Development of Robust Meteorological Year Weather Data
Development of Robust Meteorological Year Weather Data
Development of Robust Meteorological Year Weather Data
Renewable Energy
journal homepage: www.elsevier.com/locate/renene
a r t i c l e i n f o a b s t r a c t
Article history: Building energy performance simulations are limited to typical meteorological weather conditions
Received 8 June 2017 available in simulation software. Such simulations are insufficient for analysing energy performance
Received in revised form sensitivity to a range of probable weather conditions. This research presents a method for developing
20 September 2017
robust meteorological weather data that can be used for energy performance sensitivity analysis without
Accepted 12 November 2017
Available online 14 November 2017
the need to access historical weather data. The method decomposes dry bulb temperature (DBT) and
global horizontal solar radiation (H) into deterministic and stochastic components. For the typical
weather data of the City of Adelaide, the deterministic component for each of DBT and H consists of a
Keywords:
Typical meteorological year (TM2)
single frequency Fourier series. The stochastic components consist of 1-lag and 2-lags autoregressive
Robust synthetic data models for DBT and H respectively. The stochastic components also include randomly selected values
Fourier series from the residuals of the autoregressive models. Based on this method, the coldest and hottest weather
Energy simulation conditions were selected to simulate the energy performance of a single space. The results revealed 39%
Levene's test more cooling and 15% less heating in the hottest year, and 14% more heating and 64% less cooling in the
Kolmogorov-smirnov two-sample test coldest year. The results indicate that simulations based on typical weather conditions only are insuf-
ficient for assessing buildings' energy performance.
© 2017 Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.renene.2017.11.033
0960-1481/© 2017 Elsevier Ltd. All rights reserved.
344 S. Farah et al. / Renewable Energy 118 (2018) 343e350
data in a format, such as TM2, TM3 or EPW format, that is initial data. This selection eliminates the need for detecting outliers
compatible with simulation software [8]. These limitations deny in the hourly synthetic data.
most researchers the ability to analyse building's performance The method presented in this research uses the TM2 weather
sensitivity to weather conditions. data for the City of Adelaide developed by Meteonorm. The focus of
To overcome these limitations researchers can use weather data the method is on DBT and H as the two main weather conditions
generators such as RUNEOLE [9]. Weather data generators are affecting building energy performance.
useful when weather data are unavailable. Using historical data,
and physical and statistical models, weather data generators can 2. Data deterministic component e Fourier series models
generate typical and extreme weather conditions [10]. However,
weather data generators are unnecessary when typical weather 2.1. Dry bulb temperature
data are available, especially when the weather data generators
may be unavailable to researchers. The seasonality of the daily average dry bulb temperature (DBT)
When typical weather data are available, researchers can is clearly shown in Fig. 1, with the average DBT in summer being
develop synthetic data that have similar statistical characteristics higher than that in winter. The variation of DBT can be modelled by
as the typical weather data. Synthetic weather data were generated a deterministic function; a Fourier series (FS) which has a frequency
in a top-down approach using CLIMED software. Using algorithmic equal to one cycle per year.
chains, the software generated monthly weather data which were Higher frequencies that may represent quarterly (frequency
used as input to generate daily weather data, and the daily weather equals 4) or monthly (frequency equals 12) variations could also be
data were used to produce hourly data [7]. However, this process significant to represent the data with a FS. However, the frequency
required accessing weather data similar to the considered site to power spectrum shown in Fig. 2, reveals that the power corre-
adjust the model parameters used in the software. The generation sponding to frequencies higher than the fundamental frequency
of synthetic data were also explored by using the “smooth” function (frequency equals 1) is negligible compared to the power of the
in MATLAB software to identify a trend in the weather data for each fundamental frequency. This result indicates that using the
month, and the residuals between the identified trend and the fundamental frequency is sufficient to capture most of the periodic
initial data were then randomly resampled. The synthetic data from variation in the data.
this process would be the sum of the monthly trends and the Consequently, the FS model (TFS ) of the average daily DBT can be
randomly resampled data. However, the generated synthetic data represented by Equation (1)
using this process were unsatisfactory as excessive fluctuations
were observed in the trend for the entire year [11]. Based on a TFS ¼ T þ a cosðutÞ þ b sinðutÞ (1)
previous work of Boland [12], an improvement in identifying the
trend was the use of Fourier series analysis [13]. Additional im- where T is the average temperature calculated as in Equation (2)
provements were also included in the work of Rastogi and Ander-
sen [13] for developing synthetic data based on typical weather P365
i¼1 Ti
data. These improvements included fitting a seasonal autore- T¼ (2)
365
gressive moving average (SARMA) model to the Fourier series re-
siduals and performing 3-days blocks of random sampling within where Ti is the average daily DBT.
each month to maintain the intrinsic weather inertia in the data as The value of u is calculated as shown in Equation (3)
suggested by Magnano, Boland and Hyndman [14].
While these improvements produce synthetic data statistically 2p
u¼ (3)
similar to the original data, some of the adopted procedures 365
complicate the process of synthetic data generation. For instance,
instead of fitting a SARMA model to the Fourier series residuals, an and the remaining unknown coefficients a and b in Equation (1) are
autoregressive moving average (ARMA) model should be sufficient calculated to minimise the sum of the squared errors (SSE) between
as all the important frequencies can be detected and detrended the data and the FS. The SSE is calculated as shown in Equation (4)
using the Fourier series analysis. In addition, using 3-days blocks for
X
365 2
random sampling seems unnecessary as an ARMA model of the SSE ¼ TFSi Ti (4)
Fourier series residuals is meant to model the intrinsic inertia of i¼1
weather data.
This research presents a method for developing a robust mete- The minimum value of SSE is achieved for a and b equal to 5.122
orological year (RMY), without the need of accessing historical and 2.075 respectively, and the FS model can by written as in
weather data, based only on typical data available in simulation
software. The method decomposes the data into deterministic
(Fourier series) and stochastic (ARMA þ residuals) components.
The deterministic component is maintained the same throughout
the process of developing the RMY data, while the stochastic
component is modified by random sampling. This method has
three main differences compared with other methods. First, this
method is based on average daily values for the modelling of both
deterministic and stochastic components. The average values
smooth the data and simplify the modelling of deterministic and
stochastic components. Second, instead of resampling each month
separately, the method allows mixing the errors from different
months. The mixing allows creating a wider range of variations in
the robust data. Third, the generation of hourly data from average
daily synthetic data is based on selecting hourly data from the Fig. 1. Average dry bulb temperature ( C).
S. Farah et al. / Renewable Energy 118 (2018) 343e350 345
Fig. 3. DBT-FS model versus DBT data. Fig. 5. Scatter plot of DBT-FS residuals versus H-FS residuals.
346 S. Farah et al. / Renewable Energy 118 (2018) 343e350
of the autocorrelation and partial autocorrelation is required. This t , H t1 , H t2 and Rt are the FS residuals at time t, the FS
where HFSR FSR FSR H
analysis allows identifying the model form which can be an residuals at time t-1, the FS residuals at time t-2 and the AR model
autoregressive (AR) model, a moving average (MA) model, or a residuals at time t respectively.
combination of these two model forms; an ARMA model. This
analysis also allows identifying the degree of the required model;
4. Development of synthetic data
the number of lags that should be considered. The results of
autocorrelation and partial autocorrelation, shown in Figs. 8 and 9,
In a generic form, synthetic data (S) for either DBT or H are the
indicate that an AR (1) model is suitable; a rapidly decaying auto-
sum of the FS and AR models plus a random variation (RV) as shown
correlation with a peak for lag 1 in the partial autocorrelation.
in Equation (9).
The DBT-AR (1) model can be written as shown in Equation (7)
S ¼ FS þ AR þ RV (9)
t
TFSR ¼ aTFSR
t1
þ RtT (7)
where the only remaining unknown is RV.
where a, RtT , TFSR
t t1 are the auto regression coefficient, the AR
and TFSR
model residuals at time t, the FS residuals at time t and the FS re- 4.1. Grouping residuals
siduals at time t-1 respectively. The value of the regression coeffi-
cient a is obtained by minimizing the SSE of the AR model residuals. The values of RV are obtained by random sampling of the
Fig. 8. Autocorrelation function for DBT-FS residuals. Fig. 10. Autocorrelation function for TSR-FS residuals.
S. Farah et al. / Renewable Energy 118 (2018) 343e350 347
will be from the second group. While this grouping may seem Jan Feb 0 0.537 Apr Aug 0 0.435
suitable in reducing extreme synthetic data, variance homogeneity Jan Mar 0 0.204 Apr Sep 0 0.764
Jan Apr 1 0.011 Apr Oct 0 0.410
within each of the groups is not guaranteed. Achieving a homog-
Jan May 1 0.003 Apr Nov 0 0.098
enous variance within each group would require increasing the Jan Jun 1 0.000 Apr Dec 1 0.008
number of groups, with each group containing monthly residuals Jan Jul 1 0.001 May Jun 0 0.262
with similar variance. However, increasing the number of groups Jan Aug 1 0.001 May Jul 0 0.681
would reduce the number of data in each group which may lead to Jan Sep 1 0.005 May Aug 0 0.684
Jan Oct 0 0.051 May Sep 0 0.939
limited variation in the synthetic data. Jan Nov 0 0.449 May Oct 0 0.206
To overcome the variation stability while maintaining a large Jan Dec 0 0.959 May Nov 1 0.045
number of data in each group, a new grouping method of residuals Feb Mar 0 0.062 May Dec 1 0.002
is proposed in this research. The method is based on creating 12 Feb Apr 1 0.002 Jun Jul 0 0.436
Feb May 1 0.000 Jun Aug 0 0.362
groups with each of the groups being assigned to a specific month.
Feb Jun 1 0.000 Jun Sep 0 0.244
Feb Jul 1 0.000 Jun Oct 1 0.025
Feb Aug 1 0.000 Jun Nov 1 0.008
Feb Sep 1 0.001 Jun Dec 1 0.000
Feb Oct 1 0.010 Jul Aug 0 0.963
Feb Nov 0 0.184 Jul Sep 0 0.632
Feb Dec 0 0.487 Jul Oct 0 0.095
Mar Apr 0 0.194 Jul Nov 1 0.022
Mar May 0 0.093 Jul Dec 1 0.001
Mar Jun 1 0.015 Aug Sep 0 0.631
Mar Jul 1 0.045 Aug Oct 0 0.082
Mar Aug 1 0.040 Aug Nov 1 0.019
Mar Sep 0 0.113 Aug Dec 1 0.000
Mar Oct 0 0.540 Sep Oct 0 0.244
Mar Nov 0 0.654 Sep Nov 0 0.055
Mar Dec 0 0.201 Sep Dec 1 0.003
Apr May 0 0.705 Oct Nov 0 0.295
Apr Jun 0 0.168 Oct Dec 1 0.043
Apr Jul 0 0.446 Nov Dec 0 0.458
Fig. 12. Dry bulb temperature residuals.
348 S. Farah et al. / Renewable Energy 118 (2018) 343e350
Table 2 Table 4
Pairwise Levene's test results for solar radiation residuals. Sampling groups for solar radiation.
Month Month Hypothesis p-Value Month Month Hypothesis p-Value Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb 0 0.619 Apr Aug 0 0.264 Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Oct Jan
Jan Mar 0 0.681 Apr Sep 0 0.892 Feb Feb Feb Feb Feb Feb Feb Feb Feb Feb Nov Feb
Jan Apr 0 0.509 Apr Oct 0 0.356 Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Dec Mar
Jan May 0 0.345 Apr Nov 1 0.021 Apr Apr Apr Apr Apr May May Apr Apr Apr Apr
Jan Jun 0 0.282 Apr Dec 0 0.388 May May May May May Jun Jun May May Aug Aug
Jan Jul 0 0.194 May Jun 0 0.887 Jun Jun Jun Aug Jun Jul Jul Jun Jun Sep Sep
Jan Aug 0 0.735 May Jul 0 0.664 Jul Jul Jul Sep Jul Aug Aug Jul Jul Oct Oct
Jan Sep 0 0.692 May Aug 0 0.503 Aug Aug Aug Oct Aug Sep Sep Aug Aug Nov Nov
Jan Oct 0 0.154 May Sep 0 0.222 Sep Sep Sep Dec Sep Sep Sep Dec Dec
Jan Nov 1 0.007 May Oct 1 0.013 Oct Oct Oct Oct Oct
Jan Dec 0 0.178 May Nov 1 0.000 Dec Dec Dec Dec Dec
Feb Mar 0 0.938 May Dec 1 0.021
Feb Apr 0 0.924 Jun Jul 0 0.741
Feb May 0 0.132 Jun Aug 0 0.409 However, such sampling does not eliminate the necessity to verify
Feb Jun 0 0.098 Jun Sep 0 0.186
Feb Jul 0 0.059 Jun Oct 1 0.008
that the synthetic data conform to reality. For instance, any nega-
Feb Aug 0 0.383 Jun Nov 1 0.000 tive value of solar radiation is unreasonable and should not be
Feb Sep 0 0.962 Jun Dec 1 0.015 included in the synthetic data. Similar discretion should also be
Feb Oct 0 0.364 Jul Aug 0 0.278 used for DBT by adopting a maximum and minimum acceptable
Feb Nov 1 0.030 Jul Sep 0 0.132
temperature values, such as the historical maximum and minimum
Feb Dec 0 0.391 Jul Oct 1 0.004
Mar Apr 0 0.857 Jul Nov 1 0.000 temperature values. Such maximum and minimum values do not
Mar May 0 0.175 Jul Dec 1 0.008 require access to historical weather data as they are typically
Mar Jun 0 0.138 Aug Sep 0 0.478 available on weather websites, such as on the Bureau of Meteo-
Mar Jul 0 0.090 Aug Oct 0 0.065 rology for Australia [15]. Depending on the application of the syn-
Mar Aug 0 0.445 Aug Nov 1 0.002
Mar Sep 0 0.982 Aug Dec 0 0.084
thetic data, the maximum and minimum values could also be
Mar Oct 0 0.326 Sep Oct 0 0.381 slightly modified within reasonable and justified limits. In this
Mar Nov 1 0.025 Sep Nov 1 0.040 research, the adopted maximum and minimum values for both
Mar Dec 0 0.351 Sep Dec 0 0.402 synthetic DBT and H are equal to the maximum and minimum
Apr May 0 0.057 Oct Nov 0 0.180
values in the initial TM2 data.
Apr Jun 1 0.034 Oct Dec 0 0.997
Apr Jul 1 0.015 Nov Dec 0 0.202
4.2. Synthetic data verification
variance of residuals in the corresponding months are considered Based on the method presented, one thousand synthetic years of
similar. In contrast, the value 1 means that the null hypothesis has
DBT and H have been developed. The synthetic DBT and H data have
been rejected and the variance of residuals in the corresponding a significant variation between the different years as the sample (5
months are significantly different. The results of Levene's test are
years) synthetic data show in Figs. 14 and 15 respectively. These
based on the comparison of values, shown in the “p-Value” column, figures also show that the synthetic data are indistinguishable from
to the level of significance. For example, the first test in Table 1
the initial TM2 data (shown in black colour in Figs. 14 and 15).
shows that the residuals in January and February can be consid- Although the synthetic and original data are pictorially indis-
ered similar as the p-value (0.537) is greater than 5% and the null
tinguishable, a comparison between the two data sets on monthly
hypothesis could not be rejected. In contrast, the third test shows
that the null hypothesis has been rejected as the p-value (0.011) is
less than 5%; indicating that the residuals in January and April are
significantly different.
Based on the results in Tables 1 and 2, the sampling groups for
each month for DBT and H are formed as shown in Table 3 and 4
respectively. For instance, the first column in Table 3 indicates
that the sampling group for January contains the residuals of the
listed months, namely, January, February, March, October,
November and December.
Sampling from these groups reduces excessive-extreme values
while maintaining a wide variation of values in the synthetic data.
Fig. 14. Sample of synthetic dry bulb temperature.
Table 3
Sampling groups for dry bulb temperature.
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Jan Jan Mar Mar Apr Apr Apr Mar Jan Jan Jan
Feb Feb Feb Apr Apr May May May Apr Mar Feb Feb
Mar Mar Mar May May Jun Jun Jun May Apr Mar Mar
Oct Nov Apr Jun Jun Jul Jul Jul Jun May Apr Nov
Nov Dec May Jul Jul Aug Aug Aug Jul Jul Sep Dec
Dec Sep Aug Aug Sep Sep Sep Aug Aug Oct
Oct Sep Sep Oct Oct Sep Sep Nov
Nov Oct Oct Oct Oct Dec
Dec Nov Nov Nov
Fig. 15. Sample of synthetic total solar radiation.
S. Farah et al. / Renewable Energy 118 (2018) 343e350 349
Fig. 16. Synthetic and TM2 dry bulb temperature. Fig. 18. Perspective view of the simulated space.
350 S. Farah et al. / Renewable Energy 118 (2018) 343e350
Fig. 19. Schematic front and side views of the simulated space.
References