Applied Energy: Xin Liang, Tianzhen Hong, Geoffrey Qiping Shen
Applied Energy: Xin Liang, Tianzhen Hong, Geoffrey Qiping Shen
Applied Energy: Xin Liang, Tianzhen Hong, Geoffrey Qiping Shen
Applied Energy
journal homepage: www.elsevier.com/locate/apenergy
h i g h l i g h t s
a r t i c l e i n f o a b s t r a c t
Article history: More than 80% of energy is consumed during operation phase of a building’s life cycle, so energy effi-
Received 20 January 2016 ciency retrofit for existing buildings is considered a promising way to reduce energy use in buildings.
Received in revised form 17 May 2016 The investment strategies of retrofit depend on the ability to quantify energy savings by ‘‘measurement
Accepted 29 June 2016
and verification” (M&V), which compares actual energy consumption to how much energy would have
Available online 7 July 2016
been used without retrofit (called the ‘‘baseline” of energy use). Although numerous models exist for pre-
dicting baseline of energy use, a critical limitation is that occupancy has not been included as a variable.
Keywords:
However, occupancy rate is essential for energy consumption and was emphasized by previous studies.
Baseline model
Occupancy
This study develops a new baseline model which is built upon the Lawrence Berkeley National Laboratory
Building energy use (LBNL) model but includes the use of building occupancy data. The study also proposes metrics to quan-
Measurement and verification tify the accuracy of prediction and the impacts of variables. However, the results show that including
Energy efficiency retrofit occupancy data does not significantly improve the accuracy of the baseline model, especially for HVAC
load. The reasons are discussed further. In addition, sensitivity analysis is conducted to show the influ-
ence of parameters in baseline models. The results from this study can help us understand the influence
of occupancy on energy use, improve energy baseline prediction by including the occupancy factor,
reduce risks of M&V and facilitate investment strategies of energy efficiency retrofit.
Ó 2016 Elsevier Ltd. All rights reserved.
1. Introduction occurs during the actual operation stage, rather than the construc-
tion stage [5]. Therefore, improving the energy efficiency of exist-
The buildings sector consumes 40% of the total primary energy ing buildings is a key issue for reducing the total energy
in the United States [1], and the consumption has continued to consumption and GHG emissions.
increase, particularly in developing countries [2]. The buildings Energy efficiency retrofit for existing buildings is considered a
sector is thus responsible for a quarter of the total global green- promising method to achieve the target of energy savings [6].
house gas (GHG) emissions [3], and this proportion can reach Numerous previous studies indicated energy retrofit can signifi-
around 50% in the United States (U.S.) with adverse impact on glo- cantly benefit the environment, society, and economy by improv-
bal environment, healthcare, and economy [4]. Furthermore, in the ing energy efficiency [6,7], reducing emissions [8,9], controlling
life cycle of a building, more than 80% of energy consumption resource usage [10], enhancing the reputation of building owners
[11], improving the health and productivity of occupants [12,13],
reducing operation costs [14], increasing rent and occupancy rates
⇑ Corresponding author.
[15,16], and creating job opportunities [17].
E-mail addresses: liangxinpku@gmail.com (X. Liang), thong@lbl.gov (T. Hong).
http://dx.doi.org/10.1016/j.apenergy.2016.06.141
0306-2619/Ó 2016 Elsevier Ltd. All rights reserved.
248 X. Liang et al. / Applied Energy 179 (2016) 247–260
Owing to the significant benefits on energy conservation and occupants, (2) occupants’ need of thermal comfort, visual comfort
other aspects of society, energy efficiency retrofit has been empha- and indoor air quality, and (3) occupant behavior and interactions
sized around the world. For example, the U.S. government passed with building systems and controls [28,29]. In addition, in com-
the Energy Policy Act (EPA) of 2005 and Executive Order (EO) mercial buildings, the occupancy rate may increase after energy
13,423, which require that 15% of the total number of existing retrofit, due to lower utility bills, better indoor environment and
buildings should be retrofitted to improve energy efficiency by higher social reputation [30–32]. Miller et al. [33] indicated the
2015 compared with the 2003 baseline. Approximately 30 billion office buildings with green features will have 2–4% occupancy rate
US dollars are assigned to conduct energy efficiency retrofit of premium. Wiley et al. [34] specified that the office buildings with
existing buildings and facilities [7]. Incented by the policies, the LEED certification will increase up to 16–18%. Therefore, if the
market to provide energy efficiency services through energy ser- occupancy rate is changed after energy retrofit, the baseline of
vice companies (ESCOs) has been blooming in the last decade [8]. energy use should be adjusted.
Energy performance contracting (EPC), which is a financing Although a number of previous studies emphasized the impor-
package provided by ESCOs, is a commonly used market mecha- tance of occupancy factor in predicting the baseline, few studies, if
nism to implement energy efficiency retrofit. EPC includes energy not none, used occupancy factor in baseline prediction models,
savings guarantees and associated design, implementation and probably due to the highly stochastic activities and data limitation.
operation services [2,9]. The profit (or the payment to ESCOs) of Therefore, several research questions related to M&V remain to be
an EPC is mainly from the amount of energy cost savings after ret- answered: Does occupancy rate significantly impact the accuracy
rofit. The energy savings can be defined as the difference between of baseline prediction? If yes, how to quantitatively evaluate the
how much energy the building consumed after retrofit, and how impact on prediction accuracy? How is the influence of occupancy
much it would have consumed without the retrofit. The former on baseline models compared to that of other impact factors (e.g.,
can be obtained from utility meters, and the latter, which is outdoor air temperature, day of week), stronger, weaker or equal?
referred to as the energy use ‘‘baseline”, is not measurable but Is it feasible to improve prediction accuracy of energy baseline by
can only be obtained by prediction. The accuracy of the baseline using occupancy data? Nowadays, most commercial buildings
prediction can significantly impact the energy saving assessment, have access control system, which can obtain occupancy data in
investment return and payback period. Furtherly, it can likewise short time intervals. These data provide a new opportunity to dee-
impact the investment strategies and development of the building ply analyze the impact of occupancy on the accuracy of baseline
retrofit market. prediction.
The whole process of predicting baseline and assessing energy To address the aforementioned questions, this study proposes a
saving is called ‘‘measurement and verification” (M&V) [10]. The novel method to quantitatively evaluate how accuracy of energy
mechanism of M&V approaches is first monitoring the energy use baseline models is improved by including the occupancy factor.
of buildings, then developing mathematical models trained by Different from previous models, the proposed model of this study
observed data, and finally predicting baseline of energy use based considers the occupancy data as independent variables rather than
on the developed models. Xia and Zhang [10] present a mathemat- external uncertainty, shown in Fig. 1. Although influence of occu-
ical description of the M&V problem and cast a scientific frame- pancy has been emphasized by numerous previous studies, tradi-
work for the basic M&V concepts, propositions, techniques and tional models have not included occupancy data in the functions
methodologies. Mathieu et al. [11] proposed a regression-based of energy prediction. Therefore, in traditional models, occupancy
model to predict baseline electricity consumption of commercial is an external uncertain factor, which can negatively impact the
buildings and industrial facilities. Coughlin et al. [12] evaluated accuracy of energy prediction. Contrarily, in this study, the occu-
the performance of three average-based models for baseline. Gran- pancy data is considered as an independent variable so that the
derson et al. [13] proposed an automated M&V method for evalu- influence of occupancy can be fitted by the function and evaluated
ating model performance. Granderson and Price [14] summarized by the prediction results. The results of this study showed the
five baseline models, including both average-based models and accuracy of energy prediction is improved. From theoretical per-
regression-based models, and compared the predictive accuracy spective, including occupancy data can improve the prediction
of these models with several metrics. More complex mathematical accuracy, since the uncertainty of occupancy factor can be con-
models of M&V have been emerged, including multivariate regres- trolled and reduced, and less uncertainty can improve the predic-
sion models, exponential smoothing models, neural network mod- tion accuracy.
els, and Fourier series models [15–18]. The results of this study reveal that the influence of occupancy
Uncertainty of M&V models is important, since not only the on the accuracy of energy prediction. In addition, since the perfor-
value of the baseline, but also the accuracy and reliability of mance of models varies across hours and systems, the proposed
the prediction are critical to energy efficiency retrofit. It provides method zooms into the hourly performance and different systems
the stakeholders (e.g., ESCOs, building owners, facility managers) (i.e., HVAC, lighting, plug load and total load) of baseline models.
the information of investment risk, which is critical in decision Another important feature of this work is it only uses simple algo-
making. For example, if the post-retrofit energy use will be 30% rithm, excluding complex mathematical processing, and the input
lower than the baseline, but the uncertainty exceeds 30%, it is then data is available in most commercial buildings. That means the
very risky to invest in this retrofit project. Walter et al. [8] empha- proposed method is relatively easy to be implemented, and can
sized the influence of uncertainty and assessed uncertainty of M&V be well adopted for practical projects. The results of this study
for 17 buildings by calculating the percent differences between can help us understand the quantitative influence of occupancy
predicted baseline and observed data. The results showed there on energy use and energy baseline models.
was considerable uncertainty in baseline prediction: 5 out of 17
buildings had more than 20% uncertainty, and in an extreme case
it was more than 60%. 2. Methodology
The occupancy rate is a key uncertainty factor of M&V. Numer-
ous previous studies indicated that the occupancy rate had signif- 2.1. Framework of evaluating occupancy impact on baseline prediction
icantly positive correlation with the energy use in buildings [19–
26]. Occupants in buildings influence energy use in buildings in The methodology to evaluate occupancy influence on baseline
three major ways [27]: (1) sensible and latent heat gains from prediction comprises of four steps, illustrated in Fig. 2.
X. Liang et al. / Applied Energy 179 (2016) 247–260 249
Traditional Models
Influencing factor x1
(temperature)
Influencing factor x2
(time) y=f(x1,x2,...xn) Energy Prediction y
Influencing factor xn
Proposed Model
Influencing factor x1
(temperature)
Influencing factor x2
(time)
y=f(x1,x2,x3,...xn) Energy Prediction y
Influencing factor x3
(occupancy)
Influencing factor xn
Step 1: Problem Definition and Data Preparation. One aim of Step 3: Evaluation and comparison of accuracy of baseline mod-
this step is to clarify problem definition, boundary, assumption els. This step is to quantitatively evaluate the influence of occu-
and key metrics of success. The scope of this study focuses on pancy on the accuracy of baseline models. First, three baseline
the energy baseline prediction in office buildings. Since there are prediction models are implemented to predict baseline of energy
normally fewer occupants in office buildings on weekends, this use based on the observed data. Two models, which do not include
study only focuses on the energy use on weekdays. The key metric, occupancy factor, are adopted from previous studies. The other
which is to assess different models and factors, is the similarity one, using occupancy data, is the proposed method in this study.
between prediction results and observed data. The algorithms of the three models will be illustrated in detail in
The other aim of this step is to prepare data for the analysis in Section 2.2. Then, the prediction results are compared across the
the next steps. It includes acquiring, harmonizing, rescaling, clean- three models. The method and metrics of the evaluation will be
ing and formatting data. Due to the failure of sensors, stochastic introduced in detail in Section 2.3. The results can show whether
noise and other interference factors, the raw data set may contain the occupancy data improves the accuracy of baseline prediction.
missing data, error data and unstructured data. Before data mining, If the prediction accuracy is significantly improved by occupancy
the raw data should be pre-processed to provide the valid data for data, the next step will be executed.
further analysis. In this study, three types of data are required (i.e., Step 4: Prioritization of impact factors. Based on horizontal
outdoor air temperature, energy use and occupancy count data). comparison across models in the last step, this step further evalu-
Outdoor air temperature can be obtained from sensors outside ates influence of occupancy factor by vertical comparison across
buildings or database of weather stations. Energy use data can be factors. Besides the number of occupants, there are various uncer-
obtained from electricity meters in buildings. Occupant number tain factors which can impact energy consumption of buildings
can be obtained from the records of access control system. All these (e.g., outdoor air temperature, facility degradation and climate
data are recorded with timestamps of short-time intervals, typically change). It is important to understand not only the influence of
at 5–15 min. Using the short-time ‘‘interval data” can significantly occupancy factor, but also its priority compared to other impact
reduce the duration of data required in baseline models [8]. factors. Namely, this step is to identify which factor is more critical
Step 2: Correlation analysis. This step is to verify the correlation to the accuracy of baseline prediction. The results can provide
between occupancy rate and total energy consumption of build- guidance for selecting factors in baseline models. The method
ings. The total energy consumption can be divided into several and metrics of the factor comparison will be introduced in detail
sub-systems (e.g., HVAC, lighting system and plug load) by using in Section 2.4.
sub-meters. Then, the more occupancy-dependent sub-systems,
which have higher correlation with occupancy rate, can be 2.2. Baseline prediction models
revealed. Scatter plots are applied to visualize correlations qualita-
tively, while statistical analysis is applied to calculate correlations Three baseline models are implemented to demonstrate the
quantitatively. Correlation coefficients and significance levels are methodology. The first model is a simplistic ‘‘native” model, which
main criteria of correlation test. If occupancy rate and energy use only depends on one variable, the time of week. It serves as a com-
are significantly correlated, the next step will be executed. parative ‘‘floor” of performance [14]. The second model was devel-
250 X. Liang et al. / Applied Energy 179 (2016) 247–260
Processes Results
Yes
Yeses
Compare prediction accuracy of
baseline models with an without
occupancy data
X
120 intervals and /n,k is the portion of the On in interval k. For example,
Lpn;time ¼ sn;i ai ð3Þ if occupant intervals are defined (i.e., 0–10, 10–20, 20–50, 50–100,
i¼1
100–200) and the given number of occupants On = 120, the values
The indicator sn,i, which is defined in Eq. (3), serves to select of five components are /n,1 = 10, /n,2 = 10, /n,3 = 30, /n,4 = 50,
which coefficient is active. For a given point tn, only one indicator /n,5 = 20. The whole occupancy-dependent portion Lpn;occ can be cal-
is one, while other 119 indicators are zero. When sn,i = 0, the coef- culated by summing the products of occupancy components and
ficients have no effect. coefficients of all intervals, shown in Eq. (8).
1 if t n 2 Si X
N OI
sn;i ¼ ð4Þ Lpn;occ ¼ ck /n;k ð8Þ
0 if t n R Si
k¼1
The temperature-dependent portion Lpn;temp mainly represents
The predictive baseline Lpn by Eq. (7) can be transformed to Eq.
the different features of energy consumption among different tem-
(9), where the coefficients ai, bj and ck can be computed by regress-
peratures, which is probably most related to the heating and cool-
ing with observed data. Then the baseline of energy consumption
ing systems behaviors. As aforementioned, Lpn;temp is modeled by a
can be predicted with the obtained coefficients.
piecewise-linear and continuous function. A number of tempera-
ture intervals need to be divided for this piecewise-linear function X
120 X
NT X
NO
and a temperature component hn,j and a coefficient bi is assigned to Lpn ¼ Lpn;time þ Lpn;temp þ Lpn;occ ¼ ai sn;i þ bj hn;j þ ck /n;k ð9Þ
P T i¼1 j¼1 k¼1
each interval. The temperature Tn is the sum T n ¼ Nj¼1 hn;j , where
NT is the number of temperature intervals and hn,j is the portion In the model of this case study, NT is set to 2 with the intervals
of the Tn in interval j. For example, in the case study of [8], four (0–45 °F, 45–100 °F), and NO is set to 4 with the intervals (0–10,
intervals are defined (i.e., 20–40 °F, 40–60 °F, 60–80 °F, 80– 10–50, 50–100, 100–200).
100 °F). If the given temperature Tn = 70 °F, the values of four com-
ponents are hn,1 = 20 °F, hn,2 = 20 °F, hn,3 = 10 °F, hn,4 = 0 °F. The 2.3. Computing the accuracy of baseline models
whole temperature-dependent portion Lpn;temp can be calculated by
summing the products of temperature components and coeffi- The accuracy of baseline models can be quantified by the metric
cients of all intervals, shown in Eq. (5). CVRMSE (coefficient of variation of the root mean square error) [14].
CVRMSE is the root mean square error divided by the mean of the
X
NT
data, which indicates the relative size of error. For example, a value
Lpn;temp ¼ bj hn;j ð5Þ
of 0.1 means the difference between prediction and observed data
j¼1
is 10% of observed data. The equation for CVRMSE is provided in
The predictive baseline Lpn by Eq. (2) can be transformed to Eq. Eq. (10), where Lob p
n and Ln are the observed data and baseline predic-
(6), where the coefficients ai and bj can be computed by training tion reprehensively, and N is the size of the data set.
with observed data. rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN ob p 2ffi
ðLn Ln Þ
X
120 X
NT n¼1
ai sn;i þ
N
Lpn ¼ Lpn;time þ Lpn;temp ¼ bj hn;j ð6Þ CVRMSE ¼ PN ð10Þ
Lob
i¼1 j¼1 n¼1 n
N
Step 1
Training Comparing
CVRMSE1
Step 2
Training Comparing
CVRMSE2
Step M
Training Comparing
CVRMSE m
Inputs: Inputs:
Inputs Inputs:
Inputs
Time Time Temp Time Temp Occ
accuracy of Model 2 is improved, it should be caused by the incre- of one factor can be defined with the same method: comparing the
mental information of temperature. accuracy by controlling all other factors and changing the target
Since Model 1 and 2 are linear models, the impact of tempera- factor.
ture factor Impacttemp can be defined as the accuracy improvement
of Model 2 compared to Model 1, shown in Eq. (12). CVRMSEModel3 CVRMSEModel2
Impactocc ð%Þ ¼ 100% ð13Þ
CVRMSEModel3
CVRMSEModel2 CVRMSEModel1
Impacttemp ð%Þ ¼ 100% ð12Þ
CVRMSEModel2
3. Results
Likewise, as shown in Fig. 4, the contribution of occupancy fac-
tor Impactocc can be defined as the accuracy improvement of Model 3.1. Data preparation
3 compared to Model 2, shown in Eq. (13), since the only difference
between these two models is the variable of occupancy. If there are A case study was conducted to show how to quantify the avail-
other impact factors involved in baseline models, the contribution ability of occupancy impact on the accuracy of baseline prediction
X. Liang et al. / Applied Energy 179 (2016) 247–260 253
Table 2 represents the dual-peak feature, but the noon-drop is not as deep
The profile of Building 101. as that in the ASHRAE 90.1 standard. According to the feature, the
Location Philadelphia, US occupancy curve can be divided into six periods [36]: (1) the night
Size 6410 m2 period (7 pm to 6 am); (2) the going-to-work period (7 am to
Floor 3 floors 9 am); (3) the morning period (10 am to 12 pm); (4) the noon-
Constructed year 1911
Building usage Office
break period (12 pm to 1 pm); (5) the afternoon period (2 pm to
3 pm); and (6) the going-home period (4 pm to 6 pm). According
to the distribution of the boxplot, the higher uncertainties of the
occupant number occurred during going-to-work and going-
by the proposed method. Building 101 in the Navy Yard, Philadel-
home periods.
phia, Pennsylvania U.S. was used in this case study. The building is
The main feature of energy consumption is similar to that of the
one of the nation’s most highly instrumented office buildings and
number of occupants (lowest at night, increasing in the morning
is the temporary headquarters of the U.S. Department of Energy’s
and decreasing in the afternoon), but is not quite synchronized.
Energy Efficient Building Hub (EEB Hub) [35]. Various sensors have
The energy consumption curve can be divided into four periods:
been installed by EEB Hub since 2012 to acquire building data of
(1) the valley period (10 pm to 3 am); (2) the increasing period
occupants, facilities, energy consumption and environment. The
(4 am to 9 am); (3) the peak period (10 am to 5 pm); (4) the
profile of Building 101 is shown in Table 2.
decreasing period (6 pm to 9 pm). The energy consumption rises
Four sensors are installed at the gates of the building to record
about three hours earlier than occupants arriving, and falls around
the number of occupants entering and exiting. The sensors are
two hours later than occupants leaving. It indicates that the oper-
located at the first floor in Building 101, shown in Fig. 5. This study
ation schedule of building energy systems is around five hours
uses the data from the year 2014 and the time step is five minutes.
longer than occupied time in this building. In addition, it needs
The data format of raw sensor records is shown in Table 3. The set
to be noted that the energy consumption does not have dual-
(Nin1,n, Nin2,n, Nin3,n, Nin4,n) denotes the number of entering occu-
peak feature. Namely, the energy consumption keeps the peak
pants, while the set (Nout1,n, Nout2,n, Nout3,n, Nout4,n) denotes the num-
value during noon-break, which indicates the lights, HVAC and
ber of exiting occupants at the n-th time step. Therefore, the
other plug load equipment are not turned off when occupants
number of total occupants in building can be calculated by Eq. (14).
leave for lunch.
X
NO ¼ ðN in1;n Nout2;n þ Nin2;n Nout2;n þ Nin3;n Nout3;n Fig. 7 shows the correlation between the number of occupants
n and energy consumption by scatter plots. The color bar indicates
þ Nin4;n Nout4;n Þ ð14Þ the time of the day. The color is closer to red when time is closer
to noon, while the color is closer to blue when time is closer to
The electricity consumption data of the whole building and sub- midnight. The total load and occupant number present positive
systems (i.e., lighting, HVAC and plug load) was recorded by sub- correlation. Although not very significant, the trend can be discov-
meters in 15-min intervals. Based on this data set, the hourly, daily ered: the more occupants there are, the higher the total load is. The
and monthly energy use of each system can be calculated. The out- lighting and plug load systems show more significant positive cor-
door air temperature was recorded every 15 min. Therefore, all the relation between energy use and occupant number. Especially in
three categories of data (occupancy, temperature and energy use) the plug load system, the slope is high, which means a given
can be obtained by sensors and meters in Building 101. After har- change of occupant number will cause relatively large change of
monizing, rescaling, cleaning and formatting the raw data, it is energy use. Nevertheless, the occupant number does not show sig-
ready for the further analysis. nificant correlation with energy use in HVAC systems.
To compare with occupancy, the temperature was likewise ana-
3.2. Correlation between occupancy and energy consumption lyzed to show the correlation with energy use. As shown in Fig. 8,
during night (1blue dots), the total load is not related to tempera-
The correlation between occupancy and energy consumption ture. During daytime (yellow and red dots), there is significantly pos-
was investigated with three methods: time series, scatter plots itive correlation when temperature is higher than 40 °F, otherwise,
and correlation coefficient tests. Fig. 6 illustrates the comparison there is no significant correlation between them. The HVAC system
of the hourly energy consumption and the occupant number. Sim- is similar to the total load, but the correlation is more significant.
ilar to the ASHRAE 90.1 standard, the occupancy curve during 24 h There is no significant correlation in lighting and plug load systems.
Sensors
Unused Doors
Occasional Door
Table 3
The data format of sensor records.
Fig. 6. Hourly energy consumption and occupant number in weekdays. Boxplots show median, quartiles, extreme values, means (blue circles) and outliers (+) of the data set.
(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Since Building 101 uses gas for heating rather than electricity, the by including the occupancy variable. To answer this question,
total energy use does not rise in lower temperature. However, the Model 3, which uses the time, outdoor air temperature and occu-
energy use in plug load system rises slightly in lower temperature, pancy variables, is implemented to compare with the previous
probably due to personal electric heaters. methods. Since the volume of results is huge (a whole year in 1-
Besides the visualization of correlation by scatter plots, correla- h intervals), it is difficult to show all the results. Therefore, one
tion analysis is adopted to calculate the correlation coefficients, week of results is shown in Fig. 9, with comparison of observed
shown in Table 4. In vertical comparison, the coefficient of occu- data and results of three models.
pant number (0.74) is 30% higher than that of temperature (0.44) The accuracy of each model, measured by CVRMSE, is shown in
in total energy use. This premium becomes greater in lighting Figs. 10–13. The lower value of CVRMSE indicates the higher
and plug load systems, which are 60% and 81% respectively. The accuracy.
coefficients in HVAC system are approximately equal. Therefore,
overall, the occupant number has much higher correlation with Fig. 10 illustrates the accuracy of baseline models in total load
energy use than outdoor air temperature.
prediction. The values of CVRMSE in Model 1 are around 0.25
In horizontal comparison, the occupancy is more correlated to
during working time. The peak values are around 0.45, which
lighting and plug load systems, while the outdoor temperature is
are from 4 am to 6 am, and the valley values are around 0.1
more correlated to the HVAC system. These results are consistent
which are at 8 pm. After including the outdoor temperature
with common sense and previous studies [36,37], because the
variable, the accuracy of Model 2 is improved significantly.
lighting and plug load are controlled by occupants, but the HVAC
The values of CVRMSE are mostly below 0.15, and higher
system mainly depends on the outdoor air temperature.
CVRMSE values beyond 0.15 occur from 2 am to 6 am. The accu-
racy of Model 3 is slightly improved from Model 2, and the
3.3. Accuracy of baseline models
shape of CVRMSE curve is very similar.
The results in Section 3.2 have proved that the occupant num- Fig. 11 illustrates the accuracy of baseline models in HVAC load
ber is highly correlated to energy consumption. The further ques- prediction. It indicates that Model 1 is poor at HVAC load pre-
tion is whether the accuracy of baseline models can be improved diction. The CVRMSE values in Model 1 are mostly beyond 0.5,
X. Liang et al. / Applied Energy 179 (2016) 247–260 255
10:00
200 100 (14:00)
100 50
8:00
(16:00)
0 0
0 50 100 150 200 0 50 100 150 200
Number of Occupants Number of Occupants
6:00
Lighting Plug Load (18:00)
80 80
Energy Consumption (kWh)
40 2:00
20
(22:00)
30
0 20
0 50 100 150 200 0 50 100 150 200 0:00
Number of Occupants Number of Occupants (24:00)
Fig. 7. The correlation between the number of occupants and energy consumption.
300 150
10:00
200 100 (14:00)
100 50 8:00
(16:00)
0 0
0 20 40 60 80 100 0 20 40 60 80 100
Temperature (F) Temperature (F) 6:00
(18:00)
Lighting Plug Load
80 80
Energy Consumption (kWh)
Energy Consumption (kWh)
60
4:00
(20:00)
60
40
40
2:00
20
(22:00)
0 20 0:00
0 20 40 60 80 100 0 20 40 60 80 100 (24:00)
Temperature (F) Temperature (F)
which means most prediction values deviate from observed to below 0.2 during daytime (6 am to 6 pm), but the values of
value by more than 50%. The peak value is around 0.9 at 4 CVRMSE are still higher than 0.5 at night (from 7 pm to 3 am).
am. After including the temperature variable, the accuracy of By including the occupancy variable, the accuracy of Model 3
Model 2 is improved significantly. The values of CVRMSE drop is not significantly improved in most time, except 7 pm to
256 X. Liang et al. / Applied Energy 179 (2016) 247–260
Table 4
The correlation coefficients between occupancy/temperature and energy
3.4. Contribution of the occupancy factor
consumption.
The results in Section 3.3 confirmed the hypothesis that the
Total electric load HVAC Lighting Plug load
occupancy data can improve baseline prediction. The next step is
Number of occupants 0.74* 0.54* 0.73* 0.86* to quantify the contribution of the occupancy variable, and clarify
Outdoor air temperature 0.44* 0.58* 0.13* 0.05*
whether this contribution is higher or lower than that of other fac-
*
p-value < 0.001. tors. The result can help determine the dominant factors in base-
line models. In this step, the contributions of occupancy and
temperature are calculated and compared using the method intro-
12 am. The shape of CVRMSE curve is very similar. The big dif- duced in Section 2.4.
ferences of accuracy in HVAC load prediction are probably Fig. 14 illustrates the contribution of occupancy data on the
caused by the operation schedule, which is related to neither accuracy of baseline prediction. The results show that occupancy
occupancy nor outdoor temperature in this building. It will be data improves lighting and plug load prediction most significantly,
discussed in detail in Section 4. especially during working time (8 am to 6 pm). But the improve-
Fig. 12 illustrates the accuracy of baseline models in lighting ment is not significant in HVAC load prediction, lower than 10%.
load prediction. Model 1 performs well at lighting load predic- Overall, the occupancy data improves the total energy prediction
tion. The CVRMSE values in Model 1 are mostly below 0.1. But by around 10% during daytime (6 am to 6 pm), but less improve-
the CVRMSE values rise sharply at 6 am and 9–10 pm. After ment at other times.
including the outdoor temperature variable, the accuracy of The statistical results of contributions of occupancy Impactocc
Model 2 is improved significantly at 6 am and 9–10 pm, which and outdoor temperature Impacttemp are shown in Table 5. Accord-
the CVRMSE values drop to around 0.15. By involving the occu- ing to the results, the outdoor temperature variable mainly con-
pancy variable, the accuracy of Model 3 is slightly improved in tributes to HVAC and lighting load prediction, while the
daytime (from 8 am to 6 pm), but not improved in other hours. occupancy variable mainly contributes to lighting and plug load
prediction. The mean contribution of occupancy variable on total
The shape of CVRMSE curve is very similar.
energy prediction is 10%, which is much lower than the mean con-
Fig. 13 illustrates the accuracy of baseline models in plug load
tribution of the outdoor temperature variable (63%). Occupancy
prediction. Model 1 performs well at lighting load prediction.
has higher correlation with energy use but lower contribution on
The CVRMSE values in Model 1 are mostly below 0.1. The two
energy prediction, which seems inconsistent. The reasons of this
peak values of CVRMSE occur at 8 am and 6 pm. After including problem will be discussed in Section 4.
the outdoor temperature variable, the accuracy of Model 2 is
improved, which the CVRMSE values drop to below 0.09. By 3.5. Sensitivity analysis
involving the occupancy variable, the accuracy of Model 3 is sig-
nificantly improved, especially during working time (6 am to There are three critical parameters influencing the prediction
7 pm). All the CVRMSE values of Model 3 drop to below 0.08. results in baseline models. First is the length of the training data
Different from the other two systems, the shape of CVRMSE period. The baseline models use previously observed data to train
curve of Model 3 for plug load is not similar to that of Model 2. and fit models. The length of the training period will impact the
300
200
100
0
MON TUE WED THU FRI
HVAC Load (kW)
200
100
0
MON TUE WED THU FRI
Lighting Load (kW)
60
40
20
0
MON TUE WED THU FRI
70
Plug Load (kW)
60
50
40
30
MON TUE WED THU FRI
Fig. 9. The observed load and predicted load by three models (from 11th to 15th August 2014).
X. Liang et al. / Applied Energy 179 (2016) 247–260 257
0.45 35
Model 1 Model 2 Model 3 Plug Load Lighting HVAC Total
0.4
0.3
20
0.25
15
0.2
0.15 10
0.1 5
0.05 0
2 4 6 8 10 12 14 16 18 20 22 24 2 4 6 8 10 12 14 16 18 20 22 24
Time Time
Fig. 10. The accuracy of baseline models for total electric load prediction. Fig. 14. The contribution of occupancy data on the accuracy of baseline prediction.
1 Table 5
Model 1 Model 2 Model 3
0.9 The statistical profile of the contributions by occupancy and temperature factors.
0.8
Impactocc Impacttemp
0.7
Max Mean Median Max Mean Median
CVRMSE
0.6
(%) (%) (%) (%) (%) (%)
0.5
HVAC 9 5 5 72 46 64
0.4
Lighting 21 12 7 38 13 10
0.3 Plug 27 15 9 20 10 9
0.2 load
Total 18 10 8 66 44 57
0.1
0
2 4 6 8 10 12 14 16 18 20 22 24
Time training is too short, it cannot provide enough information to fit
Fig. 11. The accuracy of baseline models for HVAC load prediction.
the model. If the training is too long, it may include useless or
harmful information to the model. Since the building performance
and occupant activities change over time, the data of the building
0.4 in the distant past does not help predict the building performance
Model 1 Model 2 Model 3
0.35
in the future. For example, over a period of years, the base load of
the building is likely changed. There will be a considerable bias if
0.3
using data from years ago. The number of occupants and their
0.25 energy use behaviors can be likewise changed during a long time,
CVRMSE
0.2 so the historical data can no longer reflect the current building
0.15
performance.
The other two critical parameters are the piecewise number of
0.1
occupancy and the outdoor temperature in regressions. As men-
0.05 tioned in Section 2.2, Model 3 is piecewise-continuous regressions
0 of the occupancy and temperature variables. The piecewise num-
2 4 6 8 10 12 14 16 18 20 22 24
ber will impact model fitting. Fewer segments will sacrifice accu-
Time
racy of the model, while too many segments can cause over-
Fig. 12. The accuracy of baseline models for lighting load prediction.
fitting and high computing cost. Therefore, how to define these
segments appropriately is an important issue in the baseline
model.
0.13 Sensitivity analysis is applied to evaluate the influence of these
Model 1 Model 2 Model 3
0.12
parameters on baseline models. Fig. 15 shows the accuracy of base-
line models during different training periods. The CVRMSE of Model
0.11
1 first increases with the training period and reaches the peak
value at five months, then decreases. It can be explained that when
CVRMSE
0.1
0.09 the training period is five months, it uses training data from winter
to predict the building performance in summer. As Model 1 does
0.08
not include the outdoor temperature variable, the prediction
0.07
should be at lower accuracy. It verifies the aforementioned hypoth-
0.06 esis, that longer training period may be harmful to accuracy. The
0.05 CVRMSE of Models 2 and 3 fluctuate in short training periods,
2 4 6 8 10 12 14 16 18 20 22 24
and reach convergence after three months. Their curves are almost
Time
coincident after three months, and Model 3 is slightly below Model
Fig. 13. The accuracy of baseline models for plug load prediction.
2. In short training periods (one to three months), Model 3 shows
faster convergence and narrower range of fluctuation. It can bring
not only technical but also economic benefits, since the time for
training effect and further impact the accuracy of baseline predic- data gathering can significantly impact the costs, investment
tion. The length should be neither too short nor too long [8]. If the return and payback period [8].
258 X. Liang et al. / Applied Energy 179 (2016) 247–260
0.4 0.13
Model 1 Model 2 Model 3 Occupancy
0.35 0.12 Temperature
0.3
0.11
CVRMSE
CVRMSE
0.25
0.1
0.2
0.09
0.15
0.08
0.1
0.05 0.07
0 0.06
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Months of training Piecewise number in regression
Fig. 15. Accuracy of baseline prediction under different training periods. Fig. 16. Accuracy of baseline prediction in Model 3 under different piecewise
number of occupancy and temperature data.
The results of this study can be applied in energy efficiency ret- various methods for energy prediction (e.g., change-point regres-
rofit projects. Before retrofit, it can offer suggestions of data collec- sion, ANN, SVR, etc.). The LBNL model is only an example method
tion, decision making and risk assessment. For example, if the as function of energy prediction in this study to calculate the occu-
projects are mainly for HVAC, occupancy factor can be ignored. pancy influence on energy prediction quantitatively. But based on
However if the projects are mainly for plug load, it is necessary the results of this study, occupancy data can be included in more
to collect occupancy data before retrofit since it influences the methods to investigate the occupancy influence in further study.
energy baseline significantly. For the risk assessment, the results Further research of occupancy in baseline prediction can focus
of this study can also indicate the uncertainty of energy baseline on: (1) using larger data sets for potentially better results; (2)
model impacted by occupancy. If the uncertainty is relatively high, applying more methods and improving algorithm of the baseline
the investment strategy may be changed. After retrofit, the results model. It needs to consider the tradeoffs among result accuracy,
of this study can improve the energy saving assessment by includ- algorithm complexity and length of training period; (3) comparing
ing the occupancy factor. It is critical for ESCOs, since the profits of occupancy influences among different types of buildings and
ESCOs mainly depend on the calculated energy savings. developing benchmarks of M&V for energy efficiency retrofit.
Acknowledgements
5. Conclusions
[19] Lee W-S. Benchmarking the energy efficiency of government buildings with [28] Hong T, Taylor-Lange SC, D’Oca S, Yan D, Corgnati SP. Advances in research and
data envelopment analysis. Energy Build 2008;40(12):891–5. applications of energy-related occupant behavior in buildings. Energy Build
[20] Chung W, Hui YV. A study of energy efficiency of private office buildings in 2016.
Hong Kong. Energy Build 2009;41:696–701. [29] Yan D, O’Brien W, Hong T, Feng X, Burak Gunay H, Tahmasebi F, et al. Occupant
[21] Chung W, Hui Y, Lam YM. Benchmarking the energy efficiency of commercial behavior modeling for building performance simulation: current state and
buildings. Appl Energy 2006;83:1–14. future challenges. Energy Build 2015;107:264–78. 11/15/2015.
[22] Sabapathy A, Ragavan SK, Vijendra M, Nataraja AG. Energy efficiency [30] Rey E. Office building retrofitting strategies: multicriteria approach of an
benchmarks and the performance of LEED rated buildings for Information architectural and technical issue. Energy Build 2004;36(April):367–72.
Technology facilities in Bangalore, India. Energy Build 2010;42:2206–12. [31] Gucyeter B, Gunaydin HM. Optimization of an envelope retrofit strategy for an
[23] Martani C, Lee D, Robinson P, Britter R, Ratti C. ENERNET: studying the existing office building. Energy Build 2012;55(December):647–59.
dynamic relationship between building occupancy and energy consumption. [32] Miller E, Buys L. Retrofitting commercial office buildings for sustainability:
Energy Build 2012;47(4):584–91. tenants’ perspectives. J Prop Invest Finance 2008;26:552–61.
[24] Li N, Calis G, Becerik-Gerber B. Measuring and monitoring occupancy with an [33] Miller N, Spivey J, Florance A. Does green pay off? J Real Estate Portfolio
RFID based system for demand-driven HVAC operations. Automat Constr Manage 2008;14:385–400.
2012;24:89–99. [34] Wiley JA, Benefield JD, Johnson KH. Green design and the market for
[25] Pisello AL, Asdrubali F. Human-based energy retrofits in residential buildings: commercial office space. J Real Estate Finance Econ 2010;41(August):228–43.
a cost-effective alternative to traditional physical strategies. Appl Energy [35] EEBHUB. Energy efficient buildings hub; December 17. Available: <http://
2014;133:224–35. 11/15/2014. www.buildsci.us/eeb-hub.html>.
[26] Oldewurtel F, Sturzenegger D, Morari M. Importance of occupancy information [36] Zhou X, Yan D, Hong T, Ren X. Data analysis and stochastic modeling of lighting
for building climate control. Appl Energy 2013;101(1):521–32. energy use in large office buildings in China. Energy Build 2015;86(1):275–87.
[27] Feng X, Yan D, Hong T. Simulation of occupancy in buildings. Energy Build [37] Sun K, Yan D, Hong T, Guo S. Stochastic modeling of overtime occupancy and
2015;87:348–59. 1/1/2015. its application in building energy simulation and calibration. Build Environ
2014;79(9):1–12.