Applied Energy: Xin Liang, Tianzhen Hong, Geoffrey Qiping Shen

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Applied Energy 179 (2016) 247–260

Contents lists available at ScienceDirect

Applied Energy
journal homepage: www.elsevier.com/locate/apenergy

Improving the accuracy of energy baseline models for commercial


buildings with occupancy data
Xin Liang a,b, Tianzhen Hong b,⇑, Geoffrey Qiping Shen c
a
School of International and Public Affairs, Shanghai Jiao Tong University, Shanghai, China
b
Building Technology and Urban Systems Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
c
Department of Building and Real Estate, Hong Kong Polytechnic University, Hong Kong, China

h i g h l i g h t s

 We evaluated several baseline models predicting energy use in buildings.


 Including occupancy data improved accuracy of baseline model prediction.
 Occupancy is highly correlated with energy use in buildings.
 This simple approach can be used in decision makings of energy retrofit projects.

a r t i c l e i n f o a b s t r a c t

Article history: More than 80% of energy is consumed during operation phase of a building’s life cycle, so energy effi-
Received 20 January 2016 ciency retrofit for existing buildings is considered a promising way to reduce energy use in buildings.
Received in revised form 17 May 2016 The investment strategies of retrofit depend on the ability to quantify energy savings by ‘‘measurement
Accepted 29 June 2016
and verification” (M&V), which compares actual energy consumption to how much energy would have
Available online 7 July 2016
been used without retrofit (called the ‘‘baseline” of energy use). Although numerous models exist for pre-
dicting baseline of energy use, a critical limitation is that occupancy has not been included as a variable.
Keywords:
However, occupancy rate is essential for energy consumption and was emphasized by previous studies.
Baseline model
Occupancy
This study develops a new baseline model which is built upon the Lawrence Berkeley National Laboratory
Building energy use (LBNL) model but includes the use of building occupancy data. The study also proposes metrics to quan-
Measurement and verification tify the accuracy of prediction and the impacts of variables. However, the results show that including
Energy efficiency retrofit occupancy data does not significantly improve the accuracy of the baseline model, especially for HVAC
load. The reasons are discussed further. In addition, sensitivity analysis is conducted to show the influ-
ence of parameters in baseline models. The results from this study can help us understand the influence
of occupancy on energy use, improve energy baseline prediction by including the occupancy factor,
reduce risks of M&V and facilitate investment strategies of energy efficiency retrofit.
Ó 2016 Elsevier Ltd. All rights reserved.

1. Introduction occurs during the actual operation stage, rather than the construc-
tion stage [5]. Therefore, improving the energy efficiency of exist-
The buildings sector consumes 40% of the total primary energy ing buildings is a key issue for reducing the total energy
in the United States [1], and the consumption has continued to consumption and GHG emissions.
increase, particularly in developing countries [2]. The buildings Energy efficiency retrofit for existing buildings is considered a
sector is thus responsible for a quarter of the total global green- promising method to achieve the target of energy savings [6].
house gas (GHG) emissions [3], and this proportion can reach Numerous previous studies indicated energy retrofit can signifi-
around 50% in the United States (U.S.) with adverse impact on glo- cantly benefit the environment, society, and economy by improv-
bal environment, healthcare, and economy [4]. Furthermore, in the ing energy efficiency [6,7], reducing emissions [8,9], controlling
life cycle of a building, more than 80% of energy consumption resource usage [10], enhancing the reputation of building owners
[11], improving the health and productivity of occupants [12,13],
reducing operation costs [14], increasing rent and occupancy rates
⇑ Corresponding author.
[15,16], and creating job opportunities [17].
E-mail addresses: liangxinpku@gmail.com (X. Liang), thong@lbl.gov (T. Hong).

http://dx.doi.org/10.1016/j.apenergy.2016.06.141
0306-2619/Ó 2016 Elsevier Ltd. All rights reserved.
248 X. Liang et al. / Applied Energy 179 (2016) 247–260

Owing to the significant benefits on energy conservation and occupants, (2) occupants’ need of thermal comfort, visual comfort
other aspects of society, energy efficiency retrofit has been empha- and indoor air quality, and (3) occupant behavior and interactions
sized around the world. For example, the U.S. government passed with building systems and controls [28,29]. In addition, in com-
the Energy Policy Act (EPA) of 2005 and Executive Order (EO) mercial buildings, the occupancy rate may increase after energy
13,423, which require that 15% of the total number of existing retrofit, due to lower utility bills, better indoor environment and
buildings should be retrofitted to improve energy efficiency by higher social reputation [30–32]. Miller et al. [33] indicated the
2015 compared with the 2003 baseline. Approximately 30 billion office buildings with green features will have 2–4% occupancy rate
US dollars are assigned to conduct energy efficiency retrofit of premium. Wiley et al. [34] specified that the office buildings with
existing buildings and facilities [7]. Incented by the policies, the LEED certification will increase up to 16–18%. Therefore, if the
market to provide energy efficiency services through energy ser- occupancy rate is changed after energy retrofit, the baseline of
vice companies (ESCOs) has been blooming in the last decade [8]. energy use should be adjusted.
Energy performance contracting (EPC), which is a financing Although a number of previous studies emphasized the impor-
package provided by ESCOs, is a commonly used market mecha- tance of occupancy factor in predicting the baseline, few studies, if
nism to implement energy efficiency retrofit. EPC includes energy not none, used occupancy factor in baseline prediction models,
savings guarantees and associated design, implementation and probably due to the highly stochastic activities and data limitation.
operation services [2,9]. The profit (or the payment to ESCOs) of Therefore, several research questions related to M&V remain to be
an EPC is mainly from the amount of energy cost savings after ret- answered: Does occupancy rate significantly impact the accuracy
rofit. The energy savings can be defined as the difference between of baseline prediction? If yes, how to quantitatively evaluate the
how much energy the building consumed after retrofit, and how impact on prediction accuracy? How is the influence of occupancy
much it would have consumed without the retrofit. The former on baseline models compared to that of other impact factors (e.g.,
can be obtained from utility meters, and the latter, which is outdoor air temperature, day of week), stronger, weaker or equal?
referred to as the energy use ‘‘baseline”, is not measurable but Is it feasible to improve prediction accuracy of energy baseline by
can only be obtained by prediction. The accuracy of the baseline using occupancy data? Nowadays, most commercial buildings
prediction can significantly impact the energy saving assessment, have access control system, which can obtain occupancy data in
investment return and payback period. Furtherly, it can likewise short time intervals. These data provide a new opportunity to dee-
impact the investment strategies and development of the building ply analyze the impact of occupancy on the accuracy of baseline
retrofit market. prediction.
The whole process of predicting baseline and assessing energy To address the aforementioned questions, this study proposes a
saving is called ‘‘measurement and verification” (M&V) [10]. The novel method to quantitatively evaluate how accuracy of energy
mechanism of M&V approaches is first monitoring the energy use baseline models is improved by including the occupancy factor.
of buildings, then developing mathematical models trained by Different from previous models, the proposed model of this study
observed data, and finally predicting baseline of energy use based considers the occupancy data as independent variables rather than
on the developed models. Xia and Zhang [10] present a mathemat- external uncertainty, shown in Fig. 1. Although influence of occu-
ical description of the M&V problem and cast a scientific frame- pancy has been emphasized by numerous previous studies, tradi-
work for the basic M&V concepts, propositions, techniques and tional models have not included occupancy data in the functions
methodologies. Mathieu et al. [11] proposed a regression-based of energy prediction. Therefore, in traditional models, occupancy
model to predict baseline electricity consumption of commercial is an external uncertain factor, which can negatively impact the
buildings and industrial facilities. Coughlin et al. [12] evaluated accuracy of energy prediction. Contrarily, in this study, the occu-
the performance of three average-based models for baseline. Gran- pancy data is considered as an independent variable so that the
derson et al. [13] proposed an automated M&V method for evalu- influence of occupancy can be fitted by the function and evaluated
ating model performance. Granderson and Price [14] summarized by the prediction results. The results of this study showed the
five baseline models, including both average-based models and accuracy of energy prediction is improved. From theoretical per-
regression-based models, and compared the predictive accuracy spective, including occupancy data can improve the prediction
of these models with several metrics. More complex mathematical accuracy, since the uncertainty of occupancy factor can be con-
models of M&V have been emerged, including multivariate regres- trolled and reduced, and less uncertainty can improve the predic-
sion models, exponential smoothing models, neural network mod- tion accuracy.
els, and Fourier series models [15–18]. The results of this study reveal that the influence of occupancy
Uncertainty of M&V models is important, since not only the on the accuracy of energy prediction. In addition, since the perfor-
value of the baseline, but also the accuracy and reliability of mance of models varies across hours and systems, the proposed
the prediction are critical to energy efficiency retrofit. It provides method zooms into the hourly performance and different systems
the stakeholders (e.g., ESCOs, building owners, facility managers) (i.e., HVAC, lighting, plug load and total load) of baseline models.
the information of investment risk, which is critical in decision Another important feature of this work is it only uses simple algo-
making. For example, if the post-retrofit energy use will be 30% rithm, excluding complex mathematical processing, and the input
lower than the baseline, but the uncertainty exceeds 30%, it is then data is available in most commercial buildings. That means the
very risky to invest in this retrofit project. Walter et al. [8] empha- proposed method is relatively easy to be implemented, and can
sized the influence of uncertainty and assessed uncertainty of M&V be well adopted for practical projects. The results of this study
for 17 buildings by calculating the percent differences between can help us understand the quantitative influence of occupancy
predicted baseline and observed data. The results showed there on energy use and energy baseline models.
was considerable uncertainty in baseline prediction: 5 out of 17
buildings had more than 20% uncertainty, and in an extreme case
it was more than 60%. 2. Methodology
The occupancy rate is a key uncertainty factor of M&V. Numer-
ous previous studies indicated that the occupancy rate had signif- 2.1. Framework of evaluating occupancy impact on baseline prediction
icantly positive correlation with the energy use in buildings [19–
26]. Occupants in buildings influence energy use in buildings in The methodology to evaluate occupancy influence on baseline
three major ways [27]: (1) sensible and latent heat gains from prediction comprises of four steps, illustrated in Fig. 2.
X. Liang et al. / Applied Energy 179 (2016) 247–260 249

Traditional Models

External uncertain factor


(occupancy)

Influencing factor x1
(temperature)
Influencing factor x2
(time) y=f(x1,x2,...xn) Energy Prediction y

Influencing factor xn

Proposed Model

Influencing factor x1
(temperature)
Influencing factor x2
(time)
y=f(x1,x2,x3,...xn) Energy Prediction y
Influencing factor x3
(occupancy)

Influencing factor xn

Fig. 1. Comparison of traditional models and the proposed model.

Step 1: Problem Definition and Data Preparation. One aim of Step 3: Evaluation and comparison of accuracy of baseline mod-
this step is to clarify problem definition, boundary, assumption els. This step is to quantitatively evaluate the influence of occu-
and key metrics of success. The scope of this study focuses on pancy on the accuracy of baseline models. First, three baseline
the energy baseline prediction in office buildings. Since there are prediction models are implemented to predict baseline of energy
normally fewer occupants in office buildings on weekends, this use based on the observed data. Two models, which do not include
study only focuses on the energy use on weekdays. The key metric, occupancy factor, are adopted from previous studies. The other
which is to assess different models and factors, is the similarity one, using occupancy data, is the proposed method in this study.
between prediction results and observed data. The algorithms of the three models will be illustrated in detail in
The other aim of this step is to prepare data for the analysis in Section 2.2. Then, the prediction results are compared across the
the next steps. It includes acquiring, harmonizing, rescaling, clean- three models. The method and metrics of the evaluation will be
ing and formatting data. Due to the failure of sensors, stochastic introduced in detail in Section 2.3. The results can show whether
noise and other interference factors, the raw data set may contain the occupancy data improves the accuracy of baseline prediction.
missing data, error data and unstructured data. Before data mining, If the prediction accuracy is significantly improved by occupancy
the raw data should be pre-processed to provide the valid data for data, the next step will be executed.
further analysis. In this study, three types of data are required (i.e., Step 4: Prioritization of impact factors. Based on horizontal
outdoor air temperature, energy use and occupancy count data). comparison across models in the last step, this step further evalu-
Outdoor air temperature can be obtained from sensors outside ates influence of occupancy factor by vertical comparison across
buildings or database of weather stations. Energy use data can be factors. Besides the number of occupants, there are various uncer-
obtained from electricity meters in buildings. Occupant number tain factors which can impact energy consumption of buildings
can be obtained from the records of access control system. All these (e.g., outdoor air temperature, facility degradation and climate
data are recorded with timestamps of short-time intervals, typically change). It is important to understand not only the influence of
at 5–15 min. Using the short-time ‘‘interval data” can significantly occupancy factor, but also its priority compared to other impact
reduce the duration of data required in baseline models [8]. factors. Namely, this step is to identify which factor is more critical
Step 2: Correlation analysis. This step is to verify the correlation to the accuracy of baseline prediction. The results can provide
between occupancy rate and total energy consumption of build- guidance for selecting factors in baseline models. The method
ings. The total energy consumption can be divided into several and metrics of the factor comparison will be introduced in detail
sub-systems (e.g., HVAC, lighting system and plug load) by using in Section 2.4.
sub-meters. Then, the more occupancy-dependent sub-systems,
which have higher correlation with occupancy rate, can be 2.2. Baseline prediction models
revealed. Scatter plots are applied to visualize correlations qualita-
tively, while statistical analysis is applied to calculate correlations Three baseline models are implemented to demonstrate the
quantitatively. Correlation coefficients and significance levels are methodology. The first model is a simplistic ‘‘native” model, which
main criteria of correlation test. If occupancy rate and energy use only depends on one variable, the time of week. It serves as a com-
are significantly correlated, the next step will be executed. parative ‘‘floor” of performance [14]. The second model was devel-
250 X. Liang et al. / Applied Energy 179 (2016) 247–260

Processes Results

Problem definition; Problem boundary;


Problem Definition Data collection; Metrics;
andProblem Definition
Data Preparation Data cleaning and preparation. Prepared data;
and Data Preparation

Evaluate correlation of occupant


number and energy use using scatter
plots, statistical analysis, and etc.
Correlation Correlation between occupant
Correlation
Analysis
Analysis number and energy use of
No
Correlated? No End each sub-system

Yes
Yeses
Compare prediction accuracy of
baseline models with an without
occupancy data

Evaluation of Whether the occupancy data


Evaluation
Occupancy of
Influence improves prediction accuracy;
Model 1 Model 2
Occupancy Influence
on Baseline If yes, the quantitative impact
on Baseline
Prediction
Prediction
Improve No
No End
accuracy?
Yes
Yes
Compare occupancy factor with
other impact factors
The contribution of
Prioritization of occupancy factor compared
Prioritization
impact factors of to other factors
impact factors Factor 1 Factor 2

End process and apply results in


practical projects

Fig. 2. Framework of evaluating occupancy impact on baseline prediction.

oped by researchers at LBNL (Lawrence Berkeley National Labora- Table 1


tory), which includes two variables, outdoor air temperature and The variables included in each baseline model.
the time of week [8]. The third one is based on the LBNL model
Time of Outdoor air Occupancy
but includes the occupancy variable. The variables included in each week temperature number
model are illustrated in Table 1.
Model 1 (the MW U
model)
2.2.1. Model 1: the MW (mean-week) model Model 2 (the LBNL U U
This model only depends on one variable, the time of week. model)
Consider N observed data points, where data point n is from time Model 3 (the new U U U
model)
tn including the observed load data L, n = 1, . . ., N. The method is
presented by Eq. (1), where Lpn denotes the prediction of baseline
at point n, and time denotes the time of the week (e.g., 10 am on parts of energy use in this model, one is the time-dependent por-
Monday). For example, the prediction of 10 am on a Monday is tion Lpn;time and the other one is temperature-dependent portion
the average of historical data for 10 am on all Mondays.
Lpn;temp . Lpn , the predictive baseline of total energy use at point n, is
Lpn ¼ Lpn;time ¼ Meantime ðLÞ ð1Þ the sum of these two portions, shown in Eq. (2).

2.2.2. Model 2: the LBNL model Lpn ¼ Lpn;time þ Lpn;temp ð2Þ


This is a regression model including the variables of outdoor air
The time-dependent portion Lpn;time mainly represents the differ-
temperature and the time of week. The temperature is considered
ent features of energy consumption among different times. For
as an important factor of energy use in buildings. The correlation
example, the load is normally lower at night than at working time.
between temperature and energy use is non-linear. In occupied
mode, the temperature and energy consumption are normally pos- Lpn;time is modeled by dividing a week into 120 one-hour time slots
itively correlated at higher temperature (due to cooling), nega- (24 h multiply 5 weekdays). An indicator sn,i and a coefficient ai
tively correlated at lower temperature (due to heating), and are assigned to each time slot Si, for i = 1, . . ., 120. The whole
relatively un-correlated at moderate temperature (due to no cool- time-dependent portion Lpn;time can be calculated by summing the
ing or heating). Therefore, the piecewise-continuous regressions of products of indicators and coefficients of all time slots, shown in
temperature variable are used in the LBNL model. There are two Eq. (3).
X. Liang et al. / Applied Energy 179 (2016) 247–260 251

X
120 intervals and /n,k is the portion of the On in interval k. For example,
Lpn;time ¼ sn;i ai ð3Þ if occupant intervals are defined (i.e., 0–10, 10–20, 20–50, 50–100,
i¼1
100–200) and the given number of occupants On = 120, the values
The indicator sn,i, which is defined in Eq. (3), serves to select of five components are /n,1 = 10, /n,2 = 10, /n,3 = 30, /n,4 = 50,
which coefficient is active. For a given point tn, only one indicator /n,5 = 20. The whole occupancy-dependent portion Lpn;occ can be cal-
is one, while other 119 indicators are zero. When sn,i = 0, the coef- culated by summing the products of occupancy components and
ficients have no effect. coefficients of all intervals, shown in Eq. (8).

1 if t n 2 Si X
N OI
sn;i ¼ ð4Þ Lpn;occ ¼ ck /n;k ð8Þ
0 if t n R Si
k¼1
The temperature-dependent portion Lpn;temp mainly represents
The predictive baseline Lpn by Eq. (7) can be transformed to Eq.
the different features of energy consumption among different tem-
(9), where the coefficients ai, bj and ck can be computed by regress-
peratures, which is probably most related to the heating and cool-
ing with observed data. Then the baseline of energy consumption
ing systems behaviors. As aforementioned, Lpn;temp is modeled by a
can be predicted with the obtained coefficients.
piecewise-linear and continuous function. A number of tempera-
ture intervals need to be divided for this piecewise-linear function X
120 X
NT X
NO

and a temperature component hn,j and a coefficient bi is assigned to Lpn ¼ Lpn;time þ Lpn;temp þ Lpn;occ ¼ ai sn;i þ bj hn;j þ ck /n;k ð9Þ
P T i¼1 j¼1 k¼1
each interval. The temperature Tn is the sum T n ¼ Nj¼1 hn;j , where
NT is the number of temperature intervals and hn,j is the portion In the model of this case study, NT is set to 2 with the intervals
of the Tn in interval j. For example, in the case study of [8], four (0–45 °F, 45–100 °F), and NO is set to 4 with the intervals (0–10,
intervals are defined (i.e., 20–40 °F, 40–60 °F, 60–80 °F, 80– 10–50, 50–100, 100–200).
100 °F). If the given temperature Tn = 70 °F, the values of four com-
ponents are hn,1 = 20 °F, hn,2 = 20 °F, hn,3 = 10 °F, hn,4 = 0 °F. The 2.3. Computing the accuracy of baseline models
whole temperature-dependent portion Lpn;temp can be calculated by
summing the products of temperature components and coeffi- The accuracy of baseline models can be quantified by the metric
cients of all intervals, shown in Eq. (5). CVRMSE (coefficient of variation of the root mean square error) [14].
CVRMSE is the root mean square error divided by the mean of the
X
NT
data, which indicates the relative size of error. For example, a value
Lpn;temp ¼ bj hn;j ð5Þ
of 0.1 means the difference between prediction and observed data
j¼1
is 10% of observed data. The equation for CVRMSE is provided in
The predictive baseline Lpn by Eq. (2) can be transformed to Eq. Eq. (10), where Lob p
n and Ln are the observed data and baseline predic-
(6), where the coefficients ai and bj can be computed by training tion reprehensively, and N is the size of the data set.
with observed data. rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PN ob p 2ffi
ðLn Ln Þ
X
120 X
NT n¼1

ai sn;i þ
N
Lpn ¼ Lpn;time þ Lpn;temp ¼ bj hn;j ð6Þ CVRMSE ¼ PN ð10Þ
Lob
i¼1 j¼1 n¼1 n
N

Cross-validation is applied to facilitate the quantification of the


2.2.3. Model 3: the new model baseline accuracy. The observed data is partitioned into several
This new model is developed from the LBNL model by including subsets. Some parts of them are used for model fitting and training,
the occupancy variable. Besides the outdoor air temperature and and other parts are used for validating. Then, the process is iterated
the time variables, the occupancy variable is added in this model. by changing training set and validation set. In this study, the time
The predictive baseline Lpn comprises three portions (i.e., the interval to partition observed data is one month, since it is nor-
time-dependent portion Lpn;time , the temperature-dependent portion mally the utility bill period. First, the model is fitted by the data
Lpn;temp and the occupancy-dependent portion Lpn;occ ). It is described in one or several intervals (the training length can vary, and the
sensitivity analysis of training length will be discussed in Sec-
in Eq. (7), where the methods for computing Lpn;time and Lpn;temp are
tion 3.5), and is validated by the data in the next interval. Then,
the same as the LBNL model. the training set and validation set are shifted, and the process of
Lpn ¼ Lpn;time þ Lpn;temp þ Lpn;occ ð7Þ training and validating is repeated until the end of data set. The
schematic of the cross-validation processes is shown in Fig. 3.
The occupancy-dependent portion Lpn;occ
mainly represents the In each step, a CVRMSE value can be obtained, and the finial indi-
different features of energy consumption among different occu- cator of prediction accuracy is the average of CVRMSE, CVRMSE. The
pant numbers, which is probably most related to the occupant equation for CVRMSE is shown in Eq. (11), where CVRMSEm is the
behaviors (e.g., turning on lights when arriving). Similar to Lpn;temp , nMAE in the m-th step, and M is the number of steps.
Lpn;occ can be modeled by a piecewise-linear and continuous func- PM
m¼1 CVRMSEm
tion, since the dependence of load on occupant number is not a lin- CVRMSE ¼ ð11Þ
M
ear function either. The occupant number and energy consumption
are normally positively correlated when buildings are moderate-
occupied, but are relatively un-correlated when buildings are 2.4. Calculating the influence of variables
heavily-occupied, since no more appliances can be turned on.
A number of occupancy intervals need to be divided for this Besides comparing the accuracy of models, understanding the
piecewise-linear function and an occupancy component /n,k and impact of each influencing factor is critical in baseline prediction.
a coefficient ck is assigned to each interval. The Occupant number As shown in Fig. 4, Model 1 includes the variable of time while
P O
On is the sum On ¼ Nk¼1 /n;k , where NO is the number of occupancy Model 2 includes the variables of time and temperature. If the
252 X. Liang et al. / Applied Energy 179 (2016) 247–260

Step 1
Training Comparing
CVRMSE1

Step 2
Training Comparing
CVRMSE2

Step M

Training Comparing
CVRMSE m

Training Set Validation Set Predictive Baseline

Fig. 3. Schematic of the cross-validation processes.

State of the art The proposed model


Simple model (LBNL model) with occupancy data

Inputs: Inputs:
Inputs Inputs:
Inputs
Time Time Temp Time Temp Occ

Model 1 Model 2 Model 3

Outputs: Outputs: Outputs:


Accuracy of Prediction Accuracy of Prediction Accuracy of Prediction

Accuracy Improvement Accuracy Improvement


by Temperature by Occupancy

Fig. 4. The process of calculating the influence of variables.

accuracy of Model 2 is improved, it should be caused by the incre- of one factor can be defined with the same method: comparing the
mental information of temperature. accuracy by controlling all other factors and changing the target
Since Model 1 and 2 are linear models, the impact of tempera- factor.
ture factor Impacttemp can be defined as the accuracy improvement
of Model 2 compared to Model 1, shown in Eq. (12). CVRMSEModel3  CVRMSEModel2
Impactocc ð%Þ ¼  100% ð13Þ
CVRMSEModel3
CVRMSEModel2  CVRMSEModel1
Impacttemp ð%Þ ¼  100% ð12Þ
CVRMSEModel2
3. Results
Likewise, as shown in Fig. 4, the contribution of occupancy fac-
tor Impactocc can be defined as the accuracy improvement of Model 3.1. Data preparation
3 compared to Model 2, shown in Eq. (13), since the only difference
between these two models is the variable of occupancy. If there are A case study was conducted to show how to quantify the avail-
other impact factors involved in baseline models, the contribution ability of occupancy impact on the accuracy of baseline prediction
X. Liang et al. / Applied Energy 179 (2016) 247–260 253

Table 2 represents the dual-peak feature, but the noon-drop is not as deep
The profile of Building 101. as that in the ASHRAE 90.1 standard. According to the feature, the
Location Philadelphia, US occupancy curve can be divided into six periods [36]: (1) the night
Size 6410 m2 period (7 pm to 6 am); (2) the going-to-work period (7 am to
Floor 3 floors 9 am); (3) the morning period (10 am to 12 pm); (4) the noon-
Constructed year 1911
Building usage Office
break period (12 pm to 1 pm); (5) the afternoon period (2 pm to
3 pm); and (6) the going-home period (4 pm to 6 pm). According
to the distribution of the boxplot, the higher uncertainties of the
occupant number occurred during going-to-work and going-
by the proposed method. Building 101 in the Navy Yard, Philadel-
home periods.
phia, Pennsylvania U.S. was used in this case study. The building is
The main feature of energy consumption is similar to that of the
one of the nation’s most highly instrumented office buildings and
number of occupants (lowest at night, increasing in the morning
is the temporary headquarters of the U.S. Department of Energy’s
and decreasing in the afternoon), but is not quite synchronized.
Energy Efficient Building Hub (EEB Hub) [35]. Various sensors have
The energy consumption curve can be divided into four periods:
been installed by EEB Hub since 2012 to acquire building data of
(1) the valley period (10 pm to 3 am); (2) the increasing period
occupants, facilities, energy consumption and environment. The
(4 am to 9 am); (3) the peak period (10 am to 5 pm); (4) the
profile of Building 101 is shown in Table 2.
decreasing period (6 pm to 9 pm). The energy consumption rises
Four sensors are installed at the gates of the building to record
about three hours earlier than occupants arriving, and falls around
the number of occupants entering and exiting. The sensors are
two hours later than occupants leaving. It indicates that the oper-
located at the first floor in Building 101, shown in Fig. 5. This study
ation schedule of building energy systems is around five hours
uses the data from the year 2014 and the time step is five minutes.
longer than occupied time in this building. In addition, it needs
The data format of raw sensor records is shown in Table 3. The set
to be noted that the energy consumption does not have dual-
(Nin1,n, Nin2,n, Nin3,n, Nin4,n) denotes the number of entering occu-
peak feature. Namely, the energy consumption keeps the peak
pants, while the set (Nout1,n, Nout2,n, Nout3,n, Nout4,n) denotes the num-
value during noon-break, which indicates the lights, HVAC and
ber of exiting occupants at the n-th time step. Therefore, the
other plug load equipment are not turned off when occupants
number of total occupants in building can be calculated by Eq. (14).
leave for lunch.
X
NO ¼ ðN in1;n  Nout2;n þ Nin2;n  Nout2;n þ Nin3;n  Nout3;n Fig. 7 shows the correlation between the number of occupants
n and energy consumption by scatter plots. The color bar indicates
þ Nin4;n  Nout4;n Þ ð14Þ the time of the day. The color is closer to red when time is closer
to noon, while the color is closer to blue when time is closer to
The electricity consumption data of the whole building and sub- midnight. The total load and occupant number present positive
systems (i.e., lighting, HVAC and plug load) was recorded by sub- correlation. Although not very significant, the trend can be discov-
meters in 15-min intervals. Based on this data set, the hourly, daily ered: the more occupants there are, the higher the total load is. The
and monthly energy use of each system can be calculated. The out- lighting and plug load systems show more significant positive cor-
door air temperature was recorded every 15 min. Therefore, all the relation between energy use and occupant number. Especially in
three categories of data (occupancy, temperature and energy use) the plug load system, the slope is high, which means a given
can be obtained by sensors and meters in Building 101. After har- change of occupant number will cause relatively large change of
monizing, rescaling, cleaning and formatting the raw data, it is energy use. Nevertheless, the occupant number does not show sig-
ready for the further analysis. nificant correlation with energy use in HVAC systems.
To compare with occupancy, the temperature was likewise ana-
3.2. Correlation between occupancy and energy consumption lyzed to show the correlation with energy use. As shown in Fig. 8,
during night (1blue dots), the total load is not related to tempera-
The correlation between occupancy and energy consumption ture. During daytime (yellow and red dots), there is significantly pos-
was investigated with three methods: time series, scatter plots itive correlation when temperature is higher than 40 °F, otherwise,
and correlation coefficient tests. Fig. 6 illustrates the comparison there is no significant correlation between them. The HVAC system
of the hourly energy consumption and the occupant number. Sim- is similar to the total load, but the correlation is more significant.
ilar to the ASHRAE 90.1 standard, the occupancy curve during 24 h There is no significant correlation in lighting and plug load systems.

Sensors
Unused Doors
Occasional Door

Fig. 5. Locations of occupancy sensors in Building 101.


254 X. Liang et al. / Applied Energy 179 (2016) 247–260

Table 3
The data format of sensor records.

Time step Sensor 1 Sensor 2 Sensor 3 Sensor 4


In Out In Out In Out In Out
1/1/2014 0:00 Nin1,1 Nout1,1 Nin2,1 Nout2,1 Nin3,1 Nout3,1 Nin4,1 Nout4,1
1/1/2014 0:05 ... ... ... ... ... ... ... ...
1/1/2014 0:10 ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
12/31/2014 23:50 ... ... ... ... ... ... ... ...
12/31/2014 23:55 Nin1,n Nout1,n Nin2,n Nout2,n Nin3,n Nout3,n Nin4,n Nout4,n

Fig. 6. Hourly energy consumption and occupant number in weekdays. Boxplots show median, quartiles, extreme values, means (blue circles) and outliers (+) of the data set.
(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Since Building 101 uses gas for heating rather than electricity, the by including the occupancy variable. To answer this question,
total energy use does not rise in lower temperature. However, the Model 3, which uses the time, outdoor air temperature and occu-
energy use in plug load system rises slightly in lower temperature, pancy variables, is implemented to compare with the previous
probably due to personal electric heaters. methods. Since the volume of results is huge (a whole year in 1-
Besides the visualization of correlation by scatter plots, correla- h intervals), it is difficult to show all the results. Therefore, one
tion analysis is adopted to calculate the correlation coefficients, week of results is shown in Fig. 9, with comparison of observed
shown in Table 4. In vertical comparison, the coefficient of occu- data and results of three models.
pant number (0.74) is 30% higher than that of temperature (0.44) The accuracy of each model, measured by CVRMSE, is shown in
in total energy use. This premium becomes greater in lighting Figs. 10–13. The lower value of CVRMSE indicates the higher
and plug load systems, which are 60% and 81% respectively. The accuracy.
coefficients in HVAC system are approximately equal. Therefore,
overall, the occupant number has much higher correlation with  Fig. 10 illustrates the accuracy of baseline models in total load
energy use than outdoor air temperature.
prediction. The values of CVRMSE in Model 1 are around 0.25
In horizontal comparison, the occupancy is more correlated to
during working time. The peak values are around 0.45, which
lighting and plug load systems, while the outdoor temperature is
are from 4 am to 6 am, and the valley values are around 0.1
more correlated to the HVAC system. These results are consistent
which are at 8 pm. After including the outdoor temperature
with common sense and previous studies [36,37], because the
variable, the accuracy of Model 2 is improved significantly.
lighting and plug load are controlled by occupants, but the HVAC
The values of CVRMSE are mostly below 0.15, and higher
system mainly depends on the outdoor air temperature.
CVRMSE values beyond 0.15 occur from 2 am to 6 am. The accu-
racy of Model 3 is slightly improved from Model 2, and the
3.3. Accuracy of baseline models
shape of CVRMSE curve is very similar.
The results in Section 3.2 have proved that the occupant num-  Fig. 11 illustrates the accuracy of baseline models in HVAC load
ber is highly correlated to energy consumption. The further ques- prediction. It indicates that Model 1 is poor at HVAC load pre-
tion is whether the accuracy of baseline models can be improved diction. The CVRMSE values in Model 1 are mostly beyond 0.5,
X. Liang et al. / Applied Energy 179 (2016) 247–260 255

Total Load HVAC Time


400 200
12:00
Energy Consumption (kWh)

Energy Consumption (kWh)


300 150

10:00
200 100 (14:00)

100 50
8:00
(16:00)
0 0
0 50 100 150 200 0 50 100 150 200
Number of Occupants Number of Occupants
6:00
Lighting Plug Load (18:00)
80 80
Energy Consumption (kWh)

Energy Consumption (kWh)


70
60 4:00
60 (20:00)
40 50

40 2:00
20
(22:00)
30

0 20
0 50 100 150 200 0 50 100 150 200 0:00
Number of Occupants Number of Occupants (24:00)

Fig. 7. The correlation between the number of occupants and energy consumption.

Total Load HVAC Time


400 200
12:00
Energy Consumption (kWh)

Energy Consumption (kWh)

300 150

10:00
200 100 (14:00)

100 50 8:00
(16:00)
0 0
0 20 40 60 80 100 0 20 40 60 80 100
Temperature (F) Temperature (F) 6:00
(18:00)
Lighting Plug Load
80 80
Energy Consumption (kWh)
Energy Consumption (kWh)

60
4:00
(20:00)
60

40

40
2:00
20
(22:00)

0 20 0:00
0 20 40 60 80 100 0 20 40 60 80 100 (24:00)
Temperature (F) Temperature (F)

Fig. 8. The correlation between the temperature and energy consumption.

which means most prediction values deviate from observed to below 0.2 during daytime (6 am to 6 pm), but the values of
value by more than 50%. The peak value is around 0.9 at 4 CVRMSE are still higher than 0.5 at night (from 7 pm to 3 am).
am. After including the temperature variable, the accuracy of By including the occupancy variable, the accuracy of Model 3
Model 2 is improved significantly. The values of CVRMSE drop is not significantly improved in most time, except 7 pm to
256 X. Liang et al. / Applied Energy 179 (2016) 247–260

Table 4
The correlation coefficients between occupancy/temperature and energy
3.4. Contribution of the occupancy factor
consumption.
The results in Section 3.3 confirmed the hypothesis that the
Total electric load HVAC Lighting Plug load
occupancy data can improve baseline prediction. The next step is
Number of occupants 0.74* 0.54* 0.73* 0.86* to quantify the contribution of the occupancy variable, and clarify
Outdoor air temperature 0.44* 0.58* 0.13* 0.05*
whether this contribution is higher or lower than that of other fac-
*
p-value < 0.001. tors. The result can help determine the dominant factors in base-
line models. In this step, the contributions of occupancy and
temperature are calculated and compared using the method intro-
12 am. The shape of CVRMSE curve is very similar. The big dif- duced in Section 2.4.
ferences of accuracy in HVAC load prediction are probably Fig. 14 illustrates the contribution of occupancy data on the
caused by the operation schedule, which is related to neither accuracy of baseline prediction. The results show that occupancy
occupancy nor outdoor temperature in this building. It will be data improves lighting and plug load prediction most significantly,
discussed in detail in Section 4. especially during working time (8 am to 6 pm). But the improve-
 Fig. 12 illustrates the accuracy of baseline models in lighting ment is not significant in HVAC load prediction, lower than 10%.
load prediction. Model 1 performs well at lighting load predic- Overall, the occupancy data improves the total energy prediction
tion. The CVRMSE values in Model 1 are mostly below 0.1. But by around 10% during daytime (6 am to 6 pm), but less improve-
the CVRMSE values rise sharply at 6 am and 9–10 pm. After ment at other times.
including the outdoor temperature variable, the accuracy of The statistical results of contributions of occupancy Impactocc
Model 2 is improved significantly at 6 am and 9–10 pm, which and outdoor temperature Impacttemp are shown in Table 5. Accord-
the CVRMSE values drop to around 0.15. By involving the occu- ing to the results, the outdoor temperature variable mainly con-
pancy variable, the accuracy of Model 3 is slightly improved in tributes to HVAC and lighting load prediction, while the
daytime (from 8 am to 6 pm), but not improved in other hours. occupancy variable mainly contributes to lighting and plug load
prediction. The mean contribution of occupancy variable on total
The shape of CVRMSE curve is very similar.
energy prediction is 10%, which is much lower than the mean con-
 Fig. 13 illustrates the accuracy of baseline models in plug load
tribution of the outdoor temperature variable (63%). Occupancy
prediction. Model 1 performs well at lighting load prediction.
has higher correlation with energy use but lower contribution on
The CVRMSE values in Model 1 are mostly below 0.1. The two
energy prediction, which seems inconsistent. The reasons of this
peak values of CVRMSE occur at 8 am and 6 pm. After including problem will be discussed in Section 4.
the outdoor temperature variable, the accuracy of Model 2 is
improved, which the CVRMSE values drop to below 0.09. By 3.5. Sensitivity analysis
involving the occupancy variable, the accuracy of Model 3 is sig-
nificantly improved, especially during working time (6 am to There are three critical parameters influencing the prediction
7 pm). All the CVRMSE values of Model 3 drop to below 0.08. results in baseline models. First is the length of the training data
Different from the other two systems, the shape of CVRMSE period. The baseline models use previously observed data to train
curve of Model 3 for plug load is not similar to that of Model 2. and fit models. The length of the training period will impact the

Observed Data Model 1 Model 2 Model 3


Total Load (kW)

300

200

100

0
MON TUE WED THU FRI
HVAC Load (kW)

200

100

0
MON TUE WED THU FRI
Lighting Load (kW)

60

40

20

0
MON TUE WED THU FRI

70
Plug Load (kW)

60
50
40
30
MON TUE WED THU FRI

Fig. 9. The observed load and predicted load by three models (from 11th to 15th August 2014).
X. Liang et al. / Applied Energy 179 (2016) 247–260 257

0.45 35
Model 1 Model 2 Model 3 Plug Load Lighting HVAC Total
0.4

Accuracy Improvement (%)


30
0.35
25
CVRMSE

0.3
20
0.25
15
0.2

0.15 10

0.1 5

0.05 0
2 4 6 8 10 12 14 16 18 20 22 24 2 4 6 8 10 12 14 16 18 20 22 24
Time Time

Fig. 10. The accuracy of baseline models for total electric load prediction. Fig. 14. The contribution of occupancy data on the accuracy of baseline prediction.

1 Table 5
Model 1 Model 2 Model 3
0.9 The statistical profile of the contributions by occupancy and temperature factors.
0.8
Impactocc Impacttemp
0.7
Max Mean Median Max Mean Median
CVRMSE

0.6
(%) (%) (%) (%) (%) (%)
0.5
HVAC 9 5 5 72 46 64
0.4
Lighting 21 12 7 38 13 10
0.3 Plug 27 15 9 20 10 9
0.2 load
Total 18 10 8 66 44 57
0.1
0
2 4 6 8 10 12 14 16 18 20 22 24
Time training is too short, it cannot provide enough information to fit
Fig. 11. The accuracy of baseline models for HVAC load prediction.
the model. If the training is too long, it may include useless or
harmful information to the model. Since the building performance
and occupant activities change over time, the data of the building
0.4 in the distant past does not help predict the building performance
Model 1 Model 2 Model 3
0.35
in the future. For example, over a period of years, the base load of
the building is likely changed. There will be a considerable bias if
0.3
using data from years ago. The number of occupants and their
0.25 energy use behaviors can be likewise changed during a long time,
CVRMSE

0.2 so the historical data can no longer reflect the current building
0.15
performance.
The other two critical parameters are the piecewise number of
0.1
occupancy and the outdoor temperature in regressions. As men-
0.05 tioned in Section 2.2, Model 3 is piecewise-continuous regressions
0 of the occupancy and temperature variables. The piecewise num-
2 4 6 8 10 12 14 16 18 20 22 24
ber will impact model fitting. Fewer segments will sacrifice accu-
Time
racy of the model, while too many segments can cause over-
Fig. 12. The accuracy of baseline models for lighting load prediction.
fitting and high computing cost. Therefore, how to define these
segments appropriately is an important issue in the baseline
model.
0.13 Sensitivity analysis is applied to evaluate the influence of these
Model 1 Model 2 Model 3
0.12
parameters on baseline models. Fig. 15 shows the accuracy of base-
line models during different training periods. The CVRMSE of Model
0.11
1 first increases with the training period and reaches the peak
value at five months, then decreases. It can be explained that when
CVRMSE

0.1

0.09 the training period is five months, it uses training data from winter
to predict the building performance in summer. As Model 1 does
0.08
not include the outdoor temperature variable, the prediction
0.07
should be at lower accuracy. It verifies the aforementioned hypoth-
0.06 esis, that longer training period may be harmful to accuracy. The
0.05 CVRMSE of Models 2 and 3 fluctuate in short training periods,
2 4 6 8 10 12 14 16 18 20 22 24
and reach convergence after three months. Their curves are almost
Time
coincident after three months, and Model 3 is slightly below Model
Fig. 13. The accuracy of baseline models for plug load prediction.
2. In short training periods (one to three months), Model 3 shows
faster convergence and narrower range of fluctuation. It can bring
not only technical but also economic benefits, since the time for
training effect and further impact the accuracy of baseline predic- data gathering can significantly impact the costs, investment
tion. The length should be neither too short nor too long [8]. If the return and payback period [8].
258 X. Liang et al. / Applied Energy 179 (2016) 247–260

0.4 0.13
Model 1 Model 2 Model 3 Occupancy
0.35 0.12 Temperature

0.3
0.11
CVRMSE

CVRMSE
0.25
0.1
0.2
0.09
0.15
0.08
0.1

0.05 0.07

0 0.06
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Months of training Piecewise number in regression

Fig. 15. Accuracy of baseline prediction under different training periods. Fig. 16. Accuracy of baseline prediction in Model 3 under different piecewise
number of occupancy and temperature data.

Fig. 16 shows the accuracy of Model 3 under different piecewise


indicate the occupancy-related risk. According to the aforemen-
number of occupancy and temperature data. For the temperature
tioned first reason, the contribution of occupancy variable is lower
curve, The CVRMSE of Model 3 drops sharply from one segment
when occupancy is more correlated to the time. In this case study,
to two segments, then decreases slowly with more than two seg-
the low contribution of occupancy indicates the occupancy-related
ments, where the changes are lower than 2%. It means the piece-
risk is low in this building, mainly because it is an office building
wise number of temperature should be more than 2. For the
and the occupancy is regular during one year. Conversely, the high
occupancy curve, the CVRMSE of Model 3 stays stable over different
contribution means the occupancy is highly uncertain and stochas-
piecewise numbers. Therefore, different segment definitions of the
tic. If the occupancy-related uncertainty is very high (e.g., hotels), it
occupancy variable will not impact the accuracy of model
needs to carefully consider the occupancy-related risk in retrofit
significantly.
decision making. Furthermore, it can clarify whether the changes
of energy use are from retrofit or operation. Some buildings cannot
achieve the energy saving target after retrofit, and common dis-
4. Discussion
putes are focused on whether it is caused by ineffective retrofit
or inappropriate operation. If the occupancy does not significantly
According to the results in Section 3, a critical question needs to
change but the contribution of occupancy is abnormally low, it
be answered: why occupancy is highly correlated with energy con-
indicates the operation schedule is inconsistent with the occu-
sumption but contributes less in baseline prediction (coefficient is
pancy schedule and is more responsible for the excessive
0.74 but contribution is 10%). It seems inconsistent and nonsensi-
consumption.
cal, especially compared with the temperature variable (coefficient
There are three advantages of the proposed baseline model.
is 0.44 but contribution is 63%). There are two main reasons behind
First, this study defines the metrics to quantify the influence of
it. One is that the occupant number is highly correlated with time,
occupancy. Numerous previous studies emphasized the impact of
so the time variable has provided most information of the occu-
occupancy on M&V, however the quantified influence of occupancy
pancy. In the three baseline models, time of week is an important
is under-developed. Without this, it is difficult to improve baseline
variable, which includes 120 one-hour time slots (24 h multiply
models as well as facilitate real projects of energy efficiency retro-
five weekdays). Although occupant number changes over time
fit. Based on the proposed metrics, the contribution of different
stochastically and is in high uncertainty, the occupant number in
variables in the baseline models can be analyzed and compared.
each time slot is in relatively low uncertainty. As shown in Fig. 6,
The proposed metrics can then be used to evaluate other factors
during 12 h (7 pm to 7 am) of a day, the uncertainty of occupancy
in the baseline models.
is close to zero. During the other 12 h, the uncertainty of most
Second, the proposed method zooms into the hourly perfor-
occupancy data is under 20%. It means the time variable is highly
mance and different systems of baseline models. Previous studies
correlated with occupancy and can provide most occupancy infor-
only provided the overall whole building results of baseline predic-
mation. Therefore, the occupancy variable cannot provide much
tion, but the performance of model varies across hours and sys-
incremental information to the model.
tems. For example, the load prediction for HVAC system is very
The other reason is that the operation schedule mainly depends
accurate at daytime (CVRMSE is less than 0.2), but rises dramati-
on time rather than occupancy. As shown in Fig. 6, the operation
time is much longer than occupied period. For example, the energy cally at night (CVRMSE is more than 0.6), shown in Fig. 11. To
consumption significantly rises from 4 am, and the load has improve baseline models, future research can pay more attention
reached nearly 80% of mean peak load at 6 am, while there are to these issues. Therefore, this method provides a ‘‘magnifying
fewer occupants in the building. Fig. 6 can likewise verify this lens”, which can help diagnosis and trouble shooting.
issue. Except plug load, the energy consumption can nearly reach Third, the proposed method requires simple input data and
peak value in the early morning and late afternoon. It means build- algorithm. Three types of data are needed in the model, namely
ing systems are controlled by operation schedule rather than occu- the occupancy data (available in most commercial buildings for
pants. Therefore, the time variable is better to reflect energy security reasons), energy consumption (most commercial build-
consumption than the occupancy variable. ings have electricity meters capable of providing short-interval
After clarifying the reasons of the last question, there is a fur- data [8]) and the outdoor air temperature (available from local
ther question: whether the occupancy variable can be removed temperature sensor or weather stations). Data limitation is a main
from baseline model due to its less contribution. On the contrary, barrier in data mining, so the simple data requirement is a consid-
although the contribution of occupancy variable to accuracy is erable benefit for modeling. In addition, this method only uses sim-
not as significant as temperature in this case, it can be an impor- ple regression algorithm, which is easy to implement and fast in
tant indicator in M&V for energy efficiency retrofit. First, it can data processing.
X. Liang et al. / Applied Energy 179 (2016) 247–260 259

The results of this study can be applied in energy efficiency ret- various methods for energy prediction (e.g., change-point regres-
rofit projects. Before retrofit, it can offer suggestions of data collec- sion, ANN, SVR, etc.). The LBNL model is only an example method
tion, decision making and risk assessment. For example, if the as function of energy prediction in this study to calculate the occu-
projects are mainly for HVAC, occupancy factor can be ignored. pancy influence on energy prediction quantitatively. But based on
However if the projects are mainly for plug load, it is necessary the results of this study, occupancy data can be included in more
to collect occupancy data before retrofit since it influences the methods to investigate the occupancy influence in further study.
energy baseline significantly. For the risk assessment, the results Further research of occupancy in baseline prediction can focus
of this study can also indicate the uncertainty of energy baseline on: (1) using larger data sets for potentially better results; (2)
model impacted by occupancy. If the uncertainty is relatively high, applying more methods and improving algorithm of the baseline
the investment strategy may be changed. After retrofit, the results model. It needs to consider the tradeoffs among result accuracy,
of this study can improve the energy saving assessment by includ- algorithm complexity and length of training period; (3) comparing
ing the occupancy factor. It is critical for ESCOs, since the profits of occupancy influences among different types of buildings and
ESCOs mainly depend on the calculated energy savings. developing benchmarks of M&V for energy efficiency retrofit.

Acknowledgements
5. Conclusions

This research is funded by the National Natural Science Founda-


Baseline prediction is a key issue in M&V and energy efficiency
tion of China (No. 71271184) and the Hong Kong Polytechnic
retrofit of buildings. Occupancy, as a critical impact factor of
University. It is also supported by the Assistant Secretary for
energy consumption, has been emphasized in previous studies.
Energy Efficiency and Renewable Energy of the U.S. Department
However, few previous studies used the occupancy variable in
of Energy under Contract No. DE-AC02-05CH11231 through the
baseline models or quantified the influence of occupancy variable
U.S.-China joint program of Clean Energy Research Center on Build-
on baseline prediction.
ing Energy Efficiency. Authors appreciated Clinton Andrews of Rut-
This study develops a new baseline model by including the
gers University for providing the occupancy data of Building 101.
occupancy data into the existing LBNL baseline model, and pro-
This work is also part of the research activities of IEA EBC Annex
poses metrics to quantify the accuracy of prediction and the
66, definition and simulation of occupant behavior in buildings.
impacts of variables. First, correlation between occupancy and
energy consumption is visualized and analyzed by time series plot,
scatter plot and statistical method. Then, the accuracies of the References
three baseline models are compared with the CVRMSE metric.
Thirdly, based on the accuracy of models, the contributions of vari- [1] EIA. Annual energy review, DOE/EIA – 0384, 2010; (2010, 09.03). Available:
ables are quantified and compared. Finally, the sensitivity analysis <http://www.eia.doe.gov/aer/pdf/aer.pdf> [retrieved on 09.03.10].
[2] Xu PP, Chan EHW, Qian QK. Success factors of energy performance contracting
is conducted to evaluate the influence of parameters in models. (EPC) for sustainable building energy efficiency retrofit (BEER) of hotel
The main findings are highlighted as follows: buildings in China. Energy Policy 2011;39(November):7389–98.
[3] Hong J, Shen GQ, Feng Y, Lau WS-T, Mao C. Greenhouse gas emissions during
the construction phase of a building: a case study in China. J Clean Prod
(1) The correlation between occupancy and total building energy 2015;103:249–59. 9/15/2015.
consumption is very high. Occupancy is most correlated to [4] Menassa CC, Baer B. A framework to assess the role of stakeholders in
plug load and lighting, with the correlation coefficients of sustainable building retrofit decisions. Sustain Cities Soc 2014;10:207–21.
[5] UNEP. Buildings can play key role in combating climate change, SBCI-
0.86 and 0.73 respectively. Outdoor air temperature has sustainable construction and building initiative, Oslo, 2007Available from:
much lower correlation with energy consumption than the <http://www.unep.org/Documents.Multilingual/Default.Print.asp>2007
occupancy. [retrieved on 09.15.09].
[6] Hong T, Piette MA, Chen Y, Lee SH, Taylor-Lange SC, Zhang R, et al. Commercial
(2) The contribution of the occupancy variable is relatively low building energy saver: an energy retrofit analysis toolkit. Appl Energy
(lower than contribution of temperature). It is mainly because 2015;159(December):298–309.
the time variable can provide most information of occu- [7] EPA. Public law 109-58: 109th congress: an act to ensure jobs for our future
with secure, affordable, and reliable energy; 2005.
pancy and the operation schedule is inconsistent with the
[8] Walter T, Price PN, Sohn MD. Uncertainty estimation improves energy
occupied time. measurement and verification procedures. Appl Energy 2014;130:230–6.
(3) The model including the occupancy variable shows faster con- [9] Xu P, Chan EHW. ANP model for sustainable Building Energy Efficiency Retrofit
vergence and narrower range of fluctuation in short training (BEER) using Energy Performance Contracting (EPC) for hotel buildings in
China. Habitat Int 2013;37(1):104–12.
periods. When training periods are getting longer, the results [10] Xia X, Zhang J. Mathematical description for the measurement and verification
of the models with and without the occupancy variable are of energy efficiency improvement. Appl Energy 2013;111(11):247–56.
getting closer. [11] Mathieu JL, Callaway DS, Kiliccote S. Variability in automated responses of
commercial buildings and industrial facilities to dynamic electricity prices.
(4) The piecewise number of occupants in regression does not Energy Build 2011;43(12):3322–30.
impact results significantly. But the piecewise number of the [12] Coughlin K, Piette MA, Goldman C, Kiliccote S. Statistical analysis of baseline
outdoor air temperature should be more than 2. load models for non-residential buildings. Energy Build 2009;41:374–81.
[13] Granderson J, Price PN, Jump D, Addy N, Sohn MD. Automated measurement
and verification: performance of public domain whole-building electric
There are several limitations of this study. First is the reliability baseline models. Appl Energy 2015;144:106–13. 4/15/2015.
of the source data. Due to the sensor failure and other reasons, [14] Granderson J, Price PN. Development and application of a statistical
methodology to evaluate the predictive accuracy of building energy baseline
there is some missing data. And there is a small door used occa- models. Energy 2014;66:981–90.
sionally, shown in Fig. 5, which causes the entering number and [15] Claridge DE. A perspective on methods for analysis of measured energy data
exiting number to sometimes not be equal. Although the deviation from commercial buildings. J Sol Energy Eng 1998;120:150–5.
[16] Taylor JW, De Menezes LM, McSharry PE. A comparison of univariate methods
is lower than 5%, it still impacts the accuracy of results. In addition,
for forecasting electricity demand up to a day ahead. Int J Forecast
due to data availability, the case study only uses data from a single 2006;22:1–16.
building and the time span is one year, so the results should be [17] Katipamula S, Reddy TA, Claridge DE. Multivariate regression modeling. J Sol
used with caution. Building 101 is a typical office building, the Energy Eng 1998;120:177–84.
[18] Kissock JK, Reddy TA, Claridge DE. Ambient-temperature regression analysis
occupancy is regular over time. It cannot represent other building for estimating retrofit savings in commercial buildings. J Sol Energy Eng
types with highly random occupancy (e.g., hotels). Third, there are 1998;120:168–76.
260 X. Liang et al. / Applied Energy 179 (2016) 247–260

[19] Lee W-S. Benchmarking the energy efficiency of government buildings with [28] Hong T, Taylor-Lange SC, D’Oca S, Yan D, Corgnati SP. Advances in research and
data envelopment analysis. Energy Build 2008;40(12):891–5. applications of energy-related occupant behavior in buildings. Energy Build
[20] Chung W, Hui YV. A study of energy efficiency of private office buildings in 2016.
Hong Kong. Energy Build 2009;41:696–701. [29] Yan D, O’Brien W, Hong T, Feng X, Burak Gunay H, Tahmasebi F, et al. Occupant
[21] Chung W, Hui Y, Lam YM. Benchmarking the energy efficiency of commercial behavior modeling for building performance simulation: current state and
buildings. Appl Energy 2006;83:1–14. future challenges. Energy Build 2015;107:264–78. 11/15/2015.
[22] Sabapathy A, Ragavan SK, Vijendra M, Nataraja AG. Energy efficiency [30] Rey E. Office building retrofitting strategies: multicriteria approach of an
benchmarks and the performance of LEED rated buildings for Information architectural and technical issue. Energy Build 2004;36(April):367–72.
Technology facilities in Bangalore, India. Energy Build 2010;42:2206–12. [31] Gucyeter B, Gunaydin HM. Optimization of an envelope retrofit strategy for an
[23] Martani C, Lee D, Robinson P, Britter R, Ratti C. ENERNET: studying the existing office building. Energy Build 2012;55(December):647–59.
dynamic relationship between building occupancy and energy consumption. [32] Miller E, Buys L. Retrofitting commercial office buildings for sustainability:
Energy Build 2012;47(4):584–91. tenants’ perspectives. J Prop Invest Finance 2008;26:552–61.
[24] Li N, Calis G, Becerik-Gerber B. Measuring and monitoring occupancy with an [33] Miller N, Spivey J, Florance A. Does green pay off? J Real Estate Portfolio
RFID based system for demand-driven HVAC operations. Automat Constr Manage 2008;14:385–400.
2012;24:89–99. [34] Wiley JA, Benefield JD, Johnson KH. Green design and the market for
[25] Pisello AL, Asdrubali F. Human-based energy retrofits in residential buildings: commercial office space. J Real Estate Finance Econ 2010;41(August):228–43.
a cost-effective alternative to traditional physical strategies. Appl Energy [35] EEBHUB. Energy efficient buildings hub; December 17. Available: <http://
2014;133:224–35. 11/15/2014. www.buildsci.us/eeb-hub.html>.
[26] Oldewurtel F, Sturzenegger D, Morari M. Importance of occupancy information [36] Zhou X, Yan D, Hong T, Ren X. Data analysis and stochastic modeling of lighting
for building climate control. Appl Energy 2013;101(1):521–32. energy use in large office buildings in China. Energy Build 2015;86(1):275–87.
[27] Feng X, Yan D, Hong T. Simulation of occupancy in buildings. Energy Build [37] Sun K, Yan D, Hong T, Guo S. Stochastic modeling of overtime occupancy and
2015;87:348–59. 1/1/2015. its application in building energy simulation and calibration. Build Environ
2014;79(9):1–12.

You might also like