Olofsson Etal 2013 RSE
Olofsson Etal 2013 RSE
Olofsson Etal 2013 RSE
Making better use of accuracy data in land change studies: Estimating accuracy and
area and quantifying uncertainty using stratified estimation
Pontus Olofsson a,⁎, Giles M. Foody b, Stephen V. Stehman c, Curtis E. Woodcock a
a
Department of Earth and Environment, Boston University, 675 Commonwealth Avenue, Boston, MA 02215, USA
b
School of Geography, University of Nottingham, University Park, Nottingham NG7 2RD, UK
c
Department of Forest and Natural Resources Management, State University of New York, 1 Forestry Drive, Syracuse, NY 13210, USA
a r t i c l e i n f o a b s t r a c t
Article history: The area of land use or land cover change obtained directly from a map may differ greatly from the true area
Received 15 February 2012 of change because of map classification error. An error-adjusted estimator of area can be easily produced once
Received in revised form 23 October 2012 an accuracy assessment has been performed and an error matrix constructed. The estimator presented is a
Accepted 26 October 2012
stratified estimator which is applicable to data acquired using popular sampling designs such as stratified
Available online 29 November 2012
random, simple random and systematic (the stratified estimator is often labeled a poststratified estimator
Keywords:
for the latter two designs). A confidence interval for the area of land change should also be provided to quan-
Land use change tify the uncertainty of the change area estimate. The uncertainty of the change area estimate, as expressed via
Land cover change the confidence interval, can then subsequently be incorporated into an uncertainty analysis for applications
Carbon modeling using land change area as an input (e.g., a carbon flux model). Accuracy assessments published for land
Uncertainty change studies should report the information required to produce the stratified estimator of change area
Stratified estimation and to construct confidence intervals. However, an evaluation of land change articles published between
Accuracy assessment 2005 and 2010 in two remote sensing journals revealed that accuracy assessments often fail to include this
key information. We recommend that land change maps should be accompanied by an accuracy assessment
that includes a clear description of the sampling design (including sample size and, if relevant, details of
stratification), an error matrix, the area or proportion of area of each category according to the map, and de-
scriptive accuracy measures such as user's, producer's and overall accuracy. Furthermore, mapped areas
should be adjusted to eliminate bias attributable to map classification error and these error-adjusted area es-
timates should be accompanied by confidence intervals to quantify the sampling variability of the estimated
area. Using data from the published literature, we illustrate how to produce error-adjusted point estimates
and confidence intervals of land change areas. A simple analysis of uncertainty based on the confidence
bounds for land change area is applied to a carbon flux model to illustrate numerically that variability in
the land change area estimate can have a dramatic effect on model outputs.
© 2012 Elsevier Inc. All rights reserved.
1. Introduction is a variable of greater impact than climate change (Skole, 1994). Land
change is, for example, the single most important variable affecting eco-
Land use or land cover change (referred to as “land change” for the logical systems (Chapin et al., 2000; Vitousek, 1994) and the greatest
reminder of the article) impacts on a very diverse array of environmen- threat to biodiversity (Sala et al., 2000). The importance of land change
tal properties and processes. The effects of a land change may be felt is evident in the growth of interest in land change science (Turner et al.,
across a broad spectrum of environmental systems including the atmo- 2007) and so there is consequently considerable interest in land cover
spheric, hydrologic, geomorphologic and ecologic. Deforestation may, and a need for accurate information on land cover and its dynamics.
for example, act as a source of carbon to the atmosphere, lead to en- Indeed the central role of land surface change to a vast array of contem-
hanced soil erosion, reduce the extent of habitat and so to species de- porary concerns is reflected in its role as an underpinning feature of the
clines and contribute to displacement of human populations. Land current grand challenges for the geographical sciences articulated
change is, therefore a critical variable in relation to two environmental recently by the US National Academy of Sciences (CSDGSND, 2010).
issues of great societal concern: climate change and biodiversity loss. Remote sensing has the potential to provide accurate information on
Land change can be a cause and a consequence of climate change and land cover but numerous problems may be encountered and the
adequacy of this information has been questioned (Townshend et al.,
⁎ Corresponding author. Tel.: +1 6173539374.
1992; Wilkinson, 1996, 2005).
E-mail address: olofsson@bu.edu (P. Olofsson). In many applications the main focus is the area of a land cover class
URL: http://www.bu.edu/geography (P. Olofsson). or its gross change over time (gross change refers to the total area of
0034-4257/$ – see front matter © 2012 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.rse.2012.10.031
P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131 123
gain or total area of loss of a land cover or land use class, for example, obtain the mapped area of the class. This approach is a simple way to
total area of forest cover gain, whereas net change refers to the differ- compute area but in the presence of asymmetric classification errors
ence between the gains and losses of a class, for example forest gain pixel counting is biased for the true proportion of area (Czaplewski,
subtracted from forest loss). The importance of area has been evident 1992; Gallego, 2004; Stehman, 2005). The bias resulting from apply-
throughout the entire history of satellite remote sensing. For example, ing pixel counting to obtain the area of a class is labeled as “measure-
the area of land planted to wheat was a central component of the ment bias” rather than as “estimator bias” because a pixel count
Large Area Crop Inventory Experiment (LACIE) in the 1970s (Erickson, represents a complete census of the region and therefore is not a
1984). Numerous studies have investigated changes in the extent of sample-based estimator (Särndal et al., 1992; Stehman, 2005). A vari-
land cover classes such as forests (DeFries et al., 2002; Olofsson et al., ety of approaches exist to estimate area using information obtained
2011), deserts (Eklundh & Olsson, 2003), lakes (Smith et al., 2005), sa- from an accuracy assessment sample (Card, 1982; Czaplewski, 1992;
vannas (Brannstrom et al., 2008), crop lands (Fuller et al., 2012; Liu et McRoberts, 2011; Stehman, 2009). Card (1982) introduced the strat-
al., 2005), shrublands (McManus et al., 2012) and urban areas (He et ified estimator for area estimation that is the primary focus of this ar-
al., 2010; Lizarazo, 2010; Schneider & Woodcock, 2008). ticle. Gallego's (2004) review of area estimation provides an excellent
In some instances there is also considerable interest associated with summary of many of the area estimation options, including a critique
the agencies and consequences of changes, such as associated in studies of pixel counting and an overview of estimators combining ground
of burned areas (e.g. Vivchar, 2011) or ice sheets (McCabe et al., 2011). and remote sensing information, as well as a review of methods for
Interest in land cover and land changes remains and may be expanded small area estimation applicable when interest lies in small geograph-
further as a result of major policy related issues. This activity may be ic regions that receive few sample units. Stehman (2009) noted that
linked to a variety of areas of policy, from that seeking to counter- many of the area estimators previously proposed in the literature
narcotics (e.g. Taylor et al., 2010) to those underpinning major issues could be unified under the framework of “model-assisted” estimation
in contemporary environmental science. For example, member states in which “a model motivates the form of the estimator, but inference
of the European Union are obligated to maintain the extent of key hab- depends on the sampling design” (Lohr, 1999, p. 147) and so design-
itats under the Habitats Directive (EC, 2011). The goal of the United based inference (Särndal et al., 1992; Stehman, 2000) is still invoked.
Nations Collaborative Programme on Reducing Emissions from Defores- McRoberts (2011) re-invigorated recognition of the importance of
tation and Forest Degradation in Developing Countries (UN-REDD), estimating standard errors and confidence intervals to provide a com-
established in 2005, is to reduce green house gas emissions from defor- plete inference when estimating area using information from remote
estation while maintaining sustainability (UN-REDD, 2008). This calls sensing (see also McRoberts, 2010). Although in this article we limit
for area estimates of deforestation, accompanied by a statement of the attention to design-based inference, the existence of model-based in-
uncertainty of the estimates (e.g. confidence intervals) – as clearly stat- ference should be acknowledged.
ed in the UN-IPCC Good Practice Guidance for Land Use, Land Use Despite the availability of methods to refine class extent estimates
Change and Forestry (IPCC, 2003, p. G-2): for classification errors, the pixel counting approach continues to be
widely used. Consequently, the full potential of the accuracy assess-
“Estimates should be accurate in the sense that they are systemat- ment data is not being utilized. In fact, it is not being used at all for es-
ically neither over nor under true emissions or removals, so far as timating area in a pixel counting approach. An accuracy assessment
can be judged, and that uncertainties are reduced so far as is does more than indicate the accuracy of the map — it provides sample
practicable” data that can be used to avoid the measurement bias of pixel counting
and to decrease the standard error of the estimated area. As such, the
Accurately quantifying the extent of a land cover class or its amount accuracy assessment results should not be the final step of the quality
of change over time is, however, a non-trivial task. Recent estimates of evaluation but an integral part of the overall analysis of accuracy and
the extent of global urban land cover, for example, differ by an order area.
of magnitude (Potere & Schneider, 2007). Similarly estimates of land The problems encountered in estimating the area and accuracy of
changes, such as forest change which is central to carbon accounting change are well known in remote sensing. Similarly, methods to ex-
(Kuemmerle et al., 2011; Olofsson et al., 2011) are often subject to plicitly address the concerns have been discussed openly in the liter-
substantial error (Skole & Tucker, 1993). Addressing key environmental ature for decades, but the community has yet to adopt these methods
science concerns at regional to global scales requires accurate informa- as part of routine practice. Here, we re-state some basic principles to
tion on the extent of land cover classes and their change over time, in- help ensure that the vast potential of remote sensing as a source of in-
formation that is often based on remote sensing. formation on change may be realized more fully and thus contribute
The area of land change may be obtained directly from change maps constructively to major international research and policy priorities
determined from remote sensing data. Such maps are typically pro- (e.g. deforestation and UN-REDD).
duced by an image classification analysis. Singh (1989) reviews many
of the methods for mapping land change from remote sensing. Popular 1.1. Objectives
approaches are typically based either on the comparison of the
radiometric properties of images acquired at different times such as The objective of this article is to present a simple strategy for using
change vector analysis (Lambin & Strahler, 1994) or on the differences the information obtained for map accuracy assessment to estimate
in land cover maps obtained from the images such as from a post- area of a land cover class or of land change, and to construct confi-
classification comparison analysis. The latter methods are particularly dence intervals that reflect the uncertainty of the area estimates
popular but in many studies of change via a post-classification compar- obtained. The rationale for presenting this strategy is that even
ison analysis, the accuracy of change is not evaluated (e.g. Kuemmerle though the methods we advocate date back to at least (Card, 1982),
et al., 2009; Xian & Crane, 2005) as researchers instead assess the accu- the remote sensing community has yet to consistently adopt these
racy of the individual classifications (i.e., the classification for each good practices. Our evaluation of common practice of land change
date). As will be illustrated below, even when the two classifications accuracy assessment is based on peer-reviewed land change papers
are highly accurate it is possible that the accuracy of the change map published during 2005–2010 in Remote Sensing of Environment and
will be low and that the area computed from a change map could be the International Journal of Remote Sensing. Our review of these arti-
badly biased. cles revealed that many published assessments did not provide the
Pixel counting involves determining the number of pixels allocat- full information to address land change accuracy and area estimation
ed to a map class and multiplying this number by the area of a pixel to objectives (Section 2). To remedy the common shortcomings of land
124 P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131
change assessments, we present an analysis that makes full use of the Because accuracy measures are typically estimated from a sample,
map and accuracy assessment data by: 1) estimating accuracy (e.g., these estimates are subject to uncertainty. The uncertainty of an
user's, producer's, and overall accuracies), 2) estimating area of land estimate can be represented by computing its standard error or by
change using the accuracy assessment sample data to adjust area reporting a confidence interval. A confidence interval provides a
for map classification error, and 3) estimating standard errors or con- range of values for a parameter taking into account the uncertainty
fidence intervals for the error-adjusted area estimates. Several nu- of the sample-based estimate.
merical examples are provided to show how to produce these The manner in which uncertainty is addressed depends on the in-
estimates using simple estimators that can be applied to popular sim- ference framework employed. In this article, we will use design-based
ple random, systematic, and stratified random sampling designs inference (Särndal et al., 1992) in which the uncertainty associated
(Section 3). Lastly, we illustrate the importance of accounting for un- with the estimator is defined as the variability of the estimates over
certainty of land change area estimates in applications that use these the set of all possible samples that could have been obtained for the
area estimates as inputs. Specifically, we show how the outputs of a chosen sampling design and population sampled. The standard error
carbon flux model change with respect to the variability of the land computed for the particular sample selected is an estimate of the
change area estimates input to the model (Section 4). variability over the set of all possible samples. Other sources of uncer-
We focus on applications in which a map of land change is pro- tainty will often be present. For example, error in the reference class
duced. This map is subjected to an accuracy assessment based on a label, geo-location error, and mis-matched classification legends may
spatially explicit (e.g., per pixel) comparison between the map class generate additional uncertainty. But we address only the variability
and the reference class for a sample of spatial units (cf. Stehman & attributable to sampling in our uncertainty analysis.
Czaplewski, 1998). The reference class is defined as the best available
determination of the ground condition at a specified location. The ref- 2. Literature review
erence class label is assumed to be correct, but it is well-known that
reference classification error often occurs and can impact greatly on As stated in the objectives (Section 1.1), an accuracy assessment of a
evaluations of land cover and land cover change by remote sensing land change map should include the following information: 1) esti-
(Foody, 2010). For simplicity, we limit the attention to an accuracy mates of accuracy of change; 2) estimates of land change area that
assessment unit that is a pixel or other equal area spatial unit. We adjust the map “pixel count” area for classification error; and 3) confi-
assume a hard classification scheme for both the map and reference dence intervals associated with the accuracy and land change area pa-
classification, where a hard classification is defined as one in which rameter estimates. Methods for achieving these three outcomes have
each pixel is assigned fully to only one class. The literature also pro- been extensively published. For example, formulas for estimating accu-
vides extensive guidance on other important issues connected to racy and the standard errors of these accuracy estimators may be found
the data such as the sample size and class definitions that are beyond in Card (1982) and Stehman and Foody (2009). Area estimation has re-
the scope of this current article (Stehman, 2012; Strahler et al., 2006). cently received considerable attention, with various approaches
reviewed by Gallego (2004), McRoberts (2010), McRoberts (2011),
1.2. Accuracy and uncertainty and Stehman (2009). The stratified estimator of area we highlight
(Section 3) has been in use at least since Card (1982). The stratified es-
Accuracy is defined as the degree to which the map produced timator is applicable to sampling designs commonly used in accuracy
agrees with the reference classification. The most commonly used assessment, namely simple random, systematic, and stratified random
measures of accuracy are the following (see Liu et al., 2007 for a com- sampling. When applied to simple random or systematic sampling
prehensive review of other accuracy measures): (i.e., designs without stratification), the stratified estimator has been
historically referred to as a “poststratified” estimator to distinguish be-
(i) Overall accuracy is simply the proportion of the area mapped cor- tween using strata in the sampling design (i.e., stratified sampling) in
rectly. It provides the user of the map with the probability that a contrast to using strata in the estimator (i.e., poststratified estimation).
randomly selected location on the map is correctly classified. As a guide to popular practice, we surveyed articles published
(ii) User's accuracy is the proportion of the area mapped as a partic- between 2005 and 2010 in two major journals, Remote Sensing of Envi-
ular category that is actually that category “on the ground” ronment and the International Journal of Remote Sensing, that have a
where the reference classification is the best assessment of track record of publishing land change articles (see Table S1, Supple-
ground condition. If a “user” employs the final change map for lo- mental material). The articles were categorized into four different clas-
cating a particular area of land change, the user's accuracy gives ses in relation to the way accuracy and area estimation of land change
the conditional probability of that map location actually having were reported: 1) no accuracy measures presented, 2) accuracy mea-
changed. User's accuracy is the complement of the probability sures presented but not for the accuracy of change (i.e., the accuracies
of commission error. of the maps produced at each date are provided but a direct assessment
(iii) Producer's accuracy is the proportion of the area that is a partic- of change accuracy is not conducted), 3) accuracy measures for change
ular category on the ground that is also mapped as that category. presented, and 4) accuracy measures presented and accuracy informa-
The producer's accuracy provides the “producer” of the final tion used for calculation of adjusted change area and/or confidence
land change map with the conditional probability of a particular intervals. It should be evident that the four classes are ordered in
location of actual land change appearing as land change on the terms of the rigor of the accuracy assessment with the latter class com-
map. Producer's accuracy is the complement of the probability prised of articles that adhere most closely to good practice.
of omission error. Our evaluation of the articles published in the two journals re-
(iv) The kappa coefficient of agreement is often used as an overall mea- vealed that most land change studies do not take full advantage of
sure of accuracy. Kappa purportedly incorporates an adjustment the information available from accuracy assessment. Of the 57 publi-
for “random allocation agreement”, but the validity of such an ad- cations identified as conducting a land change study (mainly defores-
justment is arguable and numerous articles have questioned the tation), 8 did not publish any accuracy measures at all, 24 did not
use of kappa (Foody, 2002; Pontius & Millones, 2011; Stehman, report an accuracy assessment of change (although accuracy of single
1997). Adjusting for random allocation agreement has no rele- date maps may have been provided), 16 reported an accuracy assess-
vance to area estimation so the following analysis and discussion ment of change, and 9 studies provided the accuracy measures and
will not include kappa. We recommend that kappa should not be the information needed to compute the error-adjusted area estimate
used in the assessment of the accuracy of land change maps. of change. Only three articles included this last step of presenting an
P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131 125
estimate of land change area that was adjusted according to the accu- Table 1 illustrates the common practice of reporting the error ma-
racy assessment data. So while the problems of estimating area from trix in terms of sample counts. A more informative presentation of the
imperfect classifications and methods to address them have been error matrix is in terms of the unbiased estimator of the proportion of
discussed in the remote sensing literature for at least 30 years (e.g. area in cell i, j of the error matrix:
Card, 1982) it appears that methods used and conclusions drawn in
many studies fail to make full use of the available data. nij
p^ ij ¼ W i ð1Þ
A post-classification analysis to obtain change (described in ni⋅
Section 3) was used in 31 of the reviewed papers. Of these 31 articles,
5 did not report any accuracy assessment, 23 had accuracy assess- where the total area of the map is Atot, the mapped area of category i
ments performed on the individual land cover classifications used is Am,i (subscript m denotes “mapped”), and the proportion of the area
for the overlay (i.e. single-date accuracy) but not for the final change mapped as category i is Wi = Am,i ÷ Atot. The error matrix in terms of
map, and only 2 assessed the accuracy of the final change map. As estimated area proportions is shown in Table 2. An advantage of the
demonstrated in Section 3, accuracy measures produced for the indi- presentation given in Table 2 is that accuracy and area estimates
vidual single date land cover maps may not be indicative of the can be computed directly from the error matrix.
accuracy of the change map. The overall accuracy of a change map Because of classification error, the mapped area proportions given
constructed by overlaying two land cover classifications is the prod- by Am,i ÷ Atot are usually biased when the objective is to estimate the
uct of the overall accuracies of the pre-comparison maps if the classi- true proportion of area of category i as determined from the reference
fication errors are independent (Fuller, 2003; van Oort, 2007). classification. Instead of obtaining the area directly from the map classi-
However, the assumption of independent errors would generally be fication, an area estimator can be based on the reference classification of
untenable because locations that were difficult to classify correctly each sample unit. The area proportions for each reference-defined cate-
at one date would likely be difficult to classify correctly at another gory j are estimated from the column totals (p^ ⋅j ) in Table 2. An unbiased
date (Congalton, 1988). Thus reporting overall accuracy for both estimator of the total area (based on the reference classification) of cat-
dates is unlikely to be informative of the overall accuracy of the egory j is then:
post-classified change map.
^ ¼ A p^
A ð2Þ
The common practice of reporting error matrices demonstrates j tot ⋅j
S p^ ⋅j ¼ t
n
W 2i : ð3Þ
i⋅ i⋅
i¼1
ni⋅ −1
In Section 1, the key information and computations needed to
produce a complete and rigorous report of accuracy of a land change
map and estimation of area are documented. Two examples are The standard error of the error-adjusted estimated area is
provided. The first example is presented to illustrate in detail the cal-
^ ¼ A S p^ :
S A ð4Þ
culations needed to make use of the accuracy data for estimating area j tot ⋅j
and associated confidence intervals (Section 2). The second example
shows the analysis of accuracy and area when change is determined An approximate 95% confidence interval for Aj is,
by overlaying land cover classifications for the two dates bracketing
the change period, in this case classifications of forest and non- A ^
^ 2S A ð5Þ
j j
forest (Section 3), but the approach is applicable to any land change
such as those highlighted in Section 1.
where the margin of error is defined as the z-score (z is a percentile
from the standard normal distribution) multiplied by the standard
3.1. Estimating accuracy and area of change error (i.e., the± part of the confidence interval), and the value of the
z-score depends on the confidence level (for 95% confidence, z = 1.96
Suppose the objective is to assess the accuracy of a map with q which is approximated here to 2 for simplicity of presentation).
categories and to estimate the area of a particular map category. A
sample of assessment units (e.g., pixels) is selected by simple ran- Table 1
dom, stratified random (with the map classes as strata), or systematic Error matrix of sample counts, nij. Map categories are the rows while the reference cat-
egories are the columns.
sampling. A sample error matrix is constructed where the map cate-
gories (i = 1, 2, …, q) are represented by rows and the reference cate- Class 1 2 ⋯ q Total
gories (j = 1, 2, …, q) by columns (Table 1). Note that in some 1 n11 n12 ⋯ n1q n1⋅
presentations of an error matrix (e.g. Card, 1982), the rows and col- 2 n21 n22 ⋯ n2q n2⋅
umns are reversed. The basic principles of the methods outlined ⋮ ⋮ ⋮ ⋱ ⋮ ⋮
below still apply to such situations but accommodation for the switch q nq1 nq2 ⋯ nqq nq⋅
Total n⋅ 1 n⋅2 ⋯ n⋅q n
in row and column contents is required.
126 P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131
^ ¼ p^ ii
Wi
U ð6Þ ni⋅ −1 99 299
i
p^ i⋅ i¼1
1
2 2
1−100 2
2
p^ jj þ0:35 100
Þ ¼ 0:00613:
P^ j ¼ ð7Þ 99
p^ ⋅j ð11Þ
X
q
The standard error of the adjusted area estimate is
^¼
O p^ jj : ð8Þ
j¼1
^ ¼ A Sðp^ Þ ¼ 1; 755; 123 0:00613 ¼ 10; 751 ha
S A ð12Þ
1 tot ⋅1
In this example, we illustrate the computations for the stratified es- Class 1 2 3 Total Map area [ha] Wi
timator of the proportion of land change area and the estimated 95% 1 97 0 3 100 22,353 0.013
confidence interval for change area using the equations presented in 2 3 279 18 300 1,122,543 0.640
Section 1. A Landsat image pair acquired for 1990 and 2000 covering 3 2 1 97 100 610,228 0.348
Total 102 280 118 500 1,755,123 1
Eastern Massachusetts, Rhode Island and parts of Connecticut (path
P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131 127
Table 4 potential concern. Using a simple set of equations and the information
Estimated error matrix based on Table 3 with cell entries expressed as the estimated in the error matrix, we obtained an error-adjusted estimate of the area
proportion of area. Accuracy measures are presented with a 95% confidence interval.
Map categories are the rows while the reference categories are the columns.
of deforestation (±95% confidence interval) of 45,651±21,502. The
area of deforestation obtained with the stratified estimator confirms
Class 1 2 3 Total User's Producer's Overall the need to adjust the map area of deforestation obtained from pixel
1 0.012 0 0.0004 0.013 0.97 ± 0.03 0.48 ± 0.23 0.94 ± 0.04 counting to account for the large omission error of deforestation. In this
2 0.006 0.595 0.038 0.639 0.93 ± 0.03 0.99 ± 0.01 example, the map area of deforestation is so severely underestimated
3 0.007 0.003 0.337 0.347 0.97 ± 0.03 0.88 ± 0.04
that it is outside the 95% confidence interval obtained using the error-
Total 0.026 0.598 0.376 1
adjusted area estimator. Using the mapped area of 22,353 ha could
have a substantial effect on applications that use this area as input (e.g.,
which gives a final land change area estimate with a margin of error quantifying the amount of carbon released to the atmosphere arising
(at approximate 95% confidence interval) of from deforestation) compared to the results based on using the estimat-
ed adjusted area of 45,651±21,502 ha (see Section 4).
A ^ ¼ 45; 651 21; 502 ha:
^ 2S A ð13Þ
1 1
with a mapped area of deforestation of 22,353 ha. The accuracy assess- 2005
ment based on the stratified estimators revealed that the change map 1 166 10 176 0.30 0.94 ± 0.04 0.96 ± 0.04 0.97 ± 0.02
2 3 180 183 0.70 0.98 ± 0.02 0.98 ± 0.02
had an overall accuracy of 94% and a user's accuracy of deforestation of
Total 169 190 359 1
97%. But the low producer's accuracy of 48% for deforestation served as
a warning that omission error associated with the deforestation class 2010
was problematic. Moreover, a naïve assessment of the producer's accura- 1 160 13 173 0.33 0.92 ± 0.04 0.99 ± 0.02 0.97 ± 0.02
cy, based upon sample counts, would suggest that the classification was 2 1 189 190 0.67 0.99 ± 0.02 1.0 ± 0.02
Total 161 202 363 1
highly accurate (appearing as 95% but really 48%) and not indicate a
128 P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131
NE 10y s3
NE 5y s3
Error matrix for the 2005–2010 change map of Romania. Class 1 is deforestation, class 2
6 map = adjusted area
is stable forest and class 3 is stable non-forest. Map categories are the rows while the
Egypt 6y
NE 10y s4
reference categories are the columns. Accuracy measures are presented with a 95%
NE 10y s5
confidence interval and overall accuracy was 0.96 ± 0.01.
5
Class 1 2 3 Total Wi User's Producer'
Normalized area
NE 10y s2
1 127 66 54 247 0.007 0.51 ± 0.06 0.67 ± 0.32
Egypt 13 y
4
NE 10y s1
2 2 322 17 341 0.295 0.94 ± 0.02 0.93 ± 0.03
3 0 15 540 555 0.698 0.97 ± 0.01 0.98 ± 0.01
NE 5y s2
Total 129 403 611 1143 1 3
Romania 10y
NE 5y s1
Georgia
Romania 5y
NE 5y s5
2
NE 5y s4
resulting error-adjusted estimated change area (119,420 ha) is smaller
than the mapped area (154,159 ha).
Even though the two forest/non-forest maps were highly accurate 1
with user's accuracies of about 95% (Table 5), the user's accuracy of the
deforestation class in the change map was only 51% (Olofsson et al.,
0
2011), indicating that the forest change obtained by post-classification 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
was inaccurate.
The results of a post-classification comparison are often summarized Fig. 1. Error-adjusted estimated areas of change and associated 95% confidence inter-
vals for a set of land change studies. The “NE” studies refer to the change maps of
by a “from–to” change matrix that provides the area of the different pos-
New England in Jeon et al. (in press); “10y” and “5y” refers to the study periods of
sible land cover transitions from one land cover category to another or 1990–2000 and 2000–2005 respectively; and “s1–5” refers to the 5 scenes that made
transitions to one category from another. The example presented in up the study area.
this subsection involved a comparison of two land cover maps (i.e., two
dates) of only two classes (forest and non-forest) so there are four possi-
ble “cells” in the change matrix (two no change “transitions” and two Of the 15 assessments examined, 6 had a mapped area of change
changes, forest to non-forest, and non-forest to forest). If instead land outside the 95% confidence interval based on the adjusted area esti-
cover maps containing several classes and/or representing more than mate indicating that the difference between the mapped area and
two dates are compared, a more elaborate “from–to” transition matrix the error-adjusted area was substantial. The difference between the
would be required. For change in one time interval (two time points), mapped area and the estimated adjusted area varies greatly among
the number of elements in the from–to change matrix is the square of the different studies, and the discrepancy will largely depend on the
the number of land cover classes. For example, overlaying two annual relative magnitudes of the area of omission error and the area of com-
IGBP MODIS Land Cover Products (Friedl et al., 2010), with 17 classes mission error.
mapped would have produced a change matrix with a staggering 289 The effect of incorrectly ignoring the stratified design when esti-
elements. To avoid the problems inherent in mapping change by over- mating producer's accuracy can be dramatic. As the variation in esti-
laying two maps, the producers of the MODIS Land Cover Product strong- mation weights among strata increases, the discrepancy between
ly recommend against post-classification comparisons of their products the stratified estimator of producer's accuracy and the incorrect
for inferring change (Friedl et al., 2010, p. 177). (unweighted) estimator is likely to increase For the accuracy assess-
It should be noted, however, that situations might arise where ments of the 15 change maps in Fig. 1, the average difference between
post-classification approaches are necessary such as when a more cur- the biased and unbiased producer's accuracies for the change class
rent map is compared to a historical baseline map or when classification was 0.38 with a maximum difference of 0.64. Estimates of overall
training data for the change classes are insufficient for a direct classifica- accuracy are less affected by using the biased estimator based on
tion of change. Further, post-classification approaches for change detec- just the Table 1 sample counts — of the studies in Fig. 1 the difference
tion do have the potential of producing accurate results (Singh, 1989). between the biased and unbiased overall accuracies was less than 0.1
(Table S2, Supplemental material). Even if the unweighted sample
count estimate is sometimes relatively close to the estimate obtained
3.4. Additional context by the stratified estimator, good practice dictates that the stratified
estimators (Eqs. 6–8) should be used as the sample had been acquired
To provide additional exploration of the accuracy and area estima- under a stratified design.
tion issues, we identified from the literature several articles for which
the data were available to conduct the complete analysis described in 4. Propagating the uncertainty of land change area estimates: a
Section 1. A total of 15 accuracy assessments were extracted from sensitivity analysis of modeled terrestrial carbon flux
these articles. In all cases, the accuracy assessment sampling design
was stratified random with the strata defined by the map classes. The effect of the uncertainty of land change area estimates on appli-
The first analysis of these examples examines the relationship be- cations using these area estimates is illustrated using the results from
tween the error-adjusted estimate of area and the mapped area of three terrestrial carbon flux studies that use estimates of land change
land change (Fig. 1). To facilitate the comparison of different studies, obtained by remote sensing (Olofsson et al., 2010, 2011). This effect is
the error-adjusted area estimates and associated 95% confidence in- demonstrated via a simple sensitivity analysis in which the land change
tervals are reported relative to the mapped area of change (i.e., the areas' input to a carbon flux model is varied to examine the effect on the
error-adjusted area estimates are divided by the mapped area of output of the model. The land change area values used as input to the
change). Therefore, a scaled value of 1.0 in Fig. 1 would indicate model were the error-adjusted estimated area and the upper and
that the mapped area and error-adjusted estimated area are equal lower 95% confidence bounds of the change area from the error-
whereas a value of 3 would indicate that the adjusted area estimate adjusted approach (and the map change area in the case of Romania).
is three times the mapped extent. Similarly, if the confidence intervals The confidence bounds for area of change reflect uncertainty attribut-
(scaled by map area) include the value of 1.0, the mapped area of able to sampling variability, and this lone source of variation is incorpo-
change falls within the confidence interval estimated from the rated in the carbon flux uncertainty analysis. Our intent is to provide a
error-adjusted area of change suggesting no significant difference. simple, easily conducted analysis to illustrate the potential impact of
P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131 129
variation in the estimated land change area on studies that use such in- 2060 which is about a third relative to the anthropogenic emissions in
formation. More sophisticated error propagation techniques would be 2007 and would result in a 33% increase in the total carbon emissions
required to incorporate multiple sources of uncertainty. (anthropogenic and terrestrial) of Georgia. This result sharply conflicts
In the first example (Fig. 2a), the area of deforestation for a portion of with the result obtained using the lower confidence bound of deforesta-
New England (USA) was estimated by direct classification for two time tion area which leads to predicting a carbon sink that would persist
periods, 1990–2000 and 2000–2005 (Jeon et al., in press). The area until 2060 before leveling off at close to zero.
estimates were converted to annual rates of change, and these rates The modeled annual carbon flux of Romania (Fig. 2c) resulting from
were then input to a carbon book-keeping model (e.g. (Houghton et using different area estimates as model input varies less than the
al., 1983)) to estimate using a model the annual terrestrial carbon flux. modeled fluxes observed for the New England and Georgia studies.
The same methodology was applied in the second example based on The land change mapped in the Romania study was assumed to be for-
Olofsson et al. (2010) in which the impact of land use change on the ter- est harvest where the logged forest was assumed to regrow whereas in
restrial carbon flux of the country of Georgia was investigated. For the New England and Georgia, the mapped change was assumed to be
Georgia study (Fig. 2b) the area of deforestation was estimated for a sin- permanent deforestation. As forests regrow following forest harvest,
gle time period, 1990–2000. For the third example (Fig. 2c), Olofsson et the impact on the carbon budget is less uniform though the margin of
al. (2011) quantified the carbon implications of forest restitution in Ro- error of the 2005–2010 logging area is large (48% of the area). Carbon
mania following the collapse of the Soviet Union. Rates of forest harvest sequestered in the regrowing forest counterbalances the carbon re-
in Romania were estimated for 1990–2000, and 2005–2010. The esti- leased from the harvested wood. This is not the case in New England
mated annual rates of forest change input to the terrestrial carbon and Georgia.
model to generate Fig. 2 are listed in Table 7. These examples (Fig. 2) demonstrate that the uncertainty in land
The annual terrestrial carbon flux estimated using a model varies change area estimates has a substantial impact on the predicted ter-
greatly because the uncertainty of the estimated area of deforestation restrial carbon budget. Using just the mapped area of land change in
used as input to the model is substantial. In Fig. 2a, using the lower con- further estimation of important ecosystem variables such as terrestri-
fidence bound for the area of deforestation would result in New England al carbon flux, may result in estimates that are very different from
being considered a carbon sink throughout the century whereas using reality. The examples shown have illustrated situations in which
the upper bound of deforestation results in a century-long carbon both the magnitude and direction of the carbon flux may vary
source. The confidence interval for the area of deforestation in the depending on which estimated values are used. Quantifying the un-
Georgia study is even broader and the resulting carbon fluxes (Fig. 2b) certainty of land change area estimates is thus essential. The results
predicted by the model consequently vary greatly. Using the upper of the carbon flux examples highlight the importance of an accuracy
bound to predict carbon flux results in sink of −0.09 Tg C in 2007 assessment to inform uncertainty analyses for applications that use
while the lower bound generates a sink of −0.46 Tg C. As reference, land change area estimates as key inputs.
the anthropogenic emissions in Georgia were 1.6 Tg C in 2007 (UN,
2007) which translates into terrestrial carbon offsets of 6% and 28%, re- 5. Conclusions
spectively (“carbon offset” in this context refers to the percentage of the
anthropogenic carbon emissions that is offset by the terrestrial carbon The mapped area of deforestation, forest harvest or any other land
sink). Using the upper confidence bound results in a substantial carbon change is likely to be different from the actual area because of map
source starting in 2020 and reaches a maximum of almost 0.5 Tg/y in classification error even if the change map has a high overall accuracy.
a b
1 0.6
Carbon flux [Tg C/y]
Carbon flux [Tg C/y]
0.5 0.4
0.2
0
0
-0.5
-0.2
-1 -0.4
-1.5
2000 2050 2100 2000 2050 2100
Year Year
c 1
Adjusted Area
Carbon flux [Tg C/y]
0 Mapped Area
95% CI
-1
-2
-3
2000 2050 2100
Year
Fig. 2. The annual net flux of terrestrial carbon for (a) New England, (b) Georgia and (c) Romania associated with changes in forest harvest, deforestation, and forest expansion.
A carbon book-keeping model was run using the error-adjusted estimated rates of deforestation ((a) also includes the mapped rate) including the lower and upper confidence
bounds. All rates were estimated until 2005 (a), 2000 (c) and 2010 (c), and kept constant in the model until 2100.
130 P. Olofsson et al. / Remote Sensing of Environment 129 (2013) 122–131
Table 7 Acknowledgment
Annual estimated rates of deforestation for New England (USA) and Georgia and forest
harvest for Romania used to generate the terrestrial carbon flux output shown in Fig. 2.
All area estimates in hectares.
This research was funded by USGS Award Support for SilvaCarbon
and NASA award NNX11AJ79G to Boston University, and USGS Coop-
Study Mapped Adjusted Margin of error (95% CI) erative Agreement G12AC20221 to State University of New York. We
NE 90-00 4235 10,219 ±5302 thank the anonymous reviewers for the comments that helped im-
NE 00-05 6268 5427 ±1577 prove the manuscript.
Georgia 90-00 1645 1669 ±1420
Romania 90-00 14,067 15,122 ±5397
Romania 05-10 30,832 23,884 ±11,510
Appendix A. Supplementary data
Liu, J., Tian, H., Liu, M., Zhuang, D., Melillo, J. M., & Zhang, Z. (2005). China's changing Stehman, S. (2005). Comparing estimators of gross change derived from complete cov-
landscape during the 1990s: Large-scale land transformations estimated with erage mapping versus statistical sampling of remotely sensed data. Remote Sensing
satellite data. Geophysical Research Letters, 32, L02405. of Environment, 96, 466–474.
Lizarazo, I. (2010). Fuzzy image regions for estimation of impervious surface areas. Re- Stehman, S. V. (2009). Model-assisted estimation as a unifying framework for estimat-
mote Sensing Letters, 1(1), 19–27. ing the area of land cover and land-cover change from remote sensing. Remote
Lohr, S. (1999). Sampling: Design and analysis. Duxbury Press. Sensing of Environment, 113(11), 2455–2462.
McCabe, M. F., Chylek, P., & Dubey, M. K. (2011). Detecting ice-sheet melt area over Stehman, S. V. (2012). Impact of sample size allocation when using stratified random sam-
western Greenland using MODIS and AMSR-E data for the summer periods of pling to estimate accuracy and area of land-cover change. Remote Sensing Letters, 3(2),
2002–2006. Remote Sensing Letters, 2(2), 117–126. 111–120.
McManus, K. M., Morton, D. C., Masek, J. G., Wang, D., Sexton, J. O., Nagol, J. R., et al. Stehman, S. V., & Czaplewski, R. L. (1998). Design and analysis of thematic map
(2012). Satellite-based evidence for shrub and graminoid tundra expansion in accuracy assessment: Fundamental principles. Remote Sensing of Environment,
northern Quebec from 1986 to 2010. Global Change Biology, 18(7), 2313–2323. 64, 331–344.
McRoberts, R. E. (2010). Probability- and model-based approaches to inference for pro- Stehman, S., & Foody, G. M. (2009). The SAGE handbook of remote sensing. Accuracy
portion forest using satellite imagery as ancillary data. Remote Sensing of Environ- assessment (pp. 129–145). New York: Sage Publications (Ch.).
ment, 114, 1017–1025. Strahler, A. H., Boschetti, L., Foody, G. M., Friedl, M. A., Hansen, M. C., Herold, M., et al.
McRoberts, R. E. (2011). Satellite image-based maps: Scientific inference or pretty (2006). Global land cover validation: Recommendations for evaluation and accuracy
pictures? Remote Sensing of Environment, 115, 715–724. assessment of global land cover maps. EUR 22156 EN – DG. Luxembourg: Office for
Olofsson, P., Kuemmerle, T., Griffiths, P., Knorn, J., Baccini, A., Gancz, V., et al. (2011). Official Publications of the European Communities (48 pp.).
Carbon implications of forest restitution in post-socialist Romania. Environmental Taylor, J., Waine, T., Juniper, G., Simms, D., & Brewer, T. (2010). Survey and monitoring of
Research Letters, 6, 045202. opium poppy and wheat in Afghanistan: 2003–2009. Remote Sensing Letters, 1(3),
Olofsson, P., Torchinava, P., Woodcock, C. E., Baccini, A., Houghton, R. A., Ozdogan, M., 179–185.
et al. (2010). Implications of land use change on the national terrestrial carbon Townshend, J. R. G., Justice, C. O., Gurney, C., & McManus, J. (1992). The impact of
budget of Georgia. Carbon Balance and Management, 5, 4. misregistration on change detection. IEEE Transactions on Geoscience and Remote
Pontius, R. G., & Millones, M. (2011). Death to kappa: Birth of quantity disagreement Sensing, 30, 1054–1060.
and allocation disagreement for accuracy assessment. International Journal of Turner, B. L., Lambin, E. F., & Reenberg, A. (2007). The emergence of land change science
Remote Sensing, 32, 4407–4429. for global environmental change and sustainability. Proceedings of the National
Potere, D., & Schneider, A. (2007). A critical look at representations of urban areas in Academy of Sciences, 104(52), 20666–20671.
global maps. GeoJournal, 69, 55–80. UN (2007). United Nations CO2 emissions estimates. Data from the UNSD millennium
Sala, O. E., Chapin, F. S., Armesto, J. J., Berlow, E., Bloomfield, J., Dirzo, R., et al. (2000). development goals indicators database (http://unstats.un.org/unsd/environment/
Global biodiversity scenarios for the year 2100. Science, 287, 1770–1774. air_co2_emissions.htm)
Särndal, C., Swensson, B., & Wretman, J. (1992). Model assisted survey sampling. Springer. UN-REDD (2008). UN collaborative programme on reducing emissions from deforestation
Schneider, A., & Woodcock, C. E. (2008). Compact, dispersed, fragmented, extensive? A and forest degradation in developing countries (UN-REDD). FAO, UNDP, UNEP frame-
comparison of urban growth in twenty-five global cities using remotely sensed work document.
data, pattern metrics and census information. Urban Studies, 45, 659–692. van Oort, P. A. J. (2007). Interpreting the change detection error matrix. Remote Sensing
Singh, A. (1989). Digital change detection techniques using remotely-sensed data. of Environment, 108, 1–8.
International Journal of Remote Sensing, 10, 989–1003. Vitousek, P. M. (1994). Beyond global warming: Ecology and global change. Ecology, 75,
Skole, D. L. (1994). Changes in land use and land cover: A global perspective. Data on 1861–1876.
global land-cover change: Acquisition, assessment and analysis (pp. 437–471). Cam- Vivchar, A. (2011). Wildfires in Russia in 2000–2008: Estimates of burnt areas using
bridge: Cambridge University Press (Ch.). the satellite MODIS MCD45 data. Remote Sensing Letters, 2(1), 81–90.
Skole, D., & Tucker, C. (1993). Tropical deforestation and habitat fragmentation in the Wilkinson, G. G. (1996). Soft computing in remote sensing data analysis. Classification
Amazon: Satellite data from 1978 to 1988. Science, 260(5116), 1905–1910. algorithms — Where next? (pp. 93–99). Singapore: World Scientific (Ch.).
Smith, L. C., Sheng, Y., MacDonald, G. M., & Hinzman, L. D. (2005). Disappearing Arctic Wilkinson, G. (2005). Results and implications of a study of fifteen years of satellite
lakes. Science, 308, 1429. image classification experiments. IEEE Transactions on Geoscience and Remote
Stehman, S. V. (1997). Selecting and interpreting measures of thematic classification Sensing, 43(3), 433–440.
accuracy. Remote Sensing of Environment, 62, 77–89. Wolter, K. M. (2007). Introduction to variance estimation. New York: Springer.
Stehman, S. V. (2000). Practical implications of design-based sampling inference for Xian, G., & Crane, M. (2005). Assessments of urban growth in the Tampa Bay watershed
thematic map accuracy assessment. Remote Sensing of Environment, 72(1), 35–45. using remote sensing data. Remote Sensing of Environment, 97(2), 203–215.