1 s2.0 0034425788900211 Main
1 s2.0 0034425788900211 Main
1 s2.0 0034425788900211 Main
PAUL J. CURRAN
Department of Geography, University of Sheffield, Western Bank, Sheffield, $10 2TN, United Kingdom
The Earth's surface and remotely sensed imagery contain spatial information that, if quantified, could be used to
optimize many sampling procedures in remote sensing. Until recently a suitable and simple technique for the spatial
characterisation of surtaces was not readily available. Now, thanks to the development of regionalized variable theory
there is a near-ideal tool, the semivariogram. The semivariogram is a function that relates semivariance to sampling lag.
This function can be estimated using remotely sensed data or ground data and represented as a plot that gives a
picture of the spatial dependence of each point on its neighbor. This paper provides an introduction to the
semivariogram and indicates how it could be employed in remote sensing research.
several recent papers in the environmen- imagery. Kriging is a very large field of
tal sciences (e.g., Kitandis, 1983; Russo, study in its own right and although some
1984; Oliver and Webster, 1986a, b; of the theory is introduced here, it is well
1987). beyond the scope of this paper.
The key to the theory of regionalized The purpose of this introduction is to
variables is the semivariogram (Olea, present the semivariogram to a remote
1977). This hmction relates semivariance sensing readership and show in a tutorial
to spatial separation and provides a con- and, I hope, seminal way how it can be
cise and unbiased description of the scale applied to remote sensing research.
and pattern of spatial variability. One of
the main reasons for deriving a semivario- Calculating the Semivariogram
gram is to use it in the process of estima-
tion (Journel and Huijbregts, 1978). For Samples of both remotely sensed data
example, following the collection of (satellite or airborne MSS, digitized aerial
ground data for a particular property the photography, etc.) and ground data (bio-
semivariogram could be used to estimate mass, soil moisture, etc.) can be usefully
the average value of the property within employed in the construction of semi-
a region, or the semivariogram could be variograms for remote sensing research.
used to interpolate the value of the prop- As the method of calculation is the same
erty at a place that has not been visited. for both data sources, only the former will
The method of estimation embodied in be illustrated here.
regionalized variable theory is known, at Imagine a transect nmning across a
least in the Earth sciences, as kriging remotely sensed image where the digital
(Krige, 1966). It is essentially a means of number (DN) z of pixels x have been
weighted local averaging in which the extracted at regular intervals z(x), where
weights are chosen so as to give unbiased x = 1, 2,..., n. The relation between a pair
estimates. This is done while minimizing of pixels, h intervals apart (the lag dis-
estimation variance and is in that sense tance), can be given by the average vari-
optimal (Webster, 1985). The use of the ance of the differences between all such
semivariogram to krige an area is the pairs (Fig. 1). As the per-pixel variance is
solution to several remote sensing prob- half this value (Yates, 1948), the semivari-
lems. For example, it could be used to ance S 2 for pixels at distance h apart is
derive optimal estimates of ground data given by
for the huge areas covered by the ground
resolution elements of NOAA AVHRR SZ=½[z(x)-z(x+h)] 2. (1)
lag h = 2
lag h=l
Transect
of p i x e l s
FIGURE 1. Lags along a transect of pixels. Lags oI I, 2, and 3 pixels ~u'e ilhLstrated.
SEMIVARIOGRAMS IN REMOTE SENSING 495
Within the transect there will be m pairs longer the lag, the fewer there are of
of observations separated by the same them (Fig. 1). As a result the confidence
lag. Their average is given by that can be ascribed to T(h) decreases
with increasing h. Webster (1985) con-
~2= 1 (2) siders that it is inadvisable to interpret
i=l lags longer than a fifth to a third of the
transect length, and for reasons of caution
S~ is an unbiased estimate of the average
only lags up to a fifth of the transect
semivariance T(h) in the population
length are plotted in this paper (Figs.
(Webster, 1985) and is a usehd measure
2-4).
of dissimilarity between spatially separate
pixels (Woodcock and Strahler, 1983, Describing the semivariogram
Jupp et al., 1988). The larger g2 and
Most of the semivariograms in the geo-
therefore T(h), the less similar are the
statistical literature are bounded by a sill,
pixels. The semivariogram is the function
as in Fig. 2. In urban/agricultural land-
that relates semivariance to lag and its
scapes semivariograms are often of man-
estimated form can be similar to that
modified surfaces with a repetitive spatial
shown in Fig. 2(a). To describe such a
pattern, and as a result this "classic"
relationship, the six terms of support, lag
semivariogram (Fig. 3) is relatively un-
(h), sill (s), range (a), nugget variance
usual. The two very common forms are
(Co), and spatially dependent strnctttral
the "periodic" semivariogram (Fig. 3) re-
variance (C), are used; these are il-
corded across a repetitive pattern and the
lustrated in Fig. 2 and defined in Table 1.
"aspatial" semivariogram (Fig. 3) re-
The semivariogram in Fig. 2 is derived
corded either along such a repetitive pat-
from airborne multispectral scanner
tern, randomly on a homogeneous surface,
imagery of South Yorkshire, UK. The
or when using a support that is larger
imagery, which had a nominal spatial res-
than the range. For htrther details refer
olution of 2 m, have been reported in full
to Olea (1977) and Webster (1982). There
elsewhere (Curran and WiUiamson, 1988).
are many permutations of these three
For purposes of illustration a transect ap-
basic forms, by far the most common
proximately 100 pixels ( = 200 m) long
being the "periodic-classic," the "un-
was orientated randomly along an image
bounded" and the "multffrequency." The
scan line, as shown in Fig. 2(b).
periodic-classic semivariogram (Fig. 3)
often represents a repetitive pattern on a
Interpreting the Semivariogram spatially variable surface. The unbounded
semivariogram (Fig. 3) represents a
The three components of interpreta- surface with a definite trend on which
tion are the choice of a usable transect the range is not reached within the usable
lefigth, description, and modeling. transect length. The multifrequency
semivariogram (Fig. 3) represents an area
Choice of usable transect length
in which there are two or more repetitive
When constructing a semivariogram, patterns.
the length of the lag determines the num- The examples used to illustrate these
ber of those lags in any transect: the six forms of semivariogram are derived
496 PAUL J. CURRAN
7(h)
4
Location Bawtry
South Yorkshire
. . . . . . . . 2 ' LI . . . . . . . . . . . .-s D a t e 3 0 9 85
Waveband 076-090#m
Transect random
)c
2- spatially
dependent
structural
variance
l
I
I
1- I
I
,,
I range
Co
nugget
variance
0
~\i ,5 2'o 3'0 4o
\
sup[5or t Lag (m)
FICURE 2a. The semivariogram (a) was derived from airborne multispectral scanner data of the
bare field (b). A curve to guide the eye and six discriptors have been added to the semivariogram
(Table 1).
FIGURE 2b.
TABLE 1 The Terms and Symbols Used in the Description of the Semivaxiograma
TERM SYMBOL DEFINITION
Support Area and shape of surface represented by each sample point.
Lag h Distance (and direction in two or more directions)
between sampling pairs.
Sill s Maximum level of T(h).
Range a Point on h axis where T(h) reaches maximum. In sample
data where T(h) reaches approximately 95% of the sill.
Places closer than the range axe related, places htrtber apart are not.
Nugget variance CO Point where extrapolated relationship T(h)/h intercepts
the 7(h) axis. Represents spatially independent variance.
Spatially dependent C Sill minus nugget variance.
structural variance
(a) Classic
40
Surface : Semi-natural
grassland
D a t e : 17 9 8 2
W a v e b a n d : 0 6 0 5 - 0-625 /~m
i 20-
Transect : Random
10-
0
o lb 2'0 3b 40
Lag lm)
(b) Periodic
20-
Date 30 9 85
5-
0
o lb 2'0 3b 40
Lag (m)
(c) A s p a t i a l
45-
E Location : Bawtry,
g 30- South Yorkshire
Date 3 0 9 85
I
@ e
Waveband : 155 - 1 75 /=m
I @ ~ I I I
L: 15- o
0
o I'o ~o ~o 4o
Lag (m)
FIGURE 3. Six semivariograms calculated using remotely sensed data. These illustrate the
forms typical of man-modified surfaces. The classic, peiodic, and a.spatial are the basic forms
and the periodic-classic, unbounded and multifrequency are permutations of the basic forms.
SEMIVARIOGRAMS IN REMOTE SENSING 499
(d) Periodic-classic
20
16- Location : B a w t r y ,
South Yorkshire
Date = 3 0 . 9 . 8 5
c~ 12- Waveband : 0 . 6 3 - 0 . 6 9 / = m
Transect : Random
.i
E 8"
4-
0 10 20 30 40
Lag (ml
'(e) Unbounded
30
Location : Holderness,
South Humberside
i 20- Date : 1 6 . 6 . 8 4
Waveband : 0 . 5 2 - 0 . 6 0 /=m
10-
Transect : Orthogonal to coastline
0
0 I0 210 30 40
Lag (ml
('f) Multifrequency
45
Location : Holderness,
c~ 3 0 - South Humberside
c Date : 1 6 . 6 . 8 4
0
0 I'o 2'0 3'0 4o
Lag (ml
FIGURE 3. (continued)
PAULJ. CURRAN
Waveband 0 6 3 - 0 6 9 / ~ m Waveband 0 7 6 - 0 9 0 / ~ m
_ 18]
z 1o
co
0
0 '
10 2'o '
30 40 '
10 20 ' 3'0 40
Lag (m) Lag (m)
Waveband:208- 235#m
28
Surface A g r i c u l t u r a l grassland.
sloping with sod banding
Date 30 9 85
O
7-
0 i
0 ~'o 2'0 30 ,o
Lag (m)
FIGURE 4. Semivariograms of a sloping agricultural |ield with soil t)anding. Three wavebands of
remotely sensed data have been sampled across the soil bands and slope and down the soil bands and
slope.
SEMIVARIOGRAMS IN REMOTE SENSING 501
The orientation of the sampling tran- the near infrared waveband and 35 m in
sects are important on anisotropic surfaces the middle infrared waveband. For both
that have a repetitive pattern. For in- wavebands the across-slope transect re-
stance, a transect of biomass or near in- sulted in a periodic semivariogram (Fig.
frared reflectance recorded across rows of 4). It would be possible to plot these
potatoes will probably yield a periodic semivariograms in two dimensions
semivariogram whereas a similar transect (Webster, 1985). For example, a two-
recorded down these rows will almost dimensional semivariogram for data from
certainly not. the middle infrared waveband (Fig. 4)
The effect of wavelength and transect would comprise parallel ridges running
orientation on semivariograms con- orthogonal to a slope.
structed using remotely sensed data are
illustrated using airborne multispectral Modeling the semivariogram
scanner imagery of South Yorkshire, U.K. If the semivariogram is to be used in
The imagery had a nominal spatial resolu- further calculation, for example, in the
tion of 2 m and has been reported in full derivation of sample size for the collec-
elsewhere (Curran and Williamson, 1986). tion of ground data, then the function of
The agricultural grassland field for which T(h) for the shortest lags must be mod-
semivariograms are illustrated (Fig. 4) is eled from the sample data (Webster,
sloping and has banding of soil texture. 1985). Many models are available, but
The slope results in increased soil mois- only a few families are suitable for con-
ture and possibly vegetation vigor at one tinous variables (McBratney and
end of the field, and the banding of soft Webster, 1986). The most acceptable
texture results in soil moisture and possi- being the spherical, exponential, linear, or
bly vegetation vigor differences with a combinations of two or three of these.
repetitive spacing of approximately 22 m. The linear and spherical models seem to
The two 100-pixel-long transects were be most appropriate (Fig. 3) and are dis-
orientated to run across and as far as cussed in detail by Armstrong (1984),
possible down both the slope and the soil Webster (1985), and McBratney and
bands in red (0.63-0.69 /xm), near in- Webster (1986).
frared (0.76-0.90 /zm), and middle in-
frared (2.08-2.35 #m) wavebands. Applying the Semivariogram
The red image had a clear trend down
the slope, but there was no evidence of The semivariogram can be calculated
either a trend across the slope or banding. for any scale of spatial variation (Oliver
The resultant semivariograms had un- and Webster, 1986a). As much of the
bounded variance both down and for Earth's surface is divided into parcels, it
some unknown reason across the slope, is useful, in the context of this paper, to
and no periodic pattern was evident (Fig. identify the intraparcel scale of study
4). On the near infrared and middle in- (e.g., crop disease survey), the interparcel
frared images, there was no evidence of a scale of study (e.g., land cover classifica-
trend either down or across the slope, but tion), and parcel-independent scale of
the banding was very clear. The down- study (e.g., continent wide vegetation
slope variability had a range of 15 m in surveys). Recently workers, building on
502 PAUL J. CURRAN
the work of Webster (Webster, 1973, Strahler, 1984; 1985; 1987) and such in-
1978; Webster and Cunalo, 1975), have formation is sadly lacking:
used semivariograms to describe remotely
"Surprisingly, earth scientists have as
sensed surfaces at the interparcel scale
yet largely failed to provide compre-
(Lacaze et al., 1985; Milton, 1986, per-
hensive quantitative data in a form
sonal communication) and ground data at
which is comparable with deriving ob-
the parcel-independent scale of study
jective estimates of resolution require-
(Dancy et al., 1986). The former are use-
ments" (Townshend, 1981, p. 50).
ful for the separation of spectrally similar
surfaces, and the latter provide a basis on For example, to choose a spatial resolu-
which ground data, measured at points, tion suitable for deriving per-field values
can be extrapolated over the ground reso- of radiance, quantitative data are re-
lution elements of sateRite sensors. quired on the size and internal spatial
At all scales of study, the semivario- variability of the fields (Curran and
gram can be used to aid the choice of Williamson, 1988). The size of the fields,
sample units and sample numbers for re- which can be obtained from maps or
motely sensed and ground data. In this remotely sensed imagery, determines the
introduction the semivariogram, at the number of ground resolution elements
intra- and interparcel scale of study, will that will fit into them (Curran and
be used to illustrate the selection of a Williamson, 1988). The internal spatial
sampling unit for remotely sensed imag- variability of the fields determines how
ery (spatial resolution) and sample num- small a ground resolution element can be
bers for ground data collection. before it detects unnecessary intraparcel
variability.
Use of a semivariogram It is for this latter task that the semi-
derived using remotely sensed data
variogram is ideal as its range (Fig. 1)
for the selection of a spatial resolution
defines the distance above which ground
Remotely sensed data are available at a resolution elements are not related. A
wide range of spatial resolutions and yet ground resolution element larger than the
for reasons of caution there is a tendency range will probably average the spatial
to choose the finest (Townshend, 1981). variability within a field and will, there-
This is often as desirable for the oper- fore, be a suitable ambassador for that
ational applications of remote sensing, as field. By contrast, a ground resolution
data with a high spatial resolution are element smaller than the range will prob-
expensive to both purchase and process. ably be so different to its neighbors that
For operational applications, spatial reso- it would be a most unsuitable ambassador
lution should not be finer than is neces- for that field (Woodcock and Strahler,
sary for the task at hand (Curran and 1987). If semivariograms for all wave-
Williamson, 1988). To choose such a spa- bands and directions are available, then
tial resolution (1, 5, 20 m, etc.), informa- defining a minimum range and therefore
tion is required on the spatial characteris- minimum spatial resolution is straightfor-
tics of the surface under investigation ward, providing the semivariogram is
(Woodcock, 1985; Woodcock and neither aspatial or unbounded (Fig. 3).
SEMIVARIOGRAMS IN REMOTE SENSING 503
In practice a user can usually have ing on how close the sample mean is to
access to semivariograms in several direc- approach the population mean and on
tions and wavebands. In such a case, i.e., the spatial variability of the parcel. The
where a ground resolution element is re- first factor, the level of acceptable error,
quired to represent a field, then the is a matter of judgement, and the second
minimum range should be used to define is inherent in the region and can be de-
the minimum spatial resolution, and for termined by means of a pilot study (Marsh
the agricultural grassland in Fig. 4 this and Lyon, 1980; Curran and Williamson,
would be around 11 m. When only one 1985; 1986).
semivariogram is available, then the link The approach taken by most investiga-
between the range and the minimum spa- tors to such a pilot study is to measure
tial resolution, while useful, is tentative the relevant ground data at a number of
(e.g., Soares et al., 1987). For example, random points, calculate the standard de-
the minimum spatial resolution would viation of the sample 08, lookup Student's
probably be around 10, 12, 12, 20 and 26 t value for a two-tailed test with n - 1
m, respectively, for the sea, semi-natural degrees of freedom at the 95% confidence
grassland, coniferous woodland, bare soil, level, select an acceptable error e of the
and deciduous woodland of Figs. 2 and 3. mean, in the same units as the standard
By contrast, ff the aim were to classify deviation, and use these three values to
these fields using a measure of texture, derive n:
then these would be maximum values of
spatial resolution. n = (o~t/e) 2. (3)
Use of a semivariogram
derived using ground data for the This procedure assumes that the variable
design of a ground sampling scheme of interest is spatially independent, i.e.,
Accurate ground data are required to variance is all nugget. With such an as-
calibrate remotely sensed imagery or to sumption, values of n for quite modest
check the accuracy of estimates made levels of error can be very large. For
using remotely sensed imagery. Due to example, Curran and Williamson (1986)
the difficulties associated with sampling report that, to record the green leaf area
large areas in short time periods, the error index of a particular grassland field with
in ground data is usually high (Curran a 5% error of the mean (at the 95%
and Hay, 1986). This is a pressing and confidence level), 293 random points
serious problem in remote sensing re- would be needed. Where there is spatial
search as it applies across a wide range of dependence, however, the number of
variables, for example, green leaf area sample points required to achieve a given
index (Curran and Williamson, 1987), level of confidence can be much less
density of housing (Curran and Hobson, (Webster and Burgess, 1984). Maximum
1987) and the concentration of materials advantage of this spatial dependence can
suspended in water (Curran, 1987). Ide- be obtained by using a systematic sam-
ally these should be measured using many piing grid as it ensures that neighboring
samples per parcel, the number depend- sample points are as far from one another
,504 PAUL J. CURRAN
as is practical for a fixed sample size and Burgess, 1984). Such a strategy enables
area. As such, it minimizes the duplica- the link between sampling error and sam-
tion of "information" that occurs in ran- ple size to be developed in spatial con-
dom sampling, where some sample points text; as such, it is ideal for use on ground
are inevitably close. However, the sample or remotely sensed data.
grid must be placed in such a way that it The increase in precision (for a given
does not coincide with any repetitive pat- sampling effort) of systematic as opposed
tern on the ground (Dixon and Leach, to random sampling is dependent upon
1978). The procedure is to choose a grid the proportion of nugget to spatially de-
spacing and, using this spacing as the lag, pendent structural variance (Fig. 2). Sys-
read off the relevant semivariance ~,(h) tematic, as opposed to random sampling
from the ground data semivariogram will offer little increase in precision if
(usually in both directions). These values nearly all the variance is nugget [Fig.
can be used to calculate, the variance per 3(c)] and large increases in precision if
unit area. This variance is called the krig- nearly all the variance is spatially depen-
ing variance (Krige, 1966) OK z, and an be dent [Fig. 3(a)]. Typically, for surfaces
derived in relation to the mean value with a semivariogram similar to Fig. 2,
(e.g., biomass) of the area z(A) using the increases in precision offered by sys-
tematic over random sampling have been
in the range of two to ninefold (McBrat-
ney and Webster, 1983b; Webster and
i=1
Burgess, 1984; Webster and Nortcliff,
1984). If these are translated into a de-
- ~. ~. hih~'y(xi,x~)--OeA, (4) crease in sampling effort, then the neces-
i=lj=l sity for the impossibly large sample sizes
that were so lamented by Curran and
where X~ and ~i are weights chosen to Williamson (1985; 1986) will be a thing
minimize o~, which is the true estimation of the past.
variance, o~ is the variance of the area,
and the expression ~(x i, A) is the mean
semivariance between the sampling point The author wishes to acknowledge the
x i and the area (Burgess et al., 1981; financial support of the Natural Environ-
McBratney et al., 1981; McBratney and ment Research Council under Research
Webster, 1983b). Grant GR3/5096 and their sponsorship
The minimum kriging variances, or o f airborne Multispectral Scanner
their square roots, the standard errors, campaigns from 1982 to 1986 inclusive.
can be plotted against a number of grid Thanks are also due to Peter Bragg and
spacings. (For some excellent examples, Colin Bell for computer programming;
see Webster and Nortcliff, 1984). From Mark Danson for undertaking the com-
this graph a sample size (derived from puting; Dr. Dick Webster, Peter Atkinson,
the number of grid squares that would fit Mark Danson, and Dr. Curtis Woodcock
into the area) can be chosen for an ac- for comments on the manuscript and Dr.
ceptable level of error (Atkinson, 1987) Dick Webster, who encouraged me to
(for hill discussion see Webster and make friends with the semivariogram.
SEMIVARIOGRAMSIN REMOTESENSING 505
jupp, 0. L. B., Strahler, A. H., and Wood- McBratney, A. B., and Webster, R. (1986),
cock, C. E. (1988) Autocorrelation and Choosing functions for semi-variograms of
regularization in digital images. I. Basic soil properties and fitting them to sampling
theory. IEEE Geosciences and Remote estimates, J. Soil Sci. 37:617-639.
Sensing (forthcoming). McBratney, A. B., Webster, R., and Burgess,
Kitandis, D. K. (1983), Statistical estimation T. M. (1981), The design of optimal sam-
of polynomial generalized covariance func- piing schemes for local estimation and
tions and hydrological applications, Water mapping of regionalized variables. I. The-
Resources Res. 19:909-921. ory and method, Comput. Geosci.
71:331-334.
Krige, D. G. (1966), Two dimensional
weighted moving average trend surfaces for McCullagh, M. J. (1975), Estimation by krig-
ore observation, 1. S. African Inst. Mining ing of the reliability of the proposed Trent
and Metall. 66:13-38. telemetry network, Comput. Appl.
21:357-374.
Lacaze, B., Lahraoui, L., Debussache, G., and
Khelta, A. (1985), Analysis de measures Mousset-Jones, P. F., Ed. (1980), Geostatis-
radiometriques de simulation SPOT en tics, McGraw-Hill, New York.
milieux Mediterraneens aride et sub- Olea, R. A. (1977), Measuring Spatial Depen-
humide, Proceedings of the 3rd Interna- dence with Semi-Variograms, Kansas Geo-
tional Colloquium on Spectral Signatures of logical Survey, Campus West, Lawrence,
Objects in Remote Sensing, Les Arcs, KS.
France, 16-20 December, ESA SP-247 Oliver, M. A., and Webster, R. (1986a), Com-
Noordwijk, pp. 425-428. bining nested and linear sampling for de-
Lam, N. S. N. (1983), Spatial interpolation termining the scale and form of spatial
methods: a review, Am. Cartographer variation of regionalized variables, Geogr.
10:129-149. Anal. 18:227-242.
Marsh, S. E., and Lyon, R. J. P. (1980), Oliver, M. A., and Webster, R. (1986b),
Quantitative relationships of near surface Semi-variograms for modelling the spatial
spectra to Landsat radiometric data, Re- pattern of landform and soil properties,
mote Sens. Environ. 10:241-261. Earth Surface Processes Landforms
11:491-504.
Matheron, G. (1965), Les Variables R~gional- Oliver, M. A., and Webster, R. (1987), The
isles et leur Estimation, Masson, Paris. elucidation of soil pattern in the Wyre
Matheron, G. (1971), The Theory of Regional- Forest of the West Midlands, England. II.
ized Variables and Its Applications, Cahiers Spatial distribution, J. Soil Sci. 38:293-307.
Centre de Morphologie Mathematique No. Richards, J. A. (1986), Remote Sensing Dig-
5, Fontainebleau. ital Image Analysis. An Introduction,
McBratney, A. B., and Webster, R. (1983a), Springer-Verlag, Berlin.
Optimal interpolation and isarithmic map- Ripley, B. D. (1981), Spatial Statistics, Wiley,
ping of soil properties. V. Co-regionaiiza- New York.
tion and multiple sampling strategy, 1. Soil Russo, D. (1984), Statistical analysis of crop
Sci. 34:137-162. yield-soil water relationships on heteroge-
McBratney, A. B., and Webster, R. (1983b), neous soils under trickle irrigation, Soil Sci.
How many observations are needed for re- Soc. Am. ]. 48:1402-1410.
gional estimation of soil properties, Soil Sci. Soares, J. V., Bernard, R., and Vidal-Madjar,
135:177-183. D. (1987), Spatial and temporal behaviour
SEMIVARIOGRAMS IN REMOTE SENSING 507