Abrahamsen (2005)
Abrahamsen (2005)
Abrahamsen (2005)
PETTER ABRAHAMSEN
Norwegian Computing Center, PO Box 114 Blindern, NO-0314 Oslo, Norway
1 Introduction
601
The stochastic model for the thickness of layer i may include a deterministic
trend, mi (x), and a zero mean Gaussian random field, i (x), for the residual error
(Abrahamsen, 1993):
It is not obvious which alternative to choose. Since the seismic reflectors are
assumed accurately determined, Figure 1 suggests that subtracting from Base
Reservoir could be a better choice. Similar reasoning suggest that obtaining Top
Layer 2 by adding Layer 3 to Top Reservoir is a good choice. However, these choices
leaves a ‘gap’ between the two subsurfaces so the trend, mL2 (x), for the thickness
of Layer 2 is never considered. Moreover, this choice has a serious implication on
COMBINING METHODS FOR SUBSURFACE PREDICTION 603
the uncertainty of the thickness of Layer 2: Assuming the residual errors, i (x), to
be independent implies that the variance of the thickness of Layer 2 is
Var Z TL2 (x) − Z TL1 (x) = Var ∆ZL3 (x) + Var ∆ZR (x) + Var ∆ZL1 (x) .
This variance is usually significantly larger than Var ∆ZL2 (x) and the possible
strong correlation between the depth to Top Layer 1 and Top Layer 2 is lost.
The discussion has motivated the need for an approach where several methods
can be used simultaneously, so the unpleasant need for choosing one particular
method becomes obsolete.
Figure 2 shows four graphs, each corresponding to a method for constructing
all subsurfaces in Figure 1. The depth to a particular subsurface is found by
following the arrows to the subsurface; an arrow pointing downwards means that
the corresponding thickness is added whereas an arrow pointing upwards means
that the corresponding thickness is subtracted. Although some graphs give the
same result for a particular subsurface, the dependencies between the subsurfaces
are different in all four graphs. This has an implication on the predictors for each
subsurface (Abrahamsen, 1993). Thus, each graph in Figure 2 corresponds to a
method for prediction of the set of subsurfaces so there are actually four different
predictors for each of the four subsurfaces.
Figure 3 illustrates the methods for constructing subsurfaces slightly differently.
Whereas, Figure 2 gives a method for all subsurfaces in each graph, Figure 3 shows
the different methods for each subsurface. Each graph in Figure 3 are labelled
using the labels in Figure 2. Note that although Figure 2 contains four graphs
(or methods), there is only one or two possible ways of adding layers to obtain a
particular subsurface.
604 P. ABRAHAMSEN
A stochastic model with a linear trend for the thickness of layer i reads
4 Choice of Predictor
The best linear unbiased predictor for a random field with an unknown linear
trend is the universal kriging predictor. All subsurfaces in a multilayer model are
statistically dependent (covariates) since they all depend on the thickness of at
least one common layer. So the kriging predictor for any subsurface should be
conditioned on available depth observations from all correlated subsurfaces using
universal cokriging (Abrahamsen, 1993).
The kriging predictors depend on the geometry of the observations, the choice
of linear trends for the layer thicknesses, and the statistical properties of the resid-
uals. So alternative methods, such as those illustrated in Figure 2, give different
predictions for the same set of observations.
COMBINING METHODS FOR SUBSURFACE PREDICTION 605
Universal kriging was used by Hwang and McCorkindale (1994) and cokriging
by Jeffery, Stewart and Alexander (1996) for predicting velocity fields for depth
conversion. Xu et al. (1992) finds that universal kriging and collocated cokriging
give similar results for depth conversion. Using universal kriging however, gives
the possibility of using non-linear relationships between depth and travel time.
This approach is an adaption of a method used in time series analysis and fore-
casting and is reviewed by Bunn (1989) and Granger (1989). The idea is to make
a linear combination of alternative predictors.
For a subsurface l in Figure 1 there are four possible predictors corresponding
∗l ∗l ∗l
to the four different graphs or methods in Figure 2: Z(a) (x), Z(b) (x), Z(c) (x), and
∗l
Z(d) (x). A linear combination of these is a possible combined predictor:
Z ∗l (x) = w∗l ∗l ∗l ∗l ∗l ∗l ∗l ∗l
(a) (x)Z(a) (x) + w (b) (x)Z(b) (x) + w (c) (x)Z(c) (x) + w (d) (x)Z(d) (x) (2)
∗l ∗l
=w (x) Z (x).
∗l
Assume
that each predictor is unbiased and that the covariance matrix, Cab (x) =
Cov Za∗l (x)−Zal (x), Zb∗l (x)−Zbl (x) , of the predictors is known. Then, an unbiased
predictor with the minimum prediction variance is obtained using weights
−1
−1
w∗l (x) = C ∗l (x) e/ e C ∗l (x) e , (3)
where e is a vector of unit entries. This result is analogous to the weights obtained
in ordinary kriging.
To form C ∗ (x) requires the kriging prediction variances and even the prediction
covariances between all predictors at any location x. Thus, the drawback of this
method is the necessity to evaluate several predictors, prediction variances, and
prediction covariances for every subsurface.
This new approach propose that the alternative stochastic models for the depth
to a particular subsurface should be combined according to the magnitude of the
residual error for each model. A linear combination of the alternative stochastic
models is considered.
There are two different methods and stochastic models for the depth to Top
Layer 1 in Figure 1 according to Figure 3. A linear combination reads
Z TL1 (x) = w(a,b)
TL1 TL1
(x) Z(a,b) TL1
(x) + w(c,d) TL1
(x) Z(c,d) (x). (4)
TL1 TL1
The weights w(a,b) (x) and w(c,d) (x) are chosen to minimise the residual error
TL1
variance of Z (x) subject to the condition that the weights add to one:
−1
−1
wl (x) = Cl (x) e/ e Cl (x) e , (5)
606 P. ABRAHAMSEN
l
where the elements of the covariance matrix, Cab (x) = Cov Zal (x), Zbl (x) , are
calculated using
l l
m
Cov Za (x), Zb (y) = Cov εa (x), εb (y)
m
= ab
si Cov i (x), i (y) , (6)
common
i∈ intervals
where sabi = −1 when interval i is added in one model and subtracted l in the
other. Otherwise, sab
i = 1. The combined residual error variance is Var Z (x) =
−1
[e Cl (x) e]−1 which is less than or equal to Var Zal (x) for any method a.
Similar combinations must be constructed for all the subsurfaces. It is then
straightforward — but requires some bookkeeping — to calculate covariances
between depth observations from different subsurfaces. This leaves one stochastic
model and a single associated predictor for the depth to any of the subsurfaces.
Combining predictors is based on the principle of minimising the prediction
error. Combining stochastic models however, is a heuristic approach which must
be justified by comparing the results to the results obtained when combining
predictors. An example will illustrate that the two approaches give almost identical
results. The advantage of the latter approach is speed because only one predictor
for each subsurface is needed.
(2) has the form of a multiple linear regression model for Z ∗l (x) with Za∗l (x);
a = (a), (b), (c), (d) as regressors and the weights as unknown parameters. A
constant term accounting for possible bias can be added (Granger, 1989). This
approach, called ‘stacked regression’ by Wolpert (1992) and Breiman (1992), either
requires historical data or a large data set allowing cross validation. The review
by Clemen (1989) compares different methods for choosing the weights in (2).
6 A Synthetic Example
Figure 4. Travel times to Top Reservoir and Base Reservoir (left). Depth trends
obtained when choosing all β’s equal to one (right).
Table 1. Specified residual errors σal , calculated weights wal , and combined
residual errors σ l . The methods correspond to labels in Figure 3. Note how the
weights favour the assumed most accurate models.
velocity increase with increasing depth causing the subsurfaces to be more curved
than the travel times. A positive value for βR2 leads to a reduced interval velocity
for higher x values causing Base Reservoir to tilt upwards towards the right.
The standard errors of the residual errors for the layer thicknesses are chosen
as σTR (x) = σR (x) = 0.1, σTL1 (x) = σTL2 (x) = σTL3 (x) = 0.2, and they are
assumed to be independent for simplicity in the example.
When combining models, the possible methods for constructing each of the
subsurfaces are illustrated in Figure 3. The covariance matrix Cl has dimension
one for Top Reservoir and dimension two for the three other subsurfaces. Note
that Cl is independent of x because the standard errors of the residual errors
are assumed constant. The resulting weights from (5) and residual errors of the
combined models are given in Table 6.1.
Choosing all β-parameters equal to one and combining the trends according to
(4) using the weights in Table 6.1, gives the depth trends illustrated in Figure 4.
This set of trends are considered the ‘truth’ in the following.
608 P. ABRAHAMSEN
Universal kriging predictors split into two parts; the estimated trend and the local
fitting to observations (Cressie, 1993). The estimated trend depends heavily on
the trend model while the local fitting is mainly dependent on the shape of the
correlation function (variogram). The estimated trend will be used instead of the
full kriging predictor because using trends will exaggerate potential problems and
differences between approaches. Moreover, areas away from wells are the most
difficult to predict accurately and therefore the areas of the greatest concern. The
conclusions reached will carry over to the less sensitive kriging predictors in areas
outside well control. It is reasonable to assume that results are valid in the vicinity
of well observations. Three different approaches are tested:
1. Combining predictors (estimated trends) according to (2).
2. Estimating trends using stochastic models combined similar to (4).
3. Like 2. but using a Bayesian prior on the β’s.
Ten sets of depth observations have been drawn from a multinormal distribution
with the expectations given by the depth trends in Figure 4, and covariances
obtained from the weights and (6). The location of these observations are obtained
by dividing the x-axis into three segments and “drilling” one vertical well in each
segment at a random location.
Trends have been estimated for each set of observations using the three ap-
proaches. The resulting ten sets of trends are seen in Figure 5.
The first approach combines four trend estimates (see Figure 2) using weights
obtained from (3). Note that these weights depend on x.
The second approach, combining models, gives some trends that are far off the
‘true trends’ in Figure 4. This is caused by severe collinearity making it almost
impossible to estimate some of the β-parameters.
The third approach, imposing
a prior distribution on the β-parameters with
expectations 0.5 and Cov β = diag(2), dramatically improves the estimates of
the β-parameters. The corresponding ten trends in Figure 5 show a behaviour
very similar to the one obtained by combining predictors. The prior distribution
effectively restricts the parameter space so that extreme β estimates are prohibited.
Choosing a prior with large standard error (200%) and an expectation far away
from the true value (0.5 compared to 1) still gives good results. So the approach
is appearently robust to poor and vague prior specifications.
Figure 5. Ten sets of trends obtained by: 1. combining estimated trends (top
left), 2. combining stochastic models (bottom left), and 3. combining stochastic
models and using a Bayesian prior on the β’s (top right).
Figure 6. Average empirical bias (left) and error (right) from 100 simulations for
Top Layer 2. (· · · ) combined predictors, (- - -) combined model, and (—) combined
model with priors on β parameters.
7 Closing remarks
Two solutions to the problem of combining different methods for obtaining depthes
to subsurfaces have been discussed. The first approach combines alternative pre-
610 P. ABRAHAMSEN
dictors and the second approach merges alternative stochastic models. The latter
approach is approximately 10 times faster but suffers from collinearities that
are handled by imposing a prior distribution. The example showed that even a
misspecified prior gave good results so the approach appear to be robust.
When combining predictors, a rigorous minimisation criteria for the predic-
tion error is employed. The approach combining models however, uses a heuristic
minimisation criteria for the residual variance. The usefulness of this approach
is therefore justified by its performance. The two methods gave almost identical
results for the synthetic example so in this situation it is possible to conclude that
the model combination approach performs equally well.
The method has been implemented in commercial software and has been suc-
cesfully used in many field studies.
Acknowledgements
The work was supported by a research fellowship from The Research Council of
Norway. The author is grateful to Arne Skorstad, Erik Bølviken, Magne Aldrin,
and Gudmund Høst for many valuable comments.
References
Abrahamsen, P., 1993, Bayesian kriging for seismic depth conversion of a multi-layer reservoir,
in A. Soares, ed., Geostatistics Tróia ’92, proc. ‘4th Inter. Geostat. Congr.’, Tróia Portugal,
1992, Kluwer Academic Publ., Dordrecht, p. 385–398.
Acheson, C. H., 1963, Time-depth and velocity-depth relations in western Canada, Geophysics
v. 28, no. 5, p. 894–909.
Al-Chalabi, M., 1974, An analysis of stacking, RMS, average, and interval velocities over a
horizontally layered ground, Geophysical Prospecting v. 22, no. 3, p. 458–475.
Al-Chalabi, M., 1979, Velocity determination from seismic reflection data, in A. A. Fitch, ed.,
Developments in Geophysical Exploration Methods—1, Applied Science Publ. Ltd., London,
chapter 1, p. 1–68.
Breiman, L., 1992, Stacked regression, Technical Report 367, Dept. Statistics, Univ. California
Berkeley, 15 p.
Bunn, D., 1989, Forecasting with more than one model, J. Forecasting v. 8, no. 3, p. 161–166.
Clemen, R. T., 1989, Combining forecasts: A review and annotated bibliography, Internat. J.
Forecasting v. 5, p. 559–583.
Cressie, N., 1993, Statistics for Spatial Data, revised edn, John Wiley & Sons, New York, 900 p.
Faust, I. Y., 1951, Seismic velocity as a function of depth and geological time, Geophysics v. 16,
no. 2, p. 192–206.
Granger, C. W. J., 1989, Invited review: Combining forecasts—twenty years later, J. Forecasting
v. 8, no. 3, p. 167–173.
Hwang, L., and McCorkindale, D., 1994, Troll field depth conversion using geostatistically derived
average velocities, The Leading Edge v. 13, no. 4, p. 561–569.
Jeffery, R. W.,, Stewart, I. C. F., and Alexander, D. W., 1996, Geostatistical estimation of depth
conversion velocity using well control and gravity data, First Break v. 14, no. 8, p. 313–320.
Walden, A. T., and White, R. E., 1984, On errors of fit and accuracy in matching synthetic-
seismograms and seismic traces, Geophysical Prospecting v. 32, no. 5, p. 871–891.
White, R. E., 1984, Signal and noise estimation from seismic reflection data using spectral
coherence methods, Proc. IEEE v. 72, no. 10, p. 1340–1356.
Wolpert, D., 1992, Stacked generalization, Neural Networks v. 5, p. 241–259.
Xu, W.,, Tran, T. T.,, Srivastava, R. M., and Journel, A. G., 1992, Integrating seismic data
in reservoir modeling: The collocated cokriging alternative, in 67th Ann. Tech. Conf. and
Exhibition, Soc. of Petroleum Engineers, Washington DC, p. 833–842.