Report PDF
Report PDF
Report PDF
Ross Hemsley
September 25, 2009
Abstract
As neutrons pass through a magnetic field, their spin precesses.
We seek to simulate this precession within the neutron tracing package
McStas. Presented here are the methods and implementation details
used to get arbitrary magnetic field data values when only a finite
number of points are known.
Contents
1 Introduction
1.1 About McStas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
3
2 Method
2.1 Interpolation in three dimensions .
2.2 Nearest Neighbour Interpolation in
2.3 Interpolation by point weighting .
2.4 Kriging . . . . . . . . . . . . . . .
2.5 Natural Neighbour Interpolation .
2.6 Implementation Details . . . . . .
.
.
.
.
.
.
4
4
4
5
5
5
6
3 Results
3.1 Test data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Interpolating a scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Interpolating a vector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
7
7
9
4 Discussion
4.1 Nearest Neighbour Interpolation
4.2 Kriging Interpolation . . . . . . .
4.3 Natural Neighbour Interpolation
4.4 Considerations for McStas . . . .
. .
Rd
. .
. .
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
13
13
14
14
5 Conclusions
5.1 Performance . . . . . . . . . . . . . . . . . . . .
5.2 Robustness . . . . . . . . . . . . . . . . . . . .
5.3 Spotting errors . . . . . . . . . . . . . . . . . .
5.4 Including into McStas . . . . . . . . . . . . . .
5.5 General use . . . . . . . . . . . . . . . . . . . .
5.6 Future Work . . . . . . . . . . . . . . . . . . .
5.6.1 Considering variations in time . . . . . .
5.6.2 Modifying the interpolation algorithms .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
16
16
16
16
17
17
17
17
17
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Appendices
.
.
.
.
.
.
.
.
.
.
.
.
20
B Kriging Interpolation
B.1 Variograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B.2 The Emperical Variogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B.3 Ordinary Kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
23
23
24
25
25
26
26
27
28
29
29
30
31
32
32
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Introduction
1.1
About McStas
[Taken from current McStas Documentation] The software package McStas is a tool for carrying
out Monte Carlo ray-tracing simulations of neutron scattering instruments with high complexity
and precision. The simulations can compute all aspects of the performance of instruments and can
thus be used to optimize the use of existing equipment, design new instrumentation, and carry out
virtual experiments for e.g. training, experimental planning or data analysis. McStas is based on
a unique design where an automatic compilation process translates high-level textual instrument
descriptions into efficient ANSI-C code. This design makes it simple to set up typical simulations
and also gives essentially unlimited freedom to handle more unusual cases.
1.2
Requirements
McStas currently only allows magnetic fields to be described by explicit equations; this allows
arbitrary accuracy to be obtained by simply calling the known field function for each case. When
a field becomes more complex, or is only available as a set of magnetic field values, there is no
current method to calculate the magnetic field values at any point. (Clearly, this would require
an infinitely fine mesh). The purpose of this project was to allow McStas to take descriptions of
magnetic fields given by finite point-sets, and accurately trace neutrons across them by using some
form of interpolation. Further, we seek to allow these poinsets to be non-uniform, anisotropic and
sparse. We are given that McStas already has features for doing simple numeric precession, given
that we can arbitrarily get values from the magnetic field function.
Chapter 2
Method
To do precession across a magnetic field, the most important feature here is finding the field-value
at any given point. If we can succeed in doing this, we can just use the standard McStas polarisation
libraries to do the precession. To find these points we shall need to implement interpolation in three
dimensions. More than one method exists to do this, and so we shall try implementing multiple
algorithms so that we can better evaluate which solution will be better for our purpose. We will
spend the following sections describing three well-known algorithms which we consider should apply
to our problem.
2.1
Interpolation is simply the process of taking known data points to estimate the values of unknown
points lying within the convex hull of the known points. We refer to the estimation of values
outside the convex hull of the known points as extrapolation. Interpolation in one dimension can
be easily understood. For simplicity, we may consider a line with two points on it. Suppose we
wish to calculate some value at a point that lies between between them, then we can trivially see
three options: We may simply take the data value of the nearest point (this is known as Nearest
neighbour Interpolation), we may take the average of the two points, or we may try to use our
knowledge of the data to estimate what happens in between these two points.
These ideas may be extended into two and more dimensions, but we require some new notation.
We describe the location of each of our n known points in Rd by a vector xi = [x1 , x2 , . . . , xd ] for
i = 1, 2, . . . , n. We then describe the value of the data point to be interpolated at each of these
locations by v1 , v2 , . . . , vn . We will label the point we wish to interpolate x in most cases, and v
will be the estimated value which we obtain in each case.
2.2
Nearest Neighbour interpolation clearly generalises trivially to Rd . We just seek to minimise the
general Euclidean Norm of the vector between all known sample points and the point that we wish
to interpolate. Once we have found the point which minimises this value, we simply take the data
value of that point.
= vk , k = argminkx xi k
v
i
2.3
To generalise the other methods to more dimensions, things are no longer quite so clear. To clarify
our situation, we imagine that there exists some sequence of weights 1 , 2 , . . . , n which correspond
to each of our known points x1 , x2 , . . . , xn such that:
Pn
i xi
= Pi=1
v
n
i=1 i
The next step is to calculate the required weights given a point to interpolate, x . Here we use
two different weighting methods, one which uses the data values to estimate the behaviour of the
function in the intervals between known points, and one which effectively takes an average over the
set of neighbours.
2.4
Kriging
Kriging is a type of interpolation known as a BLUE (Best Linear Unbiased Estimator). It requires
as input a fitted variogram model describing the variance of the data. Once this has been obtained,
we obtain a series of weights by solving a system of linear equations. Kriging Interpolation does
not provide any major complexities. The most difficult aspect is in trying to fit a valid variogram
model; to do this we first need to create an empirical variogram. We proceed by binning similar
lag vectors and then by taking the average values of each bin. In this way we hope to obtain a
variogram which does not over-fit the data.
Once we have a valid empirical variogram, we try to fit one (or a linear combination of many)
of the possible variogram models to the data. Here we attempt to use non-linear curve fitting to
do the process automatically; but due to the possibility for large errors if the fit is not good, we
suggest that good fitting is not possible without human intervention.
We note that the weight matrix is the same for every point to be interpolated, so we can factorise
this matrix as a pre-computation step to speed up solving the equations for each new point. More
details on how Kriging is implemented can be found in Appendix B.
2.5
Natural Neighbour interpolation is very easy to describe in principle; it simply involves taking a
weighted average over the natural neighbours of a point. It has many desirable properties as an
interpolation function. It is exact (the resulting function honors the value at each data point), it is
continuous everywhere except at the data points and it performs well even when the data is highly
anisotropic. For more details see Ledoux[6, 5]. To perform Natural Neighbour Interpolation, our
basic idea is to re-weight the point set using the change in volumes of the Voronoi Cells as the point
is added, and then removed from the Voronoi Diagram. In our original notation, we consider a set
if i N
otherwise
(
i =
PN
Note that we always have:
i=1 i = 1 because the sum of the volumes of the Voronoi Cells
must remain the same. We can now calculate the estimated value of our point by re-weighting our
points.
=
v
k
X
i x i
i=1
Though the method is commonly advocated as avoiding the problems of most interpolation
algorithms, it is notoriously difficult to implement in three and higher dimensions[6]. For a brief
descriptions of the algorithms required to implement Natural neighbour Interpolation, see Appendix
C.
2.6
Implementation Details
We will implement the given methods in the C language with as few external libraries as possible,
following the details given in the appendices at the bottom of this document. We want to be able
to add this code to McStas, which currently aims to compile on any platform. This means that we
must write our code to be stand-alone avoiding external dependencies. Once we have implemented
the different algorithms, we must extend them to work when considering a vector field over three
dimensions, instead of just a scalar field. This extension follows trivially in Nearest Neighbour and
Natural Neighbour (we can use the same weights). However, when considering Kriging, we must
compute a new variogram for each component of the data.
Chapter 3
Results
3.1
Test data
To test our interpolation methods, we will use a smooth field of scalar values. We will start with
512,000 points uniformly arranged in space, and then take random samples from these points in
order to test our functions. In this way we can calculate the absolute error at each point, by taking
the squared difference between the interpolated value and the actual known value of the underlying
data. We will use the following random samples taking 10, 102 , 103 , 104 points respectively. See
Figure 3.1 for the plots of the input points. The function giving the scalar value at each point
(x, y, z) in R3 will be as follows.
x
y
z
f (x, y, z) := cos
cos
sin
35
40
40
We chose this function because it is easy to see from symmetry how the interpolation functions
are performing. It is also interesting enough to give us results similar to what we might expect
from the interpolation over a magnetic field.
Figure 3.1: The random samples of points we use for testing. From left to right, we have 10, 102 , 103
and 104 points respectively, each sampled from the given distribution function.
3.2
The following figures give an idea of the performance differences between the interpolation schemes.
We can see in Figure 3.2 the characteristic patch-like pattern that Nearest-Neighbour interpolation
7
gives, with the boundary of each patch actually corresponding to the Voronoi Cell around each
point. In Figure 3.3 we can see how the algorithm has done a much better job of estimating values
between points. It has also given very good results where it has been necessary to extrapolate; if
we look at Figure 3.4, we can see that the top left of the image has just taken the values at the
boundary of the convex hull, where as the Kriging algorithm was capable of extrapolating past the
boundary to give more accurate results.
Figures 3.2, 3.3, 3.4 all demonstrate the functionality of our algorithms over scalar fields. In each
case we have the reference rendering (the far left) which shows how the data looks if taken directly
from the function, we then have the interpolated version. In each case this has been created by
interpolating 100 random values from the actual data. The third image gives the Sum of Squared
Error, which is just the squared difference between the reference rendering and the interpolated
rendering. In this way we can see how accurate our interpolation functions have been in an absolute
sense. It has been calculated by subtracting the estimated point value and the actual point value,
and then squaring. We then take the sum over all of these values and divide by the number of
points to normalise it.
Kriging
Natural Neighbour
Number of Points
10
100
1,000
10,000
10
100
1,000
10,000
10
100
1,000
10,000
Time Taken
3.2s
3.3s
3.7s
3.9s
18s
19m36s
1m5s
3m53s
3m54s
4m8s
SSE
4.9 10 2
1.3 10 2
2.3 10 3
4.9 10 4
7.2 10 4
2.9 10 5
1.6 10 2
3.9 10 3
3.7 10 4
7.4 10 6
Figure 3.5: Comparison of algorithms when interpolating 512,000 points over a scalar field. We
consider the time taken and normalised Sum of Squared Error as a function of the number of points
in each case.
Figures 3.6 and 3.7 show how our algorithms perform with increasing point size in terms of
time and error respectively. We note clearly that the more time the algorithm takes, the less error
we expect to find - as we would expect.
3.3
To demonstrate the functionality of each algorithm on a vector field, we consider some input
data describing a magnetic nutator. This point set is particularly challenging for an interpolation
algorithm because the data is configured in sparse clusters of multiple values. The locations of the
point values can been see in Figure 3.8. Because it is particularly difficult to get a good sense of
9
Figure 3.6: Comparing running times of the different methods. This is a log-log plot comparing
the time for each algorithm to run against the number of input points.
Figure 3.7: Comparing error for different methods. Here we plot the (not normalised) Sum of
Squared Errors against the number of points for each algorithm.
10
Figure 3.8: The point layout for our vector field test data.
error from visualising a vector field, we make no attempt to give comparison renderings for each
algorithm; and simply assume that the results will be similar to those in the scalar interpolation.
We can see from Figure 3.9 that the result is continuous and smooth, even though the input
points where very sparse. We would expect that this field would have the same patch-work pattern
if we were to use Nearest Neighbour, in that we would get local cells of vectors all with the same
value, and strong discontinuities where the boundaries of the Voronoi Cells lie. This would be
particularly disastrous, because we would end up with great detail in the small regions where there
are lots of points, and then no detail across the rest of the field.
We expect the Kriging result would look almost identical to the Natural Neighbour result, as
the error margins are too small to see from the images alone. We note that this data-set would
be amenable to re-sampling, because can appear to get a good smooth function when using our
Interpolation Functions, yet we would not be able to get any kind of meaningful results using a
nave Nearest Neighbour algorithm.
11
Figure 3.9: Natural Neighbour Interpolation accross a vector field. We note the smooth gradients
which are indicative of a good-quality interpolation.
12
Chapter 4
Discussion
4.1
We can clearly see from our results that Nearest Neighbour interpolation is the fastest by a large
margin. We also note the very slow increase in time taken as the pointset grows large. This seems
to confirm our O(log(n)) expected bound on performance. We see that the method gives inaccurate
results when the pointset is sparse, but that its accuracy approaches that of the other methods
when the pointset is dense.
4.2
Kriging Interpolation
Kriging Interpolation gives the lowest error margins out of all our methods. We note that even for
sparse data sets the results appear to be very good. This is because we are able to estimate the
behaviour of the function, even when we are extrapolating values outside of the known point values.
Given that all examples here have Variograms which are being fitted automitically, we expect that
Kriging could perform much better if a human were to optimise the variogram model. It would
be of interest to update this method to take account of anisotropy. This would also lead to better
accuracy, however it would be particularly difficult to automatically fit a directional variogram
because of the number of degrees of freedom available.
Although kriging provides us with the most accurate interpolation, we note that it is generally
slow, and performs badly when more points are added. This is because each time we wish to
interpolate a point, we are considering a matrix containing the square of the entire pointset. It
should be noted that we could modify our Kriging algorithm to only consider say 100 local points,
which we could find using an extra data structure such as an Octree.
When considering the results of Kriging a vector field, we note that we have to create an
individual variogram for every component of the data. This effectively means it will be three
times slower when we extend it to vector field interpolation. In contrast, the other two methods
presented here do not consider the underlying trends in the data. This means that we can use the
same weights for every component of the data; giving only a very small increase in computational
overheads in those cases.
13
4.3
Natural Neighbour Interpolation gives us a good balance between speed and accuracy. It performs
well regardless of the configuration of the data and handles asymptotically large data sets very
well. We note that results are poor when values are outside of the convex hull of the original
points, because this kind of interpolation does not take into account the behaviour of data, but
only takes an average of the points it has available. We note that This method achieves the lowest
error margin of all the methods when used on large dense pointsets. The precomputation step of
creating the Delaunay Triangulation is notably very efficient, and can handle even very large sized
pointsets without too much difficulty. We note that this method will tend to give higher errors at
the edges of the data, because it works by averaging points around it. Best performance will be
given when a point lies within a roughly spherical set of evenly spaced natural neighbours.
4.4
Since all of these methods have been implemented with McStas in mind, we now consider how well
suited to the problem our solutions are. We note that our main considerations for interpolation in
McStas are accuracy, speed and transparancy. That is to say that we want our interpolation to give
highly accurate results, in an acceptable amount of time without the user having to be aware of the
underlying method being employed. In this case it seems clear that Kriging is not appropriate as
a method to be used in McStas because it potentially requires a detailed knowledge of directional
semi-variograms when the input data is anisotropic. Without modification, it performs poorly for
arbitrarily large pointsets. It could be that this method would be vastly improved by only Kriging
over a selected number of neighbours, but we would then be required to create a large number of
semivariograms. This could either be done as a pre-computation step, or automatically as they are
required. Although this would make checking each one by hand impossible; which would not be
ideal since variogram fitting is a subtle art not well suited to automation.
Nearest Neighbour Interpolation appears to be a bad choice in general, because of its poor
behaviour on anything but very specific input. We do note that this method performs extremely
well when the input is formatted as a dense grid of values. When this is the case, the error in this
method starts to approach that of the others, however the time taken to finish remains much lower.
Natural Neighbour Interpolation has the most advantages of all the methods considered, it is
reasonably fast and gives good results for almost all input data. Due to its complicated implementation though, we must be aware of the possibility of bugs and degenerecies arising in unusual
cases. It is likely that this method could be optimised to be much faster than the implementation
presented here, the main slowdown appears to be in creating Voronoi Cells, which is quite a complex
geometrical operation - and must be compelted several times for each point to be interpolated.
After running a session of McStas with a counter to see how many calls to the interpolation
function could be expected, we see that McStas puts a huge demand on this function, calling it in
the region of 4 108 times. Given this knowledge, it is clear that Kriging is infeasibly slow, and
that Nearest Neighbour Interpolation is close to being impractical. It seems that given a general
field of magnetic field values, our only option is to do interpolation using Nearest Neighbour; which
we have established gives very poor results in general.
Likely the best solution is to use our Natural Neighbour algorithm to resample the data set.
We can easily sample 106 points using Natural Neighbour in a few minutes, and then we can use
Nearest Neighbour Interpolation on the output. We can then expect low errors (arbitrarily low
depending on how many points we wish to resample), but with a much faster running time. Our
14
only new consideration is space, which is generally not a problem on modern computers. We could
improve upon this even more by using a uniform grid interpolation method, in this way we could
estimate the value between our resampled grid squares using a well-known interpolation method
such as tri-cubic or tri-linear interpolation.
15
Chapter 5
Conclusions
5.1
Performance
All of our algorithms perform well compared to their theoretical descriptions. The only possible consideration that remains is running time, though we note that the overheads are generally
constant factors and it is unlikely that any optimisations will speed things up asymptotically. Considering that McStas takes values from the magnetic field on the order of hundreds of millions of
times, we are bound to find a bottleneck here for any function which does reasonable computation.
Suggestions might be to attempt to reduce the number of calls that McStas makes to this function,
but this seems unrealistic. The best suggestion would be to restrict the input to a uniform grid, and
then use the functions presented here to take any other kind of input and re-sample it. This could
easily be implemented as a pre-computation step by detecting non-uniform grids and performing
re-sampling automatically.
5.2
Robustness
Natural Neighbour and Nearest Neighbour have both been written with generality in mind. In
theory, there should be no data set which can lead to failures (i.e. degeneracies leading to incorrect
meshing for Natural Neighbour, or point sets which lead to a poor kd-tree in Nearest Neighbour).
Though the complexities in some of our methods make it difficult to be certain that errors cannot
occur. We therefore suggest that caution is taken whilst using these methods.
5.3
Spotting errors
Because of the difficulties in testing an algorithm which can take such a large set of input, we must
be aware that bugs may slip through. Though there are no known point sets which give errors at
the moment, we can make no guarantees in general. The most likely algorithm to fail in general
use is Kriging, though this algorithm has not been polished to the extent of the others, because
we have demonstrated that it does not perform as well as the other methods in general. After
this, we consider Natural Neighbour Interpolation. Because this method uses only local parts of
the triangulation to interpolate each point, it is likely that errors in output are localised. When
these errors do occur, we expect to see large errors in very local areas. This could easily cause
meaningless output if a neutron is traced across such a spike. Our kd-tree implementation of
Nearest Neighbour Interpolation may sometimes fail to find the actual nearest neighbour, but the
16
most likely errors that occur will result in finding a point which is at least approximately the nearest
neighbour.
5.4
We have not included a formal component for McStas to do interpolation in a general magnetic field.
This is because there still seems to be some work on the core functionality of McStas before this
can be done elegantly. The current precession utilities do not require the storing of a state, and
store the magnetic field data in global variables. This clearly needs some thought in re-factoring,
and it is unlikely that any benefit would be gained from creating the extra global variables required
to store the state for our interpolate functions. Instead we have gone to great efforts to conceal
the complexities of interpolation so that when the functions are used, they may simply be used
as library functions. This means that once a good platform exists for adding interpolation to the
polarisation utilities, there should be no significant challenge when interpolation is required.
5.5
General use
Whilst working on this project, it seemed clear that very little public-domain code is available for
performing three dimensional interpolation. There is also a general lack of information relating
to the algorithms used here. Hopefully the implementations presented here will be of use to the
general scientific community, as they are certainly general enough to work well in any kind of vector
re-sampling or interpolation problem.
5.6
5.6.1
Future Work
Considering variations in time
In our implementation we have assumed that the magnetic field remains constant. To add a
time component to our interpolator would require interpolation over four dimensions, which would
certainly be too slow for the intended purpose. Instead we suggest that a transform matrix be used
to modify the points based on time. In this way we can easily implement oscillations and rotations
of the magnetic field, without the difficulties of higher dimensional interpolation.
5.6.2
To improve the performance of the algorithms presented here, the following modifications would
be recommended.
Modify the Delaunay Triangulation Algorithm This implementation originally used
the Lawson Simplex Flipping algorithm (see Appendix C) to generate the triangulation;
however due to lack of good reference material at the time, some details were missing and
the triangulation being produced consistently failed to be Delaunay. Given the paper written
by Ledoux[5] this algorithm should be much easier to implement. If errors turn up during
triangulation because of round-off errors, then implementing this algorithm would be a good
place to start.
Speeding up Natural Neighbour Interpolation Our algorithm regularly uses brute
force methods to discover information about neighbouring simplicies. It is likely that these
17
can be optimised out by using some cleverness and geometry, though it is likely to be time
consuming. Profiling the algorithm gives us that the slowest parts are finding Natural Neighbours and creating a Voronoi Cell. The algorithms used here are simply first attempts
and we make no claims that they are optimal (though they are certainly not unnecessarily
inefficient).
Extensions Worth while extensions to the current code-base would be to include an interpolator for uniform grids. Many fast and efficient methods exist for this which appear to be
simple to implement. Combining an efficient algorithm for interpolating uniform points would
make an ideal counter-part for the Natural Neighbour Interpolator. Suggested methods to
consider would be tri-linear and tri-cublic interpolation.
18
Appendices
19
Appendix A
A.1
A.2
To speed up the Nearest Neighbour algorithm, we introduce an auxiliary data structure known as
a kd-tree. This is special binary tree in which every node describes a splitting hyper-plane such
that the right child of each node contains points in front of the plane, and the left child contains all
points behind the plane. At each level of the tree, we chose a splitting hyper-plane which is parallel
to an axis, swapping the plane iteratively at each level - that is, at level k we use a splitting plane
which is parallel to the axis in the direction k mod n. To chose the location of the plane in any
given dimension, we use the median value of all the k mod n children. Doing this guarantees that
20
we will get a balanced tree at the end. We therefore have that a kd-tree is essentially a generalised
binary search algorithm, made to work in d-dimensional space by splitting on a different dimension
at each level of the recursion. Figure A.1 shows an example of a two dimensional kd-tree.
Figure A.1: A kd-tree in two dimensions. Taken from the Wikipedia entry on kd-trees.
Build-Tree(A, d)
1 if A.length = 0
2
return
3 = d mod n
4 Sort(A, )
5 = dA.length/2e
6 node.point = A[]
7 node.rightChild = Build-Tree(A[1 . . . 1], d 1)
8 node.leftChild = Build-Tree(A[ + 1 . . . A.length], d 1)
Here we use Sort(A, ) to sort the array A by the th dimension. In fact we can improve this
by using an expected linear time median finding algorithm, which is how the implementation used
in this paper works. for more details on expected linear selection see Introduction to Algorithms[3].
Once we have pre-computed the kd-tree for our data, we can perform expected log(n) time nearest
neighbour look-up. To do this we use the procedures Approx-Neighbour and Border-Check
as follow.
Nearest-Neighbour(tree, x )
1 approx = Approx-Neighbour(tree, x )
2 return Border-Check(tree, approx, x )
To find an approximate neighbour to the query point, we can simply traverse the tree as though
doing an insert. Once we get to the bottom of the tree, we return the last node found.
21
Approx-Neighbour(tree, x )
1 current = tree
2 loop
3
= current.depth mod n
4
if current.point[] > x []
5
if current.rightChild =
6
return current
7
current = current.rightChild
8
else
9
if current.leftChild =
10
return current
11
current = current.leftChild
Because a closer point than the one found using Approx-Neighbour might exist on the other
side of the splitting hyper-plane, we must traverse the tree again checking to see if the splitting
hyper-plane for each node intersects with the hyper-sphere around with radius given by our current
nearest estimate. At most nodes, we should be able to take just one of the branches, thus greatly
reducing the number of points we have to consider. To speed an actual implementation of this
function up, we note that we can pass the minimum values by reference, which will remove some
of the extra comparisons.
Border-Check(node, approx, x )
1 = node.depth mod n
2 d0 = Dist(approx.point, x )
3 // Check to see if this node is a better approximation
4 if Dist(node.point, x ) < d0
5
approx = node
6 // Check to see whether or not the hyper-sphere around our point
7 // intersects with the splitting plane.
8 if best.point[] x [] < d0
9
n1 = Border-Check(node.rightChild , approx, x )
10
n2 = Border-Check(node.rightChild , approx, x )
11
if Dist(n1 .point, x ) < Dist(n2 .point, x )
12
return n1
13
else
14
return n2
15 else
16
if node.point[] > approx.point[]
17
return Border-Check(node.rightChild , approx, x )
18
else
19
return Border-Check(node.leftChild , approx, x )
22
Appendix B
Kriging Interpolation
Kriging was developed by French mathematician Georges Matheron, based on the masters thesis
of Daniel Gerhadus Kridge. It is used regularly in Geoscience for estimating mineral densities and
for estimating the elevation of landscape between known points.
B.1
Variograms
The basis of Kriging is to calculate an estimated variogram model for the data which will give good
results. The variogram is defined by the following.
2(xi , xj ) := E |vi vj |2
That is that the variogram gives the expected lag between the data values of two points. For
simplicity here, we assume that the data is isotropic, and so we consider only the vector difference
between our two points h := kxi xj k.
B.2
Realistically, we can never know the actual underlying variogram for our data, (assuming it exists).
So we model the data using an empirical variogram. (The following is actually a semi-variogram,
which differs from the variogram by a factor of two.)
(h) :=
1
|N (h)|
|vi vj |2
(i,j)N (h)
Where N (h) := {i, j : |xi xj | = h}. We can then use this empirical variogram to try and fit
a known standard variogram model to our data. This allows us to calculate the variogram for lag
values (h values) which we cannot calculate from the data. To fit these models we use the following
three standard variogram properties.
Sill, s This is the limit of the variogram tending to infinite lag distances.
Range, r This is the the size of the lag for which the variogram becomes constant.
Nugget, n This is the jump between the variogram and the origin at zero lag distances.
23
We can estimate these properties using points on our empirical semi-variogram, in order to fit
a valid variogram model. Some examples of valid models are as following:
Exponential
(h) = n + s 1 exp
3h
r
Spherical
(h) = n + s
3h
2s
h 3
2s
Gaussian
2
(h) = n + s 1 exp hs2
B.3
Ordinary Kriging
1
(x1 , x1 )
..
..
..
.
.
.
=
n (xn , x1 )
(x1 , xn ) 1
(x1 , x )
..
..
..
.
.
.
(xn , xn ) 1
(xn , x )
1
0
1
Where we note that is the Lagrange Multiplyer. Since our weights sum to unity we simply
do following point weighting.
=
v
n
X
i xi
i=1
We can then calculate an estimate for the error by using the following. We note that this
is entirely dependent on our variogram, so an inaccurate variogram will give meaningless error
calculations.
0
1
(x1 , x )
..
..
.
) Z(x ) =
var Z(x
.
n (xn , x )
Although Kriging is statistically optimal[2], it has a few problems in practice. We find that
if the data is anisotropic, we need to adapt our variogram fitting to take account of this. This is
usually done by using a directional semi-variogram[8]. In this method we consider the lag relative
to specific directions from each point. The method demonstrated here does not take account of
directional anisotropy, and so we expect highly anisotropic data to give inaccurate results.
24
Appendix C
C.1
Definitions
Convex hull For a set of points X, the convex hull of X is the minimal convex set containing
X. In R2 we can imagine that this is formed by putting a tight elastic band around a set
of points, then the convex hull is the interior of shape formed by the elastic band as it fits
tightly around the points.
Simplex A Simplex is the convex hull of a set of d + 1 affinely independent points in Rd . We
shall usually use here to denote an arbitrary 4-Simplex in R3 which is no different from an
irregular tetrahedron. We note that a 3-simplex is just a triangle, and that a k-simplex is
often thought of as the generalisation of a triangle to Rk1 .
Delaunay Triangulation We shall often just refer to this as a Delaunay Mesh in Rd . The
Delaunay Triangulation DT(X) of a set of points X Rd is a triangulation of (d + 1)simplicies such that for all in DT(X), every point x X is either on or outside the
circum-hypersphere of .
Voronoi Cell A Voronoi cell about a point x X Rd is the cell enclosing all the points
which are closer (relative to the standard Euclidean Norm) to the point x than any other
point x = {u Rd : v X, ku xk < kv xk}. This means that, in Nearest Neighbour
interpolation the Voronoi Cell about x is exactly the set of points which would be classified
with the same data value as that at x. The Voronoi Cell is strictly convex[5]. We will denote
the hyper-volume of a Voronoi Cell vol().
Voronoi Diagram The Voronoi Diagram VD(X) of a set of points X Rd is the set
of Voronoi Cells over all those points. We note that every Voronoi Cell is strictly non25
intersecting. (this follows directly from the definition of the Voronoi Cell). The Voronoi
Diagram is the geometric dual of the Delaunay Mesh, and given one we can always extract
the other[5].
Natural Neighbours The Natural Neighbours of the point x X Rd are the points in
X whose Voronoi cells share a face with the Voronoi cell of x. This is in fact equivalent to
the set of points which lie on an (n+1)-simplex DT(X) that contains x.
Figure C.1: A Delaunay Triangulation (black) and Voronoi Diagram (red) of a set of points in two
dimensions. Taken from the Wikipedia entry on Delaunay Triangulations.
C.2
Since it turns out that constructing a Voronoi Diagram in R3 is more easily implemented by first
creating a Delaunay Triangulation of the data, and then later extracting the Voronoi Cells[5] (this
is possible since the Delaunay Triangulation is the geometric dual of the Voronoi Diagram), we will
first approach the problem of creating a Delaunay Triangulation in R3 .
There are numerous different methods for creating the Delaunay Triangulation and we shall
consider the various benefits and downfalls in each case.
C.2.1
This is an incremental algorithm for adding points to a Delaunay Triangulation and is valid in
any number of dimensions. For simplicity we shall restrict ourselves to the three dimensional case,
though the d-dimensional case follows trivially.
To insert a new point, x into the triangulation, we first identify all those simplicies in the
current triangulation which would no longer be Delaunay after the addition of this point. We then
remove all of these simplicies and insert new edges connecting the point x to each vertex left in
the hole created by removing the conflicting simplicies. We then have a new globally Delaunay
Triangulation containing x . For a detailed proof of correctness see Ledoux[5].
26
C.2.2
The simplex flipping algorithm is more complex than the edge flipping algorithm, but can solve
problems of robustness that appear when using the edge flipping algorithm. This process revolves
around flipping 4-simplicies instead of edges, which means that we always have a valid triangulation
(whether or not it is strictly Delaunay). The basic simplex-flipping primitives we use are as follows:
flip3-2, flip2-3, flip1-4, flip4-1. each primitive is characterised by the number of simplicies before
and after the flip takes place. For example, a flip1-4 is the flip which takes a single simplex and adds
a new point to create four simplicies. the flip1-4 is the exact reverse of this operation. flip3-2 and
flip2-3 are each responsible for flipping adjacent simplicies (without the addition of a new point).
To create our Delaunay Triangulation with these primitives, we first find the simplex which
contains the point we want to add. We then perform a flip1-4 to add the point to the triangulation.
The next step is to check that our triangulation is still Delaunay, to do this, we go through each
simplex which is incident to x and check to see if it is Delaunay relative to its neighbour. If this
test fails, we perform an appropriate flip, and push all new simplicies to a stack of simplicies to be
checked. Once the stack is empty, we have a globally Delaunay Triangulation[5].
27
Insert-Point(T , x )
1 = Find-Containing-Simplex(T , x )
2 insert x into T using a flip1-4
3 push the four new simplicies to stack
4 while stack not empty
5
= {p, a, b, c} = pop from stack
6
a = {a, b, c, d} = adjacent simplex sharing {a, b, c}
7
if d inside circumsphere of
8
Flip(, a )
Essentially the Flip operation considers the configuration of the adjacent simplex and depending
upon how many faces are oriented towards the new point, performs a different flip. The updated
simplicies are pushed to the stack for checking. For more details on this algorithm, and how the
Flip procedure can be implemented, see Ledoux[5].
C.2.3
The implementation used in this paper is based on a modified version of the Bowyer-Watson
algorithm. To start with, we do all operations at the simplex level, so when adding a new point,
we take each old face on the interior of the polytope left when removing the conflicting simplicies,
and use this to create the base of a new tetrahedron which connects to the new point. The
next modification we do, is to use adaptive geometric predicates, provided by Shewchuk[7]. The
predicates of interest to us are Orient-3D and In-Sphere. These allow us to tell whether or not
a point is in front of, on, or behind a plane and also whether or not a point is in, on, or outside
of a sphere. These methods use adaptive accuracy to ensure that results are correct without using
exact arithmetic unless it is absolutely necesary. In this way, we should be able to avoid the issues
asociated with round-off errors which plague the simple Bowyer-Watson algorithm.
The following algorithm is used to insert a point in the Delaunay Triangulation implementation
used in this paper.
Insert-Point(T , x )
1 0 = Walk(T , x )
2 push 0 to stack
3 while stack not empty
4
= pop from stack
5
if Is-Delaunay()
6
add to list of conflicts
7
push neighbours of to stack
8 // We now have a list of conflicting simplicies
9 for each in list of conflicts
10
for each face of
11
if is not shared with any other conflicting simplex
12
create new simplex = {, x }
13
add to T
Here, Is-Delaunay does an In-Sphere test with the given simplex and the verticies of each
of its neighbours.
28
C.3
When considering a dense point set for Natural Neighbour Interpolation, the only procedure that
we perform which depends on the number of points is Find-Containing-Simplex. For large
point sets then, we suppose that this will be the dominating factor in the running time of our
algorithm. To make our implemenation perform well asymptotically, we can modify this procedure
to give better results. The most well-known method to speed up simplex look-up is the Walk
algorithm. This algorithm proceeds by taking a simplex at random from the current triangulation,
and checking the orientation of each face against the query point. If the point is in front of the
face, we proceed to the neighbouring simplex which shares that face. If the point is behind every
face on the current simplex, then we are done. It might be imagined that this procedure could run
forever, however a convenient feature of the Delaunay Triangulations is that they are acyclic for
any fixed viewpoint[4], meaning that the Walk algorithm will always terminate when used on a
valid Delaunay Triangulation.
Walk(T , x )
1 = random simplex in T
2 loop
3
for i = 1 to 4
4
= get face i of
5
if Orient-3D(, x ) < 0
6
= neighbour of sharing
7
break
8
if i = 4
9
return
We note that this algorithm will work when a point lies on the simplex, in this case we will
just return a simplex which is incident to that point. Though it is difficult to calculate the running
C.4
When performing interpolation using some given Delaunay Triangulation, we find that we need
access to the triangulation with and without a given point. Instead of regenerating the mesh every
single time, we seek to implement fast point-removal. Since there is no well-known point removal
algorithm when considering Delaunay Triangulations in R3 , we find a more specialised solution.
Since in the context of Natural Neighbour Interpolation, we only ever need to remove the
last point which we added, it is possible to just store a list of the most recent updates (in the
context of an incremental point adding algorithm) which we can then undo in sequence to take the
triangulation back to its original state. The implementation presented with this paper maintains a
stack of the most recently removed simplicies, and the most recently updated simplicies; along with
a list of neighbour updates. This means that we can very quickly put back all the simplicies which
have been removed. See Ledoux[5] for more information on this operation in the more specialised
example of the Lawson flipping algorithm.
29
C.5
Degeneracies
30
C.6
Although the Voronoi Diagram is the geometric dual of the Delaunay Triangulation, we find that
extracting the Voronoi Cell for a given point requires a non-trivial implementation in three dimensions. The geometric relationship between the VD and the DT gives the following relations (from
Ledoux[5]).
Figure C.4: The Natural neighbours of a point, with neighbouring simplicies and circum-spheres
Vertex A Voronoi vertex is simple to extract, as every Voronoi vertex lies at the center of
the circumsphere of a simplex in the Delaunay Triangulation.
Face Every Delaunay edge (that is, every edge which connects the center point to a natural
neighbour) is the dual of a single Voronoi Face. A single Voronoi Face is an irregular polygon
defined by connecting the circum-spheres of the simplicies which are incident to the given
Delaunay Edge. This face is guaranteed to be co-planar[5].
Polyhedron A Voronoi Cell is a convex polyhedron defined by connecting all the Voronoi
Faces of each natural neighbour about the given point.
To write an algorithm that extracts the Voronoi Cell for a given x we must first find all the
natural neighbours of x . We can do this by using Walk to find a simplex which contains the
given point, and then by using a stack to step around the neighbours, halting when we are sure we
have found all simplicies incident to x .
We now extract all the Voronoi verticies for the Voronoi Cell about x by calculating the circumspheres of each simplex incident to x . For {a, b, c, d} the verticies in R3 of an arbitrary simplex,
we can calculate the circum-center by using the following.
c=a+
|d a|2 (b a) (c a) + |c a|2 (d a) (b a) + |b a|2 (c a) (d a)
2(b a) [(c a) (d a)]
Once we have a set of Delaunay Edges incident to our point, and a list of Voronoi verticies for
each simplex, we can consider generating the Voronoi Faces. To do this, we take each Delaunay
31
Edge in turn and rotate around it, connecting the circum-centers of each simplex which we go
through.
Once we have a complete set of Voronoi faces about a given point, we have successfully created
a Voronoi Cell.
C.7
In order to do Natural Neighbour interpolation, we must have a way to calculate the volume of any
given Voronoi Cell. Since there is no general way to calculate the volume of an irregular polyhedron,
we must first decompose the cell into a series of simplicies, which we can calculate the volume of
using the following (for some simplex {a, b, c, d}).
|(a d) ((b d) (c d))|
6
To extract a set of simplicies from a given Voronoi Cell, we use the fact that every Voronoi Cell
is convex[5]. We take each face in turn and create a two dimensional triangulation by finding the
center point of the face (we simply use the arithmetic mean of all the points) and then drawing
edges out from the center point to each vertex of the face. We then use each triangle formed
by adding these edges as the base of a tetrahedron which is connected to the center point of the
Voronoi Cell, x . Since the Voronoi Cell is convex, we can do this for every face and be sure that
none of the tetrahedrons intersect. Summing over the volumes of each of the created tetrahedra
will give us the volume of the Voronoi Cell.
V =
C.8
Performing Interpolation
Given implementations of all the above routines, we can perform Natural Neighbour Interpolation
in R3 . To proceed, we first add the point x to the known Delaunay Triangulation of our data
X R3 . We now use Natural-Neighbours to find all the natural neighbours about x . For each
of those neighbours, we calculate the volume of the voronoi cell by first creating the Voronoi cell for
each, and then decomposing it into tetrahedra as described above. We also store the volume of the
voronoi cell about x as vol(x ). We next use our efficient point removal algorithm to remove the
32
last point which was inserted, whilst maintaining references to the points which were the natural
neighbours of x when it was inserted. We now calculate the change in volumes of these natural
neighbours, by recomputing the updated Voronoi Cells and their associated volumes. To estimate
at x we weight each of the values of the natural neighbours about x by dividing the
the value v
change in volumes for each neighbouring cell before and after the point to be interpolated is added
by the volume of the cell about x , vol(x ).
33
Bibliography
[1] Isabel Beichl and Francis Sullivan. Coping with degenerecies in delaunay triangulation. In Modeling, Mesh Generation, and Adaptive Numerical Methods for Partial Differential Equations.
1995.
[2] Geoff Bohling. Kriging. October 2005.
[3] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction
to Algorithms. The MIT Press, 2nd revised edition edition, September 2001.
[4] Herbert Edelsbrunner. An acyclicity theorem for cell complexes in d dimension. Combinatorica,
10(3):251260, 1990.
[5] Hugo Ledoux. Modelling three-dimensional fields in geoscience with the voronoi diagram and
its dual. November 2006.
[6] Hugo Ledoux and Christopher Gold. An efficient natural neighbour interpolation algorithm for
geoscientific modelling. Springer Link, November 2006.
[7] Jonathan Richard Shewchuk. Robust adaptive floating-point geometric predicates. In in Proc.
12th Annu. ACM Sympos. Comput. Geom, pages 141150, 1996.
[8] William L. Wingle and Eileen P. Poeter. Directional semivariograms: Kriging anisotropy without anisotropy factors. May 1998.
34