Edge Based Segmentation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4
At a glance
Powered by AI
The document discusses several methods for edge-based image segmentation including edge detection, thresholding, graph searching, dynamic programming and active contours.

Methods discussed include edge detection, thresholding, graph searching which uses heuristics to group edges, and dynamic programming which formulates boundary detection as an optimization problem.

The active contour (snake) model is a deformable model that can be matched to an image contour by energy minimization. It is an energy minimizing spline that deforms itself to conform to the nearest salient contour in an image by minimizing internal, image and constraint energies.

3

Edge Based Segmentation & Active Contours


Computer Vision Department of Computing, Imperial College
GZ Yang and DF Gillies ! http://www.doc.ic.ac.uk/~gzy

Edge-based segmentations rely on edges found in an image by edge detecting operators - these
edges mark image locations of discontinuities in grey level, colour, context, etc. A variety of edge
detecting operators was described in the previous lecture, but the image resulting from edge detection
can not be used as a segmentation result. Other processing steps must follow to combine edges into
contours that correspond better with borders in the image.
Edge image thresholding
In an edge image, small edge values correspond to non-significant grey level changes
resulting from quantization noise, small lighting irregularities, etc. Simple thresholding of an
edge image can be applied to remove these small values based on their magnitudes. If a
large threshold is applied, only more significant edges would appear in the resulting image.
If the original data has good contrast and is not noisy, this method can sometimes give good
results.
Graph Searching
The simplest, and also the least effective method of grouping edges is to use heuristic search.
That is to say, the algorithm starts from a single boundary pixel, and then attempts to join
neighbouring pixels on the basis of their edge strength and direction. At the end of the process
the boundary contours are thinned by removing pixels at points where it is more than one pixel
thick. This strategy is greatly enhanced if use is made of the edge magnitude and direction.
For a given image, let the gradient magnitude be specified by g(x,y) and the gradient direction
by angle n(x,y). We can immediately define heuristics based on gradient to indicate when two
adjacent pixels can be joined. A threshold, 2, can be defined and a restriction introduced that
two adjacent pixels can only be joined if:

n(xi,yi)&n(xj,yj)

mod 2B < 2

A second heuristic is to take into account the absolute value of the gradient. When two or
more pixels are joined, a curve is created which has a tangent defined at each pixel. In this
case, we only consider joining arcs to pixels if the implied curve gradient is orthogonal to the
edge point direction. This method does work well when the boundaries being recovered are
straight lines from, for example, an old engineering drawing being digitized. However, it is a
poor heuristic for shaded images.
Heuristic restrictions allow us to reduce the problem to the search of a directed graph, in which
the pixels are the nodes, and the arcs are those adjacencies allowable under the above
restrictions. Now a boundary determination can be made by weighting the nodes according
to their gradient magnitude. For any two pixels for which there is more than one path, a graph
search can be initiated to find the most likely path. A third heuristic can be used, which is that,
given two competing paths, the shorter is the more likely. A weighting factor is defined based
on the gradient. If the maximum gradient available is gmax, a suitable weighting function is:

w(x,y) '

-1-

g(x,y)
gmax

Note that the gradient magnitude is always in the range 0 to gmax, so the weight simply gives
a measure between 0 and 1 as to how likely the pixel is to be a boundary point. The problem
now becomes a simple cost minimisation one which can be solved by any of the standard
search techniques (eg depth first search with backtracking and keeping as many expanded
paths as memory allows, branch and bound etc.). Other heuristic measures such as the
difference in gradient, or the direction of the target node can also be incorporated.

Dynamic Programming Solution


Boundary following can be considered an optimisation problem which can be solved by a
variety of techniques. One early method suggested by Ballard is to use dynamic programming.
To use optimisation techniques we need to associate a function to optimise with a boundary.
A simple solution is to sum the differences in gradient between adjacent pixels and subtract
the edge strengths around the boundary hence we are seeking to minimise:
n

*n(pi&1)&n(p i)*&"g(pi)

i'1

C(p1, p2, ... , p n)'j

where, the pi are the boundary points (xi,yi) with the constraint that pi is adjacent to pi+1. For a
closed boundary, we have that p1 = pn, and for a boundary segment between p1 and pn we
define n(p0)=n(p1). Variable " is a constant defining the relative importance of the two criteria.
The problem is now to find the boundary which minimises this function. In practice, we shall
use the method to join up boundary points rather than to search for a complete boundary.
Applied without any a-priori knowledge, the method comes down to exhaustive search.
Starting from some pixel p1=ps, we compute a value of C(ps,p2) for p2 equal to each of its
possible eight neighbours. The set of all possible paths of length 3 pixels beginning from the
start, ps, fall in the square of 25 pixels around ps. For each of these we compute a value for
C(ps,p2,p3). Then we proceed to the paths of length 4 pixels, which can go from ps to any pixel
in the square of 49 pixels around ps. In total, for a path of length N, there are 1 + 4N(N+1)
pixels in the square around the start pixel. For each of these pixels, to compute the maximum
of a path of length n, we need only consider the paths of length n-1 to its eight neighbours, or
three neighbours for a pixel on the edge of the square. Note also we must continue until the
square includes the end point, and possibly goes some way beyond it. Clearly, this process
will soon become too expensive in both memory requirements (since we must store all of the
optimal paths found) and computing time. It should be noted that the dynamic programming
method merely means that we store one optimal path to each end pixel, rather than all the non
looping paths.
It is possible to limit the search space by imposing constraints. For example that the direction
does not change by more than 45E at each pixel. If this is the case, then there are only three
possible successors to each candidate pixel when boundary following. Another possibility is
to prune the search tree. Thus when evaluating paths of a certain length, we eliminate those
whose maximum is too small, or those whose distance from the end point is too large.
Limitations of this kind, like heuristics, will mean that we cannot guarantee that the path found
is an optimal one.

Boundary refinement
The computational demands of the method make it only suitable for filling up the gaps in a
partially completed boundary. Another idea is to refine a given boundary segment. We can
again use the function which we choose in the dynamic programming method, but instead of
searching the complete space we use only a few control points and minimise the function for
these. The strategy is to consider each control point in turn and move it to the pixel in its local
neighbourhood which gives us the minimum. In order to do this some strategy must be

-2-

adopted for moving the control points in the event of their being no or little other information
in the image. For a closed boundary we could make the inital estimate surround the object of
interest, and add in another term to the objective function to penalise the total length. Thus,
the contour will close like an elastic band around the object until the high gradient term stops
the process. If a gap exists then a control point could drift into the object until stopped by the
continuity condition. A difficulty with this type of strategy is that the control points can drift
apart, and so another term, which preserves continuity, is to penalise badly distributed points.
A possible term to do this is the sum over all control points of |dave - |pi-p i+1||. Various authors
have suggested variants of this which work more or less successfully in different applications.
Once a good approximation has been found for the control points a divide and conquer
algorithm could be invoked. Starting with the straight line joining the two end points of a
segment of the boundary, we pick its midpoint, and search its neighbourhood until we find the
optimum pixel. This point is then taken as a boundary point, and the two resulting segments
are refined, until sufficient accuracy is obtained, or no suitable edge points can be found.
Alternatively, we could then use dynamic programming type search to fill the gaps.

Refined Boundary
Best pixel local
to the mid point

Vi+1

Real Boundary
Mid Point
Vi
Control Points

Figure 3.1. Divide and Conquer

Active Contours (Snakes)


The Snake is a specific example of a general deformable model, first proposed by by Kass,
Witkin and Terzopoulus which can be matched to an image contour by energy minimisation.
It is an energy minimising spline; from a given starting point it deforms itself to conform with
the nearest salient contour. They do not ``find'' contours; the initial location must be provided
either by other processing or by higher level knowledge. For example, a snake may track lips
in an image sequence as they change shape provided the starting position is known.
The snake is a energy function, a weighted combination of internal and external forces. To
obtain the best fit between the snake and the object, we minimise the energy. Specifically, a
snake is defined as
1

Esnake' Einternalv(s)%Eimagev(s)%Econstraintv(s) ds
m
0

where Einternal is the internal spline energy caused by stretching and bending, Eimage is a
measure of the attraction of image features such as contours, and Econstraint is a measure of
external constraints e.g. higher level understanding of the general shape or user-applied

-3-

energy. In the above equation v(s) = (x(s), y(s), z(s)) or (x(s), y(s)) is the parametric
representation of the contour in 3D or 2D space.

Figure 3.2: External and image forces which determine the snake

The internal energy provides a smoothness constraint. This can be further defined as

dv 2
d 2v
Einternal' "(s)/ / %$(s)/
00 ds 2 /00
00 ds 00
0
0

is a measure of the elasticity of the snake. is a measure of the stiffness of the snake.

Summary
The success of the tracing algorithm will be dependent, to some extent, on the type of edges
that describe the object. If they are clearly defined, such as digitizations of black and white
objects with thick lines, then simple tracing will produce the correct results. However, if there
are degrees of shading in the object, together with some random noise components, then it
is not clear where to take the edge point thresholds, and the resulting boundaries from simple
tracing will show gaps, which can be filled by dynamic programming or other optimisation
methods.
Edge tracing methods do depend on there being simple boundaries with a proportion of edge
points that can be determined easily. In some cases the initial edge points could be identified
by an operator, and the boundaries found using dynamic programming or divide and conquer.
It is however rather harder to find these seed edge points automatically. Thresholding tends
to eliminate vital information, and the ordering of points such that they can be joined by
boundary segments is a non trivial problem. For pictures which contain many objects, with lots
of detail, there is the further problem of deciding which boundary segment goes with which
object. Consequently, these types of method often fail.

-4-

You might also like