1 Convex Hulls: 1.1 Definitions
1 Convex Hulls: 1.1 Definitions
1 Convex Hulls: 1.1 Definitions
First buffalo gal go around the outside, ’round the outside, ’round the outside.
Two buffalo gals go around the outside, ’round the outside, ’round the outside.
Three buffalo gals go around the outside.
Four buffalo gals go around the outside, ’round the outside, ’round the outside,
Four buffalo gals go around the outside, and dosey-do your partners.
— Malcolm McLaren, “Buffalo Gals”, Duck Rock (1983)
1 Convex Hulls
1.1 Definitions
Suppose we are given a set P of n points in the plane, and we want to compute something called the
convex hull of P. Intuitively, the convex hull is what you get by driving a nail into the plane at each point
and then wrapping a piece of string around the nails. More formally, the convex hull is the smallest
convex polygon containing the points:
• polygon: A region of the plane bounded by a cycle of line segments, called edges, joined end-to-end
in a cycle. Points where two successive edges meet are called vertices.
• convex: For any two points p and q inside the polygon, the entire line segment pq lies inside the
polygon.
• smallest: Any convex proper subset of the convex hull excludes at least one point in P. This
implies that every vertex of the convex hull is a point in P.
We can also define the convex hull as the largest convex polygon whose vertices are all points in P, or
the unique convex polygon that contains P and whose vertices are all points in P. Notice that P might
have interior points that are not vertices of the convex hull.
Just to make things concrete, we will represent the points in P by their Cartesian coordinates, in two
arrays X [1 .. n] and Y [1 .. n]. We will represent the convex hull as a circular linked list of vertices in
counterclockwise order. If the ith point is a vertex of the convex hull, next[i] is index of the next vertex
counterclockwise and pred[i] is the index of the next vertex clockwise; otherwise, next[i] = pred[i] = 0.
It doesn’t matter which vertex we choose as the ‘head’ of the list, and our decision to list vertices
counterclockwise instead of clockwise is arbitrary.
To simplify the presentation of the convex hull algorithms, I will assume that the points are in general
position, meaning (in this context) that no three points lie on a common line. This is just like assuming
that no two elements are equal when we talk about sorting algorithms. If we wanted to really implement
these algorithms, we would have to handle collinear triples correctly, or at least consistently. Laster in
the semester, we’ll see a systematic method to do this.
1
Computational Geometry Lecture 1: Convex Hulls
Suppose our three points are (a, b), (c, d), and (e, f ), given in that order, and for the moment, let’s
also suppose that the first point is furthest to the left, so a < c and a < f . Then the three points are in
←−−−−−→ ←−−−−−→
counterclockwise order if and only if the line (a, b)(c, d) is less than the slope of the line (a, b)(e, f ):
d−b f −b
counterclockwise ⇐⇒ <
c−a e−a
Since both denominators are positive, we can rewrite this inequality as follows:
This final inequality is correct even if (a, b) is not the leftmost point. If the inequality is reversed, then
the points are in clockwise order. If the three points are collinear (remember, we’re assuming that never
happens), then the two expressions are equal.
Another way of thinking about this counterclockwise test is that we’re computing the cross-product
of the two vectors (c, d) − (a, b) and (e, f ) − (a, b), which is defined as a 2 × 2 determinant:
c − a d − b
counterclockwise ⇐⇒ >0
e − a f − b
We can also think about this orientation test in terms of three-dimensional vectors. Recall that the
cross product u × v of two vectors in IR3 is defined according to the right-hand rule; the vectors u, v,
and u × v are oriented counterclockwise around the origin. Thus, three vectors u, v, and w are oriented
counterclockwise if and only if triple product (u × v) · w is positive. If we put our original points are in
the plane z = 1, their orientation is defined by the sign of the triple product ((a, b, 1) × (c, d, 1)) · (e, f , 1),
which can in turn be written as a 3 × 3 determinant:
1 a b
counterclockwise ⇐⇒ 1 c d > 0
1 e f
2
Computational Geometry Lecture 1: Convex Hulls
The algorithm then performs a series of pivoting steps to find each successive convex hull vertex,
starting with ` and continuing until it reaches ` again. The vertex immediately following a point p is
the point that appears furthest to the right to someone standing at p and looking at the other points.
In other words, if q is the vertex following p, and r is any other input point, then the triple p, q, r is in
counterclockwise order. We can find each successive vertex in linear time by performing a series of O(n)
counterclockwise tests.
Since the algorithm spends O(n) time for each convex hull vertex, the worst-case running time is
O(n2 ). However, this naïve analysis hides the fact that if the convex hull has very few vertices, Jarvis’s
march is extremely fast. A better way to write the running time is O(nh), where h is the number of
convex hull vertices. In the worst case, h = n, and we get our old O(n2 ) time bound, but in the best case
h = 3, and the algorithm only needs O(n) time. Computational geometers call this an output-sensitive
algorithm; the smaller the output, the faster the algorithm.
3
Computational Geometry Lecture 1: Convex Hulls
We now expand this dumbbell into the correct convex hull as follows. As long as there is a clockwise
turn at either endpoint of either bridge, we remove that point from the circular sequence of vertices and
connect its two neighbors. As soon as the turns at both endpoints of both bridges are counterclockwise,
we can stop. At that point, the bridges lie on the upper and lower common tangent lines of the two
subhulls. These are the two lines that touch both subhulls, such that both subhulls lie below the upper
common tangent line and above the lower common tangent line.
Merging the two subhulls takes O(n) time in the worst case. Thus, the running time is given by
the recurrence T (n) = O(n) + T (k) + T (n − k), just like quicksort, where k the number of points in R.
Just like quicksort, if we use a naïve deterministic algorithm to choose the pivot point p, the worst-case
running time of this algorithm is O(n2 ). If we choose the pivot point randomly, the expected running
time is O(n log n).
There are inputs where this algorithm is clearly wasteful (at least, clearly to us). If we’re really
unlucky, we’ll spend a long time putting together the subhulls, only to throw almost everything away
during the merge step. Thus, this divide-and-conquer algorithm is not output sensitive.
4
Computational Geometry Lecture 1: Convex Hulls
To change this polygon into the convex hull, we apply the following ‘three-penny algorithm’. We
have three pennies, which will sit on three consecutive vertices p, q, r of the polygon; initially, these are
` and the two vertices after `. We now apply the following two rules over and over until a penny is
moved forward onto `:
• If p, q, r are in counterclockwise order, move the back penny forward to the successor of r.
• If p, q, r are in clockwise order, remove q from the polygon, add the edge pr, and move the middle
penny backward to the predecessor of p.
Whenever a penny moves forward, it moves onto a vertex that hasn’t seen a penny before (except
the last time), so the first rule is applied n − 2 times. Whenever a penny moves backwards, a vertex is
removed from the polygon, so the second rule is applied exactly n − h times, where h is as usual the
number of convex hull vertices. Since each counterclockwise test takes constant time, the scanning
phase takes O(n) time altogether.
5
Computational Geometry Lecture 1: Convex Hulls
A single iteration of the main loop takes Θ(n) time in the worst case, because we may delete up
to i − 3 points. However, each point is deleted from the evolving hull at most once, so in fact the
entire sweep requires only O(n) time after the points are sorted. Thus, the overall running time of this
algorithm, like Graham’s scan, is O(n log n).
6
Computational Geometry Lecture 1: Convex Hulls
Inserting a point into a convex hull and updating conflict pointers for uninserted points.
The running time of each phase is dominated by (1) the number of concave corners removed and
(2) the number of changes to conflict pointers; everything else clearly takes O(n) time. Each point can
be removed from the polygon at most once, so the number of corner removals is clearly O(n). Thus, to
complete the analysis of our algorithm, we only need to determine the total number of conflict pointer
changes.
T (n) = O(n) + O(number of conflict pointer changes)
Unfortunately, in the worst case, a constant fraction of the pointers can change in a constant fraction
of the insertions, leading to Ω(n2 ) changes overall. Suppose the input point set consists of two columns,
each containing n/4 points, along with a cloud of n/2 points between and above both columns. If the
points are inserted in order from bottom to top, every point in the cloud changes its conflict pointer at
each of the first n/2 insertions.
7
Computational Geometry Lecture 1: Convex Hulls
However, if we insert the points in random order, the expected total number of conflict changes is only
O(n log n). To establish this claim, all we need to do is bound the probability that a particular point’s
conflict edge changes at a particular iteration of the insertion algorithm. For any pair of indices i and j,
let ei (p j ) denote the conflict edge for point p j ∈ P after the first i random points have been inserted. If
point p j has already been inserted, or is already inside the evolving hull, then ei (p j ) = NULL. We now
easily observe that
n X
X n
E[number of conflict pointer changes] ≤ Pr[ei (p j ) = ei−1 (p j )].
i=1 j=1
But how do we compute the probability that a given conflict pointer changes at a given iteration?
The easiest method is called backwards analysis—imagine the algorithm running backward in time. In
this backward view, the algorithm deletes the input points one at a time in random order, maintaining
the convex hull of the points that have not yet been deleted, along with conflict pointers for the points
that have been deleted. A given deletion changes the conflict pointer e(q) only under the following
circumstances:
• If q was a vertex of the polygon before the deletion, then q was the deleted point.
• If q was inside the polygon before the deletion, then it was outside afterward.
• If q was outside the polygon before the deletion, then one of the endpoints of e(q) was the deleted
point.
The first and second cases happen at most once per point, and thus at most n times altogether. To
bound the probability of the third case, we observe that our backwards algorithm (mhtirogla?) deletes a
random point at each iteration. Specifically, in the ith iteration (counting backward), the probability that
any particular point is deleted is exactly 1/i. Thus, the probability that an exterior point’s conflict
pointer changes is precisely 2/i. We conclude:
n X
X n
E[number of conflict pointer changes] ≤ Pr[ei (p j ) = ei−1 (p j )]
i=1 j=1
n X
n
X 2
≤ n+
i=1 j=1
i
8
Computational Geometry Lecture 1: Convex Hulls
Once we have the n/h subhulls, we follow the general outline of Jarvis’s march, ‘wrapping a string
around’ the n/h subhulls. Starting with p = `, where ` is the leftmost input point, we successively find
the convex hull vertex the follows p and counterclockwise order until we return back to ` again.
The vertex that follows p is the point that appears to be furthest to the right to someone standing
at p. This means that the successor of p must lie on a right tangent line between p and one of the
subhulls—a line from p through a vertex of the subhull, such that the subhull lies completely on the
right side of the line from p’s point of view. We can find the right tangent line between p and any subhull
in O(log h) time using a variant of binary search. (Details are left as an exercise.) Since there are n/h
subhulls, finding the successor of p takes O((n/h) log h) time altogether.
Since there are h convex hull edges, and we find each edge in O((n/h) log h) time, the overall running
time of the algorithm is O(n log h).
Unfortunately, this algorithm only takes O(n log h) time if a little birdie has told us the value of h in
advance. So how do we implement the ‘little birdie’? Chan’s trick is to guess the correct value of h; let’s
denote the guess by h∗ . Then we shatter the points into n/h∗ subsets of size h∗ , compute their subhulls,
and then find the first h∗ edges of the global hull. If h < h∗ , this algorithm computes the complete convex
hull in O(n log h∗ ) time. Otherwise, the hull doesn’t wrap all the way back around to `, so we know our
guess h∗ is too small.
Chan’s algorithm starts with the optimistic guess h∗ = 3. If we finish an iteration of the algorithm
i
and find that h∗ is too small, we square h∗ and try again. Thus, in the ith iteration, we have h∗ = 32 . In
the final iteration, h∗ < h2 , so the last iteration takes O(n log h∗ ) = O(n log h2 ) = O(n log h) time. The
total running time of Chan’s algorithm is given by the sum
k
X k
X
i
O(n log 32 ) = O(n) · 2i
i=1 i=1
for some integer k. Since any geometric series adds up to a constant times its largest term, the total
running time is a constant times the time taken by the last iteration, which is O(n log h). So Chan’s
algorithm runs in O(n log h) time overall, even without the little birdie.
9
Computational Geometry Lecture 1: Convex Hulls
balanced partition of the points; at the same time, whenever possible, the algorithm prunes away a large
subset of the interior points before recursing.
The basic QuickHull algorithm starts by identifying the leftmost and rightmost points, which we
will call a and z. We then find the convex hulls of the points on either side of the line ←→, using the
az
following recursive subroutine.
QUICKHULL(P, a, z):
−
→
〈〈Precondition: Every point in P lies to the left of az 〉〉
if P = ∅
next[z] ← a; pred[a] ← z
else
−
→
p ← point in P furthest to the left of az
−→
L ← all points in P to the left of ap
−→
R ← all points in P to the left of pz
QUICKHULL(L, a, p)
QUICKHULL(R, p, z)
Except for the recursive calls, QUICKHULL clearly runs in O(n) time. The point r is guaranteed to be a
convex hull vertex. Thus, each call to QUICKHULL discovers either a vertex or an edge of the convex hull,
so the overall running time is O(nh), just like Jarvis’s march. More importantly, any points inside the
triangle 4apz cannot be convex hull vertices; the QUICKHULL algorithm simply discards these points. As
a result of this filtering, this algorithm is blindingly fast in practice.
Unfortunately, the O(nh) running time analysis is tight in the worst case; because the splitting is
performed deterministically, an adversarial input can force unbalanced splits, just like quicksort. To
avoid these problems, we make a few minor changes to the algorithm, using the so-called prune and
search paradigm.2
←
→
To simplify the presentation slightly, I will assume that the line az is horizontal, and that a lies to
the left of z. This assumption can be enforced by a change of coordinates, but in fact, if everything is
implemented using the correct algebraic primitives, no such transformation is necessary.
We start by arbitrarily pairing up the points in P. Let q, r be the pair whose connecting line has
median slope among all n/2 pairs; we can find this pair in O(n) time, even without sorting slopes. We
now choose the pivot point p to be the point in P furthest above the line ← →, rather than the point
qr
furthest above the baseline az .←→
a z
Segment qr has median slope among all n/2 pairs, and p is the point furthest above ←
→.
qr
Next we prune the set P as follows: For each pair s, t of points in P, we discard any point in the
subset {a, z, p, s, t} that is inside the convex hull of those five points. This pruning step automatically
discards any points inside the triangle 4apz, just like the earlier algorithm, but other points may be
removed as well. The remainder of the algorithm is unchanged.
2
However, unlike quicksort, we can’t avoid unbalanced splits just by randomly permuting the points. The pivot point is
chosen according to its position in the plane, not its position in the input array. The prune-and-search algorithm could be
simplified by a solution to the following open problem: Given a set of n points in the plane, find a random vertex of the
convex hull in O(n) expected time. Specifically, each of the h hull vertices should be returned with probability (roughly) 1/h.
10
Computational Geometry Lecture 1: Convex Hulls
p p p
a z a z a z
Pruning pairs of points. Left: Both points survive. Middle: One point survives. Right: Both points discarded.
PRUNEDQUICKHULL(P, a, z):
−
→
〈〈Precondition: az points directly to the right 〉〉
−→
〈〈Precondition: Every point in P is above az 〉〉
if |P| < 5
use brute force
else
Arbitrarily pair up the points in P
q, r ← pair with median slope among all pairs
p ← point in P furthest above ← →
qr
for each pair s, t
Discard any interior point from {a, z, p, s, t}
L ← all points in P above ← →
ap
←
R ← all points in P above pz→
PRUNEDQUICKHULL(L, a, p)
PRUNEDQUICKHULL(R, p, z)
Now I claim that in the first recursive call, the subset L contains at most 3n/4 points. Consider a
pair s, t from our arbitrary pairing of P. If both of these points are in L, they must satisfy two conditions:
−→
(1) they both lie to the left of ap, and (2) the slope of st is larger than the slope of qr. By definition,
less than half the pairs have slope larger than the median slope. Thus, in at least n/4 pairs, at least one
point is not in L. Symmetrically, the subset R also contains at most 3n/4 points.
This simple observation about the sizes of L and R implies that the running time of the modified
algorithm is only O(n log h). The running time is described by the two-variable recurrence
where n1 + n2 ≤ n and max{n1 , n2 } ≤ 3n/4. Let’s assume that the O(n) term is at most αn for some
constant α. The following inductive proof implies that T (n, h) ≤ β n ln h for some constant β.
d 3 1 3h
3 ln x + ln(h − x) = − = 0 =⇒ x =
dx x h− x 4
11
Computational Geometry Lecture 1: Convex Hulls
Thus,
n 3h h
T (n, h) ≤ αn + β + ln
3 ln
4 4
4
3 3 1 1
≤ αn + β n ln h + β n ln + ln
4 4 4 4
≤ β n ln h + (α − 0.562336β)n.
12
Computational Geometry Lecture 1: Convex Hulls
Exercises
*Starred problems are more challenging.
1. Many convex hull algorithms closely resemble well-known algorithms for sorting: for example,
Jarvis’s march resembles selection sort, and QUICKHULL resembles quicksort. Describe a convex
hull algorithm that closely resembles mergesort: Arbitrarily partition the set of points into two
subsets of equal size, recursively compute the convex hull of each subset, and then merge the two
sub-hulls into a single convex hull. How do we merge two subhulls in O(n) time?
2. Let P be a convex polygon with n vertices, and let p be a point outside P. Recall that the right
tangent line between p and P is a line ← →, where q is a vertex of P, and for every other vertex r
pq
of P, the points p, q, r are oriented clockwise.
(a) Describe an algorithm to find the right tangent line between p and P in O(log n) time. You
will need an auxiliary data structure in addition to the standard doubly-linked list of vertices
of P. This data structure should have size O(n) and should be constructible in O(n) time.
(b) What does your algorithm do if p happens to lie inside P? Modify your algorithm to handle
this case correctly, without increasing its running time.
(c) Now suppose we add a new point to P and update its convex hull. Describe how to update
your auxiliary data structures from parts (a) and (b) in O(log n) time.
(d) Describe an incremental convex hull algorithm, based on parts (a)–(c), that runs in O(n log n)
time.
3. The convex layers of a point set X are defined by repeatedly computing the convex hull of X and
removing its vertices from X , until X is empty.
(a) Describe an algorithm to compute the convex layers of a given set of n points in the plane in
O(n2 ) time.
?
(b) Describe an algorithm to compute the convex layers of a given set of n points in the plane in
O(n log n) time.
13
Computational Geometry Lecture 1: Convex Hulls
4. Let X be a set of points in the plane. A point p in X is Pareto-optimal if no other point in X is both
above and to the right of p. The Pareto-optimal points can be connected by horizontal and vertical
lines into the staircase of X , with a Pareto-optimal point at the top right corner of every step. See
the figure above.
(a) QUICKSTEP: Describe a divide-and-conquer algorithm to compute the staircase of a given set
of n points in the plane in O(n log n) time.
(b) SCANSTEP: Describe an algorithm to compute the staircase of a given set of n points in the
plane, sorted in left to right order, in O(n) time.
(c) NEXTSTEP: Describe an algorithm to compute the staircase of a given set of n points in the
plane in O(nh) time, where h is the number of Pareto-optimal points.
(d) SHATTERSTEP: Describe an algorithm to compute the staircase of a given set of n points in the
plane in O(n log h) time, where h is the number of Pareto-optimal points.
In all these problems, you may assume that no two points have the same x- or y-coordinates.
5. The staircase layers of a point set are defined by repeatedly computing the staircase and removing
the Pareto-optimal points from the set, until the set becomes empty.
(a) Describe and analyze an algorithm to compute the staircase layers of a given set of n points
in O(n2 ) time.
?
(b) Describe and analyze an algorithm to compute the staircase layers of a given set of n points
in O(n log n) time.
(c) An increasing subsequence of a point set X is a sequence of points in X such that each point is
above and to the right of its predecessor in the sequence. Describe and analyze an algorithm
to compute the longest increasing subsequence of a given set of n points in the plane in
O(n log n) time. [Hint: There is a one-line solution that uses part (b). But why is is correct?]
6. Consider the two-variable recurrence that describes the running time of QUICKHULL:
(a) Suppose we can guarantee, at every level of recursion, that αn ≤ m ≤ (1 − α)n for some
constant 0 < α < 1. Prove that T (n, h) = O(n log h).
(b) Suppose we can guarantee, at every level of recursion, that αh ≤ g ≤ (1 − α)h for some
constant 0 < α < 1. Prove that T (n, h) = O(n log h).
(c) Suppose m is chosen uniformly at random from the range 1 ≤ m ≤ n − 1, independently at
each level of recursion. Prove that E[T (n, h)] = O(n log h).
(d) Suppose g is chosen uniformly at random from the range 1 ≤ g ≤ h − 1, independently at
each level of recursion. Prove that E[T (n, h)] = O(n log h).
(e) Now suppose we modify the recurrence slightly:
Prove that T (n, h) = O(n log h), regardless of the values of m and g.
Copyright
c 2008 Jeff Erickson. Released under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License (http://creativecommons.org/licenses/by-nc-sa/3.0/)
14