EECS 3101: Prof. Andy Mirzaian
EECS 3101: Prof. Andy Mirzaian
EECS 3101: Prof. Andy Mirzaian
• [CLRS] chapter 16
• Lecture Notes 7, 8
2
TOPICS
Combinatorial Optimization Problems
The Greedy Method
Problems:
Coin Change Making
Event Scheduling
Interval Point Cover
More Graph Optimization problems considered later
3
COMBINATORIAL
OPTIMIZATION
find the best solution out of finitely many possibilities.
4
Mr. Roboto:
Find the best path from A to B avoiding obstacles
There may
But be
youaonly
simple
need
andtofast
try finitely
(incremental)
many critical strategy.
greedypaths
to find the best.
5
Mr. Roboto:
Find the best path from A to B avoiding obstacles
6
Combinatorial Optimization Problem (COP)
INPUT: Instance I to the COP.
Feasible Set: FEAS(I) = the set of all feasible (or valid) solutions for instance I,
usually expressed by a set of constraints.
Objective Cost Function:
Instance I includes a description of the objective cost function,
CostI that maps each solution S (feasible or not) to a real number or ±.
Goal: Optimize (i.e., minimize or maximize) the objective cost function.
Optimum Set:
OPT(I) = { Sol FEAS(I) | CostI (Sol) CostI (Sol’), Sol’FEAS(I) }
the set of all minimum cost feasible solutions for instance I
Combinatorial means the problem structure implies that only a discrete & finite
number of solutions need to be examined to find the optimum.
Greedy attitude:
Don’t worry about the long term consequences of your current action.
Just grab what looks best right now.
8
Most COPs restrict a feasible solution to be a finite set or sequence from
among the objects described in the input instance.
E.g., a subset of edges of the graph that forms a valid path from A to B.
C=
S= Committed objects
Set of U=
all input R= Undecided objects
objects Rejected objects
(1) Minimize d(u,v) take shortest step while making progress
(2) Minimize d(v,B) get as close to B as possible
(3) Minimize d(u,v) + d(v,B) stay nearly as straight as possible
Which of these greedy choices is guaranteed to produce the shortest A-B path?
greedy
shortest
choice
path(3)
(1)
(2)
u B
B
A
v
11
The Greedy Loop Invariant
The following are equivalent statements of the generic greedy Loop Invariant:
The decisions we have made so far are safe, i.e.,
they are consistent with some optimum solution (if there is any feasible solution).
Either there is no feasible solution, or
there is at least one optimum solution Sol OPT(I) that
includes every object we have committed to (set C), and
excludes every object that we have rejected (set R = S – U – C).
U
S=
Set of
C Sol
all input
objects
R
LI: FEAS(I) = or [ SolOPT(I) : C Sol C U ].
12
Pre-Cond: S is the set of objects in input instance I
15
LI & exit-cond & LoopCode LI
16
LI & exit-cond & LoopCode LI
17
LI & exit-cond & LoopCode LI
18
LI & exit-cond & LoopCode LI
Case 1b: x committed and x Sol
Trade off
Sol OPT
y x
C Show:
Solnew OPT
R
x
C y Show:
Solnew OPT
R
Show that
Some times more Solnew Sol {y} – {x}, for some y U - Sol
than one object on
satisfies:
either side is traded
off. (1) Solnew FEAS, &
(2) Cost(Solnew) is no worse than Cost(Sol)
20
COIN
CHANGE MAKING
Use minimum number of coins to make change for a given amount.
Greedy: pick the largest coin that fits first, then iterate.
21
The Coin Change Making Problem
PROBLEM: Within a coin denomination system, make change for a given amount S,
using the fewest number of coins possible.
objective
minimize x1 + x2 + ··· + xn function
subject to:
(1) a1x1 + a2x2 + ··· + anxn = S feasibility
constraints
(2) xtN, for t = 1..n
23
The Greedy Choice & Objective
Greedy Choice: choose the largest coin that fits
𝑛
Conflict Free: 𝑈 ≥ 0 𝑈=𝑆− 𝑖=1 𝑎𝑖 𝑥𝑖
𝑛
Problem Objective Cost: 𝑖=1 𝑥𝑖
𝑛
(𝜆 is an unspecified
Greedy Objective Cost: 𝑖=1 𝑥𝑖 + 𝜆𝑈 prohibitively large
positive number)
24
The Greedy Loop Invariant
Generic Greedy Loop Invariant ( feasible solution) :
SolOPT : C Sol C U.
Loop Invariant:
x1, x2, …, xn FEAS(S – U), (need U more to reach target S)
Sol = y1, y2, …, yn OPT(S):
yk = xk , for k < t, (Sol excludes what Greedy has rejected)
yt xt , (Sol includes what Greedy has committed to)
xk = 0 , for k > t. (not yet considered)
25
Algorithm: Greedy Coin Change
Pre-Cond: input is a = a1, a2, …, an, a1>a2> … >an =1; and S (all pos integers)
LI & exit-cond
LI: x1, x2, …, xn FEAS(S – U),
Sol = y1, y2, …, yn OPT(S): & LoopCode LI
yk = xk for k < t, yt xt , xk =0 for k > t. ???
LI & exit-cond
NO if U < at
PostLoopCond U=0
then t t+1 § reject: no more at
YES
else xt xt +1 § commit to another at
x1, x2, …, xn OPT(S)
U U - at
PostLoopCond
& return x1, x2, …, xn
PostLoopCode
Post-Cond
MP = U + (n-t)
Post-Cond: output x1, x2, …, xn OPT(S) 26
Efficient implementation
objective
minimize x1 + x2 + ··· + xn function
subject to:
(1) a1x1 + a2x2 + ··· + anxn = S feasibility
constraints
(2) xtN, for t = 1..n
27
Is G(S) = Opt(S) ?
A coin denomination system is called Regular if in that system G(S) = Opt(S) S.
Questions:
(1) In the Canadian Coin System, is G(S) = Opt(S) for all S?
Answers:
(1) YES. It’s Regular. We will see why shortly.
28
How to Test Regularity
• Question: Is a given Coin Denomination System Regular,
i.e., in that system is G(S) = Opt(S) for all S?
• Greedy( a1 , a2 , … , an , S) = (x = x1 , x2 , … , xn ; G(S) = St xt )
Optimum( a1 , a2 , … , an , S) = (y = y1 , y2 , … , yn ; Opt(S) = St yt )
29
Regularity & Critical Values
• Consider the smallest multiple of each coin value that is not smaller than
the next larger coin value:
• This also turns out to be sufficient (under some mild pre-condition that is
satisfied by virtually any reasonable coin denomination system):
t 1 2 3 4 5 6
mt = at-1 / at 2 4 3 2 5
St = at mt 200 100 30 10 5
Greedy: G(St) 1 1 2 1 1
t 1 2 3 4 5 6
mt = at-1 / at 2 4 3 3 5
St = at mt 200 100 33 15 5
Greedy: G(St) 1 1 5 5 1
33
The Optimum Sub-Structure Property
We just noticed an important property that will be used many times later:
The optimum sub-structure property: any sub-structure of an optimum structure is itself an
optimum structure (for the corresponding sub-instance).
Problems with this property are usually amenable to more efficient algorithmic solutions
than brute-force or exhaustive search methods.
This property is usually shown by a “cut-&-paste” argument (see below).
Help her select the maximum number of conflict free time intervals
35
The Problem Statement
INPUT: A set S = { I1, I2, ···, In} of n event time-intervals Ik = sk , fk, k =1..n,
where sk = start time of Ik ,
fk = finishing time of Ik ,
( sk < fk ).
Example:
S = the intervals shown below,
C = the blue intervals, C is not the unique optimum.
|C| =4. Can you spot another optimum solution?
time
36
Some Greedy Heuristics
Greedy iteration step:
From among undecided intervals, select the interval I that looks BEST.
Commit to I if it’s conflict-free
(i.e., doesn’t overlap with the committed ones so far).
Reject I otherwise.
37
Earliest Finishing Time First
MaxF(X) = max{ fk | Ik X }
MinF(X) = min{ fk | Ik X }
S = the set of n given intervals Last = MaxF(C)
C = committed intervals C { Ik S | fk Last }
R = rejected intervals C R { Ik S | fk < Last }
U = undecided intervals U { Ik S | fk Last }
LOOP INVARIANT:
Sol Opt(S) : C Sol C U,
MaxF(C) = Last MinF(U).
time
1 2 3 4 56 7 8 9
So, C Sol C (U – {Ik }). Thus, Sol still satisfies 1st line of the LI.
2. C ; Last –
3. for k 1 .. n do
4. if Last sk then C C {Ik }
5. Last fk
6. end-for
7. return (C)
end
time
1 2 3 4 56 7 8 9
Last Last Last Last
43
INTERVAL
POINT COVER
We have a volunteer group to canvas homes & businesses along Yonge Street.
44
The Problem Statement
INPUT: A set P = { p1, p2, … , pn} of n points, and
a set I = { I1= s1 , f1, I2= s2 , f2, … , Im= sm , fm } of m intervals,
all on the real line.
Example 1:
Example 2:
Not covered by I
45
Some Greedy Heuristics
Greedy choice: pick the incrementally BEST undecided interval first.
46
Cover leftmost uncovered point first
C I (committed intervals), U P (uncovered points)
GREEDY CHOICE
(1) Pick leftmost point pU (not covered by any interval in C).
If no interval in I covers p, then report that p is exposed.
Else, add to C an interval from I – C that covers p and
(2) extends as far right as possible.
LOOP INVARIANT
(exposed p is not covered by I) &
(I does not cover P or Sol Opt(I,P): C Sol) &
MaxF(C) = Last < Min(U).
Pre-Cond &
PreLoopCode
LI Lines 1, 2, 3 of the LI become true.
LI & exit-cond
PostLoopCond
Ik fk
It ft
Last
Suppose It Sol – C covers p. So, It I’. We have t = k or t k.
ft fk (by greedy choice 2).
New solution: Solnew (Sol – {It}) {Ik}.
Solnew covers every point of P covered by Sol , and |Solnew| = |Sol|.
Therefore, Sol OPT(I,P) Solnew OPT(I,P).
C {Ik } Solnew . Therefore, Solnew still maintains 3rd line of the LI.
Remaining lines of LI are also maintained. 50
Efficient Implementation
To carry out the greedy choices fast:
Classify event e:
eP point event (point activation time = p)
e=(sk , Ik) interval activation event (interval activation time = sk)
ActInts = Max Priority Queue of activated (= active/dead) intervals Ik [priority = fk ].
Events = Min Priority Queue of unprocessed events [priority = activation time].
Iterate: e DeleteMin(Events); process event e.
Minor trick: to avoid DeleteMax on empty ActInts, insert as first activated event a
dummy interval I0 = - , -. I0 will remain in ActInts till the end.
51
Algorithm: Efficient implementation
Algorithm OptIntervalPointCover ( P = { p1, p2, … , pn}, I= {I1 , I2 , … , Im} )
1. C ; Last – ; I0 – , – ; I I {I0}
2. MakeEmptyMaxHeap(ActInts)
3. Events ConstructMinHeap(P I) § O(n+m) time
4. while Events do § O(n + m) iterations
5. e DeleteMin(Events) § next event to process, O(log(n+m)) time
6. if e = (sk , Ik) then Insert(Ik , ActInts) § activate interval
7. else § event e is a point in P
8. if e > Last then do § greedy choice 1: e = leftmost uncovered point
9. Ik DeleteMax(ActInts) § greedy choice 2, O(log m) time
10. if fk < e then return ( point eP is not covered by I )
11. else do
12. C C {Ik}
13. Last fk
14. end-else
15. end-if
16. end-while O( (n+m) log(n+m) ) time
17. return ( C )
end
52
Bibliography
If you want to dig deeper, roots of greedy algorithms are in the theory of matroids:
Hassler Whitney, “On the abstract properties of linear dependence,” American Journal of
Mathematics, 57:509-533, 1935.
Jack Edmonds, “Matroids and the greedy algorithm,” Mathematical Programming, 1:126-136,
1971.
Eugene L. Lawler, “Combinatorial Optimization: Networks and Matroids,” Holt, Rinehart, and
Winston, 1976.
The two cited references on the coin change making problem are:
M.J. Magazine, G.L. Nemhauser, L.E. Trotter, Jr., “When the greedy solution solves a class of
knapsack problems,” Operations Research, 23(2):207-217, 1975.
53
Exercises
54
1. The shortest obstacle avoiding path: As we discussed, the scene consists of a pair of
points A and B among n pairwise disjoint rectangular obstacles with horizontal and
vertical sides. We listed 3 greedy heuristics. We saw that all three fail to find the
shortest obstacle avoiding A-B path on some instances.
An interesting question arises: How badly can these heuristics fail?
(a) Explore to find the worst scene for each of these heuristics. By worse, we mean the
ratio of the length of the path found by the heuristic compared with the length of
the shortest path, expressed as a function of n, is as high as possible. How bad
could it be? Is it unbounded? Supper-linear? Linear? Sub-linear? …
(b) Does the answer improve if the obstacles are congruent squares?
2. Coin Change making: For each of the following coin denomination systems
either argue that the greedy algorithm always yields an optimum solution for
any given amount, or give a counter-example:
(a) Coins c0 , c1 , c2 , …, cn-1, where c is an integer > 1.
(b) Coins 1, 7, 13, 19, 61.
(c) Coins 1, 7, 14, 20, 61.
3. [CLRS, Exercise 16.2-5, page 428] Smallest unit length interval covering set:
Describe an efficient algorithm that, given a set P = { p1, p2, … , pn} of points on the
real line, determines the smallest set of unit-length closed intervals that covers all of
the given points. Argue that your algorithm is correct.
4. Interval Point Cover: What is the time complexity of the Interval Point Cover
problem? We developed an O((n+m) log (n+m)) time algorithm for this problem.
Derive a lower bound on this problem by a reduction from the Min-Gap Problem. 55
5. [CLRS, Exercise 16.1-4, page 379] Minimum number of lecture halls:
Suppose we have a set of events to schedule among a large number of lecture halls.
We wish to schedule all the events using as few lecture halls as possible.
Design and analyze an efficient greedy algorithm to determine which event should use
which lecture hall. Prove the optimality of the solution produced by your algorithm.
6. Smallest Hitting Set: Design and analyze an efficient greedy algorithm for the
following problem:
Input: A set P = { p1, p2, … , pn} of points, and a set I = { I1, I2, … , Im} of
intervals, all on the real line. These intervals and points are given in no
particular order. Each interval is given by its starting and finishing times.
7. Given a set of black and white intervals, select a smallest number of white intervals
that collectively overlap every black interval. State your greedy choice and prove its
correctness. 56
8. One Machine Scheduling with Deadlines:
You are given a set {J1, J2, … , Jn} of n jobs to be processed on a single sequential
machine. Associated with each job Jk is a processing time tk and a deadline dk by which
it must be completed. A feasible schedule is a permutation of the jobs such that if the
jobs are processed in that order, then each job finishes by its deadline.
Design & analyze a simple greedy strategy that finds a feasible schedule if there is any.
We wish to find the minimum total duration required to process all n jobs by both
machines, as well as the corresponding optimum schedule.
A schedule is the sequencing of the n jobs through the two machines.
Both machines will process the jobs based on the scheduled order. Each job J, in the
scheduled order, is first processed on machine A as soon as A completes its previously
scheduled job. Upon completion by A, job J is processed by B as soon as B becomes
available.
57
10. The widely popular Spanish search engine El Goog needs to do a serious amount of
computation every time it recompiles its index. Fortunately, the company has at its
disposal a single large supercomputer, together with an essentially unlimited supply of
high-end PCs.
They have broken the overall computation into n distinct jobs, labeled J1, J2, …, Jn,
which can be performed completely independently of one another. Each job consists of
two stages: first it needs to be preprocessed on the supercomputer, and then it needs to
be finished on one of the PCs. Let's say that job Jk needs pk seconds of time on the
supercomputer, followed by fk seconds of time on a PC.
Since there are at least n PCs available on the premises, the finishing of the jobs can be
performed fully in parallel - all the jobs can be processed at the same time. However, the
supercomputer can only work on a single job at a time. So the system managers need to
work out an order in which to feed the jobs to the supercomputer. As soon as the first
job in order is done on the supercomputer, it can be handed off to a PC for finishing; at
that point in time a second job can be fed to the supercomputer; when the second job is
done on the supercomputer, it can proceed to a PC regardless of whether or not the first
job is done (since PCs work independently in parallel), and so on.
Let's say that a schedule is an ordering of the jobs for the supercomputer, and the
completion time of the schedule is the earliest time at which all jobs will have finished
processing on the PCs. This is an important quantity to minimize, since it determines
how rapidly El Goog can generate a new index.
Design and analyze an efficient greedy algorithm to find a minimum completion time
schedule. 58
11. The Factory Fueling Problem: A remotely-located factory has a fuel reservoir which,
when full, has enough fuel to keep the factory's machines running for M days. Due to
the factory's remote location, the fuel supply company does not deliver fuel on-demand.
Instead, its trucks make visits to the factory's area according to a preset annual schedule
and if requested, would fill the reservoir. The schedule is given as an array S[1..n]
where S[i] is the date of the i-th visit in the year. Each time the reservoir is filled, a
fixed delivery charge is applied regardless of how much fuel is supplied. The factory
management would like to minimize these delivery charges and, at the same time,
minimize the idle, empty reservoir periods. Given M and S, describe an efficient
greedy algorithm to obtain an optimal annual filling schedule; i.e., determine for each
of the n visits whether the reservoir should be filled. Prove that your algorithm satisfies
the greedy-choice property.
12. The Loading Problem: Consider a train that travels from station 1 to station n with
intermediate stops at stations 2, 3, ..., (n - 1). We have p[i,j] packages that need to be
delivered from point i to point j where 1 ≤ i < j ≤ n. Packages have the same size. The
maximum number at any point in the train must not exceed its capacity C. We want to
deliver as many packages as possible.
a) Give a strategy for dropping and picking up packages at each point.
b) Prove your strategy maximizes the number of delivered packages.
[Hint: Use the greedy idea twice in the solution of this problem. At point 1 load
packages in increasing order of their destination till capacity is reached. When the train
arrives at point 2, (conceptually) unload and reload according to the same principle as
above. If some packages that were picked up at point 1 get left at point 2, do not load
them at point 1 in the first place!]
59
13. Compatible Capacitated Assignment:
Input: Two sorted arrays A[1..n] and B[1..n] of reals; a positive integer capacity C, and a
positive Compatibility real number a.
Definitions:
(i) A pair (i,j) is called compatible, if | A[i] - B[j] | ≤ a.
(ii) An assignment is a pairing among elements of arrays A and B so that each element of
each array appears in at least one pairing with an element of the other array.
(iii) An assignment is compatible if each pairing in that assignment is compatible.
(iv) A valid assignment is a compatible one in which no element is paired with more than C
elements from the other array.
Output: A valid assignment, or nil if no such assignment exists.
Design analyze and carefully prove the correctness of an efficient greedy algorithm for this
problem. [Hint: Show the following claims are true:
(i) If (i,j1) and (i,j2) are compatible, then so is every pair (i,j) such that j is between j1 and j2.
Similarly, if (i1,j) and (i2,j) are compatible, then so is every pair (i,j) such that i is
between i1 and i2.
(ii) If there is a valid assignment, then there is a valid assignment with no crossing, where a
crossing consists of two assigned pairs (i1,j1) and (i2,j2) such that i1 < i2 but j1 > j2, or
i1 > i2 but j1 < j2.
(iii) An uncrossed valid assignment has the property that if (i,j1) and (i,j2) are pairs in that
assignment, then so are all pairs of the form (i,j) with j between j1 and j2. Similarly, if
(i1,j) and (i2,j) are pairs in the assignment, then so are all pairs (i,j) such that i is
between i1 and i2.
60
(iv) We can assign pairs by scanning arrays A and B in a greedy merge fashion.]
14. Egyptian Fractions:
Input: A rational fraction x/y, where x and y are integers and 0<x<y,
Output: Express x/y as the sum 1/m1 + 1/m2 + ··· + 1/mk, where mi are positive integers
and minimize k. That is, the sum of a minimum number of unit fractions.
This is always possible since we can express x/y as the sum of x copies of 1/y. So,
minimum k ≤ x. The answer is not always unique, e.g., 2/3 can be expressed as
2/3 = 1/2 + 1/6 = 1/3 + 1/3 .
A wrong greedy strategy: first pick the largest unit fraction that fits.
4/17 = 1/5 + 1/29 + 1/1233 + 1/3039345 (Greedy)
= 1/6 + 1/15 + 1/510 (Optimum)
61
15. Laser Beams and Balloons:
Suppose you have a high-powered laser beam gun in your hand, standing amidst a number of
large balloons in a vast open plateau. You want to use your laser gun to shoot all the balloons
without moving from your current location. You want to fire as few shots as possible. The (2
dimensional) Minimum Beams Problem (MBP) can be stated more formally as follows.
Given a set 𝐶 of 𝑛 circles in the plane, each specified by its radius and the (𝑥, 𝑦) coordinates
of its center, compute a minimum number of laser beam directions from the origin that
collectively intersect every circle in 𝐶. [Each such non-vertical beam may geometrically be
viewed as a half-line with the equation 𝑦 = 𝑠𝑥 (with slope 𝑠) that is restricted to one of the 4
quadrants by an additional inequality 𝑥 ≥ 0 or 𝑥 ≤ 0.] Your goal is to find an efficient
algorithm for this problem.
a) For the sake of this problem we will assume, that in addition to the RAM arithmetic operations,
the square-root operation on reals also takes 𝑂(1) time. Give a list of geometric primitives that
would be needed by your algorithm. These operations, when implemented, should run in
𝑂(1) time. The details of their implementation is left as optional (not required). However, you
must give a precise definition of these primitives. For instance, the boolean function
𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑠(𝑏, 𝑐) returns true if and only if beam 𝑏 intersects circle 𝑐. You should convince
yourself that this function can be implemented to run in 𝑂(1) time.
b) First consider the special case of MBP where there is a known beam direction that does not hit
any circle. Can you transform this special case to a more familiar problem and efficiently obtain
the optimum solution that way?
c) Using part (b), describe an efficient algorithm that approximately solves the general MBP by
producing a solution that is either optimum or uses one more beam than the optimum. Prove that
fact.
d) Design and analyze an 𝑂(𝑛2 ) time algorithm that exactly solves the general MBP. You must
prove that your algorithm does indeed output an optimum solution. [Hint: each circle can in turn
be considered as the “first” candidate target. How do you proceed from there to handle the rest?]
64