Ie 400
Ie 400
Ie 400
Lecture 1
Mathematical Models
Operations research, shortened as OR, is a discipline that deals with the development and application of
modeling and analysis to aid in determining how best to design and operate a system usually under the
conditions requiring allocation of scarce resources. Mathematical modelling is the process of describing a
real world problem in mathematical terms, usually in the form of equations, and then using these equations
both to help understand the original problem, and also to discover new features about the problem.
Example 1 (The Milk Production Problem). There are two machines needed to process milk to produce
(Pasteurization) (Homogenization)
Machine 1 Machine 2
In a day, machine 1 can be used for 12 hours and machine 2 can be used for 6 hours The sales price of
low-fat milk is 2.5 TL/lt and that of regular milk is 1.5 TL/lt. Find how much low-fat and regular milk
2) What are the technical factors which are not under control of the decision-maker?
1
• Available machine hours
• Total production time on each machine cannot exceed the available machine hours.
• Total revenue
x1 , x2 ≥ 0 Production amounts ≥ 0
• Any values of decision variables that satisfy all of the constraints of a mathematical model simultane-
• The set of all feasible solutions is called the feasible set or the feasible region.
• The objective is to find a feasible solution that yields the best objective function value.
• The objective function value evaluated at an optimal solution is called the optimal value.
2
The Linear Programming (LP) Problem
An LP problem is an optimization problem in which (∗) the objective function is given by a linear combination
of the decision variables and (∗∗) the constraints are linear inequalities or linear equations.
x1 , x2 , . . . , xm ≥ 0,
xm+s+1 , xm+s+2 , . . . , xn ≤ 0,
• The contribution of each decision variable to the objective function or to each constraint is proportional
• The contribution from decision variables are independent from one another and the total contribution
• All input parameters (coefficients of the objective function and of the constraints) are known with
certainty.
3
Example 2 (Blending Problem I). An oil refinery distills crude oil from Saudi Arabia and Venezuela into
gasoline, jet fuel and lubricants. The crude differ in chemical composition and yields. A ton of S.A. crude
oil yields 0.3 tons of gasoline, 0.4 tons of jet fuel and 0.2 tons of lubricants. A ton of Ven. crude oil yields
0.4 tons of gasoline, 0.2 tons of jet fuel and 0.3 tons of lubricants. 0.1 tons of crude oil is lost in refining for
both crude oils. At most 90 tons can be purchased daily from S.A. at a cost of $600/ton. At most 60 tons
can be purchased daily from Ven. at a cost of $550/ton. The daily demand is 20 tons of gasoline, 15 tons of
jet fuel, and 5 tons of lubricants. How should the demand be met at a minimum cost?
Sol:
1) Decision variables:
2) Objective function:
3) Constraints:
4
min 600xS + 550xV
s.t. xS ≤ 90,
xV ≤ 60,
0.2xS + 0.3xV ≥ 5,
xS , xV ≥ 0.
Example 3 (Transportation Problem). There are two power plants generating electricity, and three factories
that need electricity in their production. Capacities of power plants are 1250 and 2000 megawatts, demands
of factories are 1500, 750 and 750 megawatts. Electricity transmission costs per megawatt are given in the
table below
F1 F2 F3
PP1 50 100 60
PP2 30 20 35
How to satisfy the demand of each factory from the two power plants with minimum total cost?
Sol:
1) Decision variables:
2) Objective function:
5
3) Constraints:
x11 + x12 + x13 ≤ 1250, At most 1250 megawatts can be transferred from PP1
x21 + x22 + x23 ≤ 2000, At most 2000 megawatts can be transferred from PP2
Example 4 (General Transportation Problem). Assume that there are m warehouses and n stores. Let ai
denote the amount of stock at warehouse i for i = 1, . . . , m, dj denote the demand of store j for j = 1, . . . , n
and cij denote the cost of transporting a unit of product from warehouse i to store j, for i = 1, . . . , m and
j = 1, . . . , n.
How should the demands of stores be met at a minimum total transportation cost while respecting the
Sol:
1) Decision variables:
2) Objective function:
Pm Pn
min i=1 j=1 cij xij
6
3) Constraints:
Pm
i=1 xij ≥ dj , ∀j = 1, . . . , n, Demand of each store should be satisfied
Pn
j=1 xij ≤ ai , ∀i = 1, . . . , m, Capacity of each warehouse should not be exceeded
Pm Pn
min i=1 j=1 cij xij
Pm
s.t. i=1 xij ≥ dj , ∀j = 1, . . . , n,
Pn
j=1 xij ≤ ai , ∀i = 1, . . . , m,
xij ≥ 0, ∀i = 1, . . . , m, ∀j = 1, . . . , n.
Example 5 (Blending Problem II). OJ Inc. produces orange juice from 3 different grades of oranges.
Oranges are graded 1 (poor), 2 (medium), and 3 (good) depending on their quality. Each grade of oranges
1 0.4
2 0.5
3 0.6
The company currently has 100 kgs of Grade 1, 150 kgs of Grade 2 and 200 kgs of Grade 3 oranges. This
company produces 3 types of orange juices from these oranges: Superior, Premium and Regular. There are
Type Minimum average grade Profit per liter Min. daily production
Premium 2.2 1 TL 50 lt
Assume the company may sell more than demand at the same prices. How can the company maximize
Sol:
7
1) Decision variables:
2) Parameters:
• a1 = 2.4, a2 = 2.2, a3 = 2,
• p1 = 1.5, p2 = 1, p3 = 0.75,
3) Objective function:
P3 P3
max i=1 j=1 yi pj xij
4) Constraints:
x1j + 2x2j + 3x3j ≥ aj (x1j + x2j + x3j ), ∀j = 1, 2, 3 Minimum average grade should be satisfied
P3
i=1 yi xij ≥ dj , ∀j = 1, 2, 3 Minimum daily production should be satisfied
P3
j=1 xij ≤ si , ∀i = 1, 2, 3 Initial stock should not be exceeded
P3 P3
max i=1 j=1 yi pj xij
xij ≥ 0, ∀i, j = 1, 2, 3.
Example 6 (Short-term Investment Planning). You have 12,500 TL at the beginning of year 2021 and
would like to invest your money. Your goal is to maximize your total amount of money at the beginning of
8
Duration Available
Option Total interest rate
(years) (at the beginning of )
3 3 38% 2021
4 2 27% 2022
Sol: Normally,
However, since Option 2 is available at each year, we can always invest in this option rather than keeping
some money for later better options. Hence, for this problem, the following holds:
1) Decision variables:
2) Objective function:
P4
max i=1 xi4
P4
i=1 xi1 = 12500,
P4
i=1 xi2 = 1.12x21
P4
i=1 xi3 = 1.12x22 + 1.26x11
P4
i=1 xi4 = 1.12x23 + 1.26x12 + 1.38x31 + 1.27x42
9
Then we have the following model:
P4
max i=1 xi4
P4
s.t. i=1 xi1 = 12500,
P4
i=1 xi2 = 1.12x21
P4
i=1 xi3 = 1.12x22 + 1.26x11
P4
i=1 xi4 = 1.12x23 + 1.26x12 + 1.38x31 + 1.27x42
xij ≥ 0, ∀i = 1, . . . , 4, ∀j = 1, . . . 4
Example 7 (Product Mix Problem). Milk chocolate is produced with three ingredients. 1 kg of milk
chocolate contains 0.5 kg of milk, 0.4 kg of cocoa and 0.1 kg of sweetener. Each of these three ingredients
must be processed before they can be used in the production of milk chocolate. The factory has two
Department 1 Department 2
Ingredient
(hours/kg) (hours/kg)
Both departments have 150 hours of available time per week for processing. Formulate an LP problem
Sol:
1) Decision variables:
2) Parameters:
10
3) Objective function:
max y
4) Constraints:
P3
≤ 150, ∀i = 1, 2
j=1 bij xij Available time should not be exceeded
P2
aj y ≤ i=1 xij , ∀j = 1, . . . , 3, Bottleneck principle on the total amount of milk chocolate.
max y
P3
s.t. j=1 bij xij ≤ 150, ∀i = 1, 2
P2
aj y ≤ i=1 xij , ∀j = 1, . . . , 3,
xij ≥ 0, ∀i = 1, 2, ∀j = 1, 2, 3.
Example 8 (Error Minimization Problem). You estimate that there is a linear relationship between the
stock price of a beverage company and the number of beverages sold (in thousands) in a day:
1 2 10
2 3 12
3 2 8
4 3 11
5 4 14
You predict the relationship as si = a × ni + b, where si is the stock price and ni is the number of
beverages sold (in thousands) on day i = 1, . . . , 5. For a given choice of a and b, your estimation error on
day i is |si − a × ni − b| for i = 1, . . . , 5. Choose the unknown coefficients a and b such that the largest
Sol:
1) Decision variables: a, b
2) Parameters: si , ni for i = 1, . . . , 5.
11
3) Objective function:
4) Constraints: NONE!
2) Parameters: si , ni for i = 1, . . . , 5.
3) Objective function:
min t
min t
s.t. si − a × ni − b + t ≥ 0, ∀i = 1, . . . , 5,
si − a × ni − b − t ≤ 0, ∀i = 1, . . . , 5,
a, b, t free.
Example 9 (Production Planning Problem). XYZ Company produces three products A, B, and C and
Weekly Demand
Product 1 2 3 4
A 25 30 15 22
B 15 16 25 18
C 10 17 12 12
Each unit of Product A requires 3, Product B requires 2 and Product C requires 4 hours of labor. During
a week, 120 hours of regular-time labor is available at a cost of $10/hour. In addition, 25 hours of overtime
labor is available weekly at a cost of $15/hour. No shortages are allowed and XYZ Company has 18 A, 13 B,
12
and 15 C initial stocks for these products. Provide an LP formulation to determine. a production schedule
that minimizes the total labor costs for the next four weeks.
Sol:
1) Decision variables:
yij := amount of inventory of product i at the end of week j after meeting demand.
where j = 1, 2, 3, 4.
2) Objective function:
P4 P P4
min 10 j=1 i∈{A,B,C} Li xij + 5 j=1 zj
3) Parameters: Dij is the demand of product i in week j. Li is the hours of labor requirement for product
4) Constraints:
Labor Constraints
P
i∈{A,B,C} Li xij ≤ 120 + zj , ∀j = 1, 2, 3, 4,
13
Then we have the following model
P4 P P4
min 10 j=1 i∈{A,B,C} Li xij + 5 j=1 zj
zj ≤ 25, ∀j = 1, 2, 3, 4,
Homework 1. Rylon Corporation manufactures Brute and Channelle perfumes from a raw material costing
$3/pound.
• Processing 1 pound of the material takes 1 hour and yields 3 ounces of Regular Brute and 4 ounces
of Regular Chanelle. This perfume can be sold directly or can be further processed to make luxury
• One ounce of Regular Brute can be sold for $7, but 3 hours of processing and $4 of cost will convert
• One ounce of Regular Chanelle can be sold for $6, but 2 hours of processing and $4 of cost will convert
• Rylon has available 4000 pounds of raw material, 6000 hours of processing time, and will sell everything
it produces.
Homework 2. Bel Aire Pillow Company needs to manufacture 12,000 pillows in their factory next week.
They have two kinds of employees: line workers (who do the actual work) and supervisors (who oversee the
line workers). Bel Aire currently has 120 line workers and 20 supervisors. Each line worker can make 300
A recent study done on Bel Aire’s operation concluded that, for proper supervision, there must be at
least 1 supervisor for every 5 line-workers. The study also suggested that Bel Aire might be able to cut its
• firing some current line workers (the fired worker is given $240 severance pay)
14
• promoting some current line workers to supervisors (the training cost is $200/worker)
• demoting some current supervisors to line workers (no cost associated with this).
Pay for a line worker is $400/week. Pay for a supervisor is $700/week. All hiring and firing, and training
occur before the upcoming work week, and the costs of those activities will be charged against the upcoming
week. Bel Aire wishes to perform the appropriate hiring, firing, promotions and demotions that will minimize
15
Lecture 2
max cx
s.t. Ax ≤ b
x≥0
A line on wihich all points have the same objective function value is called the isoprofit line for maximiza-
tion problems (the isocost line for minimization problems). As we move the isoprofit lines in the direction
For a maximization problem, the vector c, given by the coefficients of the decision variables in the
objective function, is called the improving direction. On the other hand, −c is the improving direction for
minimization problems.
To solve an LP problem in two variables, we should try to push isoprofit (isocost) lines in the improving
1
Example 10.
max x1 + 3x2
s.t. x1 + x2 ≤ 6
−x1 + 2x2 ≤ 8
x1 , x2 ≥ 0
1 1 6
If we consider this problem in general form we need to define A = , b = , c = 1 3 and
−1 2 8
x1
x = . Then we write the model as
x2
max cx
s.t. Ax ≤ b
x≥0
1 Desmos Code: zxvn4axza3
16
Let us modify the objective function without chaning the constraints:
max x1 − 2x2
s.t. x1 + x2 ≤ 6
−x1 + 2x2 ≤ 8
x1 , x2 ≥ 0
17
Let us modify the objective function once again without chaning the constraints:
max −x1 + x2
s.t. x1 + x2 ≤ 6
−x1 + 2x2 ≤ 8
x1 , x2 ≥ 0
18
Let us modify the objective function once again without chaning the constraints:
max −x1 − x2
s.t. x1 + x2 ≤ 6
−x1 + 2x2 ≤ 8
x1 , x2 ≥ 0
Extreme Points
A set of points S is a convex set if for any two points x ∈ S and y ∈ S, their convex combination λx+(1−λ)y
is also in S for all λ ∈ [0, 1]. Alternatively, a set is a convex set if the line segment joining any pair of points
in S is contained in S.
A point x is a strict convex combination of x1 and x2 if x = λx1 + (1 − λ)x2 for some λ ∈ (0, 1). A
point P of a convex set S is an extreme point (corner point) if it cannot be represented as a strict convex
In other words, for any convex set S, a point P in S is an extreme point (corner point) if each line
segment that lies completely in S and contains the point P , has P as an endpoint of the line segment.
19
Example 11. 2
A company wishes to increase the demand for its product through advertising.
• Each minute of radio ad increases the daily demand by 2 units and each minute of TV ad by 7 units.
• The company would wish to place at least 9 minutes of daily ad in total. It wishes to increase daily
How can the company meet its advertising requirements at minimum total cost?
Sol: Let x1 be # of minute of radio ads and x2 be # of minutes of TV ads purchased. Then we have the
following model
min x1 + 2x2
x1 + x2 ≥ 9
x1 , x2 ≥ 0
(7, 2) is the optimal solution with objective value 11. Now suppose that it costed $4 to place a minute of
2 Desmos Code: mr1yyjf52o
20
radio ad and $14 to place a minute of TV ad. Then we have the following model
x1 + x2 ≥ 9
x1 , x2 ≥ 0
In this case, every feasible solution on the line segment [(7, 2), (14, 0)] is an optimal solution with the
If every feasible solution on a line segment are optimal, then we say LP has multiple or alternate optimal
solutions.
Example 12. 3
Consider the following LP problem
max x1 + 3x2
x1 , x2 ≥ 0
3 Desmos Code: tbyr6sd1fb
21
If the objective function for a maximization problem can always be increased by moving in the improving
direction c while still staying in the feasible region, we say that the LP is unbounded. For unbounded
LP maximizartion problems, there is no optimal solution and the optimal value is defined to be +∞. For
unbounded LP minimization problems, there is no optimal solution and the optimal value is defined to be
−∞.
Example 13. 4
Consider the following LP problem
max x1 + 3x2
s.t. x1 + x2 ≤ 6
−x1 + 2x2 ≤ 8
x2 ≥ 6
x1 , x2 ≥ 0
4 Desmos Code: seenzfvfis
22
Fact 2. Every LP problem falls into one of the 4 cases:
Theorem 1. If the feasible set {x ∈ Rn : Ax ≤ b, x ≥ 0} is not empty, that is if there is a feasible solution,
Theorem 2. If an LP has an optimal solution, then there is an extreme point of the feasible region which
The graphical method is a method for solving LP’s with two decision variables as follows:
• Step 2: Draw an isoprofit line (for maximization) or an isocost line (for minimization).
• Step 4: (If possible) pick the last point in the feasible region that contacts an isoprofit/isocost line.
23
Homework 3. Consider the following LP problem
min 2x1 + x2
s.t. x1 + 2x2 ≤ 10
x1 + x2 ≤ 6
x1 − x2 ≤ 2
x1 − 2x2 ≤ 1
x1 , x2 ≥ 0.
Homework 4. A firm plans to purchase at least 200 quintals of scrap containing high-quality metal X and
low-quality metal Y. It decides that the scrap to be purchased must contain at least 100 quintals of metal X
and not more than 35 quintals of metal Y. The firm can purchase the scrap from two suppliers (A and B) in
unlimited quantities. The percentage of X and Y metals in terms of weight in the scrap supplied by A and
B is given below.
X 25% 75%
Y 10% 20%
The price of A’s scrap is $ 200 per quintal and that of B is $ 400 per quintal. The firm wants to determine
the quantities that it should buy from the two suppliers so that the total cost is minimized. Formulate a
24
Lecture 3
Simplex algorithm (or Simplex method) is a widely-used algorithm to solve the linear programming (LP)
optimization problems. It starts with an extreme point, move to a neighboring extreme point in the improving
An LP is in Standard Form if all constraints are equalities with constant non-negative right-hand sides
In general LP problems can have both equality and inequality constraints. They can have non-negative
and free variables. In order to use the Simplex Method, LP problems should be converted to Standard Form.
• Define a "slack" variable for each of the "≤" constraint to convert the inequality constraint to an
• Define an "excess (surplus)" variable for each of the "≥" constraints to convert the inequality constraint
max x1 + 3x2
x1 ≥ 0, x2 ≤ 0, x3 free.
25
Using the above techniques we obtain the equivalent optimization model in Standard Form
max x1 − 3x̂2
−
s.t. 2x1 − 3x̂2 + x+
3 − x3 + s1 = 5
−
−4x1 + x̂2 − 2x+
3 + 2x3 = 11
−
3x1 − 4x̂2 + 2x+
3 − 2x3 − e3 = 8
−
x1 , x̂2 , x+
3 , x3 , s1 , e3 ≥ 0
Suppose that we converted an LP with m̃ constraints into a standard form. After the conversion, we
have n variables as x1 , x2 , . . . , xn . Then we can write our feasible region as {x ∈ Rn : Ãx = b, x ≥ 0}.
constraints in the model. For Standard Form LP’s we may assume that all redundant constraints are
removed an we end up with m constraints, such that rank(A) = m. Then we can write our feasible region
rank(A) = rank(A|b).
x1 + 2x2 + x3 = 4 (1)
x2 − x3 = 1 (2)
You may multiply Constraint (1) and Constraint (2) by 2 and add them up to obtain Constraint (3). Hence
Constraint (3) may be removed from the system and we will end up with rank(A) = 2.
and solving the remaining m variables whose columns are linearly independent. The n − m variables whose
values are set to 0 are called non-basic variables. The remaining m variables are called basic variables. A
basic solution which has all non-negative entries is called a basic feasible solution.
• Rearrange the columns of A such that these chosen columns are the first m columns in A.
• Matrix A can be written in the form A = Bm×m Nm×(n−m) where Bm×m is invertible.
26
x
B
• Now let x = be the corresponding partition in x.
xN
xB
Ax = b ⇒ B N = b ⇒ BxB + N xN = b
xN
Example 16. Consider the system of inequalities X = {(x1 , x2 ) ∈ R2+ : x1 + x2 ≤ 6, x2 ≤ 3}. In standard
1 1 1 0 3
A= and b = , then X = {x ∈ R4 : Ax = b, x ≥ 0}.
0 1 0 1 6
27
1 1 x
3 0 x
1 1 −1 6 3
i) B = . Then we fix = and = B −1 b = = . This basis yields
0 1 x4 0 x2 0 1 3 3
a basic feasible solution.
1 1
ii) B = . This matrix is not invertible so it does not give a basic solution.
0 0
1 0 x2 0 x1 1 0 6 6
iii) B = . Then we fix = and = B −1 b = = . This basis yields a
0 1 x3 0 x4 0 1 3 3
basic feasible solution.
1 1 x
1 0 x
2 0 1 6 3
iv) B = . Then we fix = and = B −1 b = = . This basis yields
1 0 x4 0 x3 1 −1 3 3
a basic feasible solution.
1 0 x1 0 x2 1 0 6 6
v) B = . Then we fix = and = B −1 b = = . This basis yields
1 1 x3 0 x4 −1 1 3 −3
a basic solution.
1 0 x1 0 x3 1 0 6 6
vi) B = . Then we fix = and = B −1 b = = . This basis yields a
0 1 x2 0 x4 0 1 3 3
basic feasible solution.
3 6 0 0
3 0 3 0
All basic feasible solutions to this system are , ,
0 0 3 6
0 3 0 3
Theorem 3. A point in the feasible region of an LP is an extreme point if and only if it is a basic feasible
Theorem 4. If feasible set of an LP is non-empty, then there is at least one basic feasible solution. If an
LP has an optimal solution, then there is a basic feasible solution which is optimal.
For any LP with m constraints, two basic feasible solutions are said to be adjacent if their set of basic
28
General Description of Simplex Algorithm
2) Obtain a basic feasible solution to the LP. This basic feasible solution is called the initial basic feasible solution.
In general, the most recent basic feasible solution iscalled the current basic feasible solution. Therefore,
at the beginning the initial basic feasible solution is the current basic feasible solution.
4) If the current basic feasible solution is not optimal, then find an adjacent basic feasible solution with a
better objective function value (one non-basic variable becomes basic and one basic variable becomes
non-basic).
5) Go to Step 3.
A system of linear equations in which each equation has a variable with a coefficient 1 in that equation (and
29
Suppose we are given an LP in the standard form
max cT x
s.t. Ax = b,
x ≥ 0.
Now introduce an extra variable z representing the objective value and rewrite the initial LP as
max z
s.t. z − cT x = 0,
Ax = b,
x ≥ 0.
We do this in order to understand the contribution of a unit of improvement on the objective value: z +
α1 x1 + · · · + αn xn = β. If αi > 0 then a unit of improvement in i will make z decrease, if αi < 0 then z will
increase.
z x RHS
1
−cT 0
0 A b
Then we pick an initial basic feasible solution. Let B be the sub-matrix of A corresponding to the basic
variables and N be the one for the non-basic variables. Then we may rewrite our table as
z xB xN RHS
z 1
−cTB −cTN 0
xB 0 B N b
Then we transform this tableau into canonical form in two steps. We first multiply rows corresponding to
xB with B −1 :
z xB xN RHS
z 1
−cTB −cTN 0
xB 0 I B −1 N −1
B b
Then clear the entries above the matrix I by multiplying rows corresponding to xB with cTB and add it to
30
the first row:
z xB xN RHS
−1 −1
z 1
0 T
cB B N − cN
T T
cB B b
xB 0 I B −1 N B −1 b
After converting the tableau to the canonical form we check for optimality:
Y ES, if all entries of cTB B −1 N − cTN are non-negative.
Is that basic feasible solution is optimal?
if there is an entry i∗ such that (cTB B −1 N − cTN )[i∗ ] < 0.
N O,
The vector cTB B −1 N − cTN we use for optimality check is called reduced cost coefficients.
If the solution is optimal we STOP. Otherwise we pick the most negative entry of the reduced cost
coefficient, denote it i∗ , and swap it with a basic variable to move to an adjacent solution. This variable xi∗
Now we have to chose a basic variable to swap with the entering variable, such variable is called
leaving variable. Increasing a non-basic variable may cause a basic variable to take negative values. We
observe this change using a similar argument we use while checking reduce cost coefficient. xj + · · · +
αi∗ xi∗ + · · · = βj , making xi∗ a basic variable may cause xj to decrease by βj /αi∗ units. So we use the
Ratio Test: For every row j in which the entering variable i∗ has a positive coefficient, we compute the
ratio
RHS of row j
ρj =
Coefficient of entering variable in row j
The position (j ∗ , i∗ ) in the tableau is called the pivot term. Use elementary row operations to make the
pivot term 1 and all remaining entries in column i∗ to have a tableau in canonical form. Repeat these steps
until optimality.
max x1 + 3x2
s.t. x1 + x2 ≤ 6,
−x1 + 2x2 ≤ 8,
x1 , x2 ≥ 0
31
We introduce slack variables and the variable representing the objective value and obtain the following
equivalent
max z
s.t. z − x1 − 3x2 = 0,
x1 + x2 + s1 = 6,
−x1 + 2x2 + s2 = 8,
x1 , x2 , s1 , s2 ≥ 0
0
0
Lets pick as a basic feasible solution. So basic variables will be {s1 , s2 } and non-basic variables are
6
8
{x1 , x2 }. We construct the simplex tableau.
z x1 x2 s1 s2 RHS
z 1 −1 −3 0 0 0
s1 0 1 1 1 0 6
s2 0 −1 2 0 1 8
The tableau is already in canonical form. Most negative coefficient of the cost vector is x2 , so x2 is the
entering variable. We apply ratio test and pick s2 to be the leaving variable and make the updates on the
tableau:
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
z 1 −1 −3 0 0 0 z 1 −5/2 0 0 3/2 12
⇒
s1 0 1 1 1 0 6 s1 0 3/2 0 1 −1/2 2
x2 0 −1 2 0 1 8 x2 0 −1/2 1 0 1/2 4
Now the tableau is in canonical form, so we chose the most negative coefficient, in this case it is x1 , to be
the entering variable. Ratio test only consider positive coefficients so only candidate for a leaving variable
32
is s1 , and we make the updates on the tableau:
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
z 1 −5/2 0 0 3/2 12 z 1 0 0 5/3 2/3 46/3
⇒
x1 0 3/2 0 1 −1/2 2 x1 0 1 0 2/3 −1/3 4/3
x2 0 −1/2 1 0 1/2 4 x2 0 0 1 1/3 1/3 14/3
Remark 3. It is easy to track values of the basic variables and the objective value using the simplex tableau.
s.t. x1 + x2 ≤ 12,
2x1 + x2 ≤ 16,
x1 , x2 ≥ 0
33
Lecture 4
Problematic Cases
There are two ways to solve a minimization problem using the Simplex Method:
A. Instead of solving min cT x s.t. Ax = b, x ≥ 0 solve max −cT x s.t. Ax = b, x ≥ 0 and multiply the
B. During the choice of the entering variable pick the most positive entry instead of the most negative
one. If all reduced cost coefficients are non-positive for a basic feasible solution, then it is optimal.
In Simplex Method, alternative solutions are detected when there are 0 valued coefficients for non-
basic variables in row-z of the optimal tableau. If there is no non-basic variable with a zero coefficient
in row-z of the optimal tableau, the LP has a unique optimal solution. Even if there is a non-basic
variable with a zero coefficient in row-z of the optimal tableau, it is possible that the LP may not have
If there are two basic feasible solution which are optimal, their convex combinations will also be optimal
solutions.
s.t. x1 + 2x2 ≤ 5
x1 + x2 ≤ 4
x1 , x2 ≥ 0
We convert this problem to standard form and get ready for the simplex:
max z
x1 + 2x2 + s1 = 5
x1 + x2 + s2 = 4
x1 , x2 , s1 , s2 ≥ 0
34
0
0
We start with a basic feasible solution with basic variables {s1 , s2 } and non-basic variables {x1 , x2 }.
5
4
So we construct the tableau:
z x1 x2 s1 s2 RHS
z 1 −2 −4 0 0 0
s1 0 1 2 1 0 5
s2 0 1 1 0 1 4
This is already in canonical form, so we pick x2 as the entering variable and s1 as the leaving variable.
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
z 1 −2 −4 0 0 0 z 1 0 0 2 0 10
⇒
x2 0 1 2 1 0 5 x2 0 1/2 1 1/2 0 5/2
s2 0 1 1 0 1 4 s2 0 1/2 0 −1/2 1 3/2
0
5/2
Current basic feasible solution is optimal. But an improvement on x1 does not change the
0
3/2
optimal value, so lets choose it as the entering variable and s2 will be the leaving variable. So we
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
z 1 0 0 2 0 10 z 1 0 0 2 0 10
⇒
x2 0 1/2 1 1/2 0 5/2 x2 0 0 1 1 −1 1
x1 0 1/2 0 −1/2 1 3/2 x1 0 1 0 −1 2 3
3 0 3
1 5/2 1
Again, current basic feasible solution is optimal. So for any λ ∈ [0, 1], λ + (1 − λ) is
0 0 0
0 3/2 0
35
optimal.
An unbounded LP occurs in a max (min) problem if there is a nonbasic variable with a negative
(positive) coefficient in row-z and there is no constraint that limits how large we can make this nonbasic
variable. Specifically, an unbounded LP for a max (min) problem occurs when a variable with a negative
(positive) coefficient in row-z has a non positive coefficient in each constraint. There is an entering
variable but no leaving variable, since ratio test does not give a finite bound.
max x1 + 2x2
s.t. x1 − x2 ≤ 10
x1 ≤ 40
x1 , x2 ≥ 0
We convert this problem to standard form and get ready for simplex:
max z
s.t. z − x1 − 2x2 = 0
x1 − x2 + s1 = 10,
x1 + s2 = 40,
x1 , x2 , s1 , s2 ≥ 0
0
0
We start with the basic feasible solution with basic variables {s1 , s2 } and non-basic variables
10
40
{x1 , x2 }. We construct the tableau:
z x1 x2 s1 s2 RHS
z 1 −1 −2 0 0 0
s1 0 1 −1 1 0 10
s2 0 1 0 0 1 40
This is already in canonical form, so we pick x2 as the entering variable however it is not possible
36
to pick a leaving variable since the ratio test is inconclusive. So consider x1 = 0 and x2 = α. Then
An LP is a degenerate LP if in a basic feasible solution, one of the basic variables takes on a zero value.
This basic feasible solution is called degenerate basic feasible solution. Degeneracy could cost simplex
method extra iterations because when degeneracy occurs, objective function value will not increase.
and K ≥ 1. If a cylce occurs, then the algorithm will loop forever among a set of basic feasible solutions
In the simplex algorithm, degeneracy is detected when there is a tie for the minimum ratio test. In the
Example 21. In this example we see a degenerate bfs. Consider the following LP problem
s.t. x1 + x2 ≤ 6,
x1 − x2 ≤ 0,
x1 , x2 ≥ 0.
max z
x1 + x2 + s1 = 6,
x1 − x2 + s2 = 0,
x1 , x2 , s1 , s2 ≥ 0
0
0
We start with the degenerate basic feasible solution with basic variables {s1 , s2 } and non-basic
6
0
37
variables {x1 , x2 }. We construct the tableau:
z x1 x2 s1 s2 RHS
z 1 −5 −2 0 0 0
s1 0 1 1 1 0 6
s2 0 1 −1 0 1 0
This is already in standard form, so we pick x1 as the entering variable and pick s2 as the leaving
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
z 1 −5 −2 0 0 0 z 1 0 −7 0 5 0
⇒
s1 0 1 1 1 0 6 s1 0 0 2 1 −1 6
x1 0 1 −1 0 1 0 x1 0 1 −1 0 1 0
The table is already in canonical form. So we pick x2 as the entering variable and s1 as the leaving
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
z 1 0 −7 0 5 0 z 1 0 0 7/2 3/2 21
⇒
x2 0 0 2 1 −1 6 x2 0 0 1 1/2 −1/2 3
x1 0 1 −1 0 1 0 x1 0 1 0 1/2 1/2 3
Bland’s Rule: Assume that slack and excess variables are numbered xn+1 , xn+2 , . . . . Choose an
entering variable (in a max problem) the variable with a negative coefficient in row-z that has the
smallest index. If there is a tie in the ratio test, then break the tie by choosing the winner of the ratio
test so that the variable leaving the basis has the smallest index.
Fact 5. Using Bland’s rule, the Simplex Method terminates in finite time with optimal solution, i.e.
no cycles.
38
Example 22. Consider the following LP problem
s.t. 4x1 + x2 ≤ 8,
2x1 + x2 ≤ 4,
x1 , x2 ≥ 0.
We convert this problem to standard form and get ready for simplex:
max z
4x1 + x2 + x3 = 8,
2x1 + x2 + x4 = 4,
x1 , x2 , x3 , x4 ≥ 0
0
0
We start with the basic feasible solution with basic variables {x3 , x4 } and non-basic variables
8
4
{x1 , x2 }. We construct the tableau:
z x1 x2 x3 x4 RHS
z 1 −9 −3 0 0 0
x3 0 4 1 1 0 8
x4 0 2 1 0 1 4
This is already in canonical form, so we pick the smallest index with negative rcc, x1 , as the entering
variable but there is a tie in the ratio test. Hence this solution is degenerate. So lets update the tablaeu
by picking x3 :
z x1 x2 x3 x4 RHS z x1 x2 x3 x4 RHS
z 1 −9 −3 0 0 0 z 1 0 −3/4 9/4 0 18
⇒
x1 0 4 1 1 0 8 x1 0 1 1/4 1/4 0 2
x4 0 2 1 0 1 4 x4 0 0 1/2 −1/2 1 0
39
Now x2 is an entering variable and x4 is a leaving variable. So we update the tableau:
z x1 x2 x3 x4 RHS z x1 x2 x3 x4 RHS
z 1 0 −3/4 9/4 0 18 z 1 0 0 3/2 3/2 18
⇒
x1 0 1 1/4 1/4 0 2 x1 0 1 0 1/2 −1/2 2
x2 0 0 1/2 −1/2 1 0 x2 0 0 1 −1 2 0
2
0
Current basic feasible solution is optimal with the optimal value 18.
0
0
Remark 4. While Bland’s pivot rule is theoretically important, from a practical perspective it is quite
inefficient and takes a long time to converge. In practice, other pivot rules are used, and cycling almost
never happens.
x1 − 3x2 − x3 + 2x4 ≤ 0
x1 ≤ 1
x1 , x2 , x3 , x4 ≥ 0.
Solve this LP problem using the Simplex Method. If there are ties during the steps of the algorithm
Big-M Method: In order to obtain trivial basic feasible solutions we introduce non-negative artificial variables for
we include them in the objective function with a coefficient M , a very large penalty. Solving the
original problem in this new formulation will help us start with a trivial basic feasible solution
and find an identical optimal solution under appropriate choice of M . During the iterations of
Simplex Method, after an artificial variable becomes non-basic you can remove it from the problem
40
by deleting the corresponding column and reduce the problem size because it cannot be basic in
x1 + 3x2 ≥ 20,
x1 + x2 = 10
x1 , x2 ≥ 0.
We convert this problem to standard form and get ready for simplex:
min z
2x1 + x2 + s1 = 16,
x1 + 3x2 − e2 = 20,
x1 + x2 = 10
x1 , x2 , s1 , e2 ≥ 0.
min z
2x1 + x2 + s1 = 16,
x1 + 3x2 − e2 + a2 = 20,
x1 + x2 + a3 = 10
x1 , x2 , s1 , e2 , a2 , a3 ≥ 0.
0
0
16
We start with the basic feasible solution
with basic variables {s1 , a2 , a3 } and non-basic
0
20
10
41
variables {x1 , x2 , e2 }. We construct the tableau:
z x1 x2 s1 e2 a2 a3 RHS
z 1 −2 −3 0 0 −M −M 0
s1
0 2 1 1 0 0 0 16
a2 0 1 3 0 −1 1 0 20
a3 0 1 1 0 0 0 1 10
z x1 x2 s1 e2 a2 a3 RHS
z 1 2M − 2 4M − 3 0 −M 0 0 30M
s1
0 2 1 1 0 0 0 16
a2 0 1 3 0 −1 1 0 20
a3 0 1 1 0 0 0 1 10
We pick x2 as the entering variable and a2 as the leaving variable so we update the tableau:
z x1 x2 s1 e2 a2 a3 RHS
z 1 2M − 2 4M − 3 0 −M 0 0 30M
s1
0 2 1 1 0 0 0 16
x2 0 1 3 0 −1 1 0 20
a3 0 1 1 0 0 0 1 10
z x1 x2 s1 e2 a2 a3 RHS
z 1 (2M − 3)/3 0 0 (M − 3)/3 (3 − 4M )/3 0 (60 + 10M )/3
s1
0 5/3 0 1 1/3 −1/3 0 28/3
x2 0 1/3 1 0 −1/3 1/3 0 20/3
a3 0 2/3 0 0 1/3 −1/3 1 10/3
z x1 x2 s1 e2 a3 RHS
z 1 (2M − 3)/3 0 0 (M − 3)/3 0 (60 + 10M )/3
s1
0 5/3 0 1 1/3 0 28/3
x2 0 1/3 1 0 −1/3 0 20/3
a3 0 2/3 0 0 1/3 1 10/3
Tableau is in canonical form so we pick x1 as the entering variable and a3 as the leaving variable.
42
Then we update the tableau:
z x1 x2 s1 e2 a3 RHS
z 1 (2M − 3)/3 0 0 (M − 3)/3 0 (60 + 10M )/3
s1
0 5/3 0 1 1/3 0 28/3
x2 0 1/3 1 0 −1/3 0 20/3
x1 0 2/3 0 0 1/3 1 10/3
z x1 x2 s1 e2 a3 RHS z x1 x2 s1 e2 RHS
z 1 0 0 0 −1/2 (3 − 2M )/2 25
z 1
0 0 0 −1/2 25
s1 −1/2 −5/2 ⇒ s −1/2
0 0 0 1 1 0 0 0 1 1
1
x2 0 0 1 0 −1/2 −1/2 5 x2 0 0 1 0 −1/2 5
x1 0 1 0 0 1/2 3/2 5 x1 0 1 0 0 1/2 5
5
5
Current basic feasible solution is the optimal solution with optimal value 25.
1
0
Homework 7. Consider the following LP problem
max x1 + x2
s.t. x1 − x2 ≥ 1,
−x1 + x2 ≥ 1,
x1 , x2 ≥ 0.
– If all artificial variables in the optimal solution equal zero, the solution is optimal.
– If any artificial variables are positive in the optimal solution, the problem is infeasible.
– When the LP (with the artificial variables) is solved, the final tableau may indicate that the
LP is unbounded. If the final tableau indicates the LP is unbounded and all artificial variables
– If the final tableau indicates that the LP is unbounded and at least one artificial variable is
43
Remark 5. For optimization solvers, it is difficult to determine how large M should be. In gen-
eral, M is chosen to be at least 100 times larger than the largest coefficient in the original objective
function. Introduction of an arbitrarily large M can cause numerical errors and computational
difficulties. Therefore most solvers solve LP’s by using the two-phase simplex method.
Two-Phase Simplex: In this method, artificial variables are added to the same constraints, then a bfs to the original
LP is found by solving Phase I LP. In Phase I LP, the objective function is to minimize the sum
of all artificial variables. As before, an artificial variable can be removed from the system as
soon as it becomes non-basic. At completion, reintroduce the original LP’s objective function and
x1 + 3x2 ≥ 20,
x1 + x2 = 10
x1 , x2 ≥ 0.
We convert this problem to the appropriate format as we did and change the objective with the
min z
s.t. z − a2 − a3 = 0,
2x1 + x2 + s1 = 16,
x1 + 3x2 − e2 + a2 = 20,
x1 + x2 + a3 = 10
x1 , x2 , s1 , e2 , a2 , a3 ≥ 0.
0
0
16
We start solving Phase I with the basic feasible solution
with basic variables {s1 , a2 , a3 }
0
20
10
44
and non-basic variables {x1 , x2 , e2 }. We construct the tableau:
z x1 x2 s1 e2 a2 a3 RHS
z 1 0 0 0 0 −1 −1 0
s1
0 2 1 1 0 0 0 16
a2 0 1 3 0 −1 1 0 20
a3 0 1 1 0 0 0 1 10
z x1 x2 s1 e2 a2 a3 RHS
z 1 2 4 0 −1 0 0 30
s1
0 2 1 1 0 0 0 16
a2 0 1 3 0 −1 1 0 20
a3 0 1 1 0 0 0 1 10
So we pick x2 as the entering variable and a2 as the leaving variable. We update the tableau
z x1 x2 s1 e2 a2 a3 RHS
z 1 2/3 0 0 1/3 −4/3 0 10/3
s1
0 5/3 0 1 1/3 −1/3 0 28/3
x2 0 1/3 1 0 −1/3 1/3 0 20/3
a3 0 2/3 0 0 1/3 −1/3 1 10/3
z x1 x2 s1 e2 a3 RHS
z 1 2/3 0 0 1/3 0 10/3
s1
0 5/3 0 1 1/3 0 28/3
x2 0 1/3 1 0 −1/3 0 20/3
a3 0 2/3 0 0 1/3 1 10/3
z x1 x2 s1 e2 a3 RHS
z 1 2/3 0 0 1/3 0 10/3
s1
0 5/3 0 1 1/3 0 28/3
x2 0 1/3 1 0 −1/3 0 20/3
x1 0 2/3 0 0 1/3 1 10/3
45
z x1 x2 s1 e2 a3 RHS
z 1 0 0 0 0 −1 0
s1
0 0 0 1 −1/2 −5/2 1
x2 0 0 1 0 −1/2 −1/2 5
x1 0 1 0 0 1/2 3/2 5
z x1 x2 s1 e2 RHS
z 1 0 0 0 0 0
s1
0 0 0 1 −1/2 1
x2 0 0 1 0 −1/2 5
x1 0 1 0 0 1/2 5
We obtained a basic feasible solution for the original problem. Now we change te objective and
This tableau is in the canonical form and basic feasible solution provided in the tableau is optimal,
so we stop.
– The optimal value of the Phase I is greater than zero. In this case the original LP is infeasible.
– The optimal value of the Phase I is equal to zero and no artificial variables are in the optimal
Phase I basis. Then a basic feasible solution to the original problem is found. Continue to
– The optimal value of the Phase I is zero and at least one artificial variable is in the optimal
46
Phase I basis. Then either we perform an additional pivot and get rid of the artificial variable
by passing to an alternate optimal solution, or there was a redundant constraint and we can
47
Lecture 5
Integer Programming
A linear program plus the additional constraints that some or all of the variables must be integer valued
is called an integer program (IP). When compared to LP’s, IP’s are more realistic and flexible but more
max 3x + 4y
s.t. 5x + 8y ≤ 24,
x, y ≥ 0 and integer.
If we ignore integrality we get, x = 24/5, y = 0 and z = 72/5. Variable x is fractional. If we round this
Periods 1 2 3 4 5 6
48
• Each bus starts to operate at the beginning of a period and operates for 8 consecutive hours and then
• For example, a bus operating from 4 AM to 12 PM must be off between 12 PM and 4 AM.
min x1 + · · · + x6
s.t. x1 + x6 ≥ 4,
x2 + x1 ≥ 8,
x3 + x2 ≥ 10,
x4 + x3 ≥ 7,
x5 + x4 ≥ 12,
x6 + x5 ≥ 4,
xi ≥ 0 and integer.
Example 27 (Work Scheduling). Burger Queen would like to determine the min # of kitchen employees
while meeting daily requirements. Each employee works six days a week and takes the seventh day off.
Min. # of workers 8 6 7 6 9 11 9
Sol: Let xi be the number of workers who are not working on day i, i = 1, . . . , 7. Number of workers in
day j can be computed as i6=j xi and this quantity should be larger than de daily requirements, say dj .
P
P7
Number of all kitchen employees is i=1 xi . Hence we have the following formulation:
P7
min i=1 xi ,
s.t.
P
i6=j xi ≥ dj , ∀j = 1, . . . , 7,
xi ≥ 0 and integer.
49
Example 28 (The Assignment Problem). There are n people and n jobs to carry out. Each person is as-
signed to carry out exactly one job. If person i is assigned to job j, then the cost is cij . Find an assignment
if person i is assigned to job j,
1,
Sol: Let xij = Such variables are called binary variables and offer
otherwise.
0,
Pn Pn
min i=1 j=1 cij xij
Pn
s.t. i=1 xij = 1,
Pn
j=1 xij = 1,
Example 29 (The 0-1 Knapsack Problem). You are going on a trip. Your suitcase has a capacity of b kg’s.
You have n different items that you can pack into your suitcase 1, . . . , n. Item i has a weight of ai and gives
you a utility ofci . How should you pack your suitcase to maximize total utility?
1, if we pick item i,
Sol: Let xi = . Then we have the following formulation
0, otherwise.
Pn
max i=1 ci xi
Pn
s.t. i=1 ai xi ≤ b,
xi ∈ {0, 1}.
Month
1 2 3 4
A pair of shoes requires 4 hours of labor and $5 worth of raw materials. There are 50 pairs of shoes in
stock. Monthly inventory cost is charged for shoes in stock at the end of each month at $30/pair. Currently 3
workers are available each has $1500/month salary and each works up to 160 hours/month. Hiring a worker
costs $1600 and firing a worker costs $2000. How to meet the demand with minimum total cost?
50
Sol: Let wi be the # workers available, fi be the # workes fired and hi be the # workers hired at the
beginning of month i. Let yi be the pairs of shoes in the stock after meeting the demand at the end of
month i and xi be the pairs of shoes produced at the ith month. Let di be the monthly demand, y0 = 50
P4 P4 P4 P4 P4
min 1500 i=1 wi + 1600 i=1 hi + 2000 i=1 fi + 30 i=1 yi + 5 i=1 xi
s.t. yi = yi−1 + xi − di , ∀i = 1, . . . , 4,
wi = wi−1 − fi + hi , ∀i = 1, . . . , 4,
4xi ≤ 160wi , ∀i = 1, . . . , 4,
xi , yi , wi , hi , fi ≥ 0 and integer.
Example 31 (The Cutting Stock Problem). A paper company manufactures and sells rolls of paper of fixed
width in 5 standard lengths: 5, 8, 12, 15 and 17 m. Demand for each size is given below. It produces 25 m
length rolls and cuts orders from its stock of this size. What is the min # of rolls to be used to meet the
total demand?
Length (meters) 5 8 12 15 17
Demand 40 35 30 25 20
51
Then we define xj to be the # rolls cut according to pattern j. Then we have the following formulation:
P11
min j=1 xj
2x5 + x6 + x7 ≥ 30
x3 + x4 ≥ 25
x1 + x2 ≥ 20
xi ≥ 0 and integer.
Example 32 (Project Selection). A manager should choose from among 10 different projects:
cj := cost of project j,
vi. Either both projects 1 and 5 are chosen or neither are chosen
vii. At least two and at most four projects should be chosen from the set {1, 2, 7, 8, 9, 10}
viii. Neither project 5 nor 6 can be chosen unless either 3 or 4 is chosen. If x3 + x4 = 0, then x5 + x6 = 0.
ix. At least two of the investments 1, 2, 3 are chosen, or at most three of the investments 4, 5, 6, 7, 8 are
chosen.
x. If both 1 and 2 are chosen, then at least one of 9 and 10 should also be chosen.
52
1, if project j is selected,
P10
Let xj = Then the objective is to maximize j=1 pj xj and the budget
0, otherwise.
P10
constraint is j=1 cj xj ≤ q. Additional requirements can be modeled as
i. x3 + x4 ≤ 1
P10
ii. j=1 xj = 3
iii. x2 ≤ x1
iv. x1 ≤ 1 − x3
v. x1 + x2 = 1
vi. x1 = x5
vii. 2 ≤ x1 + x2 + x7 + x8 + x9 + x10 ≤ 4
viii. x5 + x6 ≤ 2(x3 + x4 )
x. x1 + x2 − 1 ≤ x9 + x10
Either-Or Constraints
Let us have two constraints f (x1 , . . . , xn ) ≤ 0 and g(x1 , . . . , xn ) ≤ 0. We want to make sure that when the
model is solved at least one of the constraints is satisfied. We define a binary variable y and a large positive
number M . We add the following two constraints to the model f (x1 , . . . , xn ) ≤ M y and g(x1 , . . . , xn ) ≤
M (1 − y).
• M should be chosen large enough to be satisfied by all feasible values of x1 , . . . , xn , but small enough
We can linearize logical implications to use in integer optimization problems by defining several extra variables
53
• If δ = 1, then
P
j aj xj ≤ b.
model.
• If aj xj ≤ b, then δ = 1.
P
j
Define m to be a lower bound on aj xj − b. Define a sufficiently small . If all data are integers you
P
j
• If δ = 1, then
P
j aj xj ≥ b.
• If aj xj ≥ b, then δ = 1.
P
j
Define M to be an upper bound on aj xj − b. Define a sufficiently small . If all data are integers
P
j
three examples:
A. If aj xj ≤ b, then
P P
j j cj x j ≤ d.
B. If δ = 1, then
P
j aj xj < b.
C. If aj xj < b, then
P P
j j cj x j > d.
• A set of clients: 1, . . . , n
54
Sol: xij amount of product transferred from warehouse i to client j. yi is a binary variable indicates
Pm Pn Pm
whether warehouse i is active or not. Our goal is to minimize i=1 j=1 cij xij + i=1 fi yi . Capacity should
Pn Pn
not be exceeded: j=1 xij ≤ ai yi . Demand should be satisfied i=1 xij ≥ dj . Then we have the following
model
Pm Pn Pm
min i=1 j=1 cij xij + i=1 fi yi
Pn
s.t. j=1 xij ≤ ai yi ,
Pn
i=1 xij ≥ dj ,
xij ≥ 0,
yi ∈ {0, 1}.
Example 34 (Set Covering Problem). You wish to decide where to locate hospitals in a city. Suppose that
1 10 15 20 10 16
2 15 10 8 14 12
3 8 6 12 15 10
4 12 17 7 9 10
Travel time between each town and the nearest hospital should not be more than 10 minutes. What is
Sol: Let xi indicate whether there is an hospital at location i. Our goal is to minimize x1 + x2 + x3 + x4 .
According to the table, for the first town we have x1 + x3 ≥ 1 and rest can be found similarly. Then we have
min x1 + x2 + x3 + x4
s.t. x1 + x3 ≥ 1,
x2 + x3 ≥ 1,
x2 + x4 ≥ 1,
x1 + x4 ≥ 1,
x3 + x4 ≥ 1,
xi ∈ {0, 1}
Example 35 (Location Problems). Ankara municipality is providing service to 6 regions. The travel times
55
(in minutes) between these regions, say tij values are:
R1 R2 R3 R4 R5 R6
R1 0 10 20 30 30 20
R2 10 0 25 35 20 10
R3 20 25 0 15 30 20
R4 30 35 15 0 15 25
R5 30 20 30 15 0 14
R6 20 10 20 25 14 0
The municipality wants to build fire stations in some of these regions. Find the minimum number of fire
stations required if the municipality wants to ensure that there is a fire station within 15-minute drive time
min x1 + · · · + x6
s.t. x1 + x2 ≥ 1,
x1 + x2 + x6 ≥ 1,
x3 + x4 ≥ 1,
x3 + x4 + x5 ≥ 1,
x4 + x5 + x6 ≥ 1,
x2 + x5 + x6 ≥ 1,
xi ∈ {0, 1}
1, if cij ≤ 15
P6
Let aij = . Then constraints can be expressed as i=1 aij xi ≥ 1 for each j = 1, . . . , 6.
0, otherwise.
The municipality wants to build 2 fire stations so that the time to reach the farthest region is kept at
minimum.
1, if a fire station is built at j, 1, if region i served by fire station at region j,
Let yj = and let xij =
0, otherwise. 0, otherwise.
and T is the farthest distance. We should have 2 fire stations: y1 + · · · + y6 = 2. If there is no fire station at
56
P6
city j, there should be no assignments: i=1 xij ≤ 6yj . Each city should be assigned to a single fire station
P6
j=1 xij = 1. Farthest distance should be greater than any possible distance: T ≥ cij xij . Our goal is to
min T
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
xij , yj ∈ {0, 1}
T ≥ 0.
P6 P6
Given constraints j=1 xij = 1, inequalities that bound T can be rewritten as T ≥ j=1 cij xij for each
i = 1, . . . , 6.
The municipality now wants to build 2 post offices. There is going to be a back and forth trip each day
from each post office to each one of the regions assigned to this post office. Formulate a model which will
P6 P6
min 2 i=1 j=1 cij xij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
What if additional to previous model you have the restriction that a post office cannot provide service to
57
P6
We include the following costraint: i=1 xij ≤ 4 for all j. Then we have the following model:
P6 P6
min 2 i=1 j=1 cij xij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
P6
i=1 xij ≤ 4, ∀j = 1, . . . , 6,
Assume that in addition to the back-and-forth trips between the post offices and the assigned regions,
there will be a trip between the two post offices. The municipality wants to locate the offices in such a way
If we force diagonal entries to be 0 a post office cannot be assigned to itself so the objective function will
also count the back and forth trip between the post offices.
P6 P6
min 2 i=1 j=1 cij xij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
P6
i=1 xij ≤ 4, ∀j = 1, . . . , 6,
P6
i=1 xii = 0,
Alternatively,
one can introduce an additional binary variable zi j such that:
1, if post offices are located at both i and j(i 6= j)
zij =
0, otherwise.
58
The resulting model is:
P6 P6 P6 P6
min 2 i=1 j=1 cij xij + 2 i=1 j=1,j6=i cij zij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
P6
i=1 xij ≤ 4, ∀j = 1, . . . , 6,
yi + yj − 1 ≤ zij , ∀i, j = 1, . . . , 6 : i 6= j
The constraints linking yi , yj and zij linearize the logical expression yi = 1 ∧ yj = 1 =⇒ zij = 1
Example 36 (Traveling Salesman Problem). A salesperson lives in city 1. He has to visit each of the
cities 2, . . . , n exactly once and return home. Let cij be the travel time from city i to city j , i = 1, . . . , n;
j = 1, . . . . , n.What is the order in which she should make her tour to finish as quickly as possible?
1, if city j is visited directly after city i,
Pn
Let xij = . Each city has to be visited only once: i=1 xij =
0, otherwise.
Pn
1 and j=1 xij = 1. Then we have the following formulation:
Pn Pn
min i=1 j=1 cij xij
Pn
s.t. i=1 xij = 1, ∀j = 1, . . . , n,
Pn
j=1 xij = 1, ∀i = 1, . . . , n,
xii = 0, ∀i = 1, . . . , n,
Unfortunately, there may be subtours. After identifying a subtour, say S, add the following constraint to
the model i∈S j ∈S / xij ≥ 1. However, constraints are in exponential number as all possible subtours are
P P
2n − 1, a very large number. But number of valid subtours is much larger, (n − 1)!, that is the possible
additional integer variables ui modelling the position of city i in the tour. The subtour elimination constraints
are then:
ui − uj + nxij ≤ n − 1 ∀i 6= j : i, j = 2, . . . , n
By setting xij = 0, the constraint is trivially satisfied. When xij = 1, it enforces ui to precede uj . All
59
together, the constraints correctly define a tour where each ui takes a different value. Although the second
model exhibits a much smaller number of constraints for growing n, it is often not used in practice due to
Homework 9. A company produces animal feed mixture. The mix contains 2 active ingredients and a filler
material. 1 kg of feed mix must contain a minimum quantity of each of the following 4 nutrients:
Nutrients A B C D
A B C D Cost/kg
a) What should be the amount of active ingredients and filler material in 1 kg of feed mix in order to
• If we use any of ingredient 2, we incur a fixed cost of $15 (in addition to the $60 unit cost).
60
Lecture 6
max x2
x1 + x2 ≤ 7/2
x1 , x2 ≥ 0, integer
By removing the integrality conditions, we get an LP problem: this is the LP-relaxation and is often a
If we solve the LP of the example, the optimal solution (3/2, 2) is not integer. By trivially rounding up
i. X ⊆ Y
61
It further follows that two properties hold:
A. If Y = ∅ then X = ∅
(b) if y ∈
/ X, by optimality we know that r(y ∗ ) ≥ r(x) for any x ∈ X. But from ii. r(x) ≥ c(x) for
• From B., the optimal value of the LP-relaxation is at least as good as the optimal value of the IP.
• Moreover, if the optimal solution of the LP-relaxation is integer valued, then that solution is also
– If y ∗ is the optimum of the LP-relxation and it is integer, then it is feasible for the IP (y ∗ ∈ X).
But from ii. we know that there exists no integer x with a better objective function value. Thus
xj ∈ {0, 1} ∀j = 1, . . . , 6
62
Each variable has an utility pj , a size aj and the knapsack has a capacity b.
p
Arrange the variables in non-increasing order of unit size utility values ( ajj ) and fill the knapsack with
x1 ≥ x6 ≥ x2 ≥ x3 ≥ x5 ≥ x4
16 19 22 12 11 8
≥ ≥ ≥ ≥ ≥
5 6 7 4 4 3
Starting from the first item, set the corresponding variable xj = 1 and reduce the capacity b by aj . When a
P
b− aj
certain item h does not fit in the current capacity, set its variable to the fractional value f = j<h
ah = xh .
This simple procedure is commonly known as Dantzig procedure and outputs the LP optimum for the 0-1
knapsack problem.
A. pick x1 , the starting capacity is 14. We check that the weight 5 < 14 so x1 = 1 and the residual
capacity becomes 9.
B. pick x6 , residual capacity 9. We check that the weight 6 < 9 so x6 = 1 and the residual capacity
becomes 3.
0.
Hint: in case of a minimization problem with ≥ constraint, variables are arranged in non-decreasing
p
values of ( ajj ).
Systematically consider all possible values of decision variables. If there are n binary variables, then there
Usual Idea: iteratively break the problem in two. At the first iteration, we consider separately the case
that x1 = 0 and x1 = 1.
Enumeration tree:
63
The bottom nodes are called leaves of the tree.
Given the exponential grow of 2n , evaluating all the possible solutions through complete enumeration
The Branch-and-Bround (B&B) is an algorithm that performs an implicit enumeration to find the IP
The first ingredient used in the B&B is a feasible integer solution, i.e. the incumbent: xI of value z I .
To key idea of the implicit enumeration: select nodes of the “enumeration tree” one at a time but do not
branch from a node if none of its descendents can be a better solution than that of the incumbent
For example, for maximization if z LP (2) ≤ z I , that is the upper bound UB(2) of subproblem at node
2 has an objective function value smaller than the incumbent value, subtree 2 can be eliminated.
64
Example 39 (Subproblem 5). Consider the knapsack subproblem at node 5:
x1 = 0, x2 = 1
xj ∈ {0, 1} ∀j = 3, . . . , 6
The optimal LP-relaxation is x∗ = [0, 1, 41 , 0, 0, 1] with z LP (5) = 44. Being z LP (5) > z I , we need to branch
If the objective function has integer coefficient, we can impose the upper bound to be UB(j)= bz LP (j)c.
B&B creates the enumeration tree one node at a time and one branch at a time.
The LP relaxation at the current node j is solved and the algorithms either fathoms the node (case 1, 2,
1 infeasibility: z L P (j) = −∞. The LP subproblem is infeasible, node j is fathomed and no branching
occurs.
in the subtree of j can improve the incumbent. Suppose z LP (5) = 38.7. Then, bz LP (j)c = 38 ≤ z I .
3 integrality: z I ≤ z LP (j) and the optimal solution for LP(j) is integer. Then, the incumbent is
updated to the integer solution of LP(j). As no descendant node can have a better IP solution, fathom
4 branching z I ≤ z LP (j) and the optimal solution for LP(j) is NOT integer. Fathom cannot be
performed. Add two children from node j and continue the B&B. Suppose z LP (5) = 41 but the
The set of nodes that have no children and have not yet been processed are called active node. At the
beginning, only node 1 is active. The B&B terminates when there are no active nodes.
The selection of the current node from the list of active nodes is made according to different strategies:
• depth-first: follows the LIFO (last-in first-out) order and moves down one side of the B&B tree. It
quickly find a candidate solution (i.e., a feasible integer solution) and requires backtracking to
65
• breadth-first: follows the FIFO (first-in first-out) and visit the B&B by processing all the nodes at
• best-first: when two children are created, both are processed by solving the respective subproblems.
Then, according to a priority queue that stores all the active nodes ordered by z L P (j) values, performs
jumptracking to the node that currently exhibits the best z L P (j) value.
Example 40 (B&B). Solve by B&B the 0-1 knapsack problem of Example 37.
The branching strategy is to select the variable with the most fractional value.
Among the active nodes, choose the one with depth-first strategy.
Due to the depth-first strategy, subproblem nodes are solved in such order: 0-1-3-5-6-4-2-7-9-10-8-11-13-
14-12.
The output of the B&B algorithm is the optimal solution x∗ [1, 0, 0, 1, 0, 1] of value z ∗ = 43.
Hint: the LP-relaxation at each node of the branch-and-bound tree can be solved by firstly imposing
the branching choices and then using the Dantzig procedure on the remaining items. As example, at node 2
x2 = 1 is enforced and the initial capacity 14 is reduced to 7. Using the Dantzig procedure, x1 = 1 and the
66
67
The B&B algorithm framework is then summarized.
s.t. x1 + x2 ≤ 6
9x1 + 5x2 ≤ 45
x1 , x2 ≥ 0, integer
Solving the LP-relaxation of the original problem (node 0) gives x1 = 3.75, x2 = 2.25, z LP (0) = 41.25.
68
This is Case 4 of B&B, so two children are created so that xLP (0) is not feasible for either subproblem. This
The two children are created by setting x1 ≤ 3 on the left branch (node 1) and x1 ≥ 4 on the right one
(node 2).
Solving subproblem at node 1 gives x1 = 3, x2 = 3, z LP (1) = 39. A new incumbent z I = 39 is found and the
node is fathomed.
Solving subproblem at node 2 gives x1 = 4, x2 = 1.8, z LP (2) = 41. Two children are created. Branching is
Solving subproblem at node 3 gives x1 = 4.44, x2 = 1, z LP (3) = 40.55. Two children are created. Branching
Solving subproblem at node 6 gives x1 = 5, x2 = 0, z LP (6) = 40. A new incumbent z I = 40 is found and the
node is fathomed.
In general:
• The more accurate the bounds (close to the IP subproblem value), the quicker the fathoming. Having
good techniques to get accurate bounds and solving quickly important decisions (fixing crucial variables
69
Lecture 7
Networks model
Optimization models that exhibit a special structure. In some cases, this special structure dramatically
reduces computational complexity (running time). First widespread application of LP problems of industrial
• Highway, street, rail, airline, shipping networks: transport people, vehicles, goods
A network (or graph) G = (N, A) is made by a set of nodes N = {1, 2, 3, 4} and a set of arcs
A = {b = (1, 2), a = (1, 4), e = (2, 3), d = (3, 4), c = (2, 4)}. A network in which arcs orientation is relevant
A path (or a directed path if orientations matter) is a sequence of nodes (or arcs) without repetitions.
Figure 1: On the left, path 5-2-3-4. On the right, directed path 1-2-5-3-4.
70
A cycle (also called circuit or loop) is a path with two or more nodes where the last node is also the
Figure 2: On the left, cycle 1-2-3-1. On the right, directed cycle 1-2-3-4-1.
The shortest path problem requires the computation of the (directed) path of minimum length considering
Define integer variables xij ≥ 0 for each arc indicating how many times such arc is used in the path.
In the single-pair shortest path problem the path starts from node s and arrives in node t. The formulation
is:
P
min (i,j)∈A cij xij
s.t.
P
j:(s,j)∈Axsj = 1
P P
j:(i,j)∈A xij − j:(j,i)∈A xji = 0, ∀i ∈ V \ {s, t}
P
− j:(j,t)∈A xjt = −1
Flow balance equations are expressed compactly as M x = b, where M is the network matrix made by
{−1, 0, 1} entries and b is the array of external flow entering (>0) or leaving (<0) the network at each
node. In the shortest path problem bT = [1, 0, . . . , 0, −1] holds, where non-zeroes elements are associated to
s and t, respectively.
Example 42 (single pair shortest path). Let us consider the following network:
71
The single-pair shortest path problem model reads as:
min 2x12 + 4x13 + x23 + 4x24 + 2x25 + 3x35 + 2x46 + 3x54 + 2x56
x35 − x13 = 0
−x46 − x56 = −1
The optimal shortest path in single-pair case, highlighted in red, is 1-2-5-6 of cost 6.
For single source shortest path problem bT = [n − 1, −1, . . . , −1, −1]. Node s is the source of n − 1 units
and all other nodes are sinks with demands -1. A solution to this problem is a shortest path tree rooted
at s.
Example 43 (single source shortest path). The optimal solution described in the figure is x12 = 5, x23 =
72
The network matrix M has a row per node and a column per arc of the network. Each column xij is
exactly made by a 1 (at row i) and a -1 (at row j). Due to tehir structure, they fall into the family of the
A matrix M is called totally unimodular if every square submatrix has determinant equals 0, 1, or -1.
If M is totally unimodular and b is integer valued, then the polyhedron P = {x|M x ≤ b} is integral.
Proof. Recall that the simplex method proceeds from one basic feasible solution to the next, until it finds
the optimal one. Let B the submatrix of M corresponding to a basic feasible solution. Then the solution of
by the Cramer’s rule, where Bi is the matrix B where the i -th column is removed and replaced by b. We
know that B is invertible, thus det(B)6= 0. If all the entries of b are integers, then xB is integer.
Maximum Flows
Define variables xij ≥ 0 for each arc (i, j) ∈ A codifying the amount of flow sent through (i, j). The maximum
flow problem asks to compute the maximum net flow v that can cross the network starting from a source
node s and arriving to a sink node t. Finally, define uij as the capacity of (i, j).
max v
s.t.
P
xsj = v
j:(s,j)∈A
P P
j:(i,j)∈A xij − j:(j,i)∈A xji = 0 ∀i ∈ V \ {s, t}
P
− j:(j,t)∈A xjt = −v
xij ≥ 0 ∀(i, j) ∈ A
that is, {max v|Ax = bv, 0 ≤ x ≤ u} where A is the network matrix and bT = [1, 0, . . . , 0 − 1]
73
The model for the maximum flow is:
max v
−x1t − x2t = −v
xs1 ≤ 10
xs2 ≤ 6
x12 ≤ 1
x1t ≤ 8
x2t ≤ 10
xij ≥ 0 ∀(i, j) ∈ A
Two more network problems are the transportation problem and the assignment problem.
The former has nodes in a first set as sources (b > 0) and nodes in a second set as demands (b < 0). The
feasibility of the transportation problem is assessed by verifying that there exist flows of total units equal to
74
Figure 3: On the left, an instance of transportation problem. On the right, the trasformation to a max-flow
problem to solve the feasibility version of the transportation.
The latter requires finding a 1-1 correspondence between nodes of a first set and node in a second set,
both of the same cardinality |N |. It correspond to a transportation problem with unitary sources/demands
and nodes equally distributed in the two sets. The assignment feasibility can be proved by looking for a
Figure 4: On the left, an instance of assignment problem. On the right, the trasformation to a max-flow
problem to solve the feasibility version of the assignment.
The maximum flow problem can be solved by means of the Ford-Fulkerson algorithm. Firstly, one needs to
define the residual network: given a network G = (N, A) and a feasible flow x, the residual network G(x)
has, for each (i, j) ∈ A of G with capacity uij , a pair of arcs: (i, j) of residual capacity rij = uij − xij and
a reversed arc (j, i) of capacity xij . Intuitively, rij is the amount of flows that one can add without violating
the arc capacity uij , while the reversed arc contains the amount of flows that can be possibly removed from
75
Figure 5: On the top, an original network G with a feasible flow x (in red). On the bottom, the residual
network G(x) built accordingly to the definitions.
The Ford-Fulkerson algorithm implements the idea of augmenting path, which is an s − t directed
path on the residual network. The residual capacity of the augmenting path P is given by δ(P ) = min{rij :
(i, j) ∈ P }. To augment along P is to send δ(P ) units of flow along each arc of the path. Then, x and the
Figure 6: On the left, an augmenting path on the residual network G(x) (in red). On the right, the updated
residual network G(x).
76
It can be proved that the flow is optimal when there is no augmenting path on the residual networks,
that is t cannot be reached by any path starting from s. To prove it, we exploit the concept of (s − t)-cuts.
An (s − t)-cut in a network G = (N, A) is a partition of N into two disjoint subsets S and T such that
The capacity of a (s-t)-cut [S, T ] is C[S, T ] = uij (only the arc capacities going from nodes
P P
i∈S t∈T
in S to nodes in T ).
For any pair of feasible flows x and (s − t)-cut [S, T ] on a network, then v(x) ≤ C[S, T ]: the amount of
Due to theoretical results, we know that if v(x) = C[S, T ] then x is a maximum flow and [S, T ] is a
Theorem 5 (Max-flow Min-cut Theorem). A flow is maximum if and only if there exists no augmenting
Proof. If there exists an augmenting path in G(x), clearly x is not the maximum.
Suppose there is no augmenting path in G(x). Let S be the set of nodes reachable from s in G(x). Let
i ∈ T and j ∈ S =⇒ xij = 0.
77
The flow crossing this (s, t)-cut [S, T ] is equal to the capacity C[S, T ], hence it is maximum.
To detect the non-existence of augmenting paths, one can rely on the breadth-first search (FIFO logic),
Algorithm 2 labels nodes that at each iteration can be reached from current node u and takes track of
Example 45 (Ford-Fulkerson algorithm and terminating condition.). At the initialization with x = 0 the
By the selecting the augmenting path s − 1 − 2 − t it results ∆ = min{M, 1, M } = 1. We set xs1 = x12 =
x2t = 1, rs1 = r2t = M − 1, r12 = 0, r1s = r21 = rt2 = 1 and update the residual network accordingly.
78
The new selected augmenting path is s − 2 − 1 − t where ∆ = min{M, 1, M } = 1. We set xs2 = x1t = 1,
x12 = 0, rs2 = r1t = M − 1, r21 = 0, r2s = r12 = rt1 = 1 and update the residual network accordingly.
One can iterate the process up to 2M times to arrive at the optimum flow of value 2M that equals the
The same solution could have been computed by not playing on arc (1, 2), but by directly sending flows
79
Lecture 8
Project scheduling
Network Optimization can be used as an aid in the scheduling of large complex projects that consist of many
activities. A project is divided into a collection of work activities (for planning and control purposes) that
• Identifying precedence relationships: shows the relationship of each activity to others and to the whole
project
• Determining activity times & costs: encourages the setting of realistic time and cost estimates for each
activity
• Estimating material and worker requirements: helps make better use of people, money, and material
resources
Three main techniques are used in project management: Gantt chart, Critical path method (CPM)
A Gantt chart provides a graphical representation of a schedule of activities, useful for planning, coordi-
nating, and tracking specific tasks in a project, giving a clear illustration of the project’s progress. However,
one aspect not taken into account in this type of diagramming is the interdependence of activities.
80
Both CPM and PERT are network techniques developed in the late 1950s that consider precedence
relationships and interdependencies. Each uses a different estimate of activity times: CPM activity duration
are known with certainty; PERT activity duration are not known with certainty.
1. Define the project and prepare the work breakdown structure (list of all activities in the project)
2. Develop relationships among the activities - decide which activities must precede and which must follow
others
5. Compute the longest time path through the network - this is called the critical path
6. Use the network to help plan, schedule, monitor, and control the project
The information obtained by means of CPM and PERT can provide answers to the following questions:
IV. Is the project on schedule w.r.t. the due date, behind schedule (delay), or ahead of schedule (early)?
81
V. Is the money spent equal to, less than, or greater than the budget?
VI. Are there enough resources available to finish the project on time?
VII. If the project must be finished in a shorter time, what is the way to accomplish this with the least cost
(crashing)?
The structure that embeds all the precedence relationships between activities is called project network.
Alternatively, activities can be associated to nodes (activity on nodes network, AON) or to arcs (activity
on arcs network, AOA). The two conventions are equivalent. In the former, arcs expresses the precedence
relationships among activities. In the latter, nodes are referred as events, such as the completion of an
82
Note that in AOA, dummy activities of zero time may be required as any activity must be represented
by exactly one arc in the network, and two nodes can be connected by at most of arc.
83
Figure 7: Resulting AON.
Nodes start and finish represent the beginning and ending of the project, respectively. Arc lengths (or
The critical path is the longest path through the network and represent the shortest time in which the
project can be completed. Critical path activities have no slack time, thus any delay in critical path activities
Computing longest paths: given a network G = (N, A) with non-negative lengths associated on arcs,
special source node s, find the longest path from s to every other node in N .
If the network has no directed cycles of positive lengths, the principle of optimality holds: the longest
directed s − j path is made by the longest directed s − i path plus the directed i − j path. In formulas, this
v[s] = 0
We refer to v[j] as the length of the longest s − j path. The first equality is the boundary condition, that is
the length of the longest path v[s] is initially 0. Then, at stage (node) j all its predecessors are evaluated
in terms of path length v[i] + cij , where cij is the length of arc (i, j). Among all these measures, the largest
one is selected as longest path. Finally, the predecessor of j in the longest path is stored in d[j].
Note that project network are acyclic, hence functional equations can be used to get the longest path.
84
The critical path analysis aims at computing the following 4 measures for each activity:
1. Earliest start (ES) = earliest time at which an activity can start, assuming all predecessors have
been completed
3. Latest start (LS) = latest time at which an activity can start so as to not delay the completion time
4. Latest finish (LF) = latest time by which an activity has to be finished so as to not delay the
Forward pass
The forward pass begins from the starting node and work forwardly. Its scope is to identify the earliest start
Earliest start time rule: If an activity has only one immediate predecessor, its ES equals the EF of the
predecessor. If an activity has multiple immediate predecessors, its ES is the maximum of all the EF values
of its predecessors:
Earliest finish time rule: The earliest finish time (EF) of an activity is the sum of its earliest start time
The early schedule ES[j] indicate the length of the longest path from start to node j in the AON
project network.
The project completion time corresponds to the minimum time to complete a project, that is the
Example 48 (Early schedule on AON at Figure 7). The lexicographic order is start,A,B,C,D,E,F,G,H,finish.
For both A and B, earliest starts are ES[A] = ES[B] = 0. Earliest finish are instead EF [A] = 2 and
Pick C and get ES[C] = EF [A] = 2. Then, EF [C] = ES[C] + 2 = 4. Set d[C] = A.
85
Algorithm 3 Forward Pass Algorithm
1: Order the nodes lexicographically
2: ES[start] = EF [start] = 0
3: while the early start time of the finish node has not been fixed do
4: Let j be the smallest index of an unprocessed node
5: ES[j] = max {EF [i]}
i:(i,j)∈A
6: EF [j] = ES[j]+activity time of j
7: d[j] = argmaxi:(i,j)∈A {EF [i]}
8: end while
Pick D and get ES[D] = max{EF [A], EF [B]} = max{2, 3} = 3. Then, EF [D] = ES[D] + 4 = 7. Set
d[D] = B.
Pick E and get ES[E] = EF [C] = 4. Then, EF [E] = ES[E] + 4 = 8. Set d[E] = C. Similarly for F,
Pick G and get ES[G] = max{EF [D], EF [E]} = max{7, 8} = 8. Then, EF [G] = ES[G] + 5 = 13. Set
d[G] = E.
Pick H and get ES[H] = max{EF [F ], EF [G]} = max{7, 13} = 13. Then, EF [H] = ES[H] + 2 = 15.
Set d[H] = G.
Pick finish and get ES[f inish] = EF [f inish] = EF [H] = 15, set d[f inish] = H.
Backward pass
The backward pass begins from the finish node and work backwardly. Its scope is to identify the latest start
Assume there is a due date for the project, which can be reasonably set as EF [f inish]. The late
schedule LS[j] is computed as due date - length of longest path from node j to node finish. Note that a
predecessor i of j in this path is closer to node finish than j (we are working backward).
Latest finish time rule: If an activity is an immediate predecessor for just a single activity, its LF equals
the LS of the activity that immediately follows it. If an activity is an immediate predecessor to more than
one activity, its LF is the minimum of all LS values of all activities that immediately follow it :
Latest start time rule: The latest start time (LS) of an activity is the difference of its latest finish time
86
Algorithm 4 Backward Pass Algorithm
1: Order the nodes lexicographically
2: LF [f inish] = LS[f inish] = due date
3: while the late start time of the start node has not been fixed do
4: Let j be the highest index of an unprocessed node
5: LF [j] = min {LS[i]}
i:(j,i)∈A
6: LS[j] = LF [j]−activity time of j
7: d0 [j] = argmini:(j,i)∈A {LS[i]}
8: end while
Pick H and get LF [H] = LS[f inish] = 15. Then, LS[H] = LF [H] − 2 = 13. Set d0 [H] = f inish.
Pick G and get LF [G] = LS[H] = 13. Then, LS[G] = LF [G] − 5 = 8. Set d0 [G] = H. Similarly for F
Pick E and get LF [E] = LS[G] = 8. Then, LS[E] = LF [E] − 4 = 4. Set d0 [E] = G. Similarly for D
Pick C and get LF [C] = min{LS[E], LS[F ]} = {4, 10} = 4. Then, LS[C] = LF [C] − 2 = 2. Set
d0 [C] = E.
Pick B and get LF [B] = LS[D] = 4. Then, LS[B] = LF [B] − 3 = 1. Set d0 [B] = D.
Pick A and get LF [A] = min{LS[C], LS[D]} = {2, 4} = 2. Then, LS[A] = LF [A] − 2 = 0. Set d0 [A] = C.
Pick start and get LF [start] = LS[start] = min{LS[A], LS[B]} = {0, 1} = 0. Set d0 [start] = A.
After computing the ES, EF, LS, and LF times for all activities, compute the slack or free time for each
activity.
Slack is the length of time an activity can be delayed without delaying the entire project:
Critical activities are the activities with 0 slack value. A path from node start to the finish node that
87
Figure 8: CPM results.
In the above example, A-C-E-G-H is the critical path. The project completion time is EF [H] = LF [H] =
15.
88
Figure 10: Gantt charts of late times on AON at Figure 7
Example 50.
89
For both A and B, earliest starts are ES[A] = ES[B] = 0. Earliest finish are instead EF [A] = 9 and
Pick C and get ES[C] = max{EF [A], EF [B]} = max{9, 6} = 9. Then, EF [C] = ES[C] + 8 = 17. Set
d[C] = A.
Pick D and get ES[D] = max{EF [A], EF [B]} = max{9, 6} = 9. Then, EF [D] = ES[D] + 7 = 16. Set
d[D] = A.
Pick E and get ES[E] = EF [D] = 16. Then, EF [E] = ES[E] + 10 = 26. Set d[E] = D.
Pick F and get ES[F ] = max{EF [C], EF [E]} = {17, 26} = 26. Then, EF [F ] = ES[F ] + 12 = 38. Set
d[F ] = E.
Backward pass: The lexicographic order is start,A,B,C,D,E,F,end. Set LF [end] = LS[end] = 38 = EF [end]
as due date.
Pick F and get LF [F ] = LS[end] = 38. Then, LS[F ] = LF [F ] − 12 = 26. Set d0 [F ] = end.
Pick E and get LF [E] = LS[F ] = 26. Then, LS[E] = LF [E] − 10 = 16. Set d0 [E] = F .
Pick D and get LF [D] = LS[E] = 16. Then, LS[D] = LF [D] − 7 = 9. Set d0 [D] = E.
Pick C and get LF [C] = LS[F ] = 26. Then, LS[C] = LF [C] − 8 = 18. Set d0 [C] = F .
Pick B and get LF [B] = min{LS[C], LS[D]} = {18, 9} = 9. Then, LS[B] = LF [B] − 6 = 3. Set
d0 [B] = D.
Pick A and get LF [A] = min{LS[C], LS[D]} = {18, 9} = 9. Then, LS[A] = LF [A] − 9 = 0. Set
d0 [A] = D.
Pick start and get LF [start] = LS[start] = min{LS[A], LS[B]} = {0, 3} = 0. Set d0 [start] = A.
90
Example 51 (Reliable Construction Co. project).
91
Figure 11: The complete project network showing ES and LS (in parenthesis above the node), EF and LF
(in parenthesis below the node) for each activity. The critical path is A-B-C-E-F-J-L-N (bold arrows).
CPM assumes we know a fixed time estimate for each activity and there is no variability in activity times.
On the other hand, PERT is an attempt to correct the shortcoming of CPM by modeling the duration of
each activity as a random variable. For each activity, PERT requires that the project manager estimate
a = optimistic estimate of the activity’s duration under the most favorable conditions
b = pessimistic estimate of the activity’s duration under the least favorable conditions
Let Tj be the random variable expressing the duration of activity j (in case of AOA construction, Tij
92
are defined in place and all formulas are modified accordingly). PERT requires as first assumption that Tj
The specific definition of a beta distribution need not concern us, but it is important to realize that it
can approximate a wide range of random variables, including many positively skewed, negatively skewed,
and symmetric random variables. If Tj follows a beta distribution, then the mean and variance of Tj may
be approximated by
a + 4m + b
E(Tj ) =
6
(b − a)2
var(Tj ) =
36
Figure 12: Variability in activity times for Example 46. Note that CPM made use of expected times as arc
weights, so the critical path identified is already an appropriate input for the PERT.
PERT requires the second assumption that the duration of all activities are independent. Then for any
path in the project network, the mean and variance of the time required to complete the activities on the
93
path are given by
X
E(Tj ) = expected duration of activities on path p
j∈p
X
var(Tj ) = variance of duration of activities on path p
j∈p
Focusing on the critical path found through CPM (which need to use expected times E(Tj ) as durations
in his computations), now indicated with the random variable CP , PERT makes use a third assumption: the
total project completion times follow a normal probability distribution. This is always reasonable when
the critical path found contains enough activities to invoke the Central Limit Theorem. It follows that:
X X
E(CP ) = E(Tj ) var(CP ) = var(Tj )
j∈CP j∈CP
Given inputs in Figure 11 and the CPM results in Figure 8, one finds that
var(CP ) = var(TA ) + var(TC ) + var(TE ) + var(TG ) + var(TH ) = 0.11 + 0.11 + 1.00 + 1.78 + 0.11 = 3.11
PERT allows us to reply to questions such as: What is the probability that the project will be completed
within 16 weeks? By relying on the CPM results, we are implicitly considering that the path A-C-E-G-H
94
Given the normal distribution assumption, in general we have
CP − E(CP ) X − E(CP ) X − E(CP )
P (CP ≤ X) = P ≤ =P Z≤
dev(CP ) dev(CP ) dev(CP )
where Z is a standardized normal random variable with mean 0 and variance 1. The cumulative distribution
CP − 15 16 − 15
P (CP ≤ 16) = P ≤ = P Z ≤ 0.57 = 0.7156
1.76 1.76
that is, in 71.56% of the cases we will finish the project in 16 weeks.
Let us define G = (N, A) an AON network. Let moreover dj be the activity j ∈ N . We introduce the decision
variable xj to model the starting time of activity j ∈ N . The project completion time can be computed by
min xend
s.t. xj ≥ xi + di ∀(i, j) ∈ A
xstart = 0
xj ≥ 0 ∀j ∈ N
The inequality ensures that the starting time of activity j is the largest among all the predecessors i such
that (i, j) ∈ A.
It may occur that the project completion time exceeds the due date D if this was set too tight. In this
cases, one needs to reduce the project duration below its normal value by accomplishing some activities in
a shorter time at a larger cost. This is generally called crashing a project. Taken together, the shortened
activity duration will enable to finish the project by the due date at the minimum total cost of crashing.
Linear programming can often be used to determine the allocation of resources that minimizes the cost of
Let cj be the cost for reducing the duration of activity j by one unit and uj the maximum acceptable
duration reduction of j. Along with the above-mentioned xj variables, we further introduce yj to count the
95
number of unit of time reduction selected for activity j.
P
min j∈N cj yj
s.t. yj ≤ uj ∀j ∈ N
xj ≥ xi + di − yi ∀(i, j) ∈ A
xstart = 0
xend ≤ D
xj ≥ 0 ∀j ∈ N
yj ≥ 0 integer ∀j ∈ N
The objective function minimizes the total crashing cost. The first constraints bounds the maximum reduc-
tion of duration for each activity. The second inequality now takes into account the duration reduction by
considering yj . Finally, due date fulfillment is enforced by imposing that xend cannot exceed D.
In case crashing can be performed with a limited budget availability K, the model is modified as follows:
min xend
s.t.
P
j∈N cj yj ≤ K
yj ≤ uj ∀j ∈ N
xj ≥ xi + di − yi ∀(i, j) ∈ A
xstart = 0
xj ≥ 0 ∀j ∈ N
yj ≥ 0 integer ∀j ∈ N
The model computes the least possible project duration with a budget of K units.
Note that by modifying the activity duration, the critical path can be altered.
• Critical path and slack time analyses help pinpoint activities that need to be closely watched
96
• Project documentation and graphics point out who is responsible for various activities
• Project activities have to be clearly defined, independent, and stable in their relationships
• There is an inherent danger of too much emphasis being placed on the longest, or critical, path
• For PERT, the assumption that the activity duration are independent is difficult to justify. The same
happens for the assumption that the critical path found by CPM will always be the critical path.
97
Lecture 9
Dynamic Programming
Dynamic programming is a technique that can be used to solve many optimization problems. The main idea
is to transform a complex optimization problem into a sequence of smaller and more tractable ones (divide
and conquer). In particular, the problem is divided into a sequence of stages and states are associated to
each stage. Dynamic programming is based on recursion, principle of optimality and alternatively works in
forward or backward directions. Due to its generality, it is widely applied to several optimization problems.
I. The problem can be divided into stages (sub-parts of the problem) with a decision required at each
stage
II. Each stage has a number of states associated with it. By state, we mean the current condition of the
system. This is the information that is needed at any stage to build a decision and possibly make an
optimal decision.
III. The decision chosen at any stage describes how the state at the current stage is transformed into the
IV. Given the current state, the optimal decision for each of the remaining stages must not depend on
previously reached states or previously chosen decisions. This idea is known as the principle of
optimality.
V. If the states for the problem have been classified into one of T stages, there must be a recursion
that relates the cost or reward earned during stages t = 1, . . . , T to the cost or reward earned from
Example 52 (Resource allocation problem). A company has $6000 to invest and 3 investments are available.
If dj thousands of dollars are invested in investment j, then a net present value (in thousands) of rj (dj ) is
obtained:
7d1 + 2 if d1 > 0,
r1 (d1 ) =
if d1 = 0.
0
3d2 + 7 if d2 > 0,
r2 (d2 ) =
if d2 = 0.
0
98
4d3 + 5 if d3 > 0,
r3 (d3 ) =
if d3 = 0.
0
The amount placed in each investment must be an integer multiple of $1000. How should the company
s.t. x1 + x2 + x3 = 6
x1 , x2 , x3 ≥ 0 integer
where xj is the amount of thousands of dollars invested in j. However, rj (dj ) are not linear functions!
The stage should be chosen so that when one stage remains the problem is easy to solve. Thus, each
investment represent a stage and, in general, investment 3 depends by the investment decision 2 and 1 that
we previously set. At each stage, we can invest d amount of dollars. This is our decision. Due to the
problem specifications, d ∈ {0, 1, 2, 3, 4, 5, 6} and these are the possible states at any stage.
We define
A naive dynamic programming scheme should evaluate fk (d) for each state d at each stage k. But simplifi-
If we consider f3 (d), this is the max net present value obtained by investing d in 3. Then at stage 3,
we need to evaluate f3 (0), f3 (1), f3 (2), f3 (3)f3 (4), f3 (5), f3 (6). This can be trivially done by f3 (d) = r3 (d)
because 3 the starting stage of the recursion (n = 3 in our problem). Moreover, x3 (d) = d as all the available
99
d dollars are invested in 3. We then obtain:
f3 (0) = 0 x3 (0) = 0
f3 (1) = 9 x3 (1) = 1
f3 (2) = 13 x3 (2) = 2
f3 (3) = 17 x3 (3) = 3
f3 (4) = 21 x3 (4) = 4
f3 (5) = 25 x3 (5) = 5
f3 (6) = 29 x3 (6) = 6
In general terms, fn (d) for backward computation gives the base case or boundary conditions of the
dynamic programming and indicates the beginning of the recursion. In case of forward recursion, clearly
We move to investment 2 by following the backward procedure and compute f2 (d) for each d value by
The equation incarnates the principle of optimality: optimal solution at stage k uses optimal decision of
later stages. If d amount of dollars are available and we decide to invest xk of it in k, the net present value
will be rk (xk ) + fk+1 (d − xk ): rk (xk ) are given by investment k, whereas fk+1 (d − xk ) is the net present
f2 (1) = max {r2 (x2 ) + f3 (1 − x2 )} = max{r2 (0) + f3 (1), r2 (1) + f3 (0)} = max{9, 10} = 10 x2 (1) = 1
x2 ∈{0,1}
f2 (2) = max {r2 (x2 )+f3 (2−x2 )} = max{r2 (0)+f3 (2), r2 (1)+f3 (1), r2 (2)+f3 (0)} = max{13, 19, 13} = 19 x2 (2) = 1
x2 ∈{0,1,2}
f2 (3) = 23 x2 (3) = 1
f2 (4) = 27 x2 (4) = 1
f2 (5) = 31 x2 (5) = 1
f2 (6) = 35 x2 (6) = 1
100
As example, consider f2 (2). The recursion formula evaluates the best that can be achieved in case we
have only $2000 dollars to invest in 2 and 3 (which is a later stage). Or we invest $0 in 2 and we look at
what we can get with $2000 at stage 3 (r2 (0) + f3 (2)), or we invest $1000 in 2 and leave $1000 for stage 3
(r2 (1) + f3 (1)), or finally we invest the full amount of $2000 in 2 and do not invest in stage 3 (r2 (2) + f3 (0)).
The maximum net present value that we can get among this three alternatives is chosen as f2 (2).
Note that x2 (d) = j if the the best f2 (d) value is achieved in the recursion for x2 = j. It is used to keep
Finally, it is the turn of f1 (d). We have that d = 6 is the total amount of dollars available and f1 (6)
gives the max net present value by investing $6000 among 1, 2 and 3. So, we can reduce the computation as
there is no need to evaluate f1 (0), . . . , f1 (5) (we want to spend all the available budget) and f1 (6) identifies
the optimal value. In any dynamic programming procedure there exists at least a state that codifies the
f1 (6) = max {r1 (x1 ) + f2 (6 − x1 )} = max{r1 (0) + f2 (6), r1 (1) + f2 (5), r1 (2) + f2 (4), r1 (3) + f2 (3),
0≤x1 ≤6,int
r1 (4) + f2 (2), r1 (5) + f2 (1), r1 (6) + f2 (0)} = max{35, 40, 43, 46, 49, 47, 44} = 49 x1 (6) = 4
Since x1 (6) = 4, we invest $4000 in investment 1. This leaves 6000-4000=$2000 for investments 2 and 3.
Hence, we invest x2 (2) = $1000 in investment 2. Then $1000 is left for investment 3, so we choose to invest
x3 (1) = $1000 in investment 3. Therefore, the maximum net present value is f1 (6) = $49000 by investing
Note that finding the solution is equivalent to finding the longest route from node (1, 6) to the auxiliary
101
Figure 13
Each node corresponds to a state, and each column of nodes corresponds to all the possible states
associated with a given stage. For example, if we are at node (2, 3), then we have $3000 dollars available for
investments 2 and 3. Each arc in the network represents the way in which a decision (how much to invest
on the current investment) transforms the current state into next investment’s state. In general, this is an
arc connecting (t, d) and (t + 1, d − x) mimicking the recursive formula for an investment of x at stage t.
For example, the arc joining nodes (2, 3) and (3, 1) corresponds to invest $2000 dollars in 2, leaving $1000
available for investment 3. Hence, the length of each arc is rt (x) corresponding to the net present value
obtained by investing x thousand dollars in investment t (in the above example, the length is r2 (2) = 13).
Note that not all pairs of nodes in adjacent stages are joined by arcs. This happens when it is not possible
to go from a certain state (e.g. (2,3)) to another state (e.g. (3,4), which would require to increase the dollars
Pn
max j=1 cj xj
Pn
s.t. j=1 aj xj ≤ b
xj ∈ {0, 1} ∀j = 1, . . . , n
A dynamic programming scheme aims at computing fk (d) as the maximum value of the subset of items
102
decide if item k is inserted in the knapsack or not. This time, we use forward recursion. Given that our
original instance has n items and a knapsack capacity b, the optimal value is fn (b). Boundary conditions
are given by
if a1 ≤ d,
c
1
f1 (d) =
otherwise.
0
which specifies the solution value for the first item j = 1 and for each knapsack capacity d = 1, . . . , b.
We indicate with fk (d) the maximum profit attainable starting stage k with a capacity equal to d and
By exploiting the principle of optimality, the dynamic programming recursion for k = 2, . . . , n and
d = 0, . . . , b is
if ak > d
f
k−1 (d)
fk (d) =
max{fk−1 (d), ck + fk−1 (d − ak )} if ak ≤ d
The former case is verified when item k do not fit in the remaining capacity d of the knapsack. The latter is
checked when k fits: including k provides a profit of ck to the solution that is added to the optimal solution
value fk−1 (d − ak ) of the knapsack subproblem with capacity d − ak and item subset 1, . . . , k − 1. This
is chosen only if the total solution value is better than the optimal value fk−1 (d) of the subproblem with
capacity d and item subset 1, . . . , k − 1. The above cases can be compactly rewritten as
To trace the optimal solution, xk (d) = 0 or xk (d) = 1 are alternatively set to indicate the exclusion/selection
of item k from the optimal solution of subproblem with capacity d and items 1, . . . , k.
xj ∈ {0, 1} ∀j = 1, . . . , 4
103
Then, accoding to the dynamic programming recursion:
if d = 0, 1 x2 (0) = x2 (1) = 0
f1 (d) = 0
if d = 2 x2 (2) = 0.
f1 (2) = 16
if d = 3 x2 (3) = 1
max{f1 (3), 19 + f1 (0)} = max{16, 19} = 19
f2 (d) =
max{f1 (4), 19 + f1 (1)} = max{16, 19} = 19 if d = 4 x2 (4) = 1
if d = 5 x2 (5) = 1
max{f1 (5), 19 + f1 (2)} = max{16, 19 + 16} = 35
if d = 6 x2 (6) = 1
max{f1 (6), 19 + f1 (3)} = max{16, 19 + 16} = 35
if d = 7 x2 (7) = 1
max{f1 (7), 19 + f1 (4)} = max{16, 19 + 16} = 35
For capacity d = 0, 1, neither of items 1 and 2 fit. When d = 2, item 2 does not fit, but from f1 (2) we know
that item 1 is packed in the knapsack. For d = 3, 4 both items do not fit together, so the most valuable one
is selected, that is item 2. For d = 5, 6, 7 both items fit in the knapsack as the capacity left by item 2 is
respectively 2, 3 and 4 which is enough to host also item 1 as testified by f1 (2), f1 (3), f1 (4) > 0.
Finally, the instance of 0-1 knapsack has a capacity of b = 7 so at the final item 4 it is sufficient to
compute
To rebuild the optimal solution, x4 (7) = 1 thus leaving a capacity of 2 to be evaluated at stage 3. Here,
x3 (2) = 0 and f3 (2) = f2 (2). At stage 2, x2 (2) = 0 and f2 (2) = f1 (2). Finally, x1 (2) = 1. The optimal
solution of value 44 is x1 = x4 = 1, x2 = x3 = 0.
Finding the solution of the 0-1 knapsack is equivalent to finding the longest route from node s =(7, 1)
104
Figure 14
The network is a grid with a row per capacity unit d = 0, . . . , b and n + 1 column. At node (5,3) we are at
subproblem with capacity 5, items 3 and 4 left to be possibly packed. There is an arc connecting (d, k) and
(d − ak , k + 1) mimicking the recursive formula for selecting the item k and reducing the available capacity by
ak . From (5,3), if item 3 is selected, the arc brings to (1,4) as a3 = 4. Moreover, arcs (d, k) − (d, k + 1) are set
to represent the exclusion of item k from the solution. The length of an arc between (d, k) and (d − ak , k + 1)
corresponds to the value ck gained by selecting item k (23 in the above example on item 3), whereas it is 0
for arcs (d, k) − (d, k + 1). Arcs are missing whenever the transaction among states is not feasible according
Pn
max j=1 cj xj
Pn
s.t. j=1 aj xj ≤ b
xj ≥ 0 integer ∀j = 1, . . . , n
One way to work follows the previous framework but, when computing fk (d) (item k, capacity d), one
gets a profit equal to cj · b adj c, where the rounded-down term is the maximum number of copies of k that fit
105
in the knapsack current capacity.
Let us now change perspective. While in example 52 stages were associated to items and states were
derived accordingly to the knapsack capacity, we now proceed on the other way. The approach we now
discuss builds up the optimal knapsack by first determining how to fill a small knapsack optimally and then,
using this information, how to fill a larger knapsack optimally. The recursion is the following
The idea is that to fill a knapsack of capacity w optimally, we try to fill it with all the items j that fits
within. Selected a j, the best value that can be obtained is cj plus the best that can be done on a knapsack
with capacity w − aj . We indicate with x(w) = j if the item j is chosen in the best solution at g(w) and
x(w) = 0 if none of the items is selected. The optimal value of the problem is given by g(b). Finally,
xj ≥ 0 integer ∀j = 1, . . . , 3
Firstly, we see that min {aj } = 3. Hence, boundary conditions are g(0) = g(1) = g(2) = 0 as none of the
j=1,...,n
items fit within knapsack of capacities 0,1,2.
Then, for w = 4
if j = 1
11 + g(0) = 11
g(4) = max
if j = 2
7 + g(1) = 7
thus, the optimal solution for the general knapsack problem with capacity 4 is g(4) = 11 and x(4) = 1.
106
thus, g(5) = 12 and x(5) = 3.
For w = 6, we obtain
if j = 1
11 + g(2) = 11
g(6) = max
7 + g(3) = 7 + 7 = 14 if j = 2
if j = 3
12 + g(1) = 12
thus, g(6) = 14 and x(6) = 2. In the subsequent cases (* is used to identify optimal states):
11 + g(3) = 11 + 7 = 18∗ if j = 1
g(7) = max
7 + g(4) = 7 + 11 = 18∗ if j = 2
if j = 3
12 + g(2) = 12
11 + g(4) = 22∗ if j = 1
g(8) = max
7 + g(5) = 19 if j = 2
if j = 3
12 + g(3) = 19
11 + g(5) = 23∗ if j = 1
g(9) = max
7 + g(6) = 21 if j = 2
12 + g(4) = 23∗ if j = 3
11 + g(6) = 25∗ if j = 1
g(10) = max
7 + g(7) = 25 ∗
if j = 2
if j = 3
12 + g(5) = 24
The optimal solution for the original instance of general knapsack has a value g(10) = 25 where arbitrary
we may choose x(10) = 1 or x(10) = 2. Assume we pick x(10) = 1, then x(10 − 4) = x(6) = 2 at g(6).
Furhtermore, x(6 − 3) = x(3) = 2 at g(3), which leads us to x(3 − 3) = x(0) = 0. The optimal solution is
Shortest path algorithms exploit relationships among optimal paths (and path lengths) for different pairs of
nodes. We consider the single source shortest path problem. Recall from the previous lecture the functional
107
equations, now specified for the minimization case:
v[s] = 0
We refer to v[j] as the length of the shortest s − j path. The first equality is the boundary condition, that
is the length of the longest path v[s] is initially 0. Then, at stage (node) j all its predecessors are evaluated
in terms of path length v[i] + cij , where cij is the length of arc (i, j). If no such arc exists, v[j] = ∞ to flag
the unfeasibility. Among all these measures, the shortest one is selected. Finally, the predecessor of j in the
Functional equations of a dynamic program encode the recursive connections among optimal solution
values that are implied by a principle of optimality: a shortest path can be constructed by extending
known shortest subpaths. In a graph with no negative directed cycles, optimal paths must have optimal
subpaths.
Example 55 (Single source shortest path). Consider the network G = (N, A) depicted in the figure below
and compute the single source shortest path optimal solution from node s = 1.
v[1] = 0
v[5] = ∞
The ambiguities that emerge in some stages due to circular dependency (e.g. v[3] calling v[4] and v[4]
depending by v[3]) makes clear that it is not possible to solve the functional equations by one-pass evaluations
108
when there are directed cycles in the network. For this reason, a repeated evaluation scheme must be
devised.
Bellman-Ford Algorithm
The algorithm makes use of multiple evaluations (up to |N |) of each v[j] using results from the preceding
Step 1 initializes the shortest path lengths v 0 [j] to 0 for node s and +∞ for the other nodes. Iteration
counter is set to 1 at step 2. Up to |N | times the algorithm verifies the values v t [j] of each node and
picks the minimum value (step 6) among incoming neighbors of j (i.e., nodes i for which (i, j) exist). If no
such neighbor exists, v t [j] simply takes the previous v t−1 [j] value (step 5). Whenever v t [j] changes from
the previous iteration value v t−1 [j], d[j] is further updated (step 7). The last iteration t = |N | is used
to detect negative cost directed cycle (i.e. dicycle): if there exists a node j whose shortest path distance
is updated (v |N | [j] < v |N |−1 [j]), then a negative dicycle necessarily exists. One can implement a routine
findNegativeDicycle() to identify it. The optimal value diverges to −∞. Thus, the algorithm works on
networks with negative costs but in presence of negative dicycles, although being able to detect them, it
diverges. If no modification occurs at iteration |N |, v |N | [j] contains the length of the shortest directed s − j
109
path which can be rebuilt by exploiting d[j].
If we have a whole iteration without changing any node’s value, no more changes can occur. This further
termination can be added to the pseudocode. Remark that in a graph with n nodes, a path can contain at
most n − 1 arcs. Hence, in at most n − 1 iterations the shortest path solution should be computed. For
this reason, the n-th iteration is set as additional one to detect negative dicycles (we may still diminishing
Example 56 (Bellman-Ford algorithm). Consider the network in figure below and apply the Bellman-Ford
algorithm.
Firstly, initialize v 0 [1] = 0, v 0 [2] = v 0 [3] = v 0 [4] = +∞ and d[1] = 0 (no predecessor).
v 1 [1] = 0
v 1 [3] = min{v 0 [1] + 8, v 0 [2] − 10, v 0 [4] + 3} = {8, +∞, +∞} = 8 update d[3] = 1
v 2 [1] = 0
v 2 [3] = min{v 1 [1] + 8, v 1 [2] − 10, v 1 [4] + 3} = {8, −5, +∞} = −5 update d[3] = 2
v 1 [4] = +∞
110
Increase t = 3 and get:
v 3 [1] = 0
v 3 [3] = min{v 2 [1] + 8, v 2 [2] − 10, v 2 [4] + 3} = {8, −5, +∞} = −5 no update
v 3 [4] = +∞
No value as been updated, so we can terminate. Path 1-2 is optimal to reach 2 at cost v 3 [2] = 5 as d[2] = 1.
Path 1-2-3 is optimal to reach 3 at cost v 3 [2] = −5 and is built starting from d[3] = 2. There exists no path
Example 57 (Bellman-Ford algorithm: negative dicycle). The network in figure below exhibits a negative
dicycle, which is a cycle whose arcs total cost sum to a negative quantity.
We run the Bellman-Ford algorithm by initializing v 0 [1] = 0 (1 is the source node), v 0 [2] = v 0 [3] =
v 1 [1] = 0
v 1 [4] = v 0 [3] + 12 = +∞
111
Increase t = 2 and calculate:
v 2 [1] = 0
v 3 [1] = 0
v 4 [1] = 0
Given that we updated shortest path lengths for nodes 2 and 3 in the last iteration, there exists a negative
Example 58 (Bellman-Ford algorithm: example). Find the single source shortest path optimal solution
The optimal path are highlighted by bold red arrows in the next figure.
112
Path 1-2 of cost 2.
The minimax path problem asks to find a s − t path that minimizes the maximum arc cost used in any path.
The specular problem is the maximin path problem. In this discussion, we assume to work with a directed
and acyclic network G = (N, A). In such network, nodes can be partitioned into subsets and each one will
correspond to a stage of the dynamic programming algorithm (see the figure in example 58 to get the idea).
A dynamic programming scheme aims at computing fk (i) which indicates the maximum arc cost of a path
from node i at stage k with the minimum maximum arc cost among all the possible i − t paths (therefore,
following an optimal policy in stages k, k + 1, . . . , t). We use backward recursion. Given that our original
instance asks for an s − t path,the optimal value is f1 (s). Boundary conditions are given for nodes directly
connected with t; that is, if t is at stage m boundary conditions are set for nodes at stage m − 1
By exploiting the principle of optimality, the dynamic programming recursion for k = 0, . . . , m − 2 and
nodes i in stage k is
where j may be any node such that there is an arc connecting i and j. For a fixed j, the recursion takes as
113
fk (i) the highest arc cost between the arc (i, j) and the highest cost already identified among arcs in path
j − t (value fk+1 (j)). As usual, we use d(i) to record the node preceding i in the backward recursion to
We want to find an 1 − 10 path in which the maximum altitude (given by the arc weights) is minimized.
Boundary conditions are f4 (7) = 13, f4 (8) = 9, f4 (9) = 9 and d(7) = d(8) = d(9) = 10.
max{c67 , f4 (7)} = max{8, 13} = 13 if j = 7
f3 (6) = min
max{c68 , f4 (8)} = max{6, 8} = 8 if j = 8
if j = 9
max{c69 , f4 (9)} = max{7, 9} = 9
and d(2) = 5;
114
and d(3) = 5;
max{c45 , f3 (5)} = max{11, 8} = 11 if j = 5
f2 (4) = min
if j = 6
max{c46 , f3 (6)} = max{7, 8} = 8
Finally, at stage 1
max{c12 , f2 (2)} = max{10, 9} = 10 if j = 2
f1 (1) = min
max{c13 , f2 (3)} = max{7, 8} = 8 if j = 3
if j = 4
max{c14 , f2 (4)} = max{6, 8} = 8
thus the optimal value is f1 (1) = 8 with d(1) alternatively set to 3 or 4. Assume that d(1) = 3. Then, the
The above method does not work on networks with general networks (not acyclic). In this case, the
fki = best path’s highest arc cost met while going from node i to node t among all the paths that use at most k arcs
i
fk+1 = min {max[cij , fkj ]}
j:(i,j)∈A
codification.
The new recursion is based on repeated evaluations, similarly to the Bellman-Ford algorithm for the
A student is currently taking three courses. It is important that he not fail all of them. If the probability
of failing French is p1 , the probability of failing English is p2 , and the probability of failing Statistics is p3 ,
then the probability of failing all of them is p1 × p2 × p3 . He has left himself with four hours to study. How
115
should he minimize his probability of failing all his courses?
The following gives the probability of failing each course given he studies for a certain number of hours
on that subject.
We let stage 1 correspond to studying French, stage 2 for English, and stage 3 for Statistics. The state
will correspond to the number of hours studying for that stage and all following stages. Let ft (x) be the
probability of failing t and all following courses, assuming x hours are available. Denote the entries in the
above table as pt (k) , the probability of failing course t given k hours are spent on it.
f3 (x) = p3 (x) ∀x ≤ 4
and dt (x) = k is used to record the k value at which the minimum is attained.
Starting the dynamic programming, boundary conditions are f3 (0) = 0.9, f3 (1) = 0.7, f3 (2) = 0.6, f3 (3) =
p2 (0)f3 (1) = 0.75 · 0.7 = 0.52 if k = 0
f2 (1) = min
if k = 1
p2 (1)f3 (0) = 0.7 · 0.9 = 0.63
hence f2 (1) = 0.52 and d2 (1) = 0. By iterating, one obtains the following:
116
Table 2: Stage 2 results.
Finally, in stage 1 it is sufficient to compute the value for x = 4 (the full study time available):
if k = 0
p1 (0)f2 (4) = 0.8 · 0.37 = 0.30
if k = 1
p1 (1)f2 (3) = 0.7 · 0.41 = 0.287
f1 (4) = min
p1 (2)f2 (2) = 0.65 · 0.45 = 0.293 if k = 2
p1 (3)f2 (1) = 0.62 · 0.52 = 0.32 if k = 3
if k = 4
p1 (4)f2 (0) = 0.6 · 0.68 = 0.41
The overall optimal strategy is given by f1 (4) = 0.287 with d1 (4) = 1. That is, spend one hour on French, no
time on English (d2 (3) = 0) and three on Statistics (d3 (3) = 3). The probability of failing all three courses
is about 29%.
117