Ie 400

Download as pdf or txt
Download as pdf or txt
You are on page 1of 117

IE400 - Notebook

Deniz Akkaya July 29, 2024

Lecture 1

Mathematical Models

Operations research, shortened as OR, is a discipline that deals with the development and application of

analytical methods to improve decision-making. OR is a decision-making tool(box) based on mathematical

modeling and analysis to aid in determining how best to design and operate a system usually under the

conditions requiring allocation of scarce resources. Mathematical modelling is the process of describing a

real world problem in mathematical terms, usually in the form of equations, and then using these equations

both to help understand the original problem, and also to discover new features about the problem.

Example 1 (The Milk Production Problem). There are two machines needed to process milk to produce

low-fat and regular milk. The processing times are:

(Pasteurization) (Homogenization)

Machine 1 Machine 2

Low-fat 3 minutes/liter 1 min/lt

Regular 2 min/lt 2 min/lt

In a day, machine 1 can be used for 12 hours and machine 2 can be used for 6 hours The sales price of

low-fat milk is 2.5 TL/lt and that of regular milk is 1.5 TL/lt. Find how much low-fat and regular milk

should be produced in a day to maximize total revenue.

We follow the following procedure:

1) What are the unknowns that should be determined by decision maker?

x1 := amount in liters of low-fat milk produced in a day,

x2 := amount in liters of regular milk produced in a day.

These are the decision variables.

2) What are the technical factors which are not under control of the decision-maker?

1
• Available machine hours

• Required machine hours/product

• Revenue per unit of each product

These are the parameters.

3) What are the physical limitations?

• Total production time on each machine cannot exceed the available machine hours.

• Production amounts should be nonnegative

These are the contraints.

4) What is the measure used to compare different decisions?

• Total revenue

This is the objective function.

Then we have the following model:

max 2.5x1 + 1.5x2 Total revenue

subject to 3x1 + 2x2 ≤ 720 Total production time on Machine 1 ≤ 12 hours

x1 + 2x2 ≤ 360 Total production time on Machine 2 ≤ 6 hours

x1 , x2 ≥ 0 Production amounts ≥ 0

This is an example of a linear programming (LP) model.

• Any values of decision variables that satisfy all of the constraints of a mathematical model simultane-

ously is called a feasible solution.

• The set of all feasible solutions is called the feasible set or the feasible region.

• The objective is to find a feasible solution that yields the best objective function value.

• If there exists such a feasible solution it is called an optimal solution.

• The objective function value evaluated at an optimal solution is called the optimal value.

2
The Linear Programming (LP) Problem

An LP problem is an optimization problem in which (∗) the objective function is given by a linear combination

of the decision variables and (∗∗) the constraints are linear inequalities or linear equations.

(∗) min (or max) c1 x1 + c2 x2 + · · · + cn xn

where c1 , . . . , cn ∈ R are given parameters and x1 , . . . , xn are the decision variables.


 
 ≤

 


  
(∗∗) a1 x1 + a2 x2 + · · · + an xn = b

 


 
≥
 

where a1 , . . . , an , b ∈ R are given parameters and x1 , . . . , xn are the decision variables.

A general form of an LP problem can be written as

max (min) c1 x1 + c2 x2 + . . . cn xn x1 , . . . , xn are the decision variables

s.t. ai1 x1 + ai2 x2 + · · · + ain xn ≤ bi , ∀i = 1, 2, . . . , L, L ” ≤ ” constraints

ai1 x1 + ai2 x2 + · · · + ain xn ≥ bi , ∀i = L + 1, L + 2, . . . , L + P, P ” ≥ ” constraints

ai1 x1 + ai2 x2 + · · · + ain xn = bi , ∀i = L + P + 1, . . . , L + P + Q, Q ” = ” constraints

The sign restrictions, i.e.

x1 , x2 , . . . , xm ≥ 0,

xm+1 , xm+2 , . . . , xm+s free,

xm+s+1 , xm+s+2 , . . . , xn ≤ 0,

cj , aij , bj ∈ R are the parameters.

Remark 1. We observe the following:

• The contribution of each decision variable to the objective function or to each constraint is proportional

to the value of that variable.

• The contribution from decision variables are independent from one another and the total contribution

is the sum of the individual contributions.

• Each decision variable is allowed to take fractional values.

• All input parameters (coefficients of the objective function and of the constraints) are known with

certainty.

3
Example 2 (Blending Problem I). An oil refinery distills crude oil from Saudi Arabia and Venezuela into

gasoline, jet fuel and lubricants. The crude differ in chemical composition and yields. A ton of S.A. crude

oil yields 0.3 tons of gasoline, 0.4 tons of jet fuel and 0.2 tons of lubricants. A ton of Ven. crude oil yields

0.4 tons of gasoline, 0.2 tons of jet fuel and 0.3 tons of lubricants. 0.1 tons of crude oil is lost in refining for

both crude oils. At most 90 tons can be purchased daily from S.A. at a cost of $600/ton. At most 60 tons

can be purchased daily from Ven. at a cost of $550/ton. The daily demand is 20 tons of gasoline, 15 tons of

jet fuel, and 5 tons of lubricants. How should the demand be met at a minimum cost?

Sol:

1) Decision variables:

xS := amount in tons of S.A. crude oil bought daily,

xV := amount in tons of Ven. crude oil bought daily,

2) Objective function:

min 600xS + 550xV

3) Constraints:

xS ≤ 90, At most 90 tons can be purchased daily from S.A.

xV ≤ 60, At most 60 tons can be purchased daily from Ven.

0.3xS + 0.4xV ≥ 20, At least 20 tons of gasoline must be satisfied

0.4xS + 0.2xV ≥ 15, At least 15 tons of jet fuel must be satisfied

0.2xS + 0.3xV ≥5 At least 5 tons of lubricants must be satisfied

Then we have the following model:

4
min 600xS + 550xV

s.t. xS ≤ 90,

xV ≤ 60,

0.3xS + 0.4xV ≥ 20,

0.4xS + 0.2xV ≥ 15,

0.2xS + 0.3xV ≥ 5,

xS , xV ≥ 0.

Example 3 (Transportation Problem). There are two power plants generating electricity, and three factories

that need electricity in their production. Capacities of power plants are 1250 and 2000 megawatts, demands

of factories are 1500, 750 and 750 megawatts. Electricity transmission costs per megawatt are given in the

table below

F1 F2 F3

PP1 50 100 60

PP2 30 20 35

How to satisfy the demand of each factory from the two power plants with minimum total cost?

Sol:

1) Decision variables:

xij := amount in megawatts of electricity transferred from power plant i to factory j.

2) Objective function:

min 50x11 + 100x12 + 60x13 + 30x21 + 20x22 + 35x23

5
3) Constraints:

x11 + x12 + x13 ≤ 1250, At most 1250 megawatts can be transferred from PP1

x21 + x22 + x23 ≤ 2000, At most 2000 megawatts can be transferred from PP2

x11 + x21 ≥ 1500, At least 1500 megawatts needed to be transferred to F1

x12 + x22 ≥ 750, At least 750 megawatts needed to be transferred to F2

x13 + x23 ≥ 750, At least 750 megawatts needed to be transferred to F3

Then we have the following model:

min 50x11 + 100x12 + 60x13 + 30x21 + 20x22 + 35x23

s.t. x11 + x12 + x13 ≤ 1250

x21 + x22 + x23 ≤ 2000,

x11 + x21 ≥ 1500,

x12 + x22 ≥ 750,

x13 + x23 ≥ 750,

x11 , x12 , x13 , x21 , x22 , x23 , ≥ 0.

Example 4 (General Transportation Problem). Assume that there are m warehouses and n stores. Let ai

denote the amount of stock at warehouse i for i = 1, . . . , m, dj denote the demand of store j for j = 1, . . . , n

and cij denote the cost of transporting a unit of product from warehouse i to store j, for i = 1, . . . , m and

j = 1, . . . , n.

How should the demands of stores be met at a minimum total transportation cost while respecting the

stock limitations of the warehouses?

Sol:

1) Decision variables:

xij := amount of product transferred from warehouse i to store j.

2) Objective function:

Pm Pn
min i=1 j=1 cij xij

6
3) Constraints:

Pm
i=1 xij ≥ dj , ∀j = 1, . . . , n, Demand of each store should be satisfied
Pn
j=1 xij ≤ ai , ∀i = 1, . . . , m, Capacity of each warehouse should not be exceeded

Then we have the following model:

Pm Pn
min i=1 j=1 cij xij
Pm
s.t. i=1 xij ≥ dj , ∀j = 1, . . . , n,
Pn
j=1 xij ≤ ai , ∀i = 1, . . . , m,

xij ≥ 0, ∀i = 1, . . . , m, ∀j = 1, . . . , n.

Example 5 (Blending Problem II). OJ Inc. produces orange juice from 3 different grades of oranges.

Oranges are graded 1 (poor), 2 (medium), and 3 (good) depending on their quality. Each grade of oranges

yield different amount of juice:

Grade Yield (Juice lt/kg)

1 0.4

2 0.5

3 0.6

The company currently has 100 kgs of Grade 1, 150 kgs of Grade 2 and 200 kgs of Grade 3 oranges. This

company produces 3 types of orange juices from these oranges: Superior, Premium and Regular. There are

some production requirements:

Type Minimum average grade Profit per liter Min. daily production

Superior 2.4 1.5 TL 45 lt

Premium 2.2 1 TL 50 lt

Regular 2.0 0.75 TL 100 lt

Assume the company may sell more than demand at the same prices. How can the company maximize

profit profit while satisfying production requirements?

Sol:

7
1) Decision variables:

xij := amount in kilograms of oranges of Grade i to make orange juice of type j,

where j = 1 is Superior, j = 2 is Premium and j = 3 is Regular.

2) Parameters:

• y1 = 0.4, y2 = 0.5, y3 = 0.6,

• s1 = 100, s2 = 150, s3 = 200,

• a1 = 2.4, a2 = 2.2, a3 = 2,

• p1 = 1.5, p2 = 1, p3 = 0.75,

• d1 = 45, d2 = 50, d3 = 100.

3) Objective function:

P3 P3
max i=1 j=1 yi pj xij

4) Constraints:

x1j + 2x2j + 3x3j ≥ aj (x1j + x2j + x3j ), ∀j = 1, 2, 3 Minimum average grade should be satisfied
P3
i=1 yi xij ≥ dj , ∀j = 1, 2, 3 Minimum daily production should be satisfied
P3
j=1 xij ≤ si , ∀i = 1, 2, 3 Initial stock should not be exceeded

Then we have the following model:

P3 P3
max i=1 j=1 yi pj xij

s.t. x1j + 2x2j + 3x3j ≥ aj (x1j + x2j + x3j ), ∀j = 1, 2, 3


P3
i=1 yi xij ≥ dj , ∀j = 1, 2, 3
P3
j=1 xij ≤ si , ∀i = 1, 2, 3

xij ≥ 0, ∀i, j = 1, 2, 3.

Example 6 (Short-term Investment Planning). You have 12,500 TL at the beginning of year 2021 and

would like to invest your money. Your goal is to maximize your total amount of money at the beginning of

year 2024. You have the following options:

8
Duration Available
Option Total interest rate
(years) (at the beginning of )

1 2 26% 2021, 2022

2 1 12% 2021, 2022, 2023

3 3 38% 2021

4 2 27% 2022

How do we invest in order to maximize our profit?

Sol: Normally,

"Cash available at time t" = "Cash invested at time t"

+ "Uninvested cash at time t that is carried over to time t + 1"

However, since Option 2 is available at each year, we can always invest in this option rather than keeping

some money for later better options. Hence, for this problem, the following holds:

"Cash available at time t" = "Cash invested at time t"

1) Decision variables:

xij := amount in TL invested in Option i at year 202j,

2) Objective function:

P4
max i=1 xi4

3) Constraints: Total money at the beginning of a year under possible investments

P4
i=1 xi1 = 12500,
P4
i=1 xi2 = 1.12x21
P4
i=1 xi3 = 1.12x22 + 1.26x11
P4
i=1 xi4 = 1.12x23 + 1.26x12 + 1.38x31 + 1.27x42

9
Then we have the following model:

P4
max i=1 xi4
P4
s.t. i=1 xi1 = 12500,
P4
i=1 xi2 = 1.12x21
P4
i=1 xi3 = 1.12x22 + 1.26x11
P4
i=1 xi4 = 1.12x23 + 1.26x12 + 1.38x31 + 1.27x42

xij ≥ 0, ∀i = 1, . . . , 4, ∀j = 1, . . . 4

Example 7 (Product Mix Problem). Milk chocolate is produced with three ingredients. 1 kg of milk

chocolate contains 0.5 kg of milk, 0.4 kg of cocoa and 0.1 kg of sweetener. Each of these three ingredients

must be processed before they can be used in the production of milk chocolate. The factory has two

departments that can process these ingredients:

Department 1 Department 2
Ingredient
(hours/kg) (hours/kg)

Milk 0.4 0.6

Cocoa 0.3 0.2

Sweetener 0.5 0.6

Both departments have 150 hours of available time per week for processing. Formulate an LP problem

that maximizes the total amount of milk chocolate produced in a week.

Sol:

1) Decision variables:

xij := amount in kilograms that Department i processes Ingredient j,

y := total amount of milk chocolate produced

where j = 1 is Milk, j = 2 is Cocoa and j = 3 is Sweetener.

2) Parameters:

• a1 = 0.5, a2 = 0.4, a3 = 0.1,

• b11 = 0.4, b12 = 0.3, b13 = 0.5,

• b21 = 0.6, b22 = 0.2, b23 = 0.6.

10
3) Objective function:

max y

4) Constraints:

P3
≤ 150, ∀i = 1, 2
j=1 bij xij Available time should not be exceeded
P2
aj y ≤ i=1 xij , ∀j = 1, . . . , 3, Bottleneck principle on the total amount of milk chocolate.

Then we have the following model

max y
P3
s.t. j=1 bij xij ≤ 150, ∀i = 1, 2
P2
aj y ≤ i=1 xij , ∀j = 1, . . . , 3,

xij ≥ 0, ∀i = 1, 2, ∀j = 1, 2, 3.

Example 8 (Error Minimization Problem). You estimate that there is a linear relationship between the

stock price of a beverage company and the number of beverages sold (in thousands) in a day:

Day Stock Price # of Beverages Sold

1 2 10

2 3 12

3 2 8

4 3 11

5 4 14

You predict the relationship as si = a × ni + b, where si is the stock price and ni is the number of

beverages sold (in thousands) on day i = 1, . . . , 5. For a given choice of a and b, your estimation error on

day i is |si − a × ni − b| for i = 1, . . . , 5. Choose the unknown coefficients a and b such that the largest

estimation error in any one of the 5 days is as small as possible.

Sol:

1) Decision variables: a, b

2) Parameters: si , ni for i = 1, . . . , 5.

11
3) Objective function:

minimize largest estimation error := min max |si − a × ni − b|


i=1,...,5

4) Constraints: NONE!

A NONLINEAR MODEL! Try again...

1) Decision variables: a, b and t is defined to be the largest estimation error.

2) Parameters: si , ni for i = 1, . . . , 5.

3) Objective function:

min t

4) Constraints: −t ≤ si − a × ni − b ≤ t for all i = 1, . . . , 5.

Then we have the following model

min t

s.t. si − a × ni − b + t ≥ 0, ∀i = 1, . . . , 5,

si − a × ni − b − t ≤ 0, ∀i = 1, . . . , 5,

a, b, t free.

Example 9 (Production Planning Problem). XYZ Company produces three products A, B, and C and

must meet the demand for the next four weeks.

Weekly Demand

Product 1 2 3 4

A 25 30 15 22

B 15 16 25 18

C 10 17 12 12

Each unit of Product A requires 3, Product B requires 2 and Product C requires 4 hours of labor. During

a week, 120 hours of regular-time labor is available at a cost of $10/hour. In addition, 25 hours of overtime

labor is available weekly at a cost of $15/hour. No shortages are allowed and XYZ Company has 18 A, 13 B,

12
and 15 C initial stocks for these products. Provide an LP formulation to determine. a production schedule

that minimizes the total labor costs for the next four weeks.

Sol:

1) Decision variables:

xij := amount of product i produced during week j

where i ∈ {A, B, C} and j = 1, 2, 3, 4.

yij := amount of inventory of product i at the end of week j after meeting demand.

where i ∈ {A, B, C} and j = 0, 1, 2, 3, 4.

zj := amount of overtime hours used in week j

where j = 1, 2, 3, 4.

2) Objective function:

P4 P P4
min 10 j=1 i∈{A,B,C} Li xij + 5 j=1 zj

3) Parameters: Dij is the demand of product i in week j. Li is the hours of labor requirement for product

i. Si is the initial stock of the product i.

4) Constraints:

yi(j−1) + xij − Dij = yij , ∀i ∈ {A, B, C}, ∀j = 1, 2, 3, 4, Production Constraints

yij ≥ 0, ∀i ∈ {A, B, C}, ∀j = 1, 2, 3, 4, Demand Constraints

Labor Constraints
P
i∈{A,B,C} Li xij ≤ 120 + zj , ∀j = 1, 2, 3, 4,

zj ≤ 25, ∀j = 1, 2, 3, 4, Overtime Constraints

13
Then we have the following model

P4 P P4
min 10 j=1 i∈{A,B,C} Li xij + 5 j=1 zj

s.t. yi(j−1) + xij − Dij = yij , ∀i ∈ {A, B, C}, ∀j = 1, 2, 3, 4,

yi0 = Si , ∀i ∈ {A, B, C},


P
i∈{A,B,C} Li xij ≤ 120 + zj , ∀j = 1, 2, 3, 4,

zj ≤ 25, ∀j = 1, 2, 3, 4,

xi,j , zj ≥ 0, ∀i ∈ {A, B, C}, ∀j = 1, 2, 3, 4,

yij ≥ 0, ∀i ∈ {A, B, C}, ∀j = 0, 1, 2, 3, 4.

Homework 1. Rylon Corporation manufactures Brute and Channelle perfumes from a raw material costing

$3/pound.

• Processing 1 pound of the material takes 1 hour and yields 3 ounces of Regular Brute and 4 ounces

of Regular Chanelle. This perfume can be sold directly or can be further processed to make luxury

versions of the perfumes.

• One ounce of Regular Brute can be sold for $7, but 3 hours of processing and $4 of cost will convert

it into one ounce of Luxury Brute, which sells for $18.

• One ounce of Regular Chanelle can be sold for $6, but 2 hours of processing and $4 of cost will convert

it into one ounce of Luxury Chanelle, which sells for $14.

• Rylon has available 4000 pounds of raw material, 6000 hours of processing time, and will sell everything

it produces.

Determine how Rylon can maximize its profit.

Homework 2. Bel Aire Pillow Company needs to manufacture 12,000 pillows in their factory next week.

They have two kinds of employees: line workers (who do the actual work) and supervisors (who oversee the

line workers). Bel Aire currently has 120 line workers and 20 supervisors. Each line worker can make 300

pillows per week. Supervisors do not engage in actual pillow-making.

A recent study done on Bel Aire’s operation concluded that, for proper supervision, there must be at

least 1 supervisor for every 5 line-workers. The study also suggested that Bel Aire might be able to cut its

costs by doing some or all of the following:

• hiring new employees as line workers (the training cost is $100/worker)

• firing some current line workers (the fired worker is given $240 severance pay)

14
• promoting some current line workers to supervisors (the training cost is $200/worker)

• demoting some current supervisors to line workers (no cost associated with this).

Pay for a line worker is $400/week. Pay for a supervisor is $700/week. All hiring and firing, and training

occur before the upcoming work week, and the costs of those activities will be charged against the upcoming

week. Bel Aire wishes to perform the appropriate hiring, firing, promotions and demotions that will minimize

its operations cost for next week.

15
Lecture 2

Graphical Solution of 2-Variable LP Problems

Consider a general form LP:

max cx

s.t. Ax ≤ b

x≥0

A line on wihich all points have the same objective function value is called the isoprofit line for maximiza-

tion problems (the isocost line for minimization problems). As we move the isoprofit lines in the direction

of c, the total profit increases.

For a maximization problem, the vector c, given by the coefficients of the decision variables in the

objective function, is called the improving direction. On the other hand, −c is the improving direction for

minimization problems.

To solve an LP problem in two variables, we should try to push isoprofit (isocost) lines in the improving

direction as much as possible while still staying in the feasible region.

1
Example 10.

max x1 + 3x2

s.t. x1 + x2 ≤ 6

−x1 + 2x2 ≤ 8

x1 , x2 ≥ 0

   
1 1 6
 
If we consider this problem in general form we need to define A =  , b =  , c = 1 3 and
−1 2 8
 
x1 
x =  . Then we write the model as
x2

max cx

s.t. Ax ≤ b

x≥0
1 Desmos Code: zxvn4axza3

16
Let us modify the objective function without chaning the constraints:

max x1 − 2x2

s.t. x1 + x2 ≤ 6

−x1 + 2x2 ≤ 8

x1 , x2 ≥ 0

17
Let us modify the objective function once again without chaning the constraints:

max −x1 + x2

s.t. x1 + x2 ≤ 6

−x1 + 2x2 ≤ 8

x1 , x2 ≥ 0

18
Let us modify the objective function once again without chaning the constraints:

max −x1 − x2

s.t. x1 + x2 ≤ 6

−x1 + 2x2 ≤ 8

x1 , x2 ≥ 0

Have you noticed anything about the optimal points?

Extreme Points

A set of points S is a convex set if for any two points x ∈ S and y ∈ S, their convex combination λx+(1−λ)y

is also in S for all λ ∈ [0, 1]. Alternatively, a set is a convex set if the line segment joining any pair of points

in S is contained in S.

Fact 1. Feasible set of an LP X = {x ∈ Rn : Ax ≤ b, x ≥ 0} is a convex set.

A point x is a strict convex combination of x1 and x2 if x = λx1 + (1 − λ)x2 for some λ ∈ (0, 1). A

point P of a convex set S is an extreme point (corner point) if it cannot be represented as a strict convex

combination of two distinct points of S.

In other words, for any convex set S, a point P in S is an extreme point (corner point) if each line

segment that lies completely in S and contains the point P , has P as an endpoint of the line segment.

19
Example 11. 2
A company wishes to increase the demand for its product through advertising.

• Each minute of radio ad costs $1 and each minute of TV ad costs $2.

• Each minute of radio ad increases the daily demand by 2 units and each minute of TV ad by 7 units.

• The company would wish to place at least 9 minutes of daily ad in total. It wishes to increase daily

demand by at least 28 units.

How can the company meet its advertising requirements at minimum total cost?

Sol: Let x1 be # of minute of radio ads and x2 be # of minutes of TV ads purchased. Then we have the

following model

min x1 + 2x2

s.t. 2x1 + 7x2 ≥ 28

x1 + x2 ≥ 9

x1 , x2 ≥ 0

(7, 2) is the optimal solution with objective value 11. Now suppose that it costed $4 to place a minute of
2 Desmos Code: mr1yyjf52o

20
radio ad and $14 to place a minute of TV ad. Then we have the following model

min 4x1 + 14x2

s.t. 2x1 + 7x2 ≥ 28

x1 + x2 ≥ 9

x1 , x2 ≥ 0

In this case, every feasible solution on the line segment [(7, 2), (14, 0)] is an optimal solution with the

same optimal value 56.

If every feasible solution on a line segment are optimal, then we say LP has multiple or alternate optimal

solutions.

Example 12. 3
Consider the following LP problem

max x1 + 3x2

s.t. −x1 + 2x2 ≤ 8

x1 , x2 ≥ 0
3 Desmos Code: tbyr6sd1fb

21
If the objective function for a maximization problem can always be increased by moving in the improving

direction c while still staying in the feasible region, we say that the LP is unbounded. For unbounded

LP maximizartion problems, there is no optimal solution and the optimal value is defined to be +∞. For

unbounded LP minimization problems, there is no optimal solution and the optimal value is defined to be

−∞.

Example 13. 4
Consider the following LP problem

max x1 + 3x2

s.t. x1 + x2 ≤ 6

−x1 + 2x2 ≤ 8

x2 ≥ 6

x1 , x2 ≥ 0
4 Desmos Code: seenzfvfis

22
Fact 2. Every LP problem falls into one of the 4 cases:

• Case 1: The LP problem has a unique optimal solution.

• Case 2: The LP problem has alternative or multiple optimal solution.

• Case 3: The LP problem is unbounded, there is no optimal solution.

• Case 4: The LP problem is infeasible, there is no feasible solution.

Theorem 1. If the feasible set {x ∈ Rn : Ax ≤ b, x ≥ 0} is not empty, that is if there is a feasible solution,

then there is an extreme point.

Theorem 2. If an LP has an optimal solution, then there is an extreme point of the feasible region which

is an optimal solution to the LP.

Remark 2. Not every optimal solution needs to be an extreme point.

The graphical method is a method for solving LP’s with two decision variables as follows:

• Step 1: Graph the feasible region.

• Step 2: Draw an isoprofit line (for maximization) or an isocost line (for minimization).

• Step 3: Move parallel to the isoprofit/isocost line in the improving direction.

• Step 4: (If possible) pick the last point in the feasible region that contacts an isoprofit/isocost line.

23
Homework 3. Consider the following LP problem

min 2x1 + x2

s.t. x1 + 2x2 ≤ 10

x1 + x2 ≤ 6

x1 − x2 ≤ 2

x1 − 2x2 ≤ 1

x1 , x2 ≥ 0.

Solve this problem using the graphical method.

Homework 4. A firm plans to purchase at least 200 quintals of scrap containing high-quality metal X and

low-quality metal Y. It decides that the scrap to be purchased must contain at least 100 quintals of metal X

and not more than 35 quintals of metal Y. The firm can purchase the scrap from two suppliers (A and B) in

unlimited quantities. The percentage of X and Y metals in terms of weight in the scrap supplied by A and

B is given below.

Metals Supplier A Supplier B

X 25% 75%

Y 10% 20%

The price of A’s scrap is $ 200 per quintal and that of B is $ 400 per quintal. The firm wants to determine

the quantities that it should buy from the two suppliers so that the total cost is minimized. Formulate a

mathematical model and use graphical method to solve it.

24
Lecture 3

The Simplex Algorithm

Simplex algorithm (or Simplex method) is a widely-used algorithm to solve the linear programming (LP)

optimization problems. It starts with an extreme point, move to a neighboring extreme point in the improving

direction and stop if all neigbors are no better.

An LP is in Standard Form if all constraints are equalities with constant non-negative right-hand sides

(RHS) and all variables are non-negative.

In general LP problems can have both equality and inequality constraints. They can have non-negative

and free variables. In order to use the Simplex Method, LP problems should be converted to Standard Form.

Fact 3. Any LP can be brought into standard form.

• If a variable x is taking non-positive values, substitute it with −x̂ with x̂ ≥ 0.

• If a variable x is free, substitute it with x+ − x− with x+ , x− ≥ 0.

• Define a "slack" variable for each of the "≤" constraint to convert the inequality constraint to an

equality and a non-negativity constraint.

• Define an "excess (surplus)" variable for each of the "≥" constraints to convert the inequality constraint

into an equality and a non-negativity constraint.

• Multiply equality constraints by -1 if it has a negative RHS to obtain a non-negative RHS.

Example 14. Consider the following LP problem

max x1 + 3x2

s.t. 2x1 + 3x2 + x3 ≤ 5

4x1 + x2 + 2x3 = −11

3x1 + 4x2 + 2x3 ≥ 8

x1 ≥ 0, x2 ≤ 0, x3 free.

25
Using the above techniques we obtain the equivalent optimization model in Standard Form

max x1 − 3x̂2

s.t. 2x1 − 3x̂2 + x+
3 − x3 + s1 = 5

−4x1 + x̂2 − 2x+
3 + 2x3 = 11

3x1 − 4x̂2 + 2x+
3 − 2x3 − e3 = 8

x1 , x̂2 , x+
3 , x3 , s1 , e3 ≥ 0

Suppose that we converted an LP with m̃ constraints into a standard form. After the conversion, we

have n variables as x1 , x2 , . . . , xn . Then we can write our feasible region as {x ∈ Rn : Ãx = b, x ≥ 0}.

. An equality constraint is redundant if it can be expressed as a linear combination of other equality

constraints in the model. For Standard Form LP’s we may assume that all redundant constraints are

removed an we end up with m constraints, such that rank(A) = m. Then we can write our feasible region

as {x ∈ Rn : Ax = b, x ≥ 0} where A ∈ Rm×n . Under these assumptions Ax = b is solvable since

rank(A) = rank(A|b).

Example 15. Suppose we have the following constraints:

x1 + 2x2 + x3 = 4 (1)

x2 − x3 = 1 (2)

2x1 + 6x2 = 10 (3)

You may multiply Constraint (1) and Constraint (2) by 2 and add them up to obtain Constraint (3). Hence

Constraint (3) may be removed from the system and we will end up with rank(A) = 2.

A solution to Ax = b is called a basic solution if it is obtained by setting n − m variables equal to 0

and solving the remaining m variables whose columns are linearly independent. The n − m variables whose

values are set to 0 are called non-basic variables. The remaining m variables are called basic variables. A

basic solution which has all non-negative entries is called a basic feasible solution.

We extract basic solutions from a matrix in standard form as follows:

• Pick m linearly independent columns from A.

• Rearrange the columns of A such that these chosen columns are the first m columns in A.
 
• Matrix A can be written in the form A = Bm×m Nm×(n−m) where Bm×m is invertible.

26
 
x
 B
• Now let x =   be the corresponding partition in x.
xN

 
 xB 
 
Ax = b ⇒ B N   = b ⇒ BxB + N xN = b
xN

• If we fix xN = 0, then xB = B −1 b is a unique solution.


 
−1
B b
Hence for each B, there is a unique solution  . If B −1 b ≥ 0 then this solution is a basic feasible
0
solution. Such B is called a basis.

Fact 4. The number of basic feasible solutions are at most n


.

m

Example 16. Consider the system of inequalities X = {(x1 , x2 ) ∈ R2+ : x1 + x2 ≤ 6, x2 ≤ 3}. In standard

form we can rewrite


 it as X =  {(x1 , x2 , x3 , x4 ) ∈ R+ : x1 + x2 + x3 = 6, x2 + x4 = 3}. If we define
4

1 1 1 0 3
A=  and b =  , then X = {x ∈ R4 : Ax = b, x ≥ 0}.
0 1 0 1 6

We enumerate possible bases:

27
            
1 1 x
 3   0 x
 1 1 −1  6 3
i) B =  . Then we fix   =   and   = B −1 b =    =  . This basis yields
  
0 1 x4 0 x2 0 1 3 3
a basic feasible solution.
 
1 1
ii) B =  . This matrix is not invertible so it does not give a basic solution.
0 0
            
1 0 x2  0 x1  1 0 6 6
iii) B =  . Then we fix   =   and   = B −1 b =     =  . This basis yields a
0 1 x3 0 x4 0 1 3 3
basic feasible solution.
            
1 1 x
 1   0 x
 2 0 1  6 3
iv) B =  . Then we fix   =   and   = B −1 b =     =  . This basis yields
 
1 0 x4 0 x3 1 −1 3 3
a basic feasible solution.
            
1 0 x1  0 x2  1 0 6  6 
v) B =  . Then we fix   =   and   = B −1 b =     =  . This basis yields
1 1 x3 0 x4 −1 1 3 −3
a basic solution.
            
1 0 x1  0 x3  1 0 6 6
vi) B =  . Then we fix   =   and   = B −1 b =     =  . This basis yields a
0 1 x2 0 x4 0 1 3 3
basic feasible solution.
      
3 6 0 0
      
      
3 0 3 0
All basic feasible solutions to this system are   ,   ,    
      
0 0 3 6
      
      
0 3 0 3

Theorem 3. A point in the feasible region of an LP is an extreme point if and only if it is a basic feasible

solution to the LP.

Theorem 4. If feasible set of an LP is non-empty, then there is at least one basic feasible solution. If an

LP has an optimal solution, then there is a basic feasible solution which is optimal.

For any LP with m constraints, two basic feasible solutions are said to be adjacent if their set of basic

variables have m − 1 basic variables in common.

Example 17. Recall the previous example

28
General Description of Simplex Algorithm

1) Convert the LP problem into a standard form LP.

2) Obtain a basic feasible solution to the LP. This basic feasible solution is called the initial basic feasible solution.

In general, the most recent basic feasible solution iscalled the current basic feasible solution. Therefore,

at the beginning the initial basic feasible solution is the current basic feasible solution.

3) Determine if the current basic feasible solution is an optimal solution or not.

4) If the current basic feasible solution is not optimal, then find an adjacent basic feasible solution with a

better objective function value (one non-basic variable becomes basic and one basic variable becomes

non-basic).

5) Go to Step 3.

Simplex Tableau and Pivot Operations

A system of linear equations in which each equation has a variable with a coefficient 1 in that equation (and

a 0 coefficient in all other equations) is said to be in canonical form.

29
Suppose we are given an LP in the standard form

max cT x

s.t. Ax = b,

x ≥ 0.

Now introduce an extra variable z representing the objective value and rewrite the initial LP as

max z

s.t. z − cT x = 0,

Ax = b,

x ≥ 0.

We do this in order to understand the contribution of a unit of improvement on the objective value: z +

α1 x1 + · · · + αn xn = β. If αi > 0 then a unit of improvement in i will make z decrease, if αi < 0 then z will

increase.

Now we can put these in a tableau.

 
 z x RHS 
 
 1
 −cT 0 
 
0 A b

Then we pick an initial basic feasible solution. Let B be the sub-matrix of A corresponding to the basic

variables and N be the one for the non-basic variables. Then we may rewrite our table as

 
 z xB xN RHS 
 
z  1
 −cTB −cTN 0 
 
xB 0 B N b

Then we transform this tableau into canonical form in two steps. We first multiply rows corresponding to

xB with B −1 :  
 z xB xN RHS 
 
z  1
 −cTB −cTN 0 
 
xB 0 I B −1 N −1
B b

Then clear the entries above the matrix I by multiplying rows corresponding to xB with cTB and add it to

30
the first row:  
 z xB xN RHS 
 
−1 −1
z  1
 0 T
cB B N − cN
T T
cB B b 

 
xB 0 I B −1 N B −1 b

After converting the tableau to the canonical form we check for optimality:

Y ES, if all entries of cTB B −1 N − cTN are non-negative.


Is that basic feasible solution is optimal?
if there is an entry i∗ such that (cTB B −1 N − cTN )[i∗ ] < 0.

N O,

The vector cTB B −1 N − cTN we use for optimality check is called reduced cost coefficients.

If the solution is optimal we STOP. Otherwise we pick the most negative entry of the reduced cost

coefficient, denote it i∗ , and swap it with a basic variable to move to an adjacent solution. This variable xi∗

is called an entering variable.

Now we have to chose a basic variable to swap with the entering variable, such variable is called

leaving variable. Increasing a non-basic variable may cause a basic variable to take negative values. We

observe this change using a similar argument we use while checking reduce cost coefficient. xj + · · · +

αi∗ xi∗ + · · · = βj , making xi∗ a basic variable may cause xj to decrease by βj /αi∗ units. So we use the

ratio test to determine the leaving variable.

Ratio Test: For every row j in which the entering variable i∗ has a positive coefficient, we compute the

ratio

RHS of row j
ρj =
Coefficient of entering variable in row j

and set j ∗ = arg min ρj .

The position (j ∗ , i∗ ) in the tableau is called the pivot term. Use elementary row operations to make the

pivot term 1 and all remaining entries in column i∗ to have a tableau in canonical form. Repeat these steps

until optimality.

Example 18. Consider the following LP problem

max x1 + 3x2

s.t. x1 + x2 ≤ 6,

−x1 + 2x2 ≤ 8,

x1 , x2 ≥ 0

31
We introduce slack variables and the variable representing the objective value and obtain the following

equivalent

max z

s.t. z − x1 − 3x2 = 0,

x1 + x2 + s1 = 6,

−x1 + 2x2 + s2 = 8,

x1 , x2 , s1 , s2 ≥ 0

 
0
 
 
0
Lets pick   as a basic feasible solution. So basic variables will be {s1 , s2 } and non-basic variables are
 
6
 
 
8
{x1 , x2 }. We construct the simplex tableau.

 
z x1 x2 s1 s2 RHS
 
 
z  1 −1 −3 0 0 0 

 
s1  0 1 1 1 0 6
 

 
s2 0 −1 2 0 1 8

The tableau is already in canonical form. Most negative coefficient of the cost vector is x2 , so x2 is the

entering variable. We apply ratio test and pick s2 to be the leaving variable and make the updates on the

tableau:
   
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
   
   
z  1 −1 −3 0 0 0  z  1 −5/2 0 0 3/2 12 
⇒
  
  
s1  0 1 1 1 0 6 s1  0 3/2 0 1 −1/2 2
   
 
   
x2 0 −1 2 0 1 8 x2 0 −1/2 1 0 1/2 4

Now the tableau is in canonical form, so we chose the most negative coefficient, in this case it is x1 , to be

the entering variable. Ratio test only consider positive coefficients so only candidate for a leaving variable

32
is s1 , and we make the updates on the tableau:

   
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
   
   
z  1 −5/2 0 0 3/2 12  z  1 0 0 5/3 2/3 46/3 
⇒
  
  
x1  0 3/2 0 1 −1/2 2 x1  0 1 0 2/3 −1/3 4/3 
   

   
x2 0 −1/2 1 0 1/2 4 x2 0 0 1 1/3 1/3 14/3

Since all entries in the reduced cost coefficients


  vector are non-negative, current basic feasible solution is
4/3
 
 
14/3
optimal. Hence the optimal solution is   with optimal value 46/3.
 
0
 
 
 
0

Remark 3. It is easy to track values of the basic variables and the objective value using the simplex tableau.

Homework 5. Consider the following LP problem

max 3x1 + 4x2

s.t. x1 + x2 ≤ 12,

2x1 + x2 ≤ 16,

x1 , x2 ≥ 0

Solve this problem using the Simplex Method.

33
Lecture 4

Problematic Cases

• How to solve minimization problems?

There are two ways to solve a minimization problem using the Simplex Method:

A. Instead of solving min cT x s.t. Ax = b, x ≥ 0 solve max −cT x s.t. Ax = b, x ≥ 0 and multiply the

optimal value by -1.

B. During the choice of the entering variable pick the most positive entry instead of the most negative

one. If all reduced cost coefficients are non-positive for a basic feasible solution, then it is optimal.

• What if an LP problem has alternative optimal solutions?

In Simplex Method, alternative solutions are detected when there are 0 valued coefficients for non-

basic variables in row-z of the optimal tableau. If there is no non-basic variable with a zero coefficient

in row-z of the optimal tableau, the LP has a unique optimal solution. Even if there is a non-basic

variable with a zero coefficient in row-z of the optimal tableau, it is possible that the LP may not have

alternative optimal solutions.

If there are two basic feasible solution which are optimal, their convex combinations will also be optimal

solutions.

Example 19. Consider the following LP problem

max 2x1 + 4x2

s.t. x1 + 2x2 ≤ 5

x1 + x2 ≤ 4

x1 , x2 ≥ 0

We convert this problem to standard form and get ready for the simplex:

max z

s.t. z − 2x1 − 4x2 = 0

x1 + 2x2 + s1 = 5

x1 + x2 + s2 = 4

x1 , x2 , s1 , s2 ≥ 0

34
 
0
 
 
0
We start with a basic feasible solution   with basic variables {s1 , s2 } and non-basic variables {x1 , x2 }.
 
5
 
 
4
So we construct the tableau:
 
z x1 x2 s1 s2 RHS
 
 
z  1 −2 −4 0 0 0 

 
s1  0 1 2 1 0 5
 

 
s2 0 1 1 0 1 4

This is already in canonical form, so we pick x2 as the entering variable and s1 as the leaving variable.

So we update the tableau

   
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
   
   
z  1 −2 −4 0 0 0  z  1 0 0 2 0 10 
⇒
  
  
x2  0 1 2 1 0 5 x2  0 1/2 1 1/2 0 5/2 
   

   
s2 0 1 1 0 1 4 s2 0 1/2 0 −1/2 1 3/2

 
0
 
 
5/2
Current basic feasible solution   is optimal. But an improvement on x1 does not change the
 
 0 
 
 
3/2
optimal value, so lets choose it as the entering variable and s2 will be the leaving variable. So we

update the tableau:

   
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
   
   
z  1 0 0 2 0 10  z  1 0 0 2 0 10 
⇒
  
  
x2  0 1/2 1 1/2 0 5/2  x2  0 0 1 1 −1 1
   

   
x1 0 1/2 0 −1/2 1 3/2 x1 0 1 0 −1 2 3

     
3 0 3
     
     
1 5/2 1
Again, current basic feasible solution   is optimal. So for any λ ∈ [0, 1], λ   + (1 − λ)   is
     
0  0  0
     
     
0 3/2 0

35
optimal.

• What if an LP problem is unbounded?

An unbounded LP occurs in a max (min) problem if there is a nonbasic variable with a negative

(positive) coefficient in row-z and there is no constraint that limits how large we can make this nonbasic

variable. Specifically, an unbounded LP for a max (min) problem occurs when a variable with a negative

(positive) coefficient in row-z has a non positive coefficient in each constraint. There is an entering

variable but no leaving variable, since ratio test does not give a finite bound.

Example 20. Consider the following LP problem

max x1 + 2x2

s.t. x1 − x2 ≤ 10

x1 ≤ 40

x1 , x2 ≥ 0

We convert this problem to standard form and get ready for simplex:

max z

s.t. z − x1 − 2x2 = 0

x1 − x2 + s1 = 10,

x1 + s2 = 40,

x1 , x2 , s1 , s2 ≥ 0

 
0
 
 
0
We start with the basic feasible solution   with basic variables {s1 , s2 } and non-basic variables
 
10
 
 
40
{x1 , x2 }. We construct the tableau:

 
z x1 x2 s1 s2 RHS
 
 
z  1 −1 −2 0 0 0 

 
s1  0 1 −1 1 0 10
 

 
s2 0 1 0 0 1 40

This is already in canonical form, so we pick x2 as the entering variable however it is not possible

36
to pick a leaving variable since the ratio test is inconclusive. So consider x1 = 0 and x2 = α. Then

s1 = 10 + α and s2 = 40 is feasible for any α ∈ R and the objective will be 2α.

• What if the algorithm gets stuck in a loop?

An LP is a degenerate LP if in a basic feasible solution, one of the basic variables takes on a zero value.

This basic feasible solution is called degenerate basic feasible solution. Degeneracy could cost simplex

method extra iterations because when degeneracy occurs, objective function value will not increase.

A cycle in the simplex method is a sequence of K +1 iterations with corresponding bases B0 , . . . , BK , B0

and K ≥ 1. If a cylce occurs, then the algorithm will loop forever among a set of basic feasible solutions

and never get to an optimal solution.

In the simplex algorithm, degeneracy is detected when there is a tie for the minimum ratio test. In the

next iteration, the solution is degenerate.

Example 21. In this example we see a degenerate bfs. Consider the following LP problem

max 5x1 + 2x2

s.t. x1 + x2 ≤ 6,

x1 − x2 ≤ 0,

x1 , x2 ≥ 0.

We convert this problem to standard form:

max z

s.t. z − 5x1 − 2x2 = 0

x1 + x2 + s1 = 6,

x1 − x2 + s2 = 0,

x1 , x2 , s1 , s2 ≥ 0

 
0
 
 
0
We start with the degenerate basic feasible solution   with basic variables {s1 , s2 } and non-basic
 
6
 
 
0

37
variables {x1 , x2 }. We construct the tableau:

 
z x1 x2 s1 s2 RHS
 
 
z  1 −5 −2 0 0 0 

 
s1  0 1 1 1 0 6
 

 
s2 0 1 −1 0 1 0

This is already in standard form, so we pick x1 as the entering variable and pick s2 as the leaving

variable. We update the tableau:

   
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
   
   
z  1 −5 −2 0 0 0  z  1 0 −7 0 5 0 
⇒
  
  
s1  0 1 1 1 0 6 s1  0 0 2 1 −1 6
   
 
   
x1 0 1 −1 0 1 0 x1 0 1 −1 0 1 0

The table is already in canonical form. So we pick x2 as the entering variable and s1 as the leaving

variable. We update the tableau

   
z x1 x2 s1 s2 RHS z x1 x2 s1 s2 RHS
   
   
z  1 0 −7 0 5 0  z  1 0 0 7/2 3/2 21 
⇒
  
  
x2  0 0 2 1 −1 6 x2  0 0 1 1/2 −1/2 3
   
 
   
x1 0 1 −1 0 1 0 x1 0 1 0 1/2 1/2 3

Bland’s Rule: Assume that slack and excess variables are numbered xn+1 , xn+2 , . . . . Choose an

entering variable (in a max problem) the variable with a negative coefficient in row-z that has the

smallest index. If there is a tie in the ratio test, then break the tie by choosing the winner of the ratio

test so that the variable leaving the basis has the smallest index.

Fact 5. Using Bland’s rule, the Simplex Method terminates in finite time with optimal solution, i.e.

no cycles.

38
Example 22. Consider the following LP problem

max 9x1 + 3x2

s.t. 4x1 + x2 ≤ 8,

2x1 + x2 ≤ 4,

x1 , x2 ≥ 0.

We convert this problem to standard form and get ready for simplex:

max z

s.t. z − 9x1 − 3x2 = 0

4x1 + x2 + x3 = 8,

2x1 + x2 + x4 = 4,

x1 , x2 , x3 , x4 ≥ 0

 
0
 
 
0
We start with the basic feasible solution   with basic variables {x3 , x4 } and non-basic variables
 
8
 
 
4
{x1 , x2 }. We construct the tableau:

 
z x1 x2 x3 x4 RHS
 
 
z  1 −9 −3 0 0 0 

 
x3  0 4 1 1 0 8
 

 
x4 0 2 1 0 1 4

This is already in canonical form, so we pick the smallest index with negative rcc, x1 , as the entering

variable but there is a tie in the ratio test. Hence this solution is degenerate. So lets update the tablaeu

by picking x3 :

   
z x1 x2 x3 x4 RHS z x1 x2 x3 x4 RHS
   
   
z  1 −9 −3 0 0 0  z  1 0 −3/4 9/4 0 18 
⇒
  
  
x1  0 4 1 1 0 8 x1  0 1 1/4 1/4 0 2
   
 
   
x4 0 2 1 0 1 4 x4 0 0 1/2 −1/2 1 0

39
Now x2 is an entering variable and x4 is a leaving variable. So we update the tableau:

   
z x1 x2 x3 x4 RHS z x1 x2 x3 x4 RHS
   
   
z  1 0 −3/4 9/4 0 18  z  1 0 0 3/2 3/2 18 
⇒
  
  
x1  0 1 1/4 1/4 0 2 x1  0 1 0 1/2 −1/2 2
   
 
   
x2 0 0 1/2 −1/2 1 0 x2 0 0 1 −1 2 0

 
2
 
 
0
Current basic feasible solution   is optimal with the optimal value 18.
 
0
 
 
0

Remark 4. While Bland’s pivot rule is theoretically important, from a practical perspective it is quite

inefficient and takes a long time to converge. In practice, other pivot rules are used, and cycling almost

never happens.

Homework 6. Consider the following LP problem

max 10x1 − 57x2 − 9x3 − 24x4

s.t. x1 − 11x2 − 5x3 + 9x4 ≤ 0

x1 − 3x2 − x3 + 2x4 ≤ 0

x1 ≤ 1

x1 , x2 , x3 , x4 ≥ 0.

Solve this LP problem using the Simplex Method. If there are ties during the steps of the algorithm

break them arbitrarily to observe cycles.

• How to find an initial basic feasible solution?

There are two alternatives to find an initial basic feasible solution:

Big-M Method: In order to obtain trivial basic feasible solutions we introduce non-negative artificial variables for

each ” ≥ ” and ” = ” constraint. We should force these artificial variables to be equal to 0, so

we include them in the objective function with a coefficient M , a very large penalty. Solving the

original problem in this new formulation will help us start with a trivial basic feasible solution

and find an identical optimal solution under appropriate choice of M . During the iterations of

Simplex Method, after an artificial variable becomes non-basic you can remove it from the problem

40
by deleting the corresponding column and reduce the problem size because it cannot be basic in

the future iterations.

Example 23. Consider the following LP problem

min 2x1 + 3x2

s.t. 2x1 + x2 ≤ 16,

x1 + 3x2 ≥ 20,

x1 + x2 = 10

x1 , x2 ≥ 0.

We convert this problem to standard form and get ready for simplex:

min z

s.t. z − 2x1 − 3x2 = 0,

2x1 + x2 + s1 = 16,

x1 + 3x2 − e2 = 20,

x1 + x2 = 10

x1 , x2 , s1 , e2 ≥ 0.

Now we introduce the artificial variables wih a large M ,

min z

s.t. z − 2x1 − 3x2 − M a2 − M a3 = 0,

2x1 + x2 + s1 = 16,

x1 + 3x2 − e2 + a2 = 20,

x1 + x2 + a3 = 10

x1 , x2 , s1 , e2 , a2 , a3 ≥ 0.


0
 
 
0
 
 
16
 
We start with the basic feasible solution 
  with basic variables {s1 , a2 , a3 } and non-basic

0
 
 
20
 
 
10

41
variables {x1 , x2 , e2 }. We construct the tableau:

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 −2 −3 0 0 −M −M 0 

 
s1 
 0 2 1 1 0 0 0 16 

 
a2  0 1 3 0 −1 1 0 20
 

 
a3 0 1 1 0 0 0 1 10

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 2M − 2 4M − 3 0 −M 0 0 30M 

 
s1 
 0 2 1 1 0 0 0 16 

 
a2  0 1 3 0 −1 1 0 20 
 
 
a3 0 1 1 0 0 0 1 10

We pick x2 as the entering variable and a2 as the leaving variable so we update the tableau:

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 2M − 2 4M − 3 0 −M 0 0 30M 

 
s1 
 0 2 1 1 0 0 0 16 

 
x2  0 1 3 0 −1 1 0 20 
 
 
a3 0 1 1 0 0 0 1 10

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 (2M − 3)/3 0 0 (M − 3)/3 (3 − 4M )/3 0 (60 + 10M )/3 

 
s1 
 0 5/3 0 1 1/3 −1/3 0 28/3 

 
x2  0 1/3 1 0 −1/3 1/3 0 20/3
 

 
a3 0 2/3 0 0 1/3 −1/3 1 10/3
 
z x1 x2 s1 e2 a3 RHS
 
 
z  1 (2M − 3)/3 0 0 (M − 3)/3 0 (60 + 10M )/3 

 
s1 
 0 5/3 0 1 1/3 0 28/3 

 
x2  0 1/3 1 0 −1/3 0 20/3
 

 
a3 0 2/3 0 0 1/3 1 10/3

Tableau is in canonical form so we pick x1 as the entering variable and a3 as the leaving variable.

42
Then we update the tableau:

 
z x1 x2 s1 e2 a3 RHS
 
 
z  1 (2M − 3)/3 0 0 (M − 3)/3 0 (60 + 10M )/3 

 
s1 
 0 5/3 0 1 1/3 0 28/3 

 
x2  0 1/3 1 0 −1/3 0 20/3
 

 
x1 0 2/3 0 0 1/3 1 10/3

   
z x1 x2 s1 e2 a3 RHS z x1 x2 s1 e2 RHS
   
   
z  1 0 0 0 −1/2 (3 − 2M )/2 25 
 z  1
 0 0 0 −1/2 25 

   
s1  −1/2 −5/2 ⇒ s −1/2
 0 0 0 1 1  0 0 0 1 1 
 1  
   
x2  0 0 1 0 −1/2 −1/2 5 x2  0 0 1 0 −1/2 5
   
 
   
x1 0 1 0 0 1/2 3/2 5 x1 0 1 0 0 1/2 5
 
5
 
 
5
Current basic feasible solution   is the optimal solution with optimal value 25.
 
1
 
 
0
Homework 7. Consider the following LP problem

max x1 + x2

s.t. x1 − x2 ≥ 1,

−x1 + x2 ≥ 1,

x1 , x2 ≥ 0.

Solve this problem using the Big-M Method.

Fact 6. Big-M method can provide information about the problem:

– If all artificial variables in the optimal solution equal zero, the solution is optimal.

– If any artificial variables are positive in the optimal solution, the problem is infeasible.

– When the LP (with the artificial variables) is solved, the final tableau may indicate that the

LP is unbounded. If the final tableau indicates the LP is unbounded and all artificial variables

in this tableau equal zero, then the original LP is unbounded.

– If the final tableau indicates that the LP is unbounded and at least one artificial variable is

positive, then the original LP is infeasible.

43
Remark 5. For optimization solvers, it is difficult to determine how large M should be. In gen-

eral, M is chosen to be at least 100 times larger than the largest coefficient in the original objective

function. Introduction of an arbitrarily large M can cause numerical errors and computational

difficulties. Therefore most solvers solve LP’s by using the two-phase simplex method.

Two-Phase Simplex: In this method, artificial variables are added to the same constraints, then a bfs to the original

LP is found by solving Phase I LP. In Phase I LP, the objective function is to minimize the sum

of all artificial variables. As before, an artificial variable can be removed from the system as

soon as it becomes non-basic. At completion, reintroduce the original LP’s objective function and

determine the optimal solution to the original LP.

Example 24. Consider the following LP problem

min 2x1 + 3x2

s.t. 2x1 + x2 ≤ 16,

x1 + 3x2 ≥ 20,

x1 + x2 = 10

x1 , x2 ≥ 0.

We convert this problem to the appropriate format as we did and change the objective with the

sum of artificial variables

min z

s.t. z − a2 − a3 = 0,

2x1 + x2 + s1 = 16,

x1 + 3x2 − e2 + a2 = 20,

x1 + x2 + a3 = 10

x1 , x2 , s1 , e2 , a2 , a3 ≥ 0.

 
0
 
 
0
 
 
16
 
We start solving Phase I with the basic feasible solution 
  with basic variables {s1 , a2 , a3 }

0
 
 
20
 
 
10

44
and non-basic variables {x1 , x2 , e2 }. We construct the tableau:

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 0 0 0 0 −1 −1 0 

 
s1 
 0 2 1 1 0 0 0 16 

 
a2  0 1 3 0 −1 1 0 20
 

 
a3 0 1 1 0 0 0 1 10

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 2 4 0 −1 0 0 30 

 
s1 
 0 2 1 1 0 0 0 16 

 
a2  0 1 3 0 −1 1 0 20
 

 
a3 0 1 1 0 0 0 1 10

So we pick x2 as the entering variable and a2 as the leaving variable. We update the tableau

 
z x1 x2 s1 e2 a2 a3 RHS
 
 
z  1 2/3 0 0 1/3 −4/3 0 10/3 

 
s1 
 0 5/3 0 1 1/3 −1/3 0 28/3 

 
x2  0 1/3 1 0 −1/3 1/3 0 20/3 
 
 
a3 0 2/3 0 0 1/3 −1/3 1 10/3

 
z x1 x2 s1 e2 a3 RHS
 
 
z  1 2/3 0 0 1/3 0 10/3 

 
s1 
 0 5/3 0 1 1/3 0 28/3 

 
x2  0 1/3 1 0 −1/3 0 20/3 
 
 
a3 0 2/3 0 0 1/3 1 10/3

Then we pick x1 as the entering variable and a3 as the leaving variable :

 
z x1 x2 s1 e2 a3 RHS
 
 
z  1 2/3 0 0 1/3 0 10/3 

 
s1 
 0 5/3 0 1 1/3 0 28/3 

 
x2  0 1/3 1 0 −1/3 0 20/3 
 
 
x1 0 2/3 0 0 1/3 1 10/3

45
 
z x1 x2 s1 e2 a3 RHS
 
 
z  1 0 0 0 0 −1 0 

 
s1 
 0 0 0 1 −1/2 −5/2 1 

 
x2  0 0 1 0 −1/2 −1/2 5
 

 
x1 0 1 0 0 1/2 3/2 5
 
z x1 x2 s1 e2 RHS
 
 
z  1 0 0 0 0 0 

 
s1 
 0 0 0 1 −1/2 1 

 
x2  0 0 1 0 −1/2 5
 

 
x1 0 1 0 0 1/2 5

We obtained a basic feasible solution for the original problem. Now we change te objective and

pass to Phase II.


 
z x1 x2 s1 e2 RHS
 
 
z  1 −2 −3 0 0 0 

 
s1 
 0 0 0 1 −1/2 1 

 
x2  0 0 1 0 −1/2 5
 

 
x1 0 1 0 0 1/2 5
 
z x1 x2 s1 e2 RHS
 
 
z  1 0 0 0 −1/2 25 

 
s1 
 0 0 0 1 −1/2 1 

 
x2  0 0 1 0 −1/2 5
 

 
x1 0 1 0 0 1/2 5

This tableau is in the canonical form and basic feasible solution provided in the tableau is optimal,

so we stop.

Fact 7. We may observe several cases using Two-Phase Method:

– The optimal value of the Phase I is greater than zero. In this case the original LP is infeasible.

– The optimal value of the Phase I is equal to zero and no artificial variables are in the optimal

Phase I basis. Then a basic feasible solution to the original problem is found. Continue to

Phase II by bringin the original objective function.

– The optimal value of the Phase I is zero and at least one artificial variable is in the optimal

46
Phase I basis. Then either we perform an additional pivot and get rid of the artificial variable

by passing to an alternate optimal solution, or there was a redundant constraint and we can

delete the constraint with the artificial variable.

47
Lecture 5

Integer Programming

A linear program plus the additional constraints that some or all of the variables must be integer valued

is called an integer program (IP). When compared to LP’s, IP’s are more realistic and flexible but more

difficult to model and solve.

Example 25. Consider the following problem:

max 3x + 4y

s.t. 5x + 8y ≤ 24,

x, y ≥ 0 and integer.

If we ignore integrality we get, x = 24/5, y = 0 and z = 72/5. Variable x is fractional. If we round this

solution up we get x = 5, y = 0 which is infeasible. If we truncate we get x = 4, y = 0 with the objective

z = 12. However, the optimal solution is x = 3, y = 1 and z = 13.

Example 26 (Bus Scheduling). Consider the following table,

Periods 1 2 3 4 5 6

Hours 0-4 AM 4-8 AM 8 AM – 12 PM 12 - 4 PM 4 – 8 PM 8 PM-0AM

Min. # of buses required 4 8 10 7 12 4

48
• Each bus starts to operate at the beginning of a period and operates for 8 consecutive hours and then

receives 16 hours off.

• For example, a bus operating from 4 AM to 12 PM must be off between 12 PM and 4 AM.

• Find the minimum # of busses to provide the required service.

How do you define your decision variables here?

xi : # of busses operating in period i, i = 1, 2, . . . , 6, Not possible to express mathematically

xi : # of busses starting to operate in period i, i = 1, 2, . . . , 6, Good choice!

We have the following model

min x1 + · · · + x6

s.t. x1 + x6 ≥ 4,

x2 + x1 ≥ 8,

x3 + x2 ≥ 10,

x4 + x3 ≥ 7,

x5 + x4 ≥ 12,

x6 + x5 ≥ 4,

xi ≥ 0 and integer.

Example 27 (Work Scheduling). Burger Queen would like to determine the min # of kitchen employees

while meeting daily requirements. Each employee works six days a week and takes the seventh day off.

Days Mon Tue Wed Thu Fri Sat Sun

Min. # of workers 8 6 7 6 9 11 9

Formulate this as a integer optimization problem.

Sol: Let xi be the number of workers who are not working on day i, i = 1, . . . , 7. Number of workers in

day j can be computed as i6=j xi and this quantity should be larger than de daily requirements, say dj .
P

P7
Number of all kitchen employees is i=1 xi . Hence we have the following formulation:

P7
min i=1 xi ,

s.t.
P
i6=j xi ≥ dj , ∀j = 1, . . . , 7,

xi ≥ 0 and integer.

49
Example 28 (The Assignment Problem). There are n people and n jobs to carry out. Each person is as-

signed to carry out exactly one job. If person i is assigned to job j, then the cost is cij . Find an assignment

of n people to n jobs with minimum total cost.


if person i is assigned to job j,

1,

Sol: Let xij = Such variables are called binary variables and offer
otherwise.

0,

great flexibility in modeling. They are ideal for YES/NO decisions.

Then we have the following formulation:

Pn Pn
min i=1 j=1 cij xij
Pn
s.t. i=1 xij = 1,
Pn
j=1 xij = 1,

xij ∈ {0, 1}.

Example 29 (The 0-1 Knapsack Problem). You are going on a trip. Your suitcase has a capacity of b kg’s.

You have n different items that you can pack into your suitcase 1, . . . , n. Item i has a weight of ai and gives

you a utility ofci . How should you pack your suitcase to maximize total utility?
1, if we pick item i,


Sol: Let xi = . Then we have the following formulation
0, otherwise.

Pn
max i=1 ci xi
Pn
s.t. i=1 ai xi ≤ b,

xi ∈ {0, 1}.

Example 30 (Workforce Scheduling). Monthly demand for a shoe company is:

Month

1 2 3 4

Demand 300 500 100 100

A pair of shoes requires 4 hours of labor and $5 worth of raw materials. There are 50 pairs of shoes in

stock. Monthly inventory cost is charged for shoes in stock at the end of each month at $30/pair. Currently 3

workers are available each has $1500/month salary and each works up to 160 hours/month. Hiring a worker

costs $1600 and firing a worker costs $2000. How to meet the demand with minimum total cost?

50
Sol: Let wi be the # workers available, fi be the # workes fired and hi be the # workers hired at the

beginning of month i. Let yi be the pairs of shoes in the stock after meeting the demand at the end of

month i and xi be the pairs of shoes produced at the ith month. Let di be the monthly demand, y0 = 50

and w0 = 3. Then we have the following formulation

P4 P4 P4 P4 P4
min 1500 i=1 wi + 1600 i=1 hi + 2000 i=1 fi + 30 i=1 yi + 5 i=1 xi

s.t. yi = yi−1 + xi − di , ∀i = 1, . . . , 4,

wi = wi−1 − fi + hi , ∀i = 1, . . . , 4,

4xi ≤ 160wi , ∀i = 1, . . . , 4,

xi , yi , wi , hi , fi ≥ 0 and integer.

Example 31 (The Cutting Stock Problem). A paper company manufactures and sells rolls of paper of fixed

width in 5 standard lengths: 5, 8, 12, 15 and 17 m. Demand for each size is given below. It produces 25 m

length rolls and cuts orders from its stock of this size. What is the min # of rolls to be used to meet the

total demand?

Length (meters) 5 8 12 15 17

Demand 40 35 30 25 20

Sol: We start by identifying the cutting patterns.

51
Then we define xj to be the # rolls cut according to pattern j. Then we have the following formulation:

P11
min j=1 xj

s.t. x2 + 2x4 + x6 + 2x7 + x9 + 3x10 + 5x11 ≥ 40

x1 + x3 + x6 + 3x8 + 2x9 + x10 ≥ 35

2x5 + x6 + x7 ≥ 30

x3 + x4 ≥ 25

x1 + x2 ≥ 20

xi ≥ 0 and integer.

Example 32 (Project Selection). A manager should choose from among 10 different projects:

pj := profit if we invest in project j,

cj := cost of project j,

q := total budget available.

There are also some additional requirements:

i. Projects 3 and 4 cannot be chosen together

ii. Exactly 3 projects are to be selected

iii. If project 2 is selected, then so is project 1

iv. If project 1 is selected, then project 3 should not be selected

v. Either project 1 or project 2 is chosen, but not both

vi. Either both projects 1 and 5 are chosen or neither are chosen

vii. At least two and at most four projects should be chosen from the set {1, 2, 7, 8, 9, 10}

viii. Neither project 5 nor 6 can be chosen unless either 3 or 4 is chosen. If x3 + x4 = 0, then x5 + x6 = 0.

ix. At least two of the investments 1, 2, 3 are chosen, or at most three of the investments 4, 5, 6, 7, 8 are

chosen.

x. If both 1 and 2 are chosen, then at least one of 9 and 10 should also be chosen.

Under these requirements, the objective is to maximize the profit.

52

1, if project j is selected,


P10
Let xj = Then the objective is to maximize j=1 pj xj and the budget
0, otherwise.


P10
constraint is j=1 cj xj ≤ q. Additional requirements can be modeled as

i. x3 + x4 ≤ 1

P10
ii. j=1 xj = 3

iii. x2 ≤ x1

iv. x1 ≤ 1 − x3

v. x1 + x2 = 1

vi. x1 = x5

vii. 2 ≤ x1 + x2 + x7 + x8 + x9 + x10 ≤ 4

viii. x5 + x6 ≤ 2(x3 + x4 )

ix. x1 + x2 + x3 ≥ 2y and x4 + x5 + x6 + x7 + x8 ≤ 5 − 2(1 − y)

x. x1 + x2 − 1 ≤ x9 + x10

Either-Or Constraints

Let us have two constraints f (x1 , . . . , xn ) ≤ 0 and g(x1 , . . . , xn ) ≤ 0. We want to make sure that when the

model is solved at least one of the constraints is satisfied. We define a binary variable y and a large positive

number M . We add the following two constraints to the model f (x1 , . . . , xn ) ≤ M y and g(x1 , . . . , xn ) ≤

M (1 − y).

• M should be chosen large enough to be satisfied by all feasible values of x1 , . . . , xn , but small enough

for solvers to deal with.

• y in this case called the switch or the indicator.

Rules for Logical Implications

We can linearize logical implications to use in integer optimization problems by defining several extra variables

and constraints to the problem.

53
• If δ = 1, then
P
j aj xj ≤ b.

Define M to be an upper bound on aj xj − b. Add the constraint aj xj + M δ ≤ M + b to the


P P
j j

model.

• If aj xj ≤ b, then δ = 1.
P
j

Define m to be a lower bound on aj xj − b. Define a sufficiently small . If all data are integers you
P
j

may pick  = 1. Add the constraint j aj xj − (m − )δ ≥ b +  to the model.


P

• If δ = 1, then
P
j aj xj ≥ b.

Define m to be a lower bound on aj xj − b. Add the constraint aj xj + mδ ≥ m + b to the model.


P P
j j

• If aj xj ≥ b, then δ = 1.
P
j

Define M to be an upper bound on aj xj − b. Define a sufficiently small . If all data are integers
P
j

you may pick  = 1. Add the constraint j aj xj − (M + )δ ≤ b −  to the model.


P

Homework 8. p =⇒ q is equivalent to ∼ q =⇒ ∼ p. Use rules of logical implications on the following

three examples:

A. If aj xj ≤ b, then
P P
j j cj x j ≤ d.

B. If δ = 1, then
P
j aj xj < b.

C. If aj xj < b, then
P P
j j cj x j > d.

Example 33 (Modeling Fixed Costs). Suppose we have the following information:

• A set of potential warehouses: 1, . . . , m.

• A set of clients: 1, . . . , n

• If warehouse i is used, then there is a fixed cost of fi , i = 1, . . . , m.

• cij is the unit cost of transportation from warehouse i to client j, i = 1 . . . , m, j = 1, . . . , n.

• ai is the capacity of warehouse i in terms of # units, i = 1, . . . , m.

• dj is # unites demanded by client j, j = 1, . . . , n.

• More than one warehouse can supply the client.

54
Sol: xij amount of product transferred from warehouse i to client j. yi is a binary variable indicates
Pm Pn Pm
whether warehouse i is active or not. Our goal is to minimize i=1 j=1 cij xij + i=1 fi yi . Capacity should
Pn Pn
not be exceeded: j=1 xij ≤ ai yi . Demand should be satisfied i=1 xij ≥ dj . Then we have the following

model

Pm Pn Pm
min i=1 j=1 cij xij + i=1 fi yi
Pn
s.t. j=1 xij ≤ ai yi ,
Pn
i=1 xij ≥ dj ,

xij ≥ 0,

yi ∈ {0, 1}.

Example 34 (Set Covering Problem). You wish to decide where to locate hospitals in a city. Suppose that

there are 5 different towns in the city.

Potential Location / Town 1 2 3 4 5

1 10 15 20 10 16

2 15 10 8 14 12

3 8 6 12 15 10

4 12 17 7 9 10

Travel time between each town and the nearest hospital should not be more than 10 minutes. What is

the minimum number of hospitals?

Sol: Let xi indicate whether there is an hospital at location i. Our goal is to minimize x1 + x2 + x3 + x4 .

According to the table, for the first town we have x1 + x3 ≥ 1 and rest can be found similarly. Then we have

the following model,

min x1 + x2 + x3 + x4

s.t. x1 + x3 ≥ 1,

x2 + x3 ≥ 1,

x2 + x4 ≥ 1,

x1 + x4 ≥ 1,

x3 + x4 ≥ 1,

xi ∈ {0, 1}

Example 35 (Location Problems). Ankara municipality is providing service to 6 regions. The travel times

55
(in minutes) between these regions, say tij values are:

R1 R2 R3 R4 R5 R6

R1 0 10 20 30 30 20

R2 10 0 25 35 20 10

R3 20 25 0 15 30 20

R4 30 35 15 0 15 25

R5 30 20 30 15 0 14

R6 20 10 20 25 14 0

The municipality wants to build fire stations in some of these regions. Find the minimum number of fire

stations required if the municipality wants to ensure that there is a fire station within 15-minute drive time

from each region.



1, if a fire station is built at i,


Let xi = .
0, otherwise.


Our objective is to minimize x1 + · · · + x6 . For the first region, it is covered by a fire station only if

x1 ≥ 1 or x2 ≥ 1. Then we have the following model:

min x1 + · · · + x6

s.t. x1 + x2 ≥ 1,

x1 + x2 + x6 ≥ 1,

x3 + x4 ≥ 1,

x3 + x4 + x5 ≥ 1,

x4 + x5 + x6 ≥ 1,

x2 + x5 + x6 ≥ 1,

xi ∈ {0, 1}


1, if cij ≤ 15


P6
Let aij = . Then constraints can be expressed as i=1 aij xi ≥ 1 for each j = 1, . . . , 6.
0, otherwise.


The municipality wants to build 2 fire stations so that the time to reach the farthest region is kept at

minimum.  
1, if a fire station is built at j, 1, if region i served by fire station at region j,

 

Let yj = and let xij =
0, otherwise. 0, otherwise.

 

and T is the farthest distance. We should have 2 fire stations: y1 + · · · + y6 = 2. If there is no fire station at

56
P6
city j, there should be no assignments: i=1 xij ≤ 6yj . Each city should be assigned to a single fire station
P6
j=1 xij = 1. Farthest distance should be greater than any possible distance: T ≥ cij xij . Our goal is to

minimize T . Then we have the following model:

min T
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,

T ≥ cij xij , ∀i, j = 1, . . . , 6,

xij , yj ∈ {0, 1}

T ≥ 0.

P6 P6
Given constraints j=1 xij = 1, inequalities that bound T can be rewritten as T ≥ j=1 cij xij for each

i = 1, . . . , 6.

The municipality now wants to build 2 post offices. There is going to be a back and forth trip each day

from each post office to each one of the regions assigned to this post office. Formulate a model which will

minimize thetotal trip time. 


1, if there is a post office at region j 1, if region i is assigned to the post office at region j,

 

Let yj = and xij =
0, otherwise. 0, otherwise.

 

P6
. There are exactly 2 post offices: j=1 yj = 2. If there is no post office at city j, there should be no assign-
P6 P6
ments: i=1 xij ≤ 6yj . Each city should be assigned to a single post office j=1 xij = 1. Then we have to
P6 P6
minimize 2 i=1 j=1 cij xij if cij is the trip time provided in the table. Then we have the following model:

P6 P6
min 2 i=1 j=1 cij xij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,

xij , yj ∈ {0, 1}.

What if additional to previous model you have the restriction that a post office cannot provide service to

more than 4 regions including itself?

57
P6
We include the following costraint: i=1 xij ≤ 4 for all j. Then we have the following model:

P6 P6
min 2 i=1 j=1 cij xij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
P6
i=1 xij ≤ 4, ∀j = 1, . . . , 6,

xij , yj ∈ {0, 1}.

Assume that in addition to the back-and-forth trips between the post offices and the assigned regions,

there will be a trip between the two post offices. The municipality wants to locate the offices in such a way

that the total trip time is minimized.

If we force diagonal entries to be 0 a post office cannot be assigned to itself so the objective function will

also count the back and forth trip between the post offices.

P6 P6
min 2 i=1 j=1 cij xij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
P6
i=1 xij ≤ 4, ∀j = 1, . . . , 6,
P6
i=1 xii = 0,

xij , yj ∈ {0, 1}.

Alternatively,
 one can introduce an additional binary variable zi j such that:
1, if post offices are located at both i and j(i 6= j)


zij =
0, otherwise.

58
The resulting model is:

P6 P6 P6 P6
min 2 i=1 j=1 cij xij + 2 i=1 j=1,j6=i cij zij
P6
s.t. j=1 yj = 2,
P6
i=1 xij ≤ 6yj , ∀j = 1, . . . , 6,
P6
j=1 xij = 1, ∀i = 1, . . . , 6,
P6
i=1 xij ≤ 4, ∀j = 1, . . . , 6,

yi + yj − 1 ≤ zij , ∀i, j = 1, . . . , 6 : i 6= j

xij , yj , zij ∈ {0, 1}.

The constraints linking yi , yj and zij linearize the logical expression yi = 1 ∧ yj = 1 =⇒ zij = 1

Example 36 (Traveling Salesman Problem). A salesperson lives in city 1. He has to visit each of the

cities 2, . . . , n exactly once and return home. Let cij be the travel time from city i to city j , i = 1, . . . , n;

j = 1, . . . . , n.What is the order in which she should make her tour to finish as quickly as possible?
1, if city j is visited directly after city i,

 Pn
Let xij = . Each city has to be visited only once: i=1 xij =
0, otherwise.


Pn
1 and j=1 xij = 1. Then we have the following formulation:

Pn Pn
min i=1 j=1 cij xij
Pn
s.t. i=1 xij = 1, ∀j = 1, . . . , n,
Pn
j=1 xij = 1, ∀i = 1, . . . , n,

xii = 0, ∀i = 1, . . . , n,

xij ∈ {0, 1}, ∀i, j = 1, . . . , n.

Unfortunately, there may be subtours. After identifying a subtour, say S, add the following constraint to

the model i∈S j ∈S / xij ≥ 1. However, constraints are in exponential number as all possible subtours are
P P

2n − 1, a very large number. But number of valid subtours is much larger, (n − 1)!, that is the possible

permutations of the cities to visit.

An alternative formulation with a polynomial number of constraints can be defined by introducing n − 1

additional integer variables ui modelling the position of city i in the tour. The subtour elimination constraints

are then:

ui − uj + nxij ≤ n − 1 ∀i 6= j : i, j = 2, . . . , n

By setting xij = 0, the constraint is trivially satisfied. When xij = 1, it enforces ui to precede uj . All

59
together, the constraints correctly define a tour where each ui takes a different value. Although the second

model exhibits a much smaller number of constraints for growing n, it is often not used in practice due to

its weaker LP relaxation.

Homework 9. A company produces animal feed mixture. The mix contains 2 active ingredients and a filler

material. 1 kg of feed mix must contain a minimum quantity of each of the following 4 nutrients:

Nutrients A B C D

Required amount (in grams)


90 50 20 2
in 1 kg of mix

The ingredients have the following nutrient values and costs:

A B C D Cost/kg

Ingredient1 (gram/kg) 100 80 40 10 $40

Ingredient2 (gram/kg) 200 150 20 N/A $60

a) What should be the amount of active ingredients and filler material in 1 kg of feed mix in order to

satisfy the nutrient requirements at minimum total cost?

b) Suppose now that we have the following additional limitations

• If we use any of ingredient 2, we incur a fixed cost of $15 (in addition to the $60 unit cost).

• It is enough to only satisfy 3 of the nutrition requirements rather than 4.

60
Lecture 6

Solving Integer Programming Models with Branch-and-Bound

Example 37 (LP-relaxation). Consider the following IP:

max x2

s.t. −x1 + x2 ≤ 1/2

x1 + x2 ≤ 7/2

x1 , x2 ≥ 0, integer

By removing the integrality conditions, we get an LP problem: this is the LP-relaxation and is often a

good starting point for solving IP problems.

If we solve the LP of the example, the optimal solution (3/2, 2) is not integer. By trivially rounding up

or down the value of x1 , the solution becomes infeasible.

The LP-relaxation is a particular relaxation. In general terms, if we consider P = max{c(x) : x ∈ X ⊆

Rn } and its relaxation R = max{r(x) : x ∈ Y ⊆ Rn }, a relaxation fulfills the following conditions:

i. X ⊆ Y

ii. r(x) ≥ c(x) ∀x ∈ X (for maximization problems)

61
It further follows that two properties hold:

A. If Y = ∅ then X = ∅

B. if y ∗ is optimal for R and x∗ is optimal for P , then r(y ∗ ) ≥ c(x∗ ):

(a) if y ∗ ∈ X, then from ii. the statement holds

(b) if y ∈
/ X, by optimality we know that r(y ∗ ) ≥ r(x) for any x ∈ X. But from ii. r(x) ≥ c(x) for

each x ∈ X, hence r(y ∗ ) ≥ c(x∗ ).

Returning now to the LP-relaxation.

• From B., the optimal value of the LP-relaxation is at least as good as the optimal value of the IP.

• Moreover, if the optimal solution of the LP-relaxation is integer valued, then that solution is also

optimal for the IP:

– If y ∗ is the optimum of the LP-relxation and it is integer, then it is feasible for the IP (y ∗ ∈ X).

But from ii. we know that there exists no integer x with a better objective function value. Thus

y ∗ is optimal for the IP.

Example 38 (Knapsack Problem). Consider a knapsack problem:

max 16x1 + 22x2 + 12x3 + 8x4 + 11x5 + 19x6

s.t. 5x1 + 7x2 + 4x3 + 3x4 + 4x5 + 6x6 ≤ 14

xj ∈ {0, 1} ∀j = 1, . . . , 6

The LP-relaxation is obtained by setting 0 ≤ x ≤ 1 ∀j = 1, . . . , 6.

62
Each variable has an utility pj , a size aj and the knapsack has a capacity b.
p
Arrange the variables in non-increasing order of unit size utility values ( ajj ) and fill the knapsack with

respect to this order:

x1 ≥ x6 ≥ x2 ≥ x3 ≥ x5 ≥ x4

16 19 22 12 11 8
≥ ≥ ≥ ≥ ≥
5 6 7 4 4 3

Starting from the first item, set the corresponding variable xj = 1 and reduce the capacity b by aj . When a
P
b− aj
certain item h does not fit in the current capacity, set its variable to the fractional value f = j<h
ah = xh .

This simple procedure is commonly known as Dantzig procedure and outputs the LP optimum for the 0-1

knapsack problem.

In our case, we work like this:

A. pick x1 , the starting capacity is 14. We check that the weight 5 < 14 so x1 = 1 and the residual

capacity becomes 9.

B. pick x6 , residual capacity 9. We check that the weight 6 < 9 so x6 = 1 and the residual capacity

becomes 3.

C. pick x2 , residual capacity 3. We find that the weight 7 > 3 so x2 = 3


7 and the residual capacity becomes

0.

The optimal solution of the LP-relaxation is x1 = x6 = 1, x2 = 3


7 of value 44.42.

Hint: in case of a minimization problem with ≥ constraint, variables are arranged in non-decreasing
p
values of ( ajj ).

Systematically consider all possible values of decision variables. If there are n binary variables, then there

are 2n different ways.

Usual Idea: iteratively break the problem in two. At the first iteration, we consider separately the case

that x1 = 0 and x1 = 1.

Enumeration tree:

63
The bottom nodes are called leaves of the tree.

Given the exponential grow of 2n , evaluating all the possible solutions through complete enumeration

becomes rapidly not viable.

Solving IPs – Branch-and-Bound

The Branch-and-Bround (B&B) is an algorithm that performs an implicit enumeration to find the IP

optimal solution in reasonable time.

The first ingredient used in the B&B is a feasible integer solution, i.e. the incumbent: xI of value z I .

Assume we have an initial incumbent x1 = x2 = 1, x3 = x4 = x5 = x6 = 0 with z I = 38.

To key idea of the implicit enumeration: select nodes of the “enumeration tree” one at a time but do not

branch from a node if none of its descendents can be a better solution than that of the incumbent

For example, for maximization if z LP (2) ≤ z I , that is the upper bound UB(2) of subproblem at node

2 has an objective function value smaller than the incumbent value, subtree 2 can be eliminated.

64
Example 39 (Subproblem 5). Consider the knapsack subproblem at node 5:

max 16x1 + 22x2 + 12x3 + 8x4 + 11x5 + 19x6

s.t. 5x1 + 7x2 + 4x3 + 3x4 + 4x5 + 6x6 ≤ 14

x1 = 0, x2 = 1

xj ∈ {0, 1} ∀j = 3, . . . , 6

The optimal LP-relaxation is x∗ = [0, 1, 41 , 0, 0, 1] with z LP (5) = 44. Being z LP (5) > z I , we need to branch

again from node 5. However, if z I = 45 or z I = 44 we could have fathom the node.

If the objective function has integer coefficient, we can impose the upper bound to be UB(j)= bz LP (j)c.

B&B creates the enumeration tree one node at a time and one branch at a time.

The LP relaxation at the current node j is solved and the algorithms either fathoms the node (case 1, 2,

3) or branches (case 4) by creating two children.

1 infeasibility: z L P (j) = −∞. The LP subproblem is infeasible, node j is fathomed and no branching

occurs.

2 bounding: −∞ < bz LP (j)c ≤ z I . Then z IP (j) ≤ bz LP (j)c ≤ z I . Fathom node j as no subproblem

in the subtree of j can improve the incumbent. Suppose z LP (5) = 38.7. Then, bz LP (j)c = 38 ≤ z I .

3 integrality: z I ≤ z LP (j) and the optimal solution for LP(j) is integer. Then, the incumbent is

updated to the integer solution of LP(j). As no descendant node can have a better IP solution, fathom

node j. Suppose the optimum of LP(5) equal to x2 = x4 = x5 = 1, x1 = x3 = x6 = 0 and z LP (5) = 41.

Then, z I is set to 41.

4 branching z I ≤ z LP (j) and the optimal solution for LP(j) is NOT integer. Fathom cannot be

performed. Add two children from node j and continue the B&B. Suppose z LP (5) = 41 but the

solution not integer. Set UB(5)=41.

The set of nodes that have no children and have not yet been processed are called active node. At the

beginning, only node 1 is active. The B&B terminates when there are no active nodes.

The selection of the current node from the list of active nodes is made according to different strategies:

• depth-first: follows the LIFO (last-in first-out) order and moves down one side of the B&B tree. It

quickly find a candidate solution (i.e., a feasible integer solution) and requires backtracking to

visit the other branches of the tree.

65
• breadth-first: follows the FIFO (first-in first-out) and visit the B&B by processing all the nodes at

a certain depth before moving to explore all of their children.

• best-first: when two children are created, both are processed by solving the respective subproblems.

Then, according to a priority queue that stores all the active nodes ordered by z L P (j) values, performs

jumptracking to the node that currently exhibits the best z L P (j) value.

Example 40 (B&B). Solve by B&B the 0-1 knapsack problem of Example 37.

The branching strategy is to select the variable with the most fractional value.

Among the active nodes, choose the one with depth-first strategy.

Let the initial incumbent be z I = 38, x1 = x2 = 1 x3 = x4 = x5 = x6 = 0.

Due to the depth-first strategy, subproblem nodes are solved in such order: 0-1-3-5-6-4-2-7-9-10-8-11-13-

14-12.

The incumbent z I is improved at node 5 and set to 43.

The output of the B&B algorithm is the optimal solution x∗ [1, 0, 0, 1, 0, 1] of value z ∗ = 43.

Hint: the LP-relaxation at each node of the branch-and-bound tree can be solved by firstly imposing

the branching choices and then using the Dantzig procedure on the remaining items. As example, at node 2

x2 = 1 is enforced and the initial capacity 14 is reduced to 7. Using the Dantzig procedure, x1 = 1 and the

residual capacity is 2. Finally, x6 = 31 .

66
67
The B&B algorithm framework is then summarized.

Branch-and-Bound for Pure Integer Programs

Example 41. Consider the following IP:

max 8x1 + 5x2

s.t. x1 + x2 ≤ 6

9x1 + 5x2 ≤ 45

x1 , x2 ≥ 0, integer

Let us assume we have no initial incumbent (z I = −∞).

Solving the LP-relaxation of the original problem (node 0) gives x1 = 3.75, x2 = 2.25, z LP (0) = 41.25.

68
This is Case 4 of B&B, so two children are created so that xLP (0) is not feasible for either subproblem. This

step do not cancel any feasible integer solution.

The two children are created by setting x1 ≤ 3 on the left branch (node 1) and x1 ≥ 4 on the right one

(node 2).

Solving subproblem at node 1 gives x1 = 3, x2 = 3, z LP (1) = 39. A new incumbent z I = 39 is found and the

node is fathomed.

Solving subproblem at node 2 gives x1 = 4, x2 = 1.8, z LP (2) = 41. Two children are created. Branching is

done by setting x2 ≤ 1 (node 3) and x2 ≥ 2 (node 4).

Solving subproblem at node 3 gives x1 = 4.44, x2 = 1, z LP (3) = 40.55. Two children are created. Branching

is done by setting x1 ≤ 4 (node 5) and x1 ≥ 5 (node 6).

Solving subproblem at node 5 gives z LP (5) = 37. Fathom by bounding.

Solving subproblem at node 6 gives x1 = 5, x2 = 0, z LP (6) = 40. A new incumbent z I = 40 is found and the

node is fathomed.

Solving subproblem at node 4 gives an infeasible LP for x1 = 4, x2 = 2. Fathom by infeasibility.

In general:

• The more accurate the bounds (close to the IP subproblem value), the quicker the fathoming. Having

good techniques to get accurate bounds and solving quickly important decisions (fixing crucial variables

in the problem) enhance the fathoming.

• The better the incumbent, the quicker the fathoming.

69
Lecture 7

Networks model

Optimization models that exhibit a special structure. In some cases, this special structure dramatically

reduces computational complexity (running time). First widespread application of LP problems of industrial

logistics. Addresses huge number of diverse applications.

• Phone network: route calls, messages, data

• Power network: transmit power from power plants to consumers

• Highway, street, rail, airline, shipping networks: transport people, vehicles, goods

• Computer networks: transmit info, data, messages

• Pipeline networks: transport crude oil, gasoline

• Satellite networks: worldwide communication system

• Abstract networks: using networks to model relations

A network (or graph) G = (N, A) is made by a set of nodes N = {1, 2, 3, 4} and a set of arcs

A = {b = (1, 2), a = (1, 4), e = (2, 3), d = (3, 4), c = (2, 4)}. A network in which arcs orientation is relevant

is called directed, otherwise undirected.

A path (or a directed path if orientations matter) is a sequence of nodes (or arcs) without repetitions.

Figure 1: On the left, path 5-2-3-4. On the right, directed path 1-2-5-3-4.

70
A cycle (also called circuit or loop) is a path with two or more nodes where the last node is also the

first. By considerng orientations, it becomes a directed cycle.

Figure 2: On the left, cycle 1-2-3-1. On the right, directed cycle 1-2-3-4-1.

LP-Formulation for the Shortest Path Problem

The shortest path problem requires the computation of the (directed) path of minimum length considering

a cost cij for each arc (i, j) ∈ A.

Define integer variables xij ≥ 0 for each arc indicating how many times such arc is used in the path.

In the single-pair shortest path problem the path starts from node s and arrives in node t. The formulation

is:

P
min (i,j)∈A cij xij

s.t.
P
j:(s,j)∈Axsj = 1
P P
j:(i,j)∈A xij − j:(j,i)∈A xji = 0, ∀i ∈ V \ {s, t}
P
− j:(j,t)∈A xjt = −1

xij ≥ 0, integer ∀(i, j) ∈ A

Flow balance equations are expressed compactly as M x = b, where M is the network matrix made by

{−1, 0, 1} entries and b is the array of external flow entering (>0) or leaving (<0) the network at each

node. In the shortest path problem bT = [1, 0, . . . , 0, −1] holds, where non-zeroes elements are associated to

s and t, respectively.

Example 42 (single pair shortest path). Let us consider the following network:

71
The single-pair shortest path problem model reads as:

min 2x12 + 4x13 + x23 + 4x24 + 2x25 + 3x35 + 2x46 + 3x54 + 2x56

s.t. x12 + x13 = 1

x23 + x24 + x25 − x12 = 0

x35 − x13 = 0

x46 − x24 − x54 = 0

x54 + x56 − x25 − x35 = 0

−x46 − x56 = −1

xij ≥ 0 integer ∀(i, j) ∈ A

The optimal shortest path in single-pair case, highlighted in red, is 1-2-5-6 of cost 6.

For single source shortest path problem bT = [n − 1, −1, . . . , −1, −1]. Node s is the source of n − 1 units

and all other nodes are sinks with demands -1. A solution to this problem is a shortest path tree rooted

at s.

Example 43 (single source shortest path). The optimal solution described in the figure is x12 = 5, x23 =

1, x24 = 1, x25 = 2, x56 = 1 of total cost 21.

72
The network matrix M has a row per node and a column per arc of the network. Each column xij is

exactly made by a 1 (at row i) and a -1 (at row j). Due to tehir structure, they fall into the family of the

Totally Unimodular Matrices.

A matrix M is called totally unimodular if every square submatrix has determinant equals 0, 1, or -1.

(In particular, each entry of M is 0, 1, or -1.)

If M is totally unimodular and b is integer valued, then the polyhedron P = {x|M x ≤ b} is integral.

Hence, solving its LP-relaxation is enough to obtain the optimal solution.

Proof. Recall that the simplex method proceeds from one basic feasible solution to the next, until it finds

the optimal one. Let B the submatrix of M corresponding to a basic feasible solution. Then the solution of

the system BxB = b is given by:


det(Bi )
xiB = ∀i
det(B)

by the Cramer’s rule, where Bi is the matrix B where the i -th column is removed and replaced by b. We

know that B is invertible, thus det(B)6= 0. If all the entries of b are integers, then xB is integer.

Maximum Flows

Define variables xij ≥ 0 for each arc (i, j) ∈ A codifying the amount of flow sent through (i, j). The maximum

flow problem asks to compute the maximum net flow v that can cross the network starting from a source

node s and arriving to a sink node t. Finally, define uij as the capacity of (i, j).

The formulation is:

max v

s.t.
P
xsj = v
j:(s,j)∈A
P P
j:(i,j)∈A xij − j:(j,i)∈A xji = 0 ∀i ∈ V \ {s, t}
P
− j:(j,t)∈A xjt = −v

xij ≤ uij ∀(i, j) ∈ A

xij ≥ 0 ∀(i, j) ∈ A

that is, {max v|Ax = bv, 0 ≤ x ≤ u} where A is the network matrix and bT = [1, 0, . . . , 0 − 1]

Example 44. Take the following network:

73
The model for the maximum flow is:

max v

s.t. xs1 + xs2 = v

−xs1 + x12 + x1t = 0

−xs2 − x12 + x2t = 0

−x1t − x2t = −v

xs1 ≤ 10

xs2 ≤ 6

x12 ≤ 1

x1t ≤ 8

x2t ≤ 10

xij ≥ 0 ∀(i, j) ∈ A

Two more network problems are the transportation problem and the assignment problem.

The former has nodes in a first set as sources (b > 0) and nodes in a second set as demands (b < 0). The

feasibility of the transportation problem is assessed by verifying that there exist flows of total units equal to

the total demands in the transformed network.

74
Figure 3: On the left, an instance of transportation problem. On the right, the trasformation to a max-flow
problem to solve the feasibility version of the transportation.

The latter requires finding a 1-1 correspondence between nodes of a first set and node in a second set,

both of the same cardinality |N |. It correspond to a transportation problem with unitary sources/demands

and nodes equally distributed in the two sets. The assignment feasibility can be proved by looking for a

maximum flow of |N | units on the transformed network.

Figure 4: On the left, an instance of assignment problem. On the right, the trasformation to a max-flow
problem to solve the feasibility version of the assignment.

The Ford-Fulkerson Maximum Flow Algorithm

The maximum flow problem can be solved by means of the Ford-Fulkerson algorithm. Firstly, one needs to

define the residual network: given a network G = (N, A) and a feasible flow x, the residual network G(x)

has, for each (i, j) ∈ A of G with capacity uij , a pair of arcs: (i, j) of residual capacity rij = uij − xij and

a reversed arc (j, i) of capacity xij . Intuitively, rij is the amount of flows that one can add without violating

the arc capacity uij , while the reversed arc contains the amount of flows that can be possibly removed from

(i, j) by setting a flow in the opposite direction (i.e., from j to i).

75
Figure 5: On the top, an original network G with a feasible flow x (in red). On the bottom, the residual
network G(x) built accordingly to the definitions.

The Ford-Fulkerson algorithm implements the idea of augmenting path, which is an s − t directed

path on the residual network. The residual capacity of the augmenting path P is given by δ(P ) = min{rij :

(i, j) ∈ P }. To augment along P is to send δ(P ) units of flow along each arc of the path. Then, x and the

residual capacities are modified appropriately:

rij = rij − δ(P ) rji = rji + δ(P ) ∀(i, j) ∈ P.

Figure 6: On the left, an augmenting path on the residual network G(x) (in red). On the right, the updated
residual network G(x).

Algorithm 1 Max Flow Algorithm


1: x ← 0
2: Create the residual network G(x)
3: while there is some directed path from s to t in G(x) do
4: Let P be a path from s to t in G(x)
5: ∆ ← δ(P )
6: Send ∆ units of flow along P
7: Update the r’s
8: end while

76
It can be proved that the flow is optimal when there is no augmenting path on the residual networks,

that is t cannot be reached by any path starting from s. To prove it, we exploit the concept of (s − t)-cuts.

An (s − t)-cut in a network G = (N, A) is a partition of N into two disjoint subsets S and T such that

s ∈ S and t ∈ T , e.g., S = {s, 1} and T = {2, t}.

The capacity of a (s-t)-cut [S, T ] is C[S, T ] = uij (only the arc capacities going from nodes
P P
i∈S t∈T

in S to nodes in T ).

For any pair of feasible flows x and (s − t)-cut [S, T ] on a network, then v(x) ≤ C[S, T ]: the amount of

flow sent is bounded by the capacity of any (s − t)-cut.

Due to theoretical results, we know that if v(x) = C[S, T ] then x is a maximum flow and [S, T ] is a

minimum capacity (s − t)-cut.

Theorem 5 (Max-flow Min-cut Theorem). A flow is maximum if and only if there exists no augmenting

path in the residual network G(x)

Proof. If there exists an augmenting path in G(x), clearly x is not the maximum.

Suppose there is no augmenting path in G(x). Let S be the set of nodes reachable from s in G(x). Let

T = N \ S. Then there is no arc in G(x) from S to T . Thus, in the original network:

i ∈ S and j ∈ T =⇒ xij = uij

i ∈ T and j ∈ S =⇒ xij = 0.

77
The flow crossing this (s, t)-cut [S, T ] is equal to the capacity C[S, T ], hence it is maximum.

To detect the non-existence of augmenting paths, one can rely on the breadth-first search (FIFO logic),

which brings to the Edmonds–Karp algorithm’s implementation of the Ford-Fulkerson method.

Algorithm 2 Breadth-first search


1: Input: Residual network G(x), root s, sink t
2: Q = 0 (empty queue)
3: label s
4: Q ← Q ∪ {s}
5: while Q 6= ∅ do
6: select first node u ∈ Q
7: if u is the sink t then
8: return u
9: end if
10: for each v: (u, v) ∈ A of G(x) do
11: if v is not labeled then
12: label v
13: v.father← u
14: Q←Q∪v
15: end if
16: end for
17: end while

Algorithm 2 labels nodes that at each iteration can be reached from current node u and takes track of

the discovered neighbor v by updating the father attribute of nodes.

Example 45 (Ford-Fulkerson algorithm and terminating condition.). At the initialization with x = 0 the

residual network G(x) = G as reversed arcs have null capacities.

By the selecting the augmenting path s − 1 − 2 − t it results ∆ = min{M, 1, M } = 1. We set xs1 = x12 =

x2t = 1, rs1 = r2t = M − 1, r12 = 0, r1s = r21 = rt2 = 1 and update the residual network accordingly.

78
The new selected augmenting path is s − 2 − 1 − t where ∆ = min{M, 1, M } = 1. We set xs2 = x1t = 1,

x12 = 0, rs2 = r1t = M − 1, r21 = 0, r2s = r12 = rt1 = 1 and update the residual network accordingly.

One can iterate the process up to 2M times to arrive at the optimum flow of value 2M that equals the

capacity of the minimum (s, t)-cut.

The same solution could have been computed by not playing on arc (1, 2), but by directly sending flows

through paths s − 1 − t and s − 2 − t.

79
Lecture 8

Project scheduling

Network Optimization can be used as an aid in the scheduling of large complex projects that consist of many

activities. A project is divided into a collection of work activities (for planning and control purposes) that

must all be completed to accomplish the project.

• Identifying precedence relationships: shows the relationship of each activity to others and to the whole

project

• Sequencing activities: clarifies the correct sequence of activities in the project

• Determining activity times & costs: encourages the setting of realistic time and cost estimates for each

activity

• Estimating material and worker requirements: helps make better use of people, money, and material

resources

• Determining critical activities: identifies critical bottlenecks in the project

Three main techniques are used in project management: Gantt chart, Critical path method (CPM)

and Program evaluation and review technique (PERT).

A Gantt chart provides a graphical representation of a schedule of activities, useful for planning, coordi-

nating, and tracking specific tasks in a project, giving a clear illustration of the project’s progress. However,

one aspect not taken into account in this type of diagramming is the interdependence of activities.

Example 46 (Service for a delta jet: Gantt chart).

80
Both CPM and PERT are network techniques developed in the late 1950s that consider precedence

relationships and interdependencies. Each uses a different estimate of activity times: CPM activity duration

are known with certainty; PERT activity duration are not known with certainty.

CPM and PERT works by following prescribed steps:

1. Define the project and prepare the work breakdown structure (list of all activities in the project)

2. Develop relationships among the activities - decide which activities must precede and which must follow

others

3. Draw the network connecting all of the activities

4. Assign time and/or cost estimates to each activity

5. Compute the longest time path through the network - this is called the critical path

6. Use the network to help plan, schedule, monitor, and control the project

The information obtained by means of CPM and PERT can provide answers to the following questions:

I. When will the entire project be completed?

II. What are the critical activities or tasks in the project?

III. Which are the noncritical activities?

IV. Is the project on schedule w.r.t. the due date, behind schedule (delay), or ahead of schedule (early)?

81
V. Is the money spent equal to, less than, or greater than the budget?

VI. Are there enough resources available to finish the project on time?

VII. If the project must be finished in a shorter time, what is the way to accomplish this with the least cost

(crashing)?

Precedence relationship: activity j is called a predecessor of activity k if activity j must be completed

before activity k can begin.

The structure that embeds all the precedence relationships between activities is called project network.

Alternatively, activities can be associated to nodes (activity on nodes network, AON) or to arcs (activity

on arcs network, AOA). The two conventions are equivalent. In the former, arcs expresses the precedence

relationships among activities. In the latter, nodes are referred as events, such as the completion of an

activity on the incoming arc.

AON and AOA comparison:

82
Note that in AOA, dummy activities of zero time may be required as any activity must be represented

by exactly one arc in the network, and two nodes can be connected by at most of arc.

Example 47 (Milwaukee Paper Manufacturing’s Activities and Predecessors - AON).

83
Figure 7: Resulting AON.

Nodes start and finish represent the beginning and ending of the project, respectively. Arc lengths (or

node numerical labels) indicate the corresponding activity durations.

Critical path method

The critical path is the longest path through the network and represent the shortest time in which the

project can be completed. Critical path activities have no slack time, thus any delay in critical path activities

delays the project.

Computing longest paths: given a network G = (N, A) with non-negative lengths associated on arcs,

special source node s, find the longest path from s to every other node in N .

If the network has no directed cycles of positive lengths, the principle of optimality holds: the longest

directed s − j path is made by the longest directed s − i path plus the directed i − j path. In formulas, this

translates into functional equations exploited in dynamic programming approaches:

v[s] = 0

v[j] = max {v[i] + cij } ∀j ∈ N \ {s}


i:(i,j)∈A

d[j] = preceding node on the longest path from s to j

We refer to v[j] as the length of the longest s − j path. The first equality is the boundary condition, that is

the length of the longest path v[s] is initially 0. Then, at stage (node) j all its predecessors are evaluated

in terms of path length v[i] + cij , where cij is the length of arc (i, j). Among all these measures, the largest

one is selected as longest path. Finally, the predecessor of j in the longest path is stored in d[j].

Note that project network are acyclic, hence functional equations can be used to get the longest path.

84
The critical path analysis aims at computing the following 4 measures for each activity:

1. Earliest start (ES) = earliest time at which an activity can start, assuming all predecessors have

been completed

2. Earliest finish (EF) = earliest time at which an activity can be finished

3. Latest start (LS) = latest time at which an activity can start so as to not delay the completion time

of the entire project

4. Latest finish (LF) = latest time by which an activity has to be finished so as to not delay the

completion time of the entire project

Forward pass

The forward pass begins from the starting node and work forwardly. Its scope is to identify the earliest start

and earliest finish of all the activities.

Earliest start time rule: If an activity has only one immediate predecessor, its ES equals the EF of the

predecessor. If an activity has multiple immediate predecessors, its ES is the maximum of all the EF values

of its predecessors:

ES[j] = max {EF [i]}


i:(i,j)∈A

Earliest finish time rule: The earliest finish time (EF) of an activity is the sum of its earliest start time

(ES) and its activity time

EF [j] = ES[j] + activity j duration

Activity duration for start and finish nodes are 0.

The early schedule ES[j] indicate the length of the longest path from start to node j in the AON

project network.

The project completion time corresponds to the minimum time to complete a project, that is the

length of the longest path from start to finish on AON.

Example 48 (Early schedule on AON at Figure 7). The lexicographic order is start,A,B,C,D,E,F,G,H,finish.

Set ES[start] = EF [start] = 0.

For both A and B, earliest starts are ES[A] = ES[B] = 0. Earliest finish are instead EF [A] = 2 and

EF [B] = 3. Finally, d[A] = d[B] = start.

Pick C and get ES[C] = EF [A] = 2. Then, EF [C] = ES[C] + 2 = 4. Set d[C] = A.

85
Algorithm 3 Forward Pass Algorithm
1: Order the nodes lexicographically
2: ES[start] = EF [start] = 0
3: while the early start time of the finish node has not been fixed do
4: Let j be the smallest index of an unprocessed node
5: ES[j] = max {EF [i]}
i:(i,j)∈A
6: EF [j] = ES[j]+activity time of j
7: d[j] = argmaxi:(i,j)∈A {EF [i]}
8: end while

Pick D and get ES[D] = max{EF [A], EF [B]} = max{2, 3} = 3. Then, EF [D] = ES[D] + 4 = 7. Set

d[D] = B.

Pick E and get ES[E] = EF [C] = 4. Then, EF [E] = ES[E] + 4 = 8. Set d[E] = C. Similarly for F,

ES[F ] = EF [C] = 4, EF [F ] = ES[F ] + 3 = 7, d[F ] = C.

Pick G and get ES[G] = max{EF [D], EF [E]} = max{7, 8} = 8. Then, EF [G] = ES[G] + 5 = 13. Set

d[G] = E.

Pick H and get ES[H] = max{EF [F ], EF [G]} = max{7, 13} = 13. Then, EF [H] = ES[H] + 2 = 15.

Set d[H] = G.

Pick finish and get ES[f inish] = EF [f inish] = EF [H] = 15, set d[f inish] = H.

Backward pass

The backward pass begins from the finish node and work backwardly. Its scope is to identify the latest start

and latest finish of all the activities.

Assume there is a due date for the project, which can be reasonably set as EF [f inish]. The late

schedule LS[j] is computed as due date - length of longest path from node j to node finish. Note that a

predecessor i of j in this path is closer to node finish than j (we are working backward).

Latest finish time rule: If an activity is an immediate predecessor for just a single activity, its LF equals

the LS of the activity that immediately follows it. If an activity is an immediate predecessor to more than

one activity, its LF is the minimum of all LS values of all activities that immediately follow it :

LF [j] = min {LS[i]}


i:(j,i)∈A

Latest start time rule: The latest start time (LS) of an activity is the difference of its latest finish time

(LF) and its activity time

LS[j] = LF [j] − activity j duration

86
Algorithm 4 Backward Pass Algorithm
1: Order the nodes lexicographically
2: LF [f inish] = LS[f inish] = due date
3: while the late start time of the start node has not been fixed do
4: Let j be the highest index of an unprocessed node
5: LF [j] = min {LS[i]}
i:(j,i)∈A
6: LS[j] = LF [j]−activity time of j
7: d0 [j] = argmini:(j,i)∈A {LS[i]}
8: end while

Example 49 (Late schedule on AON at Figure 7, assuming due date 15).

The lexicographic order is start,A,B,C,D,E,F,G,H,finish. Set LF [f inish] = LS[f inish] = 15.

Pick H and get LF [H] = LS[f inish] = 15. Then, LS[H] = LF [H] − 2 = 13. Set d0 [H] = f inish.

Pick G and get LF [G] = LS[H] = 13. Then, LS[G] = LF [G] − 5 = 8. Set d0 [G] = H. Similarly for F

LF [F ] = LS[H] = 13, LS[F ] = LF [F ] − 3 = 10, d0 [F ] = H.

Pick E and get LF [E] = LS[G] = 8. Then, LS[E] = LF [E] − 4 = 4. Set d0 [E] = G. Similarly for D

LF [D] = LS[G] = 8, LS[D] = LF [D] − 4 = 4, d0 [D] = G.

Pick C and get LF [C] = min{LS[E], LS[F ]} = {4, 10} = 4. Then, LS[C] = LF [C] − 2 = 2. Set

d0 [C] = E.

Pick B and get LF [B] = LS[D] = 4. Then, LS[B] = LF [B] − 3 = 1. Set d0 [B] = D.

Pick A and get LF [A] = min{LS[C], LS[D]} = {2, 4} = 2. Then, LS[A] = LF [A] − 2 = 0. Set d0 [A] = C.

Pick start and get LF [start] = LS[start] = min{LS[A], LS[B]} = {0, 1} = 0. Set d0 [start] = A.

Computing slack time

After computing the ES, EF, LS, and LF times for all activities, compute the slack or free time for each

activity.

Slack is the length of time an activity can be delayed without delaying the entire project:

Slack[j] = LS[j] − ES[j] = LF [j] − EF [j]

Critical activities are the activities with 0 slack value. A path from node start to the finish node that

consists entirely of critical activities is called a critical path.

87
Figure 8: CPM results.

In the above example, A-C-E-G-H is the critical path. The project completion time is EF [H] = LF [H] =

15.

Figure 9: Gantt charts of early times on AON at Figure 7

88
Figure 10: Gantt charts of late times on AON at Figure 7

Example 50.

Forward pass: The lexicographic order is start,A,B,C,D,E,F,G,end. Set ES[start] = EF [start] = 0.

89
For both A and B, earliest starts are ES[A] = ES[B] = 0. Earliest finish are instead EF [A] = 9 and

EF [B] = 6. Finally, d[A] = d[B] = start.

Pick C and get ES[C] = max{EF [A], EF [B]} = max{9, 6} = 9. Then, EF [C] = ES[C] + 8 = 17. Set

d[C] = A.

Pick D and get ES[D] = max{EF [A], EF [B]} = max{9, 6} = 9. Then, EF [D] = ES[D] + 7 = 16. Set

d[D] = A.

Pick E and get ES[E] = EF [D] = 16. Then, EF [E] = ES[E] + 10 = 26. Set d[E] = D.

Pick F and get ES[F ] = max{EF [C], EF [E]} = {17, 26} = 26. Then, EF [F ] = ES[F ] + 12 = 38. Set

d[F ] = E.

Pick end and get ES[end] = EF [end] = EF [F ] = 38, set d[end] = F .

Backward pass: The lexicographic order is start,A,B,C,D,E,F,end. Set LF [end] = LS[end] = 38 = EF [end]

as due date.

Pick F and get LF [F ] = LS[end] = 38. Then, LS[F ] = LF [F ] − 12 = 26. Set d0 [F ] = end.

Pick E and get LF [E] = LS[F ] = 26. Then, LS[E] = LF [E] − 10 = 16. Set d0 [E] = F .

Pick D and get LF [D] = LS[E] = 16. Then, LS[D] = LF [D] − 7 = 9. Set d0 [D] = E.

Pick C and get LF [C] = LS[F ] = 26. Then, LS[C] = LF [C] − 8 = 18. Set d0 [C] = F .

Pick B and get LF [B] = min{LS[C], LS[D]} = {18, 9} = 9. Then, LS[B] = LF [B] − 6 = 3. Set

d0 [B] = D.

Pick A and get LF [A] = min{LS[C], LS[D]} = {18, 9} = 9. Then, LS[A] = LF [A] − 9 = 0. Set

d0 [A] = D.

Pick start and get LF [start] = LS[start] = min{LS[A], LS[B]} = {0, 3} = 0. Set d0 [start] = A.

Computing slack times:

Slack[A] = LS[A] − ES[A] = LF [A] − EF [A] = 0

Slack[B] = LS[B] − ES[B] = LF [B] − EF [B] = 3

Slack[C] = LS[C] − ES[C] = LF [C] − EF [C] = 9

Slack[D] = LS[D] − ES[D] = LF [D] − EF [D] = 0

Slack[E] = LS[E] − ES[E] = LF [E] − EF [E] = 0

Slack[F ] = LS[F ] − ES[F ] = LF [F ] − EF [F ] = 0

90
Example 51 (Reliable Construction Co. project).

91
Figure 11: The complete project network showing ES and LS (in parenthesis above the node), EF and LF
(in parenthesis below the node) for each activity. The critical path is A-B-C-E-F-J-L-N (bold arrows).

PERT (Program Evaluation Review Technique)

CPM assumes we know a fixed time estimate for each activity and there is no variability in activity times.

On the other hand, PERT is an attempt to correct the shortcoming of CPM by modeling the duration of

each activity as a random variable. For each activity, PERT requires that the project manager estimate

the following three quantities:

a = optimistic estimate of the activity’s duration under the most favorable conditions

b = pessimistic estimate of the activity’s duration under the least favorable conditions

m = most likely (realistic) value for the activity’s duration

Let Tj be the random variable expressing the duration of activity j (in case of AOA construction, Tij

92
are defined in place and all formulas are modified accordingly). PERT requires as first assumption that Tj

follows a beta distribution.

The specific definition of a beta distribution need not concern us, but it is important to realize that it

can approximate a wide range of random variables, including many positively skewed, negatively skewed,

and symmetric random variables. If Tj follows a beta distribution, then the mean and variance of Tj may

be approximated by
a + 4m + b
E(Tj ) =
6
(b − a)2
var(Tj ) =
36

Figure 12: Variability in activity times for Example 46. Note that CPM made use of expected times as arc
weights, so the critical path identified is already an appropriate input for the PERT.

PERT requires the second assumption that the duration of all activities are independent. Then for any

path in the project network, the mean and variance of the time required to complete the activities on the

93
path are given by
X
E(Tj ) = expected duration of activities on path p
j∈p

X
var(Tj ) = variance of duration of activities on path p
j∈p

Focusing on the critical path found through CPM (which need to use expected times E(Tj ) as durations

in his computations), now indicated with the random variable CP , PERT makes use a third assumption: the

total project completion times follow a normal probability distribution. This is always reasonable when

the critical path found contains enough activities to invoke the Central Limit Theorem. It follows that:

X X
E(CP ) = E(Tj ) var(CP ) = var(Tj )
j∈CP j∈CP

Given inputs in Figure 11 and the CPM results in Figure 8, one finds that

E(CP ) = E(TA ) + E(TC ) + E(TE ) + E(TG ) + E(TH ) = 2 + 2 + 4 + 5 + 2 = 15

var(CP ) = var(TA ) + var(TC ) + var(TE ) + var(TG ) + var(TH ) = 0.11 + 0.11 + 1.00 + 1.78 + 0.11 = 3.11

The standard deviation is given by dev(CP ) = var(CP ) = 1.76 weeks.


p

PERT allows us to reply to questions such as: What is the probability that the project will be completed

within 16 weeks? By relying on the CPM results, we are implicitly considering that the path A-C-E-G-H

remains critical no matter the specific realization of the activity times.

94
Given the normal distribution assumption, in general we have

   
CP − E(CP ) X − E(CP ) X − E(CP )
P (CP ≤ X) = P ≤ =P Z≤
dev(CP ) dev(CP ) dev(CP )

where Z is a standardized normal random variable with mean 0 and variance 1. The cumulative distribution

function for a normal random variable is typically tabulated.

In our case, we set

   
CP − 15 16 − 15
P (CP ≤ 16) = P ≤ = P Z ≤ 0.57 = 0.7156
1.76 1.76

that is, in 71.56% of the cases we will finish the project in 16 weeks.

LP to find the project completion time

Let us define G = (N, A) an AON network. Let moreover dj be the activity j ∈ N . We introduce the decision

variable xj to model the starting time of activity j ∈ N . The project completion time can be computed by

means of the formulation:

min xend

s.t. xj ≥ xi + di ∀(i, j) ∈ A

xstart = 0

xj ≥ 0 ∀j ∈ N

The inequality ensures that the starting time of activity j is the largest among all the predecessors i such

that (i, j) ∈ A.

LP to find the minimum cost crashing

It may occur that the project completion time exceeds the due date D if this was set too tight. In this

cases, one needs to reduce the project duration below its normal value by accomplishing some activities in

a shorter time at a larger cost. This is generally called crashing a project. Taken together, the shortened

activity duration will enable to finish the project by the due date at the minimum total cost of crashing.

Linear programming can often be used to determine the allocation of resources that minimizes the cost of

meeting the project deadline.

Let cj be the cost for reducing the duration of activity j by one unit and uj the maximum acceptable

duration reduction of j. Along with the above-mentioned xj variables, we further introduce yj to count the

95
number of unit of time reduction selected for activity j.

P
min j∈N cj yj

s.t. yj ≤ uj ∀j ∈ N

xj ≥ xi + di − yi ∀(i, j) ∈ A

xstart = 0

xend ≤ D

xj ≥ 0 ∀j ∈ N

yj ≥ 0 integer ∀j ∈ N

The objective function minimizes the total crashing cost. The first constraints bounds the maximum reduc-

tion of duration for each activity. The second inequality now takes into account the duration reduction by

considering yj . Finally, due date fulfillment is enforced by imposing that xend cannot exceed D.

In case crashing can be performed with a limited budget availability K, the model is modified as follows:

min xend

s.t.
P
j∈N cj yj ≤ K

yj ≤ uj ∀j ∈ N

xj ≥ xi + di − yi ∀(i, j) ∈ A

xstart = 0

xj ≥ 0 ∀j ∈ N

yj ≥ 0 integer ∀j ∈ N

The model computes the least possible project duration with a budget of K units.

Note that by modifying the activity duration, the critical path can be altered.

Advantages and limits of CPM/PERT

The main advantages of CPM/PERT techniques are:

• Especially useful when scheduling and controlling large projects

• Straightforward concept and not mathematically complex

• Network depiction help highlight relationships among project activities

• Critical path and slack time analyses help pinpoint activities that need to be closely watched

96
• Project documentation and graphics point out who is responsible for various activities

• Applicable to a wide variety of projects

• Useful in monitoring not only schedules but costs as well

For what concerns limitations:

• Project activities have to be clearly defined, independent, and stable in their relationships

• Precedence relationships must be specified and networked together

• Time estimates tend to be subjective and are subject to fudging by managers

• There is an inherent danger of too much emphasis being placed on the longest, or critical, path

• For PERT, the assumption that the activity duration are independent is difficult to justify. The same

happens for the assumption that the critical path found by CPM will always be the critical path.

Finally, activity duration may not follow a beta distribution.

97
Lecture 9

Dynamic Programming

Dynamic programming is a technique that can be used to solve many optimization problems. The main idea

is to transform a complex optimization problem into a sequence of smaller and more tractable ones (divide

and conquer). In particular, the problem is divided into a sequence of stages and states are associated to

each stage. Dynamic programming is based on recursion, principle of optimality and alternatively works in

forward or backward directions. Due to its generality, it is widely applied to several optimization problems.

The elements that characterize dynamic programming applications are

I. The problem can be divided into stages (sub-parts of the problem) with a decision required at each

stage

II. Each stage has a number of states associated with it. By state, we mean the current condition of the

system. This is the information that is needed at any stage to build a decision and possibly make an

optimal decision.

III. The decision chosen at any stage describes how the state at the current stage is transformed into the

state at the next stage.

IV. Given the current state, the optimal decision for each of the remaining stages must not depend on

previously reached states or previously chosen decisions. This idea is known as the principle of

optimality.

V. If the states for the problem have been classified into one of T stages, there must be a recursion

that relates the cost or reward earned during stages t = 1, . . . , T to the cost or reward earned from

stages t + 1, . . . , T . In essence, the recursion formalizes the working-backward procedure. A specular

definition holds for the working-forward procedure.

Example 52 (Resource allocation problem). A company has $6000 to invest and 3 investments are available.

If dj thousands of dollars are invested in investment j, then a net present value (in thousands) of rj (dj ) is

obtained: 
7d1 + 2 if d1 > 0,


r1 (d1 ) =
if d1 = 0.

0


3d2 + 7 if d2 > 0,


r2 (d2 ) =
if d2 = 0.

0

98

4d3 + 5 if d3 > 0,


r3 (d3 ) =
if d3 = 0.

0

The amount placed in each investment must be an integer multiple of $1000. How should the company

allocate $6000 to maximize the net present value?

max r1 (x1 ) + r2 (x2 ) + r3 (x3 )

s.t. x1 + x2 + x3 = 6

x1 , x2 , x3 ≥ 0 integer

where xj is the amount of thousands of dollars invested in j. However, rj (dj ) are not linear functions!

To build our dynamic programming approach, firstly we recognize stages:

The stage should be chosen so that when one stage remains the problem is easy to solve. Thus, each

investment represent a stage and, in general, investment 3 depends by the investment decision 2 and 1 that

we previously set. At each stage, we can invest d amount of dollars. This is our decision. Due to the

problem specifications, d ∈ {0, 1, 2, 3, 4, 5, 6} and these are the possible states at any stage.

We define

fk (d) = max net present value by investing d dollars in investments (stages) k, k + 1, k + 2, . . . , n.

xk (d) = amount to invest in k if d dollars are available for alternatives k, k + 1, . . . , n

A naive dynamic programming scheme should evaluate fk (d) for each state d at each stage k. But simplifi-

cations arises as we will see later.

If we consider f3 (d), this is the max net present value obtained by investing d in 3. Then at stage 3,

we need to evaluate f3 (0), f3 (1), f3 (2), f3 (3)f3 (4), f3 (5), f3 (6). This can be trivially done by f3 (d) = r3 (d)

because 3 the starting stage of the recursion (n = 3 in our problem). Moreover, x3 (d) = d as all the available

99
d dollars are invested in 3. We then obtain:

f3 (0) = 0 x3 (0) = 0

f3 (1) = 9 x3 (1) = 1

f3 (2) = 13 x3 (2) = 2

f3 (3) = 17 x3 (3) = 3

f3 (4) = 21 x3 (4) = 4

f3 (5) = 25 x3 (5) = 5

f3 (6) = 29 x3 (6) = 6

In general terms, fn (d) for backward computation gives the base case or boundary conditions of the

dynamic programming and indicates the beginning of the recursion. In case of forward recursion, clearly

f1 (d) are the boundary conditions.

We move to investment 2 by following the backward procedure and compute f2 (d) for each d value by

the dynamic programming recursion:

fk (d) = max {rk (xk ) + fk+1 (d − xk )}


0≤xk ≤d

The equation incarnates the principle of optimality: optimal solution at stage k uses optimal decision of

later stages. If d amount of dollars are available and we decide to invest xk of it in k, the net present value

will be rk (xk ) + fk+1 (d − xk ): rk (xk ) are given by investment k, whereas fk+1 (d − xk ) is the net present

value from investing d − xk on k + 1, . . . , n.

We have that n − 1 = 2, thus

f2 (0) = max{r2 (x2 ) + f3 (0 − x2 )} = 0 x2 (0) = 0


x2 =0

f2 (1) = max {r2 (x2 ) + f3 (1 − x2 )} = max{r2 (0) + f3 (1), r2 (1) + f3 (0)} = max{9, 10} = 10 x2 (1) = 1
x2 ∈{0,1}

f2 (2) = max {r2 (x2 )+f3 (2−x2 )} = max{r2 (0)+f3 (2), r2 (1)+f3 (1), r2 (2)+f3 (0)} = max{13, 19, 13} = 19 x2 (2) = 1
x2 ∈{0,1,2}

f2 (3) = 23 x2 (3) = 1

f2 (4) = 27 x2 (4) = 1

f2 (5) = 31 x2 (5) = 1

f2 (6) = 35 x2 (6) = 1

100
As example, consider f2 (2). The recursion formula evaluates the best that can be achieved in case we

have only $2000 dollars to invest in 2 and 3 (which is a later stage). Or we invest $0 in 2 and we look at

what we can get with $2000 at stage 3 (r2 (0) + f3 (2)), or we invest $1000 in 2 and leave $1000 for stage 3

(r2 (1) + f3 (1)), or finally we invest the full amount of $2000 in 2 and do not invest in stage 3 (r2 (2) + f3 (0)).

The maximum net present value that we can get among this three alternatives is chosen as f2 (2).

Note that x2 (d) = j if the the best f2 (d) value is achieved in the recursion for x2 = j. It is used to keep

track of the best solution.

Finally, it is the turn of f1 (d). We have that d = 6 is the total amount of dollars available and f1 (6)

gives the max net present value by investing $6000 among 1, 2 and 3. So, we can reduce the computation as

there is no need to evaluate f1 (0), . . . , f1 (5) (we want to spend all the available budget) and f1 (6) identifies

the optimal value. In any dynamic programming procedure there exists at least a state that codifies the

optimal value for the problem.

f1 (6) = max {r1 (x1 ) + f2 (6 − x1 )} = max{r1 (0) + f2 (6), r1 (1) + f2 (5), r1 (2) + f2 (4), r1 (3) + f2 (3),
0≤x1 ≤6,int

r1 (4) + f2 (2), r1 (5) + f2 (1), r1 (6) + f2 (0)} = max{35, 40, 43, 46, 49, 47, 44} = 49 x1 (6) = 4

Since x1 (6) = 4, we invest $4000 in investment 1. This leaves 6000-4000=$2000 for investments 2 and 3.

Hence, we invest x2 (2) = $1000 in investment 2. Then $1000 is left for investment 3, so we choose to invest

x3 (1) = $1000 in investment 3. Therefore, the maximum net present value is f1 (6) = $49000 by investing

$4000 in investment 1, $1000 in investment 2, and $1000 in investment 3.

Note that finding the solution is equivalent to finding the longest route from node (1, 6) to the auxiliary

node (4, 0) in Figure 12.

101
Figure 13

Each node corresponds to a state, and each column of nodes corresponds to all the possible states

associated with a given stage. For example, if we are at node (2, 3), then we have $3000 dollars available for

investments 2 and 3. Each arc in the network represents the way in which a decision (how much to invest

on the current investment) transforms the current state into next investment’s state. In general, this is an

arc connecting (t, d) and (t + 1, d − x) mimicking the recursive formula for an investment of x at stage t.

For example, the arc joining nodes (2, 3) and (3, 1) corresponds to invest $2000 dollars in 2, leaving $1000

available for investment 3. Hence, the length of each arc is rt (x) corresponding to the net present value

obtained by investing x thousand dollars in investment t (in the above example, the length is r2 (2) = 13).

Note that not all pairs of nodes in adjacent stages are joined by arcs. This happens when it is not possible

to go from a certain state (e.g. (2,3)) to another state (e.g. (3,4), which would require to increase the dollars

available after investing in 2).

Binary Knapsack Problem

Let us consider the 0-1 binary knapsack formulation:

Pn
max j=1 cj xj
Pn
s.t. j=1 aj xj ≤ b

xj ∈ {0, 1} ∀j = 1, . . . , n

A dynamic programming scheme aims at computing fk (d) as the maximum value of the subset of items

1, . . . , k ≤ n (stages) inserted within a remaining capacity 0 ≤ d ≤ b (states). At each stage k, we need to

102
decide if item k is inserted in the knapsack or not. This time, we use forward recursion. Given that our

original instance has n items and a knapsack capacity b, the optimal value is fn (b). Boundary conditions

are given by 
if a1 ≤ d,

c

1
f1 (d) =
otherwise.

0

which specifies the solution value for the first item j = 1 and for each knapsack capacity d = 1, . . . , b.

We indicate with fk (d) the maximum profit attainable starting stage k with a capacity equal to d and

following an optimal policy in stages 1, . . . , k − 1, k.

By exploiting the principle of optimality, the dynamic programming recursion for k = 2, . . . , n and

d = 0, . . . , b is 
if ak > d

f

k−1 (d)
fk (d) =
max{fk−1 (d), ck + fk−1 (d − ak )} if ak ≤ d

The former case is verified when item k do not fit in the remaining capacity d of the knapsack. The latter is

checked when k fits: including k provides a profit of ck to the solution that is added to the optimal solution

value fk−1 (d − ak ) of the knapsack subproblem with capacity d − ak and item subset 1, . . . , k − 1. This

is chosen only if the total solution value is better than the optimal value fk−1 (d) of the subproblem with

capacity d and item subset 1, . . . , k − 1. The above cases can be compactly rewritten as

fk (d) = max {ck xk + fk−1 (d − ak xk )}


xk :ak xk ≤d

To trace the optimal solution, xk (d) = 0 or xk (d) = 1 are alternatively set to indicate the exclusion/selection

of item k from the optimal solution of subproblem with capacity d and items 1, . . . , k.

Example 53 (0-1 knapsack problem).

max 16x1 + 19x2 + 23x3 + 28x4

s.t. 2x1 + 3x2 + 4x3 + 5x4 ≤ 7

xj ∈ {0, 1} ∀j = 1, . . . , 4

We firstly set boundary conditions:



if d = 0, 1 x1 (0) = x1 (1) = 0

0

f1 (d) =
16 if d ≥ 2 x1 (2) = · · · = x1 (7) = 1.

103
Then, accoding to the dynamic programming recursion:

if d = 0, 1 x2 (0) = x2 (1) = 0

f1 (d) = 0






if d = 2 x2 (2) = 0.

f1 (2) = 16






if d = 3 x2 (3) = 1

max{f1 (3), 19 + f1 (0)} = max{16, 19} = 19





f2 (d) =
max{f1 (4), 19 + f1 (1)} = max{16, 19} = 19 if d = 4 x2 (4) = 1



if d = 5 x2 (5) = 1




max{f1 (5), 19 + f1 (2)} = max{16, 19 + 16} = 35



if d = 6 x2 (6) = 1




max{f1 (6), 19 + f1 (3)} = max{16, 19 + 16} = 35



if d = 7 x2 (7) = 1

max{f1 (7), 19 + f1 (4)} = max{16, 19 + 16} = 35

For capacity d = 0, 1, neither of items 1 and 2 fit. When d = 2, item 2 does not fit, but from f1 (2) we know

that item 1 is packed in the knapsack. For d = 3, 4 both items do not fit together, so the most valuable one

is selected, that is item 2. For d = 5, 6, 7 both items fit in the knapsack as the capacity left by item 2 is

respectively 2, 3 and 4 which is enough to host also item 1 as testified by f1 (2), f1 (3), f1 (4) > 0.

Moving to item 3, recursion provides the following results:



if d = 0, 1, 2, 3 x3 (0) = · · · = x3 (3) = 0

f2 (d)






if d = 4 x3 (4) = 1

max{f2 (4), 23 + f2 (0)} = max{19, 23} = 23





f3 (d) =
 max{f2 (5), 23 + f2 (1)} = max{35, 23} = 35 if d = 5 x3 (5) = 0



max{f2 (6), 23 + f2 (2)} = max{35, 23 + 16} = 39 if d = 6 x3 (6) = 1








max{f2 (7), 19 + f2 (3)} = max{35, 23 + 19} = 42 if d = 7 x3 (7) = 1

Finally, the instance of 0-1 knapsack has a capacity of b = 7 so at the final item 4 it is sufficient to

compute

f4 (7) = max{f3 (7), 28 + f3 (2)} = max{42, 28 + 16} = 44 x4 (7) = 1

To rebuild the optimal solution, x4 (7) = 1 thus leaving a capacity of 2 to be evaluated at stage 3. Here,

x3 (2) = 0 and f3 (2) = f2 (2). At stage 2, x2 (2) = 0 and f2 (2) = f1 (2). Finally, x1 (2) = 1. The optimal

solution of value 44 is x1 = x4 = 1, x2 = x3 = 0.

Finding the solution of the 0-1 knapsack is equivalent to finding the longest route from node s =(7, 1)

to the auxiliary node t in Figure 13.

104
Figure 14

The network is a grid with a row per capacity unit d = 0, . . . , b and n + 1 column. At node (5,3) we are at

subproblem with capacity 5, items 3 and 4 left to be possibly packed. There is an arc connecting (d, k) and

(d − ak , k + 1) mimicking the recursive formula for selecting the item k and reducing the available capacity by

ak . From (5,3), if item 3 is selected, the arc brings to (1,4) as a3 = 4. Moreover, arcs (d, k) − (d, k + 1) are set

to represent the exclusion of item k from the solution. The length of an arc between (d, k) and (d − ak , k + 1)

corresponds to the value ck gained by selecting item k (23 in the above example on item 3), whereas it is 0

for arcs (d, k) − (d, k + 1). Arcs are missing whenever the transaction among states is not feasible according

to the knapsack capacity or item occupations.

General Knapsack Problem

Let us consider the integer knapsack formulation:

Pn
max j=1 cj xj
Pn
s.t. j=1 aj xj ≤ b

xj ≥ 0 integer ∀j = 1, . . . , n

One way to work follows the previous framework but, when computing fk (d) (item k, capacity d), one

gets a profit equal to cj · b adj c, where the rounded-down term is the maximum number of copies of k that fit

105
in the knapsack current capacity.

Let us now change perspective. While in example 52 stages were associated to items and states were

derived accordingly to the knapsack capacity, we now proceed on the other way. The approach we now

discuss builds up the optimal knapsack by first determining how to fill a small knapsack optimally and then,

using this information, how to fill a larger knapsack optimally. The recursion is the following

g(w) = max {cj + g(w − aj )} ∀w : min {aj } ≤ w ≤ b


j:aj ≤w j=1,...,n

The idea is that to fill a knapsack of capacity w optimally, we try to fill it with all the items j that fits

within. Selected a j, the best value that can be obtained is cj plus the best that can be done on a knapsack

with capacity w − aj . We indicate with x(w) = j if the item j is chosen in the best solution at g(w) and

x(w) = 0 if none of the items is selected. The optimal value of the problem is given by g(b). Finally,

boundary conditions are

g(w) = 0 x(w) = 0 0 ≤ w < min {aj }


j=1,...,n

Example 54 (General knapsack problem).

max 11x1 + 7x2 + 12x3

s.t. 4x1 + 3x2 + 5x3 ≤ 10

xj ≥ 0 integer ∀j = 1, . . . , 3

Firstly, we see that min {aj } = 3. Hence, boundary conditions are g(0) = g(1) = g(2) = 0 as none of the
j=1,...,n
items fit within knapsack of capacities 0,1,2.

For w = 3, only item 2 can be inserted, so g(3) = 7 and x(3) = 2.

Then, for w = 4 
if j = 1

11 + g(0) = 11

g(4) = max
if j = 2

7 + g(1) = 7

thus, the optimal solution for the general knapsack problem with capacity 4 is g(4) = 11 and x(4) = 1.

Continuing with w = 5, we get



if j = 1

11 + g(1) = 11





g(5) = max
 7 + g(2) = 7 if j = 2



12 + g(0) = 12 if j = 3

106
thus, g(5) = 12 and x(5) = 3.

For w = 6, we obtain 
if j = 1




 11 + g(2) = 11


g(6) = max
 7 + g(3) = 7 + 7 = 14 if j = 2



if j = 3

12 + g(1) = 12

thus, g(6) = 14 and x(6) = 2. In the subsequent cases (* is used to identify optimal states):


11 + g(3) = 11 + 7 = 18∗ if j = 1







g(7) = max
7 + g(4) = 7 + 11 = 18∗ if j = 2



if j = 3

12 + g(2) = 12


11 + g(4) = 22∗ if j = 1







g(8) = max
 7 + g(5) = 19 if j = 2



if j = 3

12 + g(3) = 19


11 + g(5) = 23∗ if j = 1







g(9) = max
 7 + g(6) = 21 if j = 2



12 + g(4) = 23∗ if j = 3


11 + g(6) = 25∗ if j = 1






g(10) = max
 7 + g(7) = 25 ∗
if j = 2



if j = 3

12 + g(5) = 24

The optimal solution for the original instance of general knapsack has a value g(10) = 25 where arbitrary

we may choose x(10) = 1 or x(10) = 2. Assume we pick x(10) = 1, then x(10 − 4) = x(6) = 2 at g(6).

Furhtermore, x(6 − 3) = x(3) = 2 at g(3), which leads us to x(3 − 3) = x(0) = 0. The optimal solution is

then made by x1 = 1, x2 = 2, x3 = 0 of value 25.

Shortest Path Problem

Shortest path algorithms exploit relationships among optimal paths (and path lengths) for different pairs of

nodes. We consider the single source shortest path problem. Recall from the previous lecture the functional

107
equations, now specified for the minimization case:

v[s] = 0

v[j] = min {v[i] + cij } ∀j ∈ N \ {s}


i:(i,j)∈A

d[j] = preceding node on the shortest path from s to j

We refer to v[j] as the length of the shortest s − j path. The first equality is the boundary condition, that

is the length of the longest path v[s] is initially 0. Then, at stage (node) j all its predecessors are evaluated

in terms of path length v[i] + cij , where cij is the length of arc (i, j). If no such arc exists, v[j] = ∞ to flag

the unfeasibility. Among all these measures, the shortest one is selected. Finally, the predecessor of j in the

shortest path is stored in d[j].

Functional equations of a dynamic program encode the recursive connections among optimal solution

values that are implied by a principle of optimality: a shortest path can be constructed by extending

known shortest subpaths. In a graph with no negative directed cycles, optimal paths must have optimal

subpaths.

Example 55 (Single source shortest path). Consider the network G = (N, A) depicted in the figure below

and compute the single source shortest path optimal solution from node s = 1.

By applying functional equations

v[1] = 0

v[2] = min{v[1] + 3, v[3] + 8} = 3 d[2] = 1

v[3] = min{v[1] + 15, v[2] + 8, v[4] + 5, v[5] + 11} =? d[3] =?

v[4] = min{v[2] + 6, v[3] + 5, v[5] + 10} =? d[4] =?

v[5] = ∞

The ambiguities that emerge in some stages due to circular dependency (e.g. v[3] calling v[4] and v[4]

depending by v[3]) makes clear that it is not possible to solve the functional equations by one-pass evaluations

108
when there are directed cycles in the network. For this reason, a repeated evaluation scheme must be

devised.

Bellman-Ford Algorithm

The algorithm makes use of multiple evaluations (up to |N |) of each v[j] using results from the preceding

iteration. The algorithm stops, when the results do not change.

v t [j] : value of v[j] obtained at the t-th iteration

d[j] : node preceding j in the best known path from s to j

Algorithm 5 Bellman-Ford Algorithm


(
0 if j = s
1: Initialize v [j] ←
0
, d[s] ← 0
+∞ otherwise
2: t ← 1
3: while t ≤ |N | do
4: for j ∈ N do
5: if there is no incoming arc (i, j) ∈ A then v t [j] = v t−1 [j]
6: else v t [j] ← min {v t−1 [i] + cij } . update distances
i:(i,j)∈A
7: end if
8: if v t [j] < v t−1 [j] then
9: d[j] = argmini:(i,j)∈A {v t−1 [i] + cij } . update closest preceding node
10: if t == |N | then . detecting negative dicycle
11: findNegativeDicycle()
12: return . −∞ optimal value
13: end if
14: end if
15: end for
16: end while

Step 1 initializes the shortest path lengths v 0 [j] to 0 for node s and +∞ for the other nodes. Iteration

counter is set to 1 at step 2. Up to |N | times the algorithm verifies the values v t [j] of each node and

picks the minimum value (step 6) among incoming neighbors of j (i.e., nodes i for which (i, j) exist). If no

such neighbor exists, v t [j] simply takes the previous v t−1 [j] value (step 5). Whenever v t [j] changes from

the previous iteration value v t−1 [j], d[j] is further updated (step 7). The last iteration t = |N | is used

to detect negative cost directed cycle (i.e. dicycle): if there exists a node j whose shortest path distance

is updated (v |N | [j] < v |N |−1 [j]), then a negative dicycle necessarily exists. One can implement a routine

findNegativeDicycle() to identify it. The optimal value diverges to −∞. Thus, the algorithm works on

networks with negative costs but in presence of negative dicycles, although being able to detect them, it

diverges. If no modification occurs at iteration |N |, v |N | [j] contains the length of the shortest directed s − j

109
path which can be rebuilt by exploiting d[j].

If we have a whole iteration without changing any node’s value, no more changes can occur. This further

termination can be added to the pseudocode. Remark that in a graph with n nodes, a path can contain at

most n − 1 arcs. Hence, in at most n − 1 iterations the shortest path solution should be computed. For

this reason, the n-th iteration is set as additional one to detect negative dicycles (we may still diminishing

shortest path lengths when we should have stopped).

Example 56 (Bellman-Ford algorithm). Consider the network in figure below and apply the Bellman-Ford

algorithm.

Firstly, initialize v 0 [1] = 0, v 0 [2] = v 0 [3] = v 0 [4] = +∞ and d[1] = 0 (no predecessor).

Set t = 1 and compute:

v 1 [1] = 0

v 1 [2] = min{v 0 [1] + 5, v 0 [4] + 2} = min{3, +∞} = 5 update d[2] = 1

v 1 [3] = min{v 0 [1] + 8, v 0 [2] − 10, v 0 [4] + 3} = {8, +∞, +∞} = 8 update d[3] = 1

v 1 [4] = +∞ (no incoming arcs)

Increase t = 2 and get:

v 2 [1] = 0

v 2 [2] = min{v 1 [1] + 5, v 1 [4] + 2} = min{5, +∞} = 5 no update

v 2 [3] = min{v 1 [1] + 8, v 1 [2] − 10, v 1 [4] + 3} = {8, −5, +∞} = −5 update d[3] = 2

v 1 [4] = +∞

110
Increase t = 3 and get:

v 3 [1] = 0

v 3 [2] = min{v 2 [1] + 5, v 2 [4] + 2} = min{5, +∞} = 5 no update

v 3 [3] = min{v 2 [1] + 8, v 2 [2] − 10, v 2 [4] + 3} = {8, −5, +∞} = −5 no update

v 3 [4] = +∞

No value as been updated, so we can terminate. Path 1-2 is optimal to reach 2 at cost v 3 [2] = 5 as d[2] = 1.

Path 1-2-3 is optimal to reach 3 at cost v 3 [2] = −5 and is built starting from d[3] = 2. There exists no path

to reach 4 as v 3 [4] = +∞.

Example 57 (Bellman-Ford algorithm: negative dicycle). The network in figure below exhibits a negative

dicycle, which is a cycle whose arcs total cost sum to a negative quantity.

We run the Bellman-Ford algorithm by initializing v 0 [1] = 0 (1 is the source node), v 0 [2] = v 0 [3] =

v 0 [4] = +∞ and d[1] = 0 (no predecessor).

Set t = 1 and compute:

v 1 [1] = 0

v 1 [2] = min{v 0 [1] + 2, v 0 [4] − 25} = min{2, +∞} = 2 update d[2] = 1

v 1 [3] = min{v 0 [1] + 10, v 0 [2] + 3} = {10, +∞} = 10 update d[3] = 1

v 1 [4] = v 0 [3] + 12 = +∞

111
Increase t = 2 and calculate:

v 2 [1] = 0

v 2 [2] = min{v 1 [1] + 2, v 1 [4] − 25} = min{2, +∞} = 2 no update

v 2 [3] = min{v 1 [1] + 10, v 1 [2] + 3} = {10, 5} = 5 update d[3] = 2

v 2 [4] = v 1 [3] + 12 = 22 update d[4] = 3

Increase t = 3 and get:

v 3 [1] = 0

v 3 [2] = min{v 2 [1] + 2, v 2 [4] − 25} = min{2, −3} = −3 update d[2] = 4

v 3 [3] = min{v 2 [1] + 10, v 2 [2] + 3} = {10, 5} = 5 no update

v 3 [4] = v 2 [3] + 12 = 17 update d[4] = 3

Final iteration t = 4 = |N | and obtain:

v 4 [1] = 0

v 4 [2] = min{v 3 [1] + 2, v 3 [4] − 25} = min{2, −8} = −8 update d[2] = 4

v 4 [3] = min{v 3 [1] + 10, v 3 [2] + 3} = {10, 0} = 0 update d[3] = 2

v 4 [4] = v 3 [3] + 12 = 17 no update

Given that we updated shortest path lengths for nodes 2 and 3 in the last iteration, there exists a negative

cost directed cycle.

Example 58 (Bellman-Ford algorithm: example). Find the single source shortest path optimal solution

with s = 1 in the following network.

The optimal path are highlighted by bold red arrows in the next figure.

112
Path 1-2 of cost 2.

Path 1-2-3 of cost 3.

Path 1-2-4 of cost 6.

Path 1-2-5 of cost 4.

Path 1-2-5-6 of cost 6.

Note that the solution composes a shortest path tree.

Minimax path problem

The minimax path problem asks to find a s − t path that minimizes the maximum arc cost used in any path.

The specular problem is the maximin path problem. In this discussion, we assume to work with a directed

and acyclic network G = (N, A). In such network, nodes can be partitioned into subsets and each one will

correspond to a stage of the dynamic programming algorithm (see the figure in example 58 to get the idea).

Nodes in stage k are at k arcs distant from t.

A dynamic programming scheme aims at computing fk (i) which indicates the maximum arc cost of a path

from node i at stage k with the minimum maximum arc cost among all the possible i − t paths (therefore,

following an optimal policy in stages k, k + 1, . . . , t). We use backward recursion. Given that our original

instance asks for an s − t path,the optimal value is f1 (s). Boundary conditions are given for nodes directly

connected with t; that is, if t is at stage m boundary conditions are set for nodes at stage m − 1

fm−1 (i) = cit d(i) = t

which gives the trivial optimal solution for nodes adjacent to t.

By exploiting the principle of optimality, the dynamic programming recursion for k = 0, . . . , m − 2 and

nodes i in stage k is

fk (i) = min {max[cij , fk+1 (j)]}


j:(i,j)∈A

where j may be any node such that there is an arc connecting i and j. For a fixed j, the recursion takes as

113
fk (i) the highest arc cost between the arc (i, j) and the highest cost already identified among arcs in path

j − t (value fk+1 (j)). As usual, we use d(i) to record the node preceding i in the backward recursion to

rebuild the optimal path.

Example 59 (Minimax path).

We want to find an 1 − 10 path in which the maximum altitude (given by the arc weights) is minimized.

Boundary conditions are f4 (7) = 13, f4 (8) = 9, f4 (9) = 9 and d(7) = d(8) = d(9) = 10.

Consider stage m − 2 = 3. Dynamic programming formula gives us



max{c57 , f4 (7)} = max{8, 13} = 13 if j = 7






f3 (5) = min
max{c58 , f4 (8)} = max{7, 8} = 8 if j = 8



max{c59 , f4 (9)} = max{10, 9} = 10 if j = 9

thus f3 (5) = 8 and d(5) = 8;


max{c67 , f4 (7)} = max{8, 13} = 13 if j = 7







f3 (6) = min
max{c68 , f4 (8)} = max{6, 8} = 8 if j = 8



if j = 9

max{c69 , f4 (9)} = max{7, 9} = 9

hence f3 (6) = 8 and d(6) = 8.

Focus on stage 2, the recursion is

f2 (2) = max{c25 , f3 (5)} = max{9, 8} = 9

and d(2) = 5;

f2 (3) = max{c35 , f3 (5)} = max{7, 8} = 8

114
and d(3) = 5; 
max{c45 , f3 (5)} = max{11, 8} = 11 if j = 5


f2 (4) = min
if j = 6

max{c46 , f3 (6)} = max{7, 8} = 8

hence f2 (4) = 8 and d(4) = 6.

Finally, at stage 1 
max{c12 , f2 (2)} = max{10, 9} = 10 if j = 2







f1 (1) = min
max{c13 , f2 (3)} = max{7, 8} = 8 if j = 3



if j = 4

max{c14 , f2 (4)} = max{6, 8} = 8

thus the optimal value is f1 (1) = 8 with d(1) alternatively set to 3 or 4. Assume that d(1) = 3. Then, the

optimal path is 1-3-5-8-10 with an highest arc cost of 8.

The above method does not work on networks with general networks (not acyclic). In this case, the

dynamic programming function is modified into

fki = best path’s highest arc cost met while going from node i to node t among all the paths that use at most k arcs

We define boundary conditions as 


if (i, t) ∈ A

c

it
f1i =
otherwise

∞

and dynamic programming recursion for k = 1, . . . , n − 2 and nodes i ∈ N

i
fk+1 = min {max[cij , fkj ]}
j:(i,j)∈A

The optimal value is reached at fn−1


s
where |N | = n. The support array di is employed with the same

codification.

The new recursion is based on repeated evaluations, similarly to the Bellman-Ford algorithm for the

classic shortest path problem.

Example 60 (Non-additive recursion).

Here is an example where we multiply terms to get the recursion.

A student is currently taking three courses. It is important that he not fail all of them. If the probability

of failing French is p1 , the probability of failing English is p2 , and the probability of failing Statistics is p3 ,

then the probability of failing all of them is p1 × p2 × p3 . He has left himself with four hours to study. How

115
should he minimize his probability of failing all his courses?

The following gives the probability of failing each course given he studies for a certain number of hours

on that subject.

Table 1: Student failure probabilities.

Hours French English Statistics

0 0.8 0.75 0.9

1 0.7 0.7 0.7

2 0.65 0.67 0.6

3 0.62 0.65 0.55

4 0.6 0.62 0.5

We let stage 1 correspond to studying French, stage 2 for English, and stage 3 for Statistics. The state

will correspond to the number of hours studying for that stage and all following stages. Let ft (x) be the

probability of failing t and all following courses, assuming x hours are available. Denote the entries in the

above table as pt (k) , the probability of failing course t given k hours are spent on it.

Boundary conditions are set for the final stage:

f3 (x) = p3 (x) ∀x ≤ 4

The recursion is given as follows:

ft (x) = min {pt (k)ft+1 (x − k)}


k:k≤x

and dt (x) = k is used to record the k value at which the minimum is attained.

Starting the dynamic programming, boundary conditions are f3 (0) = 0.9, f3 (1) = 0.7, f3 (2) = 0.6, f3 (3) =

0.55, f3 (4) = 0.5 and d3 (0) = 0, d3 (1) = 1, d3 (2) = 2, d3 (3) = 3, d3 (4) = 4.

At stage 2, the recursion becomes

f2 (0) = p2 (0)f3 (0) = 0.75 · 0.9 = 0.68


p2 (0)f3 (1) = 0.75 · 0.7 = 0.52 if k = 0


f2 (1) = min
if k = 1

p2 (1)f3 (0) = 0.7 · 0.9 = 0.63

hence f2 (1) = 0.52 and d2 (1) = 0. By iterating, one obtains the following:

116
Table 2: Stage 2 results.

x k=0 k=1 k=2 k=3 k=4 f2 (x) d2 (x)

0 0.75· 0.9 = 0.68 0.68 0

1 0.75· 0.7 = 0.52 0.7·0.9=0.63 0.52 0

2 0.75· 0.6 = 0.45 0.7·0.7=0.49 0.67·0.9=0.60 0.45 0

3 0.75· 0.55 = 0.41 0.7· 0.6=0.42 0.67·0.7=0.47 0.65·0.9=0.59 0.41 0

4 0.75· 0.5 = 0.37 0.7·0.55 = 0.39 0.67·0.6=0.40 0.65·0.7=0.45 0.62·0.9=0.56 0.37 0

Finally, in stage 1 it is sufficient to compute the value for x = 4 (the full study time available):


if k = 0




 p1 (0)f2 (4) = 0.8 · 0.37 = 0.30



if k = 1




 p1 (1)f2 (3) = 0.7 · 0.41 = 0.287


f1 (4) = min
 p1 (2)f2 (2) = 0.65 · 0.45 = 0.293 if k = 2



p1 (3)f2 (1) = 0.62 · 0.52 = 0.32 if k = 3








if k = 4

p1 (4)f2 (0) = 0.6 · 0.68 = 0.41

The overall optimal strategy is given by f1 (4) = 0.287 with d1 (4) = 1. That is, spend one hour on French, no

time on English (d2 (3) = 0) and three on Statistics (d3 (3) = 3). The probability of failing all three courses

is about 29%.

117

You might also like