0% found this document useful (0 votes)
41 views52 pages

ESI 4313 Operations Research 2: Dynamic Programming

Dynamic programming is a technique for solving optimization problems by breaking them into smaller subproblems. It works by solving subproblems from the end stages backwards to the initial stages. This example shows how to model an optimization problem of finding the minimum cost route from NYC to LA as a dynamic programming problem by defining stages as days of travel, states as locations, decisions as daily travel choices, and a recursion to solve the subproblems. Dynamic programming can solve many types of optimization problems, including production and inventory planning problems, more efficiently than enumerating all possible solutions.

Uploaded by

Ruchi Kashyap
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views52 pages

ESI 4313 Operations Research 2: Dynamic Programming

Dynamic programming is a technique for solving optimization problems by breaking them into smaller subproblems. It works by solving subproblems from the end stages backwards to the initial stages. This example shows how to model an optimization problem of finding the minimum cost route from NYC to LA as a dynamic programming problem by defining stages as days of travel, states as locations, decisions as daily travel choices, and a recursion to solve the subproblems. Dynamic programming can solve many types of optimization problems, including production and inventory planning problems, more efficiently than enumerating all possible solutions.

Uploaded by

Ruchi Kashyap
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

ESI 4313

Operations Research 2

Dynamic Programming
Dynamic Programming
 Dynamic programming is a technique for
solving certain types of optimization
problems
 The idea is to break up a large, complex
problem into many smaller, much easier ones
 Usually, this technique can be applied to
problems in which a sequence of decisions
over time needs to be made to optimize
some criterion
Dynamic Programming
 In many cases, solving a problem by
dynamic programming means
 formulating this problem as a shortest path
problem in an acyclic network
 The art of dynamic programming lies in
how to construct this network!
Example:
Travel from coast to coast
 You currently live in NYC (1), but plan to move
to LA (10)
 You will drive
 To save money, you will spend each night of your
trip at a friend’s house
 You structure your potential stopovers as follows:
 In 1 day you can reach Columbus (2), Nashville (3), or
Louisville (4)
 On the 2nd day, you can reach Kansas City (5), Omaha (6),
or Dallas (7)
 On the 3rd day, you can reach San Antonio (8) or Denver (9)
 On the 4th day, you can reach LA
 To minimize your gas expenses, you are looking for
the route of minimum length
Example:
Travel from coast to coast
 We can classify the cities as follows:
 Call all cities that you can be in at the
beginning of your nth day the Stage n cities
 The idea of solving this problem by
dynamic programming is to start by
solving easy problems that will eventually
help you solve the entire problem
 In particular, we will work backward
Example:
Travel from coast to coast
 Denote the distance between city i and
city j by ci,j
 If city i is a stage t city, we denote the
length of the shortest path from city i to
LA by ft(i)
 Clearly, we would like to find f0(1)
Example:
Travel from coast to coast
 First, find the shortest path to LA from
each of the cities from which you can
reach LA in 1 day – the stage 4 cities
 Note that these problems are trivial, since in
each case there’s only 1 way to go to LA
 More formally,
 f4(8) = c8,10
 f4(9) = c9,10
Example:
Travel from coast to coast
 Then, find the shortest path to LA from
each of the stage 3 cities
 Note that this means that you should first go
to a stage 4 city, and then use the shortest
path from this stage 4 city to LA
 These problems are not as trivial as the first
ones, but by simply looking at all possible city
4 problems and the solutions to the first set
of problems this remains relatively easy
Example:
Travel from coast to coast
 From each stage 3 city
 go to a stage 4 city, and then use the
shortest path from this stage 4 city to LA
 So, for example, f3(5) is equal to
 c5,8 + f4(8), or
 c5,9 + f4(9)
 Since we’re interested in the shortest
path, we have
 f3(5) = min{c5,8 + f4(8) , c5,9 + f4(9)}
Example:
Travel from coast to coast
 Perform the same procedure for the stage
2 cities
 Perform the same procedure for the stage
1 city, NYC
 From NYC you should first go to a stage 2
city, and then use the shortest path from this
stage 2 city to LA
 We can find the best route from NYC to LA
by considering all possible stage 2 cities
Example:
Travel from coast to coast
 In general, in stage t we are interested in
finding ft(i) for all stage t cities i
 Using the earlier approach, we can write
 ft(i) = minj: j is a stage t+1 city {ci,j + ft+1(j) }
for all stage t cities i
Computational efficiency of
dynamic programming
 In the example, we could simply enumerate all
possible paths from NYC to LA
 It is easy to see that there are 3x3x2=18 paths
 However, suppose that we have more options:
 Starting city is again stage 1
 5 cities in each of 5 stages (stages 2,…,6)
 Destination city is stage 7
 Then there are 55=3,125 paths
 Determining the length of each of these paths takes
a total of 5x55 = 15,625 additions and 3,124
comparisons
Computational efficiency of
dynamic programming
 How much work is the dynamic
programming algorithm?
 The stage 6 problems are trivial
 Each of the other problems require
 5 additions (potential choices for next city to
visit) and 4 comparisons
 For a total of 4x5x5 + 5 = 105 additions and
4x5x4 + 4 = 84 comparisons
Characteristics of dynamic
programming
 The problem should have stages
 Each stage corresponds to a point at which a
decision needs to be made
 Each stage should have a number of
associated states
 The state contains all information that is
needed to make an optimal decision for the
remaining problem
Characteristics of dynamic
programming
 The decision chosen at each stage
describes how the state at the current
stage is transformed in the state at the
next stage
 The optimal decision at the current state
should not depend on previously visited
states or previous decisions
 This is called the principle of optimality
Characteristics of dynamic
programming
 There must be a recursion that relates
the cost or reward for stages t, t+1, …, T
to the cost or reward for stages t+1, t+2,
…, T
 This recursion formalizes the procedure of
working backwards from the last stage to the
first stage
Dynamic programming
formulation
 Stages: t =1,…,5
 States: city
 Decision in each stage:
 Choose the stage t+1 city to go to
 Dynamic programming recursion:
 f4(i) = ci,10 for all stage 4 cities i
 ft(i) = minj: j is a stage t+1 city {ci,j + ft+1(j) }
for all stage t cities i
Dynamic programming
without stages
 You must drive from Bloomington to
Cleveland
 You are interested in the route that takes
the least amount of time
Dynamic programming
without stages

Gary Toledo Cleveland


3 hours 3 hours

2 hours 1 hour 3 hours

Indianapolis Dayton Columbus


3 hours 2 hours

1 hour 2 hours
2.5 hours
Bloomington Cincinnati
3 hours
Production & inventory
planning
 Consider the following production & inventory
planning problem for a single item:
 Consider a planning period of T periods, and assume
that
 the demand for the item in each of the periods is known
 the initial inventory level is known
 At the start of each period, you must decide how
many units to produce; production capacity is limited
 Each period’s demand must be met on time
 There is a limited amount of storage space available
 The goal is to minimize the total production &
inventory costs over the planning horizon
Production & inventory
planning
 This is a periodic review model
 Denote the demand in period t by dt
(t =1,…,T )
 Denote the cost of producing x units in
period t by ct(x) (often, this function is
independent of t, i.e., ct(x)=c(x) )
 If at the end of period t the inventory
level is I, a cost of ht(I) is charged (often,
these costs are independent of t, i.e.,
ht(I)=h(I) )
Production & inventory
planning
 If the production and inventory holding
cost functions are linear, we can
formulate this problem as an LP problem
(how?)
 Often, the production costs are assumed
to have a fixed-charge structure:
 c(x) = 0 if x = 0, c(x) = a + bx if x > 0
 In that case, we can formulate this problem
as a mixed-integer LP problem (how?)
Production & inventory
planning
 More generally, we can formulate this
problem as an NLP problem (how?)
 Production (and inventory) costs are
often assumed to be concave – reflecting
economies of scale
 What does that mean for the ease of
solvability of the NLP problem?
Production & inventory
planning
 NLP formulation:
T T
min  ct ( xt )   h (I )
t t
t 1 t 1

subject to
It 1  xt  dt  It t  1,..., T
0  It  B t  1,..., T
0  xt  C t  1,..., T
Production & inventory
planning
 Dynamic programming provides a solution
methodology that can be applied for
general cost functions
 We only need to assume that the units of
demand, production, and inventory are
integers – which is not unrealistic in many
practical situations
 This methodology will be efficient if the
magnitude of the numbers involved is not too
large
Production & inventory
planning
 We must identify:
 Stages
time: t  1,..., T
 States
(starting) inventory level: I  0,..., B
 Decisions
production quantity: x  0,..., C
 Recursion
minimal cost from start of stage t : ft (I )
 Clearly, we are looking for f1(I0)
Production & inventory
planning
 Recursion:
 Cost at the beginning of stage T :

fT (I )  min  cT (x)  hT (I  x  dT )
0 x C

 Note that you will always want to end up


with 0 inventory – so the final period’s
production will be x  dT  I
 So: fT (I )  cT (dT  I )
 (what if dT-I < 0 or dT-I > C ???)
Production & inventory
planning
 Recursion:
 Cost at the beginning of stage t :
ft (I )  min
max(0,d  I ) x min(C ,d  B  I )
 ct ( x)  ht (I  x  dt )
 ft 1(I  x  dt )
t t

 We have to make sure that we have


sufficient storage capacity, i.e., we need
I  x  dt  B or x  dt  B  I
 We have to make sure that we deliver on
time I  x  d  0 or x  d  I
t t
Production & inventory
planning example
 Example:
 T = 4 periods
 Demands: 1, 4, 2, 3
 Inventory holding costs: $0.50 per unit
 Production costs:
 fixed setup cost $3
 variable cost $1 per unit
 Production capacity C = 5 units
 Inventory capacity B = 4 units
Production & inventory
planning
 Initialization: T = 4
 d4 = 3
 Cost at the beginning of stage T = 4 :
fT (I )  cT (dT  I )
 I = 0: f4(0) = c4(3-0) = 3 + 31 = 6
 I = 1: f4(1) = c4(3-1) = 3 + 21 = 5
 I = 2: f4(2) = c4(3-2) = 3 + 11 = 4
 I = 3: f4(3) = c4(3-3) = 0
 I = 4: f4(4) = c4(3-4) = ???  0
Production & inventory
planning
 Next stage: t = 3
 d3 = 2
 Cost at the beginning of stage t = 3:

ft (I )  min
max(0,dt  I ) x min(C ,dt  B  I )

 ct (x)  ht (I  x  dt )  ft 1(I  x  dt )
f3 (I )  min
max(0,2  I ) x min(5,2  4  I )

 c3( x)  h3(I  x  2)  f4(I  x  2)


Production & inventory
planning
 I = 0:

f3 (0)  min
max(0,2 0) x min(5,2  4 0)

 c3(x)  h3(0  x  2)  f4(0  x  2)

f3 (0)  min  c3 (x)  12 (x  2)  f4 ( x  2)


2 x 5
Production & inventory
planning
 I = 0:
f3 (0) 
x  2: 2  1 12  2  f4 (0)  5  6  11

x  3: 2  1 12  3  f4 (1)  5 12  5  10 12
min 
x  4: 2  1 12  4  f4 (2)  6  4  10
 x  5: 2  1 12  5  f4 (3)  6 12  0  6 12
Production & inventory
planning
 I = 2:
f3 (2)  min
max(0,2 2) x min(5,2  4 2)

 c3(x)  h3(2  x  2)  f4(2  x  2)


f3 (2)  min  c3 (x )  1
2
x  f4 (x )
0 x  4


f3 (2)  min f4 (0),min  3  1 12 x  f4 ( x)
1 x  4

Production & inventory
planning
 I = 2:
f3 (2) 
x  0 : f4 (0)  6
x  1 : 3  1 12  1  f4 (1)  4 12  5  9 12

min  x  2 : 3  1 12  2  f4 (2)  4 12  4  8 12
x  3 : 3  1 12  3  f4 (3)  4 12  0  4 12

 x  4 : 3  1 12  4  f4 (4)  4 12  0  4 12
Production & inventory
planning
 Network representation:
 Nodes: stage/state combinations (t,I)
 Arcs: decisions x
 Arc from node (t,I) corresponding to decision
x leads to node (t+1,I+x-dt)
 Cost of this arc is ct(x) + ht(I+x-dt)
Resource allocation:
the knapsack problem (1)
 Stockco is considering 4 investments
 Investment 1 will yield a NPV of $16K, but
requires a cash outflow of $5K
 Investment 2 will yield a NPV of $22K, but
requires a cash outflow of $7K
 Investment 3 will yield a NPV of $12K, but
requires a cash outflow of $4K
 Investment 4 will yield a NPV of $8K, but
requires a cash outflow of $3K
 You have a budget of $14K
Resource allocation:
the knapsack problem (1)
 IP formulation:
4
max  NPVi xi  16 x1  22 x2  12 x3  8 x4
i 1

subject to
4

C x
i 1
i i  5x1  7 x2  4 x3  3x4  14

xi  {0,1} i  1,..., 4
Resource allocation:
the knapsack problem (2)
 You are planning an overnight hike, and are
considering taking 4 items along on your trip
 Item 1 yields a “benefit” of 16, but weighs 5 lbs
 Item 2 yields a “benefit” of 22, but weighs 7 lbs
 Item 3 yields a “benefit” of 12, but weighs 4 lbs
 Item 4 yields a “benefit” of 8, but weighs 3 lbs
 You do not want to carry more than 14 lbs
 You want to maximize your “benefit”
 Mathematically, this is the same problem as the
investment problem!
Resource allocation:
more general
 Stockco is considering n investments
 Investment n will yield a NPV of rn(dn) when
dn$1,000 is invested
 You only want to (or can) invest in integer multiples of
$1,000
 You have a budget of B  $1,000
 Example
 n = 3, B = 6
 r1(d1) = 7d1+2 (d1>0), r1(0) = 0
 r2(d2) = 3d2+7 (d2>0), r2(0) = 0
 r3(d3) = 4d3+5 (d3>0), r3(0) = 0
Resource allocation:
more general
 NLP formulation:
n
max  ri (di )
i 1

subject to
n

d
i 1
i B

di  {0,1,...} i  1,..., n
Resource allocation
 To formulate this problem as a DP problem, we
must identify:
 Stages
investment categories: i  1,2,3
 States
budget available: y  0,1,...,6
 Decisions
investment amount: d  0,1,...,6
 Recursion
maximal return from inv. categories i,…,3 : fi (y )
 Clearly, we are looking for f1(6)
Resource allocation
 Recursion:
 Return from investment in category 3 only:
f3 (y )  max r3 (d )
0 d  y
 Note that you will always invest all remaining
budget in category 3 at this stage, i.e., d=y

0 if y  0
f3 (y )  r3 (y )  
4y  5 if y  1,...,6
Resource allocation
 Recursion:
 Return from investment in categories 2 and 3:
f2 (y )  max  r2 (d )  f3(y  d )
0 d  y
 These subproblems are a little harder…
 y=0: f2(0) = 0
 y=1: f2(1) =max(r2(0)+f3(1),r2(1)+f3(0))
=max(0+9,10+0) = 10
 y=2: f2(2) =
max(r2(0)+f3(2),r2(1)+f3(1),r2(2)+f3(0))
=max(0+13,10+9,13+0) = 19
Resource allocation
 Network representation:
 Nodes: stage/state combinations (i,y)
 Arcs: decisions d
 Arc from node (i,y) corresponding to decision
x leads to node (i+1,y-d)
 Return of this arc is ri(d)
Resource allocation:
even more general
 NLP formulation:
n
max  ri (di )
i 1

subject to
n

 g (d )  B
i 1
i i

di  {0,1,..., Ui } i  1,..., n
 Find the DP formulation for this general
case
Equipment replacement
problem
 A company faces the problem of how long a
machine should be utilized before it should be
traded in for a new one
 Example
 A new machine costs p=$1,000, and has a useful
lifetime of 3 years
 Maintaining a machine during its first 3 years costs
m1=$60, m2=$80, m3=$120, respectively
 If a machine is traded in, a salvage value is
obtained: s1=$800, s2=$600, and s3=$500,
respectively, after the first 3 years
Equipment replacement
problem
 We currently have a y year old machine
 Find a policy that minimizes total net costs
over the next 5 years
Equipment replacement
problem
 To formulate this problem as a DP problem, we
must identify:
 Stages
time: t  0,1,...,5
 States
age of machine: y  0,1,2,3
 Decisions
keep or trade-in: d  0,1
 Recursion
minimal net cost after period t : ft (y )
 Clearly, we are looking for f0(y)
Equipment replacement
problem
 Recursion:
 Note that you will always salvage the
machine at the end of year 5:
 Net cost after period 5:
f5 (y )  sy y  1,2,3
Equipment replacement
problem
 Recursion:
 At the end of period t < 5, you must decide
whether to keep or trade-in the machine
 If y=3, you must trade it in

ft (3)  s3  p  ft 1(1) t  0,1,..., 4


 If y<3, you have a real choice:
 x  0 : my  ft 1(y  1)
ft (y )  
 x  1 :  sy  p  ft 1(1)
y  1,2; t  0,1,..., 4
Equipment replacement
problem
 Network representation:
 Nodes: stage/state combinations (t,y)
 Arcs: decisions x
 Arc from node (t,y) corresponding to decision
 x=0 leads to node (t+1,y+1)
 x=1 leads to node (t+1,1)
 Return of the arc is
 my when x=0
 – sy + p when x=1

You might also like