Introduction to
Stochastic Dynamic
Programming
Sheldon Ross
University of California
Berkeley, California
ACADEMIC PRESS
A Subsidiary of Harcourt Brace Jovanovich, Publishers
New York
Paris
London
San Diego
San Francisco-. Sao Paulo
Sydney
Tokyo
Toronto
Contents
Preface
xi
I. Finite-Stage Models
1.
2.
3.
4.
5.
6.
7.
Introduction
1
A Gambling Model
2
A Stock-Option Model
4
Modular Functions and Monotone Policies
Accepting the Best Offer
11
A Sequential Allocation Model
14
The Interchange Argument in Sequencing
Problems
21
Notes and References 26
17
II. Discounted Dynamic Programming
1.
2.
3.
4.
5.
6.
Introduction
29
The Optimality Equation and Optimal Policy
Method of Successive Approximations
35
Policy Improvement
38
Solution by Linear Programming
40
Extension to Unbounded Rewards
42
Problems
44
References
48
30
III. Minimizing CostsNegative Dynamic Programming
1. Introduction and Some Theoretical Results
2. Optimal Stopping Problems
51
49
viii
CONTENTS
3. Bayesian SequentiaPAnalysis
4. Computational Approaches
5. Optimal Search
63
Problems
68
References
71
58
60
IV. Maximizing RewardsPositive Dynamic Programming
1. Introduction and Main Theoretical Results
2. Applications to Gambling Theory
76
3. Computational Approaches to Obtaining V
Problems
85
Notes and References
88
73
83
V. Average Reward Criterion
1. Introduction and Counterexamples
89
2. Existence of an Optimal Stationary Policy
3. Computational Approaches
98
Problems
103
Notes and References
105
93
VI. Stochastic Scheduling
1.
2.
3.
4.
5.
6.
7.
Introduction
107
Maximizing Finite-Time ReturnsSingle Processor
Minimizing Expected MakespanProcessors in Parallel
Minimizing Expected MakespanProcessors in Series
Maximizing Total Field Life
118
A Stochastic Knapsack Model
122
A Sequential-Assignment Problem
124
Problems
127
Notes and References
129
VII. Bandit Processes
1.
2.
3.
4.
5.
Introduction
131
Single-Project Bandit Processes
131
Multiproject Bandit Processes
133
An Extension and a Nonextension
143
Generalizations of the Classical Bandit Problem
Problems
150
Notes and References
151
145
108
109
114
CONTENTS
Appendix: Stochastic Order Relations
1.
2.
3.
4.
Stochastically Larger
153
Coupling
154
Hazard-Rate Ordering
156
Likelihood-Ratio Ordering
157
Problems
160
Reference
161
Index
163
IX