Stochastic Quadratic Knapsack with recourse
Lisser Abdel∗
∗
Lopez Rafael∗
Hu Xu∗
Laboratoire de Recherche en Informatique, Universite Paris-Sud XI
Bat. 490, Universit Paris-Sud, 91405 Orsay France
Abstract
This paper is dedicated to a study of different extensions of the classical knapsack problem to
the case when different elements of the problem formulation are subject to a degree of uncertainty
described by random variables. This brings the knapsack problem into the realm of stochastic programming. Four different model formulations are proposed, based on the introduction of probability
constraints. The first one is a two-stage quadratic knapsack with recourse (2QKR), and serves as
the base on which we develop the three other models. The second one is a 2QKR with a probability
constraint on the capacity of the knapsack on the first stage. The third one is a 2QKR where we introduce a probability constraint on the capacity of the knapsack in the second stage. Finally, the last
model is also a 2QKR which uses probability constraints on the capacity constraints of both stages.
As far as we know, this is the first time such a constraint has been used in a two-stage model. The
solution techniques are based on the semidefinite relaxations. This allows for solving large instances,
for which exact methods cannot be used.
Keywords: Semidefinite programming, knapsack, stochastic, recourse.
1
Introduction
The knapsack problem (KP) is a well-known and well-studied problem in combinatorial optimization.
Knapsack problems are often used to model industrial situation, financial decisions or network design
problems. They may also appear as sub-problems of larger or more complex problems. The most famous
form of KP is the single constraint binary version: we are given N items, with return pi for the item i, and
a weight wi for the item i, with i = 1, ..., N , and a knapsack capacity c. The problem is to select a subset
of items so that the weight of the subset does not exceed c, and gives a maximal total return. In this
form, the problem is known to be N P -Hard [GJ79], and has been intensively studied in the past decades,
and several exact and approximate algorithms are known for this problem. In particular, it admits a
FPTAS [IK75]. For the quadratic knapsack problem, a survey done by David Pisinger [Pis07] gives
detailed information on the problem and a number of results on the performance of various relaxations
and algorithms used to solve or approximate the problem.
In the case of modeling financial decisions, transportation, or production plans, however, this formulation shows its limitations, since it does not take into account uncertainty on the problem parameters,
such as the prices pi or the weights wi . Similarly, such decisions are not static, and this model cannot
take into account new information available on the prices or the weights. There have been several studies
done on the stochastic knapsack problem in the past years: work has been done to find heuristics [CB98],
approximation algorithms [KP98], [DGV04], [SS06]. . .
Stochastic knapsack problems can reach a number of binary variables and constraints of such magnitude that commercial packages cannot find a solution in a reasonable time or memory space, requiring
1
the use of linear relaxations to find an upper bound on the problem. While linear relaxations were successful for many combinatorial optimization problems, it turns out that knapsack problems, especially
their quadratic formulation, can not be approximated tightly by linearization based methods. Stronger
relaxation methods, namely semidefinite relaxations (called SDP thereafter), have turned out to be particularly interesting for such combinatorial optimization problems [GW95], [HR98]. See also [Pis07] for
a survey on the quadratic knapsack problem.
More precisely, semidefinite programming is a recent development of convex optimization, which deals
with optimization problems over symmetric positive semidefinite matrices with linear cost function and
linear constraints. Groetschel and al. showed that semidefinite optimization problems can be solved in
polynomial time [GLS88].
In this paper we present the two-stage quadratic knapsack with recourse that is used as the base
for three variants of stochastic optimization problems : the first one is a two-stage quadratic knapsack
with probability constraint on the capacity in the first stage. The probability constraint is used to
model the risk we are willing to take when making our initial decision. We only know some information
about the weights of the items, but we have to take a decision with this limited knowledge at the risk of
breaking the capacity constraint with the knowledge that a second stage decision (the recourse) will come
after, allowing to correct the decision to some extent. In this paper, we consider a two-fold recourse,
in the sense that items can be removed if they turn out to be suboptimal or if their weight appears to
be too heavy, or added if they appeared to be uninteresting at first, but reveal to be desirable. The
second model is a two-stage quadratic knapsack with probability constraint on the second stage capacity
constraint. In this model, we also have a two-stage formulation which models a situation where we
make a decision with limited information, but where the second stage decision is made after receiving
additional (but still incomplete) information about the weight, or prices. This allows to modify initial
decisions. In this model, since at the second step, some information is still unknown, we face a risk of
breaking the capacity constraint when taking the second stage decision. Finally, the last model combines
the probability constraints on the capacity constraint in both stages. In this model, we are willing to
take a risk in all the decisions.
This paper is organized as follows: first, we present the static knapsack problem with probability
constraint wherein we give a presentation of the problem. Then we reformulate this problem under the
form of an SDP problem and detail two different relaxations. We then present the stochastic knapsack
problem with recourse. Finally, concluding remarks are given.
2
Two-stage quadratic knapsack problem
The initial decision is made during the first stage before knowing the realizations of the random variables.
Then these realizations are (partially) revealed and the second stage decision is made which corrects the
first stage decision, taking into account this information. This modeling scheme is known as stochastic
program with recourse. Moreover, we introduce probability constraints in the second stage. To the best
of our knowledge, this formulation has not been studied in the literature. We start by formulating the
general quadratic stochastic program with recourse with probability constraint in the second stage. A
generic quadratic stochastic problem with recourse can be modeled as follows:
max xT Cx + Eω Q(u, ω)
(1)
Rx ≤ s
(2)
x∈{0,1}
where (2) models generic linear constraints, with R ∈ Rm×n and s ∈ Rm . The second stage value is given
by the solution of the problem:
Q(u, ω) = max uT D(ω)u
(3)
u
W (ω)u + T (ω)x ≤ h(ω)
2
(4)
In this model, the uncertainty is described by the probability vector ω with a given probability distribution. When the initial decision is made, ω is unknown. After this decision, part of the information is
revealed. This corresponds to the realization of random vector ω. The second stage decision can then be
taken, with knowledge of ω and of the first stage decision x. More details can be found in [AG08].
First stage decision
We now need to adapt this generic model to our knapsack problem. We assume we have n items, and
each item is characterized by its value cii , and weight wi , i = 1 : n. Each item pair is characterized
in the same manner by its value cij . The objective is to maximize the value of the items contained in
the knapsack, with the constraint that it has a limited capacity d. The selection of an item during the
first stage is defined by a binary decision variable xi which takes value 1 if the item i is included in the
selection and 0 otherwise. The formulation of this first stage decision of the problem is the following:
max
x
N X
N
X
cij xi xj + Eω Q(u, ω)
(5)
i=1 j=1
N
X
wi xi ≤ d
(6)
i=1
The constraint (6) describes the knapsack capacity constraint. Equation (5) consists of two parts: the
value of the knapsack during the first stage, and the expected value of the same knapsack during the
second stage. This expected value depends on the items selected in the first stage (the vector x), and the
realization of the random vector ω.
Second stage decision
After the first stage decision is made, the values of items may change, as well as their weight. During
the second stage, the item i has the value bii (ω) and the weight vi (ω). Each item pair (i, j) has the value
bij (ω). Similarly to the first stage, there is a constraint on the capacity of the knapsack, which is subject
to change too. We note the new capacity h(ω). The realization of vector ω is known before making the
second stage decision.
The second stage decision allows to change the initial decision in order to correct mistakes which
appear after extra information is known. There are two possibilities: first, an item which was selected
during the first stage can be removed. In this case, we describe this decision by a binary variable u−
i set
to 1 if item i is removed from the knapsack, 0 otherwise. Likewise, an item that was previously rejected
can be selected. In this case, we use a binary variable ui , set to 1 if we select the item i in the second
stage, 0 otherwise. Note that if item i was selected during the first stage, then if it is not removed during
the second, it is considered selected again (i.e. ui = 1) in the second stage. When an item i is removed,
it comes at a cost, which includes penalties, such as time or manipulation costs necessary to reorganize
the knapsack. This allows us to formulate the second stage decision as follows:
Q(u, ω) = max
−
u,u
N
X
N X
N
X
bij (ω)ui uj −
i=1 j=1
N X
N
X
− −
b−
ij (ω)ui uj
(7)
i=1 j=1
ui ≥ xi − u−
i , i=1:n
(8)
u−
i ≤ xi , i = 1 : n
(9)
vi (ω)(ui + xi − u−
i ) ≤ h(ω)
i=1
3
(10)
Equation (7) is our objective: we want to maximize the value of the knapsack where we deduce the
cost of removing items previously selected. The two constraints (8) and (9) link the first stage and second
stage decisions: the first one means that if an item i was selected during the first stage and not deselected,
then it is necessarily considered selected during the second stage. Conversely, the constraint (9) means
that only an item that was selected during the first phase can be deselected. Constraint (10) represents
the capacity constraint in the second stage.
This model serves as the base on which we build stochastic extensions: first we introduce a probability
constraint on the first stage of the problem, in the first sub-section, then in the second sub-section we look
at the model where the probability constraint is on the second stage. Finally, in the third sub-section,
we combine probability constraints in both stages.
3
Probability constraints in the two-stage quadratic knapsack
In order to model the risk, we need to introduce probability constraints. Taking a risk in a two-stage
decision process can happen in either or both of the stages, therefore leading to three variants of the
two-stage quadratic knapsack.
Two-stage quadratic knapsack problem with probability constraint in the first
stage
In order to model risk-taking in the first stage decision, we have to replace the capacity constraint (6)
with a probability constraint.
)
(N
X
wi (φ)xi ≤ d ≥ (1 − α1 )
(11)
P
i=1
where φ is a probability vector with a given known distribution representing the uncertainty on the
weights of the items, and α1 is the risk we take of ignoring the capacity constraint. This case corresponds
to a situation in which we must make a first decision under very limited knowledge (on the weights and
on the future), knowing that we will be able to correct the decision in a second stage.
Two-stage quadratic knapsack problem with probability constraint in the second stage
Similarly to the case with a probability constraint in the first stage, the base problem (5)-(10) can be
modified to allow risk-taking in the second stage decision. We have to replace constraint (10). This time,
we model the added uncertainty on a second probability vector ψ, conditioned on the vector ω. While
the realization of vector ω is known before making the second stage decision, only the distribution of ψ
is known, conditioned on ω.
)
(N
X
vi (ω, ψ)(ui + xi − u−
≥ (1 − α2 )
(12)
P
i ) ≤ h(ω, ψ)|ω
i=1
This situation corresponds to the case where information about the future is very limited. In fact,
the distribution of weights in the second stage is not known until after ω is realized.
Two-stage quadratic knapsack problem with probability constraint in both
stages
Finally, the last variant combines risk taking in both stages. This combines the limited information
of both cases above: the first decision is taken with uncertainty about the first stage weights, then
information is revealed, but is still incomplete, which leads to a second stage decision under uncertainty.
In this case the problem becomes:
4
max
x
P
N X
N
X
cij xi xj + Eω Q(u, ω)
(13)
i=1 j=1
(N
X
wi (φ)xi ≤ d
i=1
Q(u, ω) = max
u,u−
N X
N
X
)
≥ (1 − α1 )
bij (ω)ui uj −
i=1 j=1
N X
N
X
(14)
− −
b−
ij (ω)ui uj
ui ≥ xi − u−
i , i=1:n
u−
i
P
(N
X
(15)
i=1 j=1
(16)
≤ xi , i = 1 : n
(17)
vi (ω, ψ)(ui + xi − u−
i ) ≤ h(ω, ψ)|ω
i=1
)
≥ (1 − α2 )
(18)
As we showed in [AG08], formulation (5)-(10) of the two-stage quadratic knapsack problem is quite
general and covers many specific cases. Our formulation (5)-(10) is general enough to allow the modeling
of additional characteristics which do not appear explicitly in the problem description. The main such
feature is that this model allows a different composition of the allowable set of items during the first and
the second stage. Since the weights of the items and the capacity of the knapsack may change (based on
the realization of ω) between the first and second stage, it allows us to model cases where some items
may not be allowed during either stage. For example, an item could be unavailable during the first
stage and only available during the second stage, in which case it would have wi > d. This also allows
having different set of items for the two stages, which is traditional in recourse models: items which
can only be decided on in the first phase will have a deselection cost b−
ii set to a very large number (to
prevent deselection), and a second stage value bii , bij ≤ 0 (which means selection in the second phase is
suboptimal, since it only affects the weight and does not increase the profit).
We will now work with this problem and reformulate it in order to be able to use resolution techniques
like SDP. The first step of the reformulation is to rewrite the problem into deterministic equivalent
problems. Then we will be able to use semidefinite relaxations on the deterministic equivalent problems.
4
Deterministic equivalent problem
In order to rewrite the stochastic quadratic knapsack with recourse into a deterministic form, we need
to consider the case when the distribution of the random vectors φ, ω and ψ is concentrated in a finite
number of points. We assume that the random vector ω is concentrated in the finite number of points
ωk , k = 1 : K with probabilities pω
k . We will refer to these points as scenarios. In this case the problem
(5)-(10) can be rewritten as follows. The objective function in the first stage becomes:
N X
N
K
X
X
max
cij xi xj +
pω
(19)
k Q(u, k)
x
i=1 j=1
k=1
with the second stage decision becoming:
N X
N
N X
N
X
X
− −
bijk uik ujk −
b−
Q(u, k) = max
ijk uik ujk
−
u,u
i=1 j=1
(20)
i=1 j=1
The second stage constraint capacity also changes:
N
X
vik (uik + xi − u−
ik ) ≤ hk
i=1
5
∀k = 1 : K
(21)
where
−
Q(u, ωk ) = Q(u, k), bij (ωk ) = bijk , b−
ij (ωk ) = bijk , vi (ωk ) = vik , h(ωk ) = hk .
Substituting (20) into (19) and collecting constraints for each scenario we obtain an equivalent problem:
N X
N
K
N X
N
N X
N
X
X
X
X
− −
max
cij xi xj +
pω
bijk uik ujk −
(22)
b−
k
ijk uik ujk
−
x,uik ,uik
i=1 j=1
i=1 j=1
k=1
N
X
i=1 j=1
wi xi ≤ d
(23)
i=1
uik ≥ xi − u−
ik , i = 1 : n, k = 1 : K
(24)
u−
ik ≤ xi , i = 1 : n, k = 1 : K
(25)
N
X
vik (uik + xi − u−
ik ) ≤ hk , k = 1 : K
(26)
i=1
Probabilistic constraints
Probability constraints (11) and (12) are reformulated as deterministic equivalent constraints. Suppose
that the random vector φ (resp. ψ) is concentrated in the finite number of points φl , l = 1 : L (resp.
ψkr , k = 1 : k, r = 1 : R) with probabilities pφl (resp. pψ
kr ) such that
L
X
pφl = 1, pφl ≥ 0
l=1
R
X
ψ
pψ
kr = 1, pkr ≥ 0, k = 1 : K
r=1
Then constraint (11) is equivalent to the pair:
N
X
wil xi ≤ dl , l ∈ Γ
i=1
X
pkr ≥ 1 − α1 , k = 1 : K
l∈Γ
where wil = wi (φ), dl = d(φ) and Γ is a subset of scenarios of set {1, ..., L} where the capacity constraint
is satisfied, while the set {1, ..., L} \Γ corresponds to the scenarios where risk is taken. These constraints
ψ
can be reformulated as binary constraints by introducing the auxiliary binary variable ylφ (resp. ykr
) for
each scenario l = 1 : L (resp. observation r = 1 : R and scenario k = 1 : K) as follows:
½
0
if
l∈Γ
ylφ =
1 otherwise
This yields the following deterministic equivalent constraints:
N
X
wil xi ≤ dl + Mlφ ylφ
i=1
L
X
pφl ylφ ≤ α1
l=1
6
(27)
where Mlφ is an arbitrary number such that
Mlφ ≥
N
X
wil − dl
i=1
In the same manner, constraint (12) is equivalent to:
N
X
vikr (uik + xi − u−
ik ) ≤ hkr , r ∈ Λk , k = 1 : K
i=1
X
pkr ≥ 1 − α2 , k = 1 : K
r∈Λk
where vikr = vik (ψkr ), hkr = hk (ψkr ) Λk is a subset of {1, ..., R}. Again we can reformulate them using
binary variables:
½
0
if
r ∈ Λk
ψ
ykr
=
1 otherwise
yielding the following deterministic equivalent constraints:
N
X
ψ ψ
vikr (uik + xi − u−
ik ) ≤ hkr + Mk ykr , r = 1 : R, k = 1 : K
i=1
R
X
ψ
pψ
kr ykr ≤ α2 , k = 1 : K
(28)
r=1
where Mkψ is an arbitrary number such that
Mkψ
5
≥ max
r
N
X
vikr − hkr
i=1
Conclusion
In this paper we detailed the models for two-stage quadratic knapsack problems with recourse, on which
we introduced probability constraints on the first stage, second stage, or both. Currently, numerical
experiments are underway to complete those presented in [AG08]. Furthermore, these experiments add a
distinction between models where the deselection of items is allowed – like presented in this paper – and
the case where items can only be added, and not removed. These experiments compare the problem size
and result between a MIP formulation solved by CPLEX, a LP relaxation, and various SDP relaxations
for various sizes of instances of each variation. In particular, we show that these problems, even for low
values of the size parameters (number of items, number of scenarios), quickly grow into large numerical
problems where exact resolution is impossible. When this happens, we look at the improvements brought
by the SDP relaxations over the LP relaxation.
References
[AG08]
R. Lopez A. Gaivoronski, A. Lisser. Knapsack problem with probability constraints. Technical
Report RR1498, LRI, July 2008.
[CB98]
Amy Mainville Cohn and Cynthia Barnhart. The stochastic knapsack problem with random
weights: A heuristic approach to robust transportation planning. In Proceedings from TRISTAN
III, San Juan, Puerto Rico, 1998.
7
[DGV04] Brian C. Dean, Michel X. Goemans, and Jan Vondrak. Approximating the stochastic knapsack
problem: The benefit of adaptivity. In FOCS ’04: Proceedings of the 45th Annual IEEE
Symposium on Foundations of Computer Science, pages 208–217, Washington, DC, USA, 2004.
IEEE Computer Society.
[GJ79]
M. R. Garey and D. S. Johnson. Computer and Intractability. W. H. Freeman and Company,
New York, 1979.
[GLS88] Martin Grötschel, Lászlo Lovász, and Alexander Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer, 1988.
[GW95] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145,
1995.
[HR98]
C. Helmberg and F. Rendl. Solving quadratic (0,1)- problems by semidefinite programs and
cutting planes. Mathematical Programming, 82:291–315, 1998.
[IK75]
Oscar H. Ibarra and Chul E. Kim. Fast approximation algorithms for the knapsack and sum
of subset problems. J. ACM, 22(4):463–468, 1975.
[KP98]
Anton J. Kleywegt and Jason D. Papastavrou. The dynamic and stochastic knapsack problem.
Oper. Res., 46(1):17–35, 1998.
[Pis07]
David Pisinger. The quadratic knapsack problem-a survey. Discrete Appl. Math., 155(5):623–
648, 2007.
[SS06]
David B. Shmoys and Chaitanya Swamy. An approximation scheme for stochastic linear programming and its application to stochastic integer programs. J. ACM, 53(6):978–1012, 2006.
8