Academia.eduAcademia.edu

Stochastic Quadratic Knapsack with Recourse

2010, Electronic Notes in Discrete Mathematics

This paper is dedicated to a study of different extensions of the classical knapsack problem to the case when different elements of the problem formulation are subject to a degree of uncertainty described by random variables. This brings the knapsack problem into the realm of stochastic programming. Four different model formulations are proposed, based on the introduction of probability constraints. The first one is a two-stage quadratic knapsack with recourse (2QKR), and serves as the base on which we develop the three other models. The second one is a 2QKR with a probability constraint on the capacity of the knapsack on the first stage. The third one is a 2QKR where we introduce a probability constraint on the capacity of the knapsack in the second stage. Finally, the last model is also a 2QKR which uses probability constraints on the capacity constraints of both stages. As far as we know, this is the first time such a constraint has been used in a two-stage model. The solution techniques are based on the semidefinite relaxations. This allows for solving large instances, for which exact methods cannot be used.

Stochastic Quadratic Knapsack with recourse Lisser Abdel∗ ∗ Lopez Rafael∗ Hu Xu∗ Laboratoire de Recherche en Informatique, Universite Paris-Sud XI Bat. 490, Universit Paris-Sud, 91405 Orsay France Abstract This paper is dedicated to a study of different extensions of the classical knapsack problem to the case when different elements of the problem formulation are subject to a degree of uncertainty described by random variables. This brings the knapsack problem into the realm of stochastic programming. Four different model formulations are proposed, based on the introduction of probability constraints. The first one is a two-stage quadratic knapsack with recourse (2QKR), and serves as the base on which we develop the three other models. The second one is a 2QKR with a probability constraint on the capacity of the knapsack on the first stage. The third one is a 2QKR where we introduce a probability constraint on the capacity of the knapsack in the second stage. Finally, the last model is also a 2QKR which uses probability constraints on the capacity constraints of both stages. As far as we know, this is the first time such a constraint has been used in a two-stage model. The solution techniques are based on the semidefinite relaxations. This allows for solving large instances, for which exact methods cannot be used. Keywords: Semidefinite programming, knapsack, stochastic, recourse. 1 Introduction The knapsack problem (KP) is a well-known and well-studied problem in combinatorial optimization. Knapsack problems are often used to model industrial situation, financial decisions or network design problems. They may also appear as sub-problems of larger or more complex problems. The most famous form of KP is the single constraint binary version: we are given N items, with return pi for the item i, and a weight wi for the item i, with i = 1, ..., N , and a knapsack capacity c. The problem is to select a subset of items so that the weight of the subset does not exceed c, and gives a maximal total return. In this form, the problem is known to be N P -Hard [GJ79], and has been intensively studied in the past decades, and several exact and approximate algorithms are known for this problem. In particular, it admits a FPTAS [IK75]. For the quadratic knapsack problem, a survey done by David Pisinger [Pis07] gives detailed information on the problem and a number of results on the performance of various relaxations and algorithms used to solve or approximate the problem. In the case of modeling financial decisions, transportation, or production plans, however, this formulation shows its limitations, since it does not take into account uncertainty on the problem parameters, such as the prices pi or the weights wi . Similarly, such decisions are not static, and this model cannot take into account new information available on the prices or the weights. There have been several studies done on the stochastic knapsack problem in the past years: work has been done to find heuristics [CB98], approximation algorithms [KP98], [DGV04], [SS06]. . . Stochastic knapsack problems can reach a number of binary variables and constraints of such magnitude that commercial packages cannot find a solution in a reasonable time or memory space, requiring 1 the use of linear relaxations to find an upper bound on the problem. While linear relaxations were successful for many combinatorial optimization problems, it turns out that knapsack problems, especially their quadratic formulation, can not be approximated tightly by linearization based methods. Stronger relaxation methods, namely semidefinite relaxations (called SDP thereafter), have turned out to be particularly interesting for such combinatorial optimization problems [GW95], [HR98]. See also [Pis07] for a survey on the quadratic knapsack problem. More precisely, semidefinite programming is a recent development of convex optimization, which deals with optimization problems over symmetric positive semidefinite matrices with linear cost function and linear constraints. Groetschel and al. showed that semidefinite optimization problems can be solved in polynomial time [GLS88]. In this paper we present the two-stage quadratic knapsack with recourse that is used as the base for three variants of stochastic optimization problems : the first one is a two-stage quadratic knapsack with probability constraint on the capacity in the first stage. The probability constraint is used to model the risk we are willing to take when making our initial decision. We only know some information about the weights of the items, but we have to take a decision with this limited knowledge at the risk of breaking the capacity constraint with the knowledge that a second stage decision (the recourse) will come after, allowing to correct the decision to some extent. In this paper, we consider a two-fold recourse, in the sense that items can be removed if they turn out to be suboptimal or if their weight appears to be too heavy, or added if they appeared to be uninteresting at first, but reveal to be desirable. The second model is a two-stage quadratic knapsack with probability constraint on the second stage capacity constraint. In this model, we also have a two-stage formulation which models a situation where we make a decision with limited information, but where the second stage decision is made after receiving additional (but still incomplete) information about the weight, or prices. This allows to modify initial decisions. In this model, since at the second step, some information is still unknown, we face a risk of breaking the capacity constraint when taking the second stage decision. Finally, the last model combines the probability constraints on the capacity constraint in both stages. In this model, we are willing to take a risk in all the decisions. This paper is organized as follows: first, we present the static knapsack problem with probability constraint wherein we give a presentation of the problem. Then we reformulate this problem under the form of an SDP problem and detail two different relaxations. We then present the stochastic knapsack problem with recourse. Finally, concluding remarks are given. 2 Two-stage quadratic knapsack problem The initial decision is made during the first stage before knowing the realizations of the random variables. Then these realizations are (partially) revealed and the second stage decision is made which corrects the first stage decision, taking into account this information. This modeling scheme is known as stochastic program with recourse. Moreover, we introduce probability constraints in the second stage. To the best of our knowledge, this formulation has not been studied in the literature. We start by formulating the general quadratic stochastic program with recourse with probability constraint in the second stage. A generic quadratic stochastic problem with recourse can be modeled as follows: max xT Cx + Eω Q(u, ω) (1) Rx ≤ s (2) x∈{0,1} where (2) models generic linear constraints, with R ∈ Rm×n and s ∈ Rm . The second stage value is given by the solution of the problem: Q(u, ω) = max uT D(ω)u (3) u W (ω)u + T (ω)x ≤ h(ω) 2 (4) In this model, the uncertainty is described by the probability vector ω with a given probability distribution. When the initial decision is made, ω is unknown. After this decision, part of the information is revealed. This corresponds to the realization of random vector ω. The second stage decision can then be taken, with knowledge of ω and of the first stage decision x. More details can be found in [AG08]. First stage decision We now need to adapt this generic model to our knapsack problem. We assume we have n items, and each item is characterized by its value cii , and weight wi , i = 1 : n. Each item pair is characterized in the same manner by its value cij . The objective is to maximize the value of the items contained in the knapsack, with the constraint that it has a limited capacity d. The selection of an item during the first stage is defined by a binary decision variable xi which takes value 1 if the item i is included in the selection and 0 otherwise. The formulation of this first stage decision of the problem is the following: max x N X N X cij xi xj + Eω Q(u, ω) (5) i=1 j=1 N X wi xi ≤ d (6) i=1 The constraint (6) describes the knapsack capacity constraint. Equation (5) consists of two parts: the value of the knapsack during the first stage, and the expected value of the same knapsack during the second stage. This expected value depends on the items selected in the first stage (the vector x), and the realization of the random vector ω. Second stage decision After the first stage decision is made, the values of items may change, as well as their weight. During the second stage, the item i has the value bii (ω) and the weight vi (ω). Each item pair (i, j) has the value bij (ω). Similarly to the first stage, there is a constraint on the capacity of the knapsack, which is subject to change too. We note the new capacity h(ω). The realization of vector ω is known before making the second stage decision. The second stage decision allows to change the initial decision in order to correct mistakes which appear after extra information is known. There are two possibilities: first, an item which was selected during the first stage can be removed. In this case, we describe this decision by a binary variable u− i set to 1 if item i is removed from the knapsack, 0 otherwise. Likewise, an item that was previously rejected can be selected. In this case, we use a binary variable ui , set to 1 if we select the item i in the second stage, 0 otherwise. Note that if item i was selected during the first stage, then if it is not removed during the second, it is considered selected again (i.e. ui = 1) in the second stage. When an item i is removed, it comes at a cost, which includes penalties, such as time or manipulation costs necessary to reorganize the knapsack. This allows us to formulate the second stage decision as follows: Q(u, ω) = max − u,u N X N X N X bij (ω)ui uj − i=1 j=1 N X N X − − b− ij (ω)ui uj (7) i=1 j=1 ui ≥ xi − u− i , i=1:n (8) u− i ≤ xi , i = 1 : n (9) vi (ω)(ui + xi − u− i ) ≤ h(ω) i=1 3 (10) Equation (7) is our objective: we want to maximize the value of the knapsack where we deduce the cost of removing items previously selected. The two constraints (8) and (9) link the first stage and second stage decisions: the first one means that if an item i was selected during the first stage and not deselected, then it is necessarily considered selected during the second stage. Conversely, the constraint (9) means that only an item that was selected during the first phase can be deselected. Constraint (10) represents the capacity constraint in the second stage. This model serves as the base on which we build stochastic extensions: first we introduce a probability constraint on the first stage of the problem, in the first sub-section, then in the second sub-section we look at the model where the probability constraint is on the second stage. Finally, in the third sub-section, we combine probability constraints in both stages. 3 Probability constraints in the two-stage quadratic knapsack In order to model the risk, we need to introduce probability constraints. Taking a risk in a two-stage decision process can happen in either or both of the stages, therefore leading to three variants of the two-stage quadratic knapsack. Two-stage quadratic knapsack problem with probability constraint in the first stage In order to model risk-taking in the first stage decision, we have to replace the capacity constraint (6) with a probability constraint. ) (N X wi (φ)xi ≤ d ≥ (1 − α1 ) (11) P i=1 where φ is a probability vector with a given known distribution representing the uncertainty on the weights of the items, and α1 is the risk we take of ignoring the capacity constraint. This case corresponds to a situation in which we must make a first decision under very limited knowledge (on the weights and on the future), knowing that we will be able to correct the decision in a second stage. Two-stage quadratic knapsack problem with probability constraint in the second stage Similarly to the case with a probability constraint in the first stage, the base problem (5)-(10) can be modified to allow risk-taking in the second stage decision. We have to replace constraint (10). This time, we model the added uncertainty on a second probability vector ψ, conditioned on the vector ω. While the realization of vector ω is known before making the second stage decision, only the distribution of ψ is known, conditioned on ω. ) (N X vi (ω, ψ)(ui + xi − u− ≥ (1 − α2 ) (12) P i ) ≤ h(ω, ψ)|ω i=1 This situation corresponds to the case where information about the future is very limited. In fact, the distribution of weights in the second stage is not known until after ω is realized. Two-stage quadratic knapsack problem with probability constraint in both stages Finally, the last variant combines risk taking in both stages. This combines the limited information of both cases above: the first decision is taken with uncertainty about the first stage weights, then information is revealed, but is still incomplete, which leads to a second stage decision under uncertainty. In this case the problem becomes: 4 max x P N X N X cij xi xj + Eω Q(u, ω) (13) i=1 j=1 (N X wi (φ)xi ≤ d i=1 Q(u, ω) = max u,u− N X N X ) ≥ (1 − α1 ) bij (ω)ui uj − i=1 j=1 N X N X (14) − − b− ij (ω)ui uj ui ≥ xi − u− i , i=1:n u− i P (N X (15) i=1 j=1 (16) ≤ xi , i = 1 : n (17) vi (ω, ψ)(ui + xi − u− i ) ≤ h(ω, ψ)|ω i=1 ) ≥ (1 − α2 ) (18) As we showed in [AG08], formulation (5)-(10) of the two-stage quadratic knapsack problem is quite general and covers many specific cases. Our formulation (5)-(10) is general enough to allow the modeling of additional characteristics which do not appear explicitly in the problem description. The main such feature is that this model allows a different composition of the allowable set of items during the first and the second stage. Since the weights of the items and the capacity of the knapsack may change (based on the realization of ω) between the first and second stage, it allows us to model cases where some items may not be allowed during either stage. For example, an item could be unavailable during the first stage and only available during the second stage, in which case it would have wi > d. This also allows having different set of items for the two stages, which is traditional in recourse models: items which can only be decided on in the first phase will have a deselection cost b− ii set to a very large number (to prevent deselection), and a second stage value bii , bij ≤ 0 (which means selection in the second phase is suboptimal, since it only affects the weight and does not increase the profit). We will now work with this problem and reformulate it in order to be able to use resolution techniques like SDP. The first step of the reformulation is to rewrite the problem into deterministic equivalent problems. Then we will be able to use semidefinite relaxations on the deterministic equivalent problems. 4 Deterministic equivalent problem In order to rewrite the stochastic quadratic knapsack with recourse into a deterministic form, we need to consider the case when the distribution of the random vectors φ, ω and ψ is concentrated in a finite number of points. We assume that the random vector ω is concentrated in the finite number of points ωk , k = 1 : K with probabilities pω k . We will refer to these points as scenarios. In this case the problem (5)-(10) can be rewritten as follows. The objective function in the first stage becomes:   N X N K X X  max  cij xi xj + pω (19) k Q(u, k) x i=1 j=1 k=1 with the second stage decision becoming:   N X N N X N X X − −  bijk uik ujk − b− Q(u, k) = max ijk uik ujk − u,u i=1 j=1 (20) i=1 j=1 The second stage constraint capacity also changes: N X vik (uik + xi − u− ik ) ≤ hk i=1 5 ∀k = 1 : K (21) where − Q(u, ωk ) = Q(u, k), bij (ωk ) = bijk , b− ij (ωk ) = bijk , vi (ωk ) = vik , h(ωk ) = hk . Substituting (20) into (19) and collecting constraints for each scenario we obtain an equivalent problem:    N X N K N X N N X N X X X X − −   max  cij xi xj + pω bijk uik ujk − (22) b− k ijk uik ujk − x,uik ,uik i=1 j=1 i=1 j=1 k=1 N X i=1 j=1 wi xi ≤ d (23) i=1 uik ≥ xi − u− ik , i = 1 : n, k = 1 : K (24) u− ik ≤ xi , i = 1 : n, k = 1 : K (25) N X vik (uik + xi − u− ik ) ≤ hk , k = 1 : K (26) i=1 Probabilistic constraints Probability constraints (11) and (12) are reformulated as deterministic equivalent constraints. Suppose that the random vector φ (resp. ψ) is concentrated in the finite number of points φl , l = 1 : L (resp. ψkr , k = 1 : k, r = 1 : R) with probabilities pφl (resp. pψ kr ) such that L X pφl = 1, pφl ≥ 0 l=1 R X ψ pψ kr = 1, pkr ≥ 0, k = 1 : K r=1 Then constraint (11) is equivalent to the pair:  N  X    wil xi ≤ dl , l ∈ Γ  i=1 X    pkr ≥ 1 − α1 , k = 1 : K   l∈Γ where wil = wi (φ), dl = d(φ) and Γ is a subset of scenarios of set {1, ..., L} where the capacity constraint is satisfied, while the set {1, ..., L} \Γ corresponds to the scenarios where risk is taken. These constraints ψ can be reformulated as binary constraints by introducing the auxiliary binary variable ylφ (resp. ykr ) for each scenario l = 1 : L (resp. observation r = 1 : R and scenario k = 1 : K) as follows: ½ 0 if l∈Γ ylφ = 1 otherwise This yields the following deterministic equivalent constraints:  N X     wil xi ≤ dl + Mlφ ylφ   i=1 L  X    pφl ylφ ≤ α1   l=1 6 (27) where Mlφ is an arbitrary number such that Mlφ ≥ N X wil − dl i=1 In the same manner, constraint (12) is equivalent to:  N  X   vikr (uik + xi − u−  ik ) ≤ hkr , r ∈ Λk , k = 1 : K i=1 X    pkr ≥ 1 − α2 , k = 1 : K   r∈Λk where vikr = vik (ψkr ), hkr = hk (ψkr ) Λk is a subset of {1, ..., R}. Again we can reformulate them using binary variables: ½ 0 if r ∈ Λk ψ ykr = 1 otherwise yielding the following deterministic equivalent constraints:  N X   ψ ψ   vikr (uik + xi − u−  ik ) ≤ hkr + Mk ykr , r = 1 : R, k = 1 : K  i=1 R  X   ψ  pψ  kr ykr ≤ α2 , k = 1 : K  (28) r=1 where Mkψ is an arbitrary number such that Mkψ 5 ≥ max r N X vikr − hkr i=1 Conclusion In this paper we detailed the models for two-stage quadratic knapsack problems with recourse, on which we introduced probability constraints on the first stage, second stage, or both. Currently, numerical experiments are underway to complete those presented in [AG08]. Furthermore, these experiments add a distinction between models where the deselection of items is allowed – like presented in this paper – and the case where items can only be added, and not removed. These experiments compare the problem size and result between a MIP formulation solved by CPLEX, a LP relaxation, and various SDP relaxations for various sizes of instances of each variation. In particular, we show that these problems, even for low values of the size parameters (number of items, number of scenarios), quickly grow into large numerical problems where exact resolution is impossible. When this happens, we look at the improvements brought by the SDP relaxations over the LP relaxation. References [AG08] R. Lopez A. Gaivoronski, A. Lisser. Knapsack problem with probability constraints. Technical Report RR1498, LRI, July 2008. [CB98] Amy Mainville Cohn and Cynthia Barnhart. The stochastic knapsack problem with random weights: A heuristic approach to robust transportation planning. In Proceedings from TRISTAN III, San Juan, Puerto Rico, 1998. 7 [DGV04] Brian C. Dean, Michel X. Goemans, and Jan Vondrak. Approximating the stochastic knapsack problem: The benefit of adaptivity. In FOCS ’04: Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, pages 208–217, Washington, DC, USA, 2004. IEEE Computer Society. [GJ79] M. R. Garey and D. S. Johnson. Computer and Intractability. W. H. Freeman and Company, New York, 1979. [GLS88] Martin Grötschel, Lászlo Lovász, and Alexander Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer, 1988. [GW95] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115–1145, 1995. [HR98] C. Helmberg and F. Rendl. Solving quadratic (0,1)- problems by semidefinite programs and cutting planes. Mathematical Programming, 82:291–315, 1998. [IK75] Oscar H. Ibarra and Chul E. Kim. Fast approximation algorithms for the knapsack and sum of subset problems. J. ACM, 22(4):463–468, 1975. [KP98] Anton J. Kleywegt and Jason D. Papastavrou. The dynamic and stochastic knapsack problem. Oper. Res., 46(1):17–35, 1998. [Pis07] David Pisinger. The quadratic knapsack problem-a survey. Discrete Appl. Math., 155(5):623– 648, 2007. [SS06] David B. Shmoys and Chaitanya Swamy. An approximation scheme for stochastic linear programming and its application to stochastic integer programs. J. ACM, 53(6):978–1012, 2006. 8