Academia.eduAcademia.edu

Augmented Lagrangian Active Set Methods for Obstacle Problems

2003, Journal of Optimization Theory and Applications

Active set strategies for two-dimensional and threedimensional, unilateral and bilateral obstacle problems are described. Emphasis is given to algorithms resulting from the augmented Lagrangian (i.e., primal-dual formulation of the discretized obstacle problems), for which convergence and rate of convergence are considered. For the bilateral case, modifications of the basic primal-dual algorithm are also introduced and analyzed. Finally, efficient computer realizations that are based on multigrid and multilevel methods are suggested and different aspects of the proposed techniques are investigated through numerical experiments.

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS : Vol. 119, No. 3, pp. 499–533, December 2003 (g2003) Augmented Lagrangian Active Set Methods for Obstacle Problems1 T. KÄRKKÄINEN,2 K. KUNISCH,3 AND P. TARVAINEN4 Communicated by R. Glowinski Abstract. Active set strategies for two-dimensional and threedimensional, unilateral and bilateral obstacle problems are described. Emphasis is given to algorithms resulting from the augmented Lagrangian (i.e., primal-dual formulation of the discretized obstacle problems), for which convergence and rate of convergence are considered. For the bilateral case, modifications of the basic primal-dual algorithm are also introduced and analyzed. Finally, efficient computer realizations that are based on multigrid and multilevel methods are suggested and different aspects of the proposed techniques are investigated through numerical experiments. Key Words. Obstacle problems, active set strategies, augmented Lagrangian methods, multigrid and multilevel methods. 1. Introduction Projected relaxation methods are the most well-known solution techniques for discretized obstacle problems (Refs. 1–2). They are known to be convergent and easy to implement. However, their convergence rates depend crucially on the choice of the relaxation parameter and they deteriorate rapidly when the triangulations are refined. Moreover, they do not take into account the characteristic domain decomposition of the computational domain, given by the (a priori unknown) free boundary of the coincidence set. 1 This work was partially supported by the Academy of Finland under Project 772494. Professor of Software Engineering, Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylä, Finland. 3 Professor of Mathematics, Institute for Mathematics, University of Graz, Graz, Austria. 4 Director, Numerola Oy, Jyväskylä, Finland. 2 499 0022-3239=03=1200-0499=0 g 2003 Plenum Publishing Corporation 500 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Various domain decomposition and multilevel algorithms have been proposed for obstacle problems (Refs. 3–9). Their motivation is twofold. First, the convergence rates can be independent of the mesh refinement. Second, one can develop iterative algorithms where the characteristic domain decomposition of the obstacle problem is taken into account to construct reasonable domain partitions. In Ref. 8, a Schwarz method based on the particular monotonicity properties of the unilateral obstacle problems was introduced. Its convergence rate is independent of the mesh refinement, but it requires a nonlinear solution algorithm in some subdomain. The method was compared numerically to projected relaxation methods and to a primal active set strategy in Ref. 10. Another efficient way to solve discrete obstacle problems is given by active set strategies (Refs. 11–14). Their basic iteration consists of two steps. In the first phase, the (mesh) domain is decomposed into active and inactive parts, based on a criterion specifying a certain active set method. In the second phase, a reduced linear system associated with the inactive set is solved. To make such iterative processes efficient, a fast linear system solver, suitable for problems posed on domains with nonregular geometry, is required. Contrary to the projected relaxation and domain decomposition methods, no additional linear or nonlinear subproblem solvers are needed. In this paper, we continue the efforts in developing the theory and efficient computer realization of active set strategies that are based on the augmented Lagrangian, i.e., primal-dual formulation of the unilateral and bilateral obstacle problems (Refs. 15–16). For example, a rate of convergence estimate valid for a wide class of active set methods is given. The most significant findings of the present article are related to the regularization coefficient c [cf. (7)] resulting from the augmentation of the inequality constraint. Both theoretical analysis by means of the stability of the active set iterations and numerical observations on convergence, accuracy, and overall efficiency indicate that it is always advantageous to choose c large. This gives new insight into the overall treatment of obstacle problems, because the primal active set methods (e.g. in Refs. 11–12 and 17) correspond to c = 1. Numerically, we obtain high computational efficiency for solving the obstacle problems with even a few million unknowns by exploiting the new multigrid linear system solver from Refs. 18–19. Together with nested iterations, our multilevel approach shows optimal performance in the sense that the number of both outer (active set) iterations and inner (multigrid preconditioned conjugate gradient) iterations are independent of the mesh size. However, some aspects of the discretization could be further improved. Specifically, the discretization is based on a piecewise linear finite-element method on an orthogonal, uniform mesh. Therefore, the approximation of the boundary of the coincidence set is not optimal and one could apply mesh JOTA : VOL. 119, NO. 3, DECEMBER 2003 501 adaptation techniques to obtain better accuracy (Ref. 17). Notice that our methods are readily applicable to such cases as well as, due to the good performance of the multigrid preconditioner to locally fitted meshes (Ref. 19). The remainder of the paper is organized as follows. In Section 2, we introduce some notation, the model problems as well as give equivalent characterizations for the solution of the discretized obstacle problem. We consider the unilateral case in Section 3 and the bilateral case in Section 4. In Section 5, we describe some numerical experiments made with twodimensional (2D) and three-dimensional (3D) examples. We remark that a more extensive coverage of the numerical experiments with the proposed methods is given in Ref. 20. Finally, the conclusions are given in Section 6. 2. Preliminaries In this section, we formulate the obstacle problems, for which the solution algorithms are developed in the forthcoming sections. Moreover, we fix some notation and assumptions, as well as give equivalent characterizations for the solutions of the discrete obstacle problems. We consider the following problem : min a(u, u) – 2l(u), u ˛H01 (W) s:t: f(x)# u(x)# y (x) a:e: x ˛W, (1a) (1b) where a is a symmetric, H10-elliptic bilinear form, l is a linear form, and the functions –O < f < y < O a.e. in W represent given lower and upper obstacles, respectively, in the bounded domain W Rd, d = 2, 3. Discretization of problem (1) leads to the finite-dimensional obstacle problem min J(u) = ÆAu, uæ – 2Æ f, uæ, (2) C n = {u ˛Rn : f i # ui # y i for all i = 1, . . ., n}, (3) u ˛C n where f,f,y ˛R n, and Æ. ,.æ denotes the usual Euclidean inner product in R n. We assume that the inequality f < y holds componentwise and that A = (aij)˛R n · n is a Stieltjes matrix, that is, a symmetric and nonsingular Mmatrix (Ref. 21). In particular, A is symmetric and positive definite. Let a > 0 be the smallest eigenvalue and let b < O be the largest eigenvalue of A, respectively. 502 JOTA : VOL. 119, NO. 3, DECEMBER 2003 We denote the set of indices in R n by N = {1, 2, . . ., n} and by Æ. , .æs the restriction of Æ. ,.æ on the index subset S  N with the corresponding norm |v|2S = Æv, væS. Moreover, for any vector v ˛R n, the notation v > 0 [v$ 0] on S means that vj > 0 [vj $ 0], for all j ˛S. The operator A induces the energy norm defined by kuk2 = ÆAu, uæ: The restriction of this norm to a subset S  N is given by kuk2S = ÆAu, uæS : Finally, we recall that, for every decomposition N = I ¨ J, every principal minor AII = PIAPI of A (i.e., the rows and columns of A indexed by I) is monotone, that is, AII–1 $ 0 and every codiagonal block AIJ, I „ J, is nonpositive, that is, AIJ # 0 (i.e., vJ $ 0 implies AIJvJ # 0, Ref. 21). A typical obstacle problem to which our algorithms are applicable is given by the following example. Set Example 2.1. a(u, u) = ð Bru  rudx, l(u) = W ð f̃ udx, (4) W where f̃ is a given function and B is a symmetric and positive definite d · d (coefficient) matrix. Discretizing (4) by the finite-element (FE) method with a piecewise linear basis {ji}N i = 1 without obtuse-angled triangles or using the finite difference (FD) five-point (2D) or seven-point (3D) scheme, one obtains a discrete problem with an M-matrix (Ref. 22). The next result gives equivalent forms of the optimality conditions for problem (2). Theorem 2.1. equivalent : (i) (ii) See Refs. 1–2, 20. The following three conditions are u* is the unique solution for (2). u*˛C n satisfies the variational inequality ÆAu* – f , v – u*æ $ 0, for all v˛C n : (5) (iii) N = J* ¨ I*, where J* = Jy* ¨ J*f and f – Au* > 0 and u* = y , on Jy*, (6a) f – Au* < 0 and u* = f, on Jf*, (6b) f – Au* = 0 and f # u*# y , on I*: (6c) JOTA : VOL. 119, NO. 3, DECEMBER 2003 503 Primal-dual active set methods are based on a regularization of the optimality condition given in Theorem 2.1. Regularization refers to the fact that the inequality involving the operator A is replaced by an equality by means of an appropriate Lagrangian (slack) variable. To carry out this process, we can use the general augmented Lagrangian formulation as developed in Refs. 15–16 and 23. Alternatively, as in the present case, one can resort to direct computation and, introducing the Lagrange variable l* = f – Au*, one observes that (6) is equivalent to the existence of a pair (u*, l*) such that u*˛C n and Au* + l* = f, (7a) l*j + c(u* – y )j > 0, for all j ˛Jy*, (7b) l*j + c(u* – f)j < 0, for all j ˛Jf*, (7c) l*i = 0, for all i ˛I*, (7d) where c is any positive constant resulting from the augmentation. 3. Unilateral Obstacle Problem An unilateral (upper) obstacle problem results from the original problem (2) by setting (formally) f = –O. In this case, the active=inactive decomposition of the indices in the optimality condition is given by l*> 0 j and uj* = y j , for all j ˛J*, (8a) l*i = 0 and ui* # y i , for all i ˛I*: (8b) The regularized optimality condition for (8) is given by (e.g., Ref. 15) Au* + l* = f, l* = max{0, l* + c(u* – y )}, (9a) for all c > 0, (9b) and thus necessarily l* $ 0. In (9), the max-operator is understood componentwise. 3.1. Primal-Dual Algorithm. In order to develop a Newton-type method for the unilateral obstacle problem, we apply a primal-dual active set strategy suggested by (7) and (9). 504 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Algorithm A1. Obstacle Problem. Step 1. Step 2. Step 3. Step 4. Primal-Dual Active Set Algorithm for the Unilateral Initialize u0 and l0 $ 0. Choose c > 0. Set k = 0. Compute l̂k = max{0, lk + c(u k – y)}. Determine the active and inactive sets by Jk = { j ˛N: l̂kj > 0} (active), (10a) Ik = {i ˛N: l̂ki = 0} (inactive): (10b) If k $ 1 and Jk = Jk–1, then stop; the solution is uk. Let u and l be the solution of the linear system Au + l = f, (11a) l = 0 on Ik and u = y on Jk : (11b) Set uk+1 = u, lk+1 = max{0, l}, k = k+1, and go to Step 2. The only laborious step in Algorithm A1 is the realization of (11) that can be reduced to solving AII uI = fI – AIJ y J , (12a) uJ = y J , l = f – Au, (12b) k k for I = I and J = J : (12c) The matrices AII and AIJ result from the partition of A according to the active and inactive sets. This implies that u is unique and l is determined uniquely from u. The computational domain in (11) and (12) is determined by the inactive set I, so that the linear system is posed on a varying and presumedly complicated geometry during the active set iterations. It is well known that this makes the iterative solution of the linear problem a difficult task (Refs. 19, 24). 3.2. Monotone Convergence. We commence the analysis of Algorithm A1 by justifying the stopping criterion in Step 3. Lemma 3.1. If Algorithm A1 stops in Step 3, then the current iterate (uk, lk ) satisfies the optimality condition (8). Proof. Let k $ 1. Then, the set Jk–1 can be decomposed as k–1 Jk–1 = Jk–1 1 ¨ J2 , JOTA : VOL. 119, NO. 3, DECEMBER 2003 505 where k Jk–1 = { j ˛Jk–1 : l j > 0}, 1 k Jk–1 = { j ˛Jk–1 : l j = 0}: 2 Similarly, k–1 Ik–1 = Ik–1 1 ¨ I2 , where Ik–1 = {i ˛Ik–1 : uki # y i }, 1 Ik–1 = {i ˛Ik–1 :uki > y i }: 2 Using Step 2, we then obtain k–1 Jk = Jk–1 1 ¨ I2 and Ik = I1k–1 ¨ Jk–1 2 , k–1 k–1 and Ik–1 are disjoint and this decomposition is where the sets Jk–1 1 , J2 1 , I2 k k–1 independent of c. Hence, if J = J , then we have = Jk–1 =; Ik–1 2 2 and thus (uk, lk) satisfies (9). u Notice from the proof above that, for k $ 1, changes in the sets Ik and Jk are independent of the parameter c. Theorem 3.1. Let u* # u0 # y be given. Define l0 = max{0, [ f – Au0]} and assume that we obtain a disjoint decomposition N = I –1 ¨ J –1, with I –1 = {i˛N :l0i = 0} and J –1 = { j˛N :l0j > 0 and u0j = yj} such that J* J –1. Then, Algorithm A1 converges monotonically; that is, for all k $ 0, uk $ uk+1 $ u* and k l $l k+1 $ l*: Furthermore, Algorithm A1 converges in finitely many steps. Proof. The proof is rather similar to that of the primal active set method in Ref. 12, but for completeness and needs of the further analysis it is included in the Appendix (Section 7). u Remark 3.1. For u0 = y, the index set N can be decomposed in the desired manner. The fact that J* J –1 follows using the same technique as in (43), (44). 506 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Remark 3.2. Consider the assumption u0 $ u*. Using a technique similar to the one in Ref. 9 (see also Refs. 12, 20), it can be shown that, when u is the solution of the reduced system uJ = y J , AII uI = fI – AIJ uJ , (13) we have u$ u* independently of the decomposition N = I ¨ J. This implies that, after one iteration of Algorithm A1, an upper solution u1 $ u* is obtained for any u0. Moreover, since the infeasible components of I12 enter J2, where u2 = y and u2 # u1, we have necessarily after two iterations the feasibility u2 # y and the decomposition N = I –1 ¨ J –1 in the form required by Theorem 3.1. Our numerical experiments indicate that sometimes one active set iteration can be saved by already projecting u1 onto C n (cf. the implemented algorithms in Section 5.2). To this end, let us compare briefly the primal-dual algorithm of this paper to the corresponding primal strategy in Refs. 11–12 and 17. We recall further that there is much similarity between these algorithms and the projected Newton method by Bertsekas (Ref. 15, p. 76). In the primal approach, the parameter c is not visible (i.e., it is fixed to unity) and determination of the active set Jk is based on the condition k ( f – Au k )j + (uk – y )j = l j + (u k – y )j $ 0 (14) without a projection step max{0, f – Auk+1}. As in the proof of Lemma 3.1, we obtain that, after the first iteration, the new sets for the primal strategy consist of k Jk ={ j ˛N: l j + (uk – y )j $ 0} k ={ j ˛Jk–1 : l j $ 0} ¨ {i ˛Ik–1 : uki $ y i }, (15a) k Ik = {i ˛N: l i + (uk – y )i < 0} k = {i ˛Ik–1 : uki < y i } ¨ { j ˛Jk–1 : l j < 0}: (15b) Thus, the difference between the strategies is due to the initialization and different decomposition of the discrete variational inequality in Theorem 2.1. (iii) : the primal-dual strategy is based on the augmented disjoint decomposition according to the Lagrangian variable l* = f – Au*, whereas the primal strategy uses a disjoint partition of the constraint u* # y. 3.3. Convergence Rate. In this section, we prove a basic convergence rate result, valid for both the primal and primal-dual active set strategies. JOTA : VOL. 119, NO. 3, DECEMBER 2003 507 For this purpose, let us define a local stencil neighborhood of a matrix A = (aij) for a row index i˛N by setting NA (i) = { j ˛N: aij „ 0g: (16) Hence, in the finite-difference framework or finite-element framework, the local stencil neighborhood consists of the node indices in the stencil connecting two discretization points xi and xj via a nonzero entry in A. For a subset S  N, we define the stencil interior relative to A by NA (S) = {i ˛S: NA (i) S}: (17) The following result for the monotone phase of the primal-dual and primal active set methods is the basis for the convergence rate result. Lemma 3.2. If Ik  Ik+1, then NA (Jk ) Jk+1 : Proof. (i) We consider first the primal-dual strategy. From the proof of Theorem 3.1, it follows that Ik+1 = Ik ¨ Jk2 , where k+1 Jk2 = { j ˛Jk : l j = 0} k = { j ˛J : max{0, ( f – Auk+1 )j } = 0} = { j ˛Jk : ( f – Auk+1 )j # 0}: (18) By definition, for every j ˛NA (Jk ), we have aji = 0, for all i ˇJk : This implies ( f – Auk+1 )j = ( f – Ay )j = ( f – Auk )j > 0, for all j ˛NA (Jk ), (19) so that the indices in NA (Jk ) cannot belong to Jk2. (ii) The same technique applies to show also that, for the primal strategy, the residual in the stencil interior of the set J k in (15) has the same value over one iteration. This means that a change from Jk into Ik+1 in the stencil interior is impossible for the primal strategy, which proves the result. u 508 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Fig. 1. Inactive and active nodes and the nodes which can be inactivated in one iteration. Corollary 3.1. The number of outer iterations of the active set methods depends linearly on h for the discretizations proposed in Example 2.1. Proof. The monotone phase Ik  Ik+1 of the active set algorithm starts inevitably after the first iteration, independently of the initial vector u0 (cf. Remark 3.2). With the proposed discretizations, only the neighboring points {xi = (x1i , x2i ): xi = (x1j th, x2j th)} in 2D (with the corresponding 7-point neighborhood in 3D) belong to the local stencil neighborhood NA( j). Hence, there can be only an h-dependent boundary strip of points on J2k  I knNA (J k ) so that the result follows from Theorem 3.2. u The result is illustrated in Fig. 1. Additional nodes which can be inactivated in one iteration when using a nine-point stencil resulting from the Mehrstellenverfahren of Collatz or bilinear finite elements (Ref. 25), for example, are also included. In Section 5, it will be shown that a practically mesh-independent behavior for the active set methods can be recovered by using multilevel techniques. 4. Bilateral Obstacle Problem 4.1. Basic Primal-Dual Algorithm. The formulation (7) suggests the following primal-dual active set strategy for solving the bilateral obstacle problem (2). Notice that, from a different viewpoint (i.e., decision between a proportional and projective step), the so-called release coefficient G playing JOTA : VOL. 119, NO. 3, DECEMBER 2003 509 a similar role to that of c in Algorithm A2 was introduced for box constrained quadratic programming problems in Ref. 26. Algorithm A2. Problem. Step 1. Step 2. Primal-Dual Algorithm for the Bilateral Obstacle Let (u0, l0) be given. Choose c > 0 and set k = 0. Determine the active=inactive partition, k Jfk = { j ˛N: l j + c(uk – f)j < 0}, k Jyk = { j ˛N: l j + c(uk – y )j > 0}: Set Jk = Jkf ¨ Jyk and Ik = NnJk. If k $ 1 and Jk = Jk–1, then stop; the solution is uk. Determine (uk+1, lk+1) as the unique solution of the linear system Auk+1 + l = f, (20a) =0 on Ik , (20b) uk+1 = f on Jfk , (20c) uk+1 = y on Jyk : (20d) l Step 3. k+1 k+1 Set k = k+1 and go to Step 2. As before, we notice that actually (20) reduces to solving AII uI = fI – AIJf f Jf – AIJy yJy , uJf = fJf , uJy = y Jf , l = f – Au, (21) for (u, l) = (uk+1, lk+1) and Jf = Jkf, Jy = Jyk , I = Ik. For analyzing Algorithm A2, we investigate first the behavior of the active=inactive index sets. Proposition 4.1. condition There exists c̄< O such that, for all c $ c̄, the Jyk ˙ Jfk+1 = ; = Jfk ˙ Jyk+1 holds true for all k $ 0 in Algorithm A2. (22) 510 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Proof. Let I ¨ Jf ¨ Jy be any mutually disjoint decomposition of N. If u is the corresponding solution of (21), then it follows readily that ajuj2I # (j f j + b(jfj + jy j))jujI : (23) Consequently, there exists c1 > 0 such that |u| # c1 with u the solution to (21) regardless of the decomposition of N. Let ĉ = sup{jAvj: jvj# c1 }: Then, jAuk j# ĉ, If there exists k+1 lj for all k $ 1: ˙ Jyk , j˛Jk+1 f then ujk+1 = yj and, by Step 2 of Algorithm A2, + c(u k+1 – f)j = fj – (Au k+1 )j + c(y – f)j < 0: Consequently, c min (y – f)2i < (Au k+1 – f )j (y – f)j # (ĉ + j f j)jy – fj, i ˛N (24) which is impossible, if c is chosen to be sufficiently large. Thus, Jfk+1 ˙ Jyk = ;: Analogously, one shows that Jfk ˙ Jyk+1 = ;: This ends the proof. u The role of c becomes evident from the above result. While in the unilateral case c influences only the first iteration, the distance between the given obstacles and the size of c will effect the behavior of the primal-dual algorithm throughout all iterations in the bilateral case. The significance of (22) will be investigated more thoroughly in Section 5. Next, we give a flowchart of indices between two consecutive iteration steps. Flowchart of Algorithm A2 Inactive Set: Ik  {i ˛N :lk+1 = 0}. I Ik1 = {i ˛Ik : f i # uk+1 # y i }: i k+1 Then, l i k+1 + c(uk+1 – f)i $ 0 and l i < f i }: Ik2 = {i ˛Ik :uk+1 i + c(uk+1 – y )i # 0 Ik1  Ik+1 : JOTA : VOL. 119, NO. 3, DECEMBER 2003 k+1 Then, l i 511 + c(uk+1 – f)i < 0  Ik2  Jfk+1 : Ik3 = {i ˛Ik : uk+1 > y i }: i k+1 Then, l i + c(uk+1 – y )i > 0 Ik3  Jyk+1 : Lower Active Set : k+1 Jf,k 1 = { j ˛Jfk : l j k+1 Then, l j k+1 Then, < 0}: + c(uk+1 – f)j < 0 Jf,k 1  Jfk+1 : Jf,k 2 = { j ˛Jfk : l j k+1 lj $ 0}: + c(uk+1 – f)j $ 0 and (22) implies Jf,k 2  Ik+1 : Upper Active Set: k+1 Jyk , 1 = { j ˛Jyk : l j k+1 Then, l i Jfk  { j ˛N: ujk+1 = f j }: Jyk  { j ˛N: uk+1 = y j }: j > 0}: + c(uk+1 – y )j > 0 Jyk , 1  Jyk+1 : k+1 Jyk , 2 = { j ˛Jyk : l j # 0}: k k+1 Then, lk+1 + c(uk+1 – y)j # 0 and Jy, by (22). j 2I Altogether, under the condition (22), we obtain Ik+1 = Ik1 ¨ Jf,k 2 ¨ Jyk , 2 , Jfk+1 = Jf,k 1 ¨ Ik2 , Jyk+1 = Jyk , 1 ¨ Ik3 : (25) Lemma 4.1. If (22) holds true and if Jk = Jk–1 for some k $ 1, then we have (lk+1, uk+1) = (uk, lk) = (u*, l*). Proof. Due to (22) and the uniqueness of the solution uI of the reduced system (21), we have uk+1 = uk and l k+1 k =l , if Jk = Jk–1 : Since (25) is a partition into disjoint sets, we find from Jk = Jk–1 that I2k–1 ¨ I3k–1 = ;: Moreover, Ik = Ik–1 implies k–1 Jf,k–1 2 ¨ Jy , 2 = ;: Hence, Ik = I1k–1 , Jfk = Jf,k–1 1, Jyk = Jyk–1 ,1, 512 JOTA : VOL. 119, NO. 3, DECEMBER 2003 so that the optimality condition in Theorem 2.1(iii) is valid and k (uk , l ) = (u*, l*): u The following theorem gives a sufficient condition for the decrease of the cost functional J in every iteration step. The condition reveals the basic difficulty of estimating the indices of Ik violating the obstacle constraints. It is clear that, when the coincidence set for the current iterate uk is strictly smaller than that of the true solution u*, we have readily J(uk ) < J(u*): Then, there must be indices violating the constraints or otherwise u* would not be the minimizer of J(u) over C n. This suggests that, during the first iterations of Algorithm A2, we need to recover a situation where the current coincidence set is not strictly smaller than that of u*. Theorem 4.1. such that Let (22) be valid and assume that there exists k0 ˛N kuk – uk+1 k¯I k– 1 < kuk – uk+1 kJ¯ k – 1 (26) holds for all k $ k0 unless Jk = Jk–1, where k–1 ¯I k–1 = Ik–1 2 ¨ I3 and J̄ k–1 k–1 = Jf,k–1 2 ¨ Jy , 2 , k–1 k–1 k–1 with Ik–1 2 , I3 , Jf, 2 , Jy, 2 defined within the Flowchart. Then, Algorithm A2 converges in finitely many steps. Proof. If Jk = Jk–1, then we know from Lemma 4.1 that k (uk , l ) = (u*, l*): Therefore, it is enough to show that Jk „ Jk–1 for all k $ k0 is impossible. A direct calculation using A = AT shows that, for every u, v ˛Rn, J(u) – J(v) = – kv – uk2 + 2Æ f – Au, v – uæ: (27) Moreover, from (25), we know that uk+1 = uk l k+1 =l l k+1 =0 k k–1 on Jf,k–1 1 ¨ Jy , 1 , on I1k–1 , on Ik , which gives kuk – uk+1 k2 = ÆA(uk – uk+1 ), uk – uk+1 æ¯I k – 1 ¨ J̄ k– 1 , (28) JOTA : VOL. 119, NO. 3, DECEMBER 2003 513 Æ f – Auk+1 , uk – uk+1 æ = ÆA(uk – uk+1 ), uk – uk+1 æ¯I k – 1 : (29) Combining (27)–(29), we find that J(u k+1 ) – J(uk ) = – kuk – uk+1 k¯2I k – 1 ¨ J¯ k– 1 +2kuk – uk+1 k¯2I k – 1 = – kuk – uk+1 k2J̄ k – 1 + kuk – uk+1 k¯2I k – 1 : (30) Thus, J(uk+1 )< J(uk ) if and only if (26) is valid. To this end, if Jk „ Jk–1, then J(uk+1 )< J(uk ), for all k # k0 : Since there are only a finite number of possible active sets, this leads to a contradiction. Hence, there exists an index k such that Jk = Jk–1 and the algorithm stops at the solution (u*, l*). u Remark 4.1. Note that the strict inclusion Jk  Jk–1 implies (26). In k fact, in this case, ¯I k–1 = ; and J̄ k–1 „ ;. Namely, if uJ̄k+1 ¯ k – 1, then we k – 1 = uJ k+1 k k+1 k k+1 = uk, conhave uJk – 1 = uJk – 1 and (Au )Ik – 1 = (Au )Ik – 1. It follows that u k k–1 tradicting J „ J . 4.2. Modified Primal-Dual Algorithm. Next, we propose a modification of the basic algorithm which is unconditionally convergent. The underlying idea is to include extra steps in Algorithm A2, similarly to the primal method in Ref. 11. However, compared to Ref. 11, here the extra substeps are formulated and justified in a different way. In what follows, we replace the explicit Lagrange multiplier lk by the residual f – Auk of the system. Moreover, we denote by PC the projection onto the admissible set C n of (3). Algorithm A3. Obstacle Problem. Step 1. Step 2. Modified Primal-Dual Algorithm for the Bilateral Let u0 be given. Choose c > 0 and set k = 0. Determine the active=inactive partition, Jfk = { j ˛N: ( f – Auk )j + c(uk – f)j < 0}, Jyk = { j ˛N: ( f – Auk )j + c(uk – y )j > 0}: 514 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Set Jk = Jkf ¨ Jyk and Ik = NnJk. Determine uk+1 : (a) Let wk+1 be the unique solution of the reduced linear system (Awk+1 )i = fi , i ˛Ik , (31a) wk+1 j = fj, j ˛Jfk , (31b) wk+1 = y j, j j ˛Jyk : (31c) If wk+1 = uk for k $ 1, then stop; the solution is uk. If k = 0 or J(Pcwk+1)< J(uk), then take uk+1 = PCwk+1 and go to Step 3. (b) Else if ukj = fj for all j˛Jkf and ukj = yj for all j˛Jyk , then determine the maximal admissible steplength 0 # t# 1 such that ū =uk + t(wk+1 – uk ) ˛C n : If t > 0, then take uk+1 = ū and go to Step 3. (c) Else, determine Rky = { j ˛N: ( f – Auk )j > 0}, Iy = { j ˛Rky : ukj < y j } and Rkf = { j ˛N: ( f – Auk )j < 0}, If = { j ˛Rkf : ukj > f j }: (i) Iy „ ; : Let w̄ be the solution of the linear system (Aw̄)i = fi for all i ˛Iy , (32a) w̄j = ukj for all j ˛NnIy : (32b) Determine 0< t̄# 1 such that ū = uk + t̄ (w̄ – uk ) ˛C n : (ii) Iy „; : Let ŵ be the solution of the linear system (Aŵ)i = fi for all i ˛If , (33a) ŵj = ukj for all j ˛NnIf : (33b) JOTA : VOL. 119, NO. 3, DECEMBER 2003 515 Determine 0 < t̂# 1 such that û = uk + t̂(ŵ – uk )˛C n : (iii) If If = ;, then take uk+1 = ū. If Iy = ;, then take uk+1 = û. Otherwise, take uk+1 = (1=2)(ū + û). Step 3. Set k = k+1, and go to Step 2. In the above Iy [resp. If] is the set of indices for which the Lagrange multiplier indicates that the index is active, whereas from the point of view of the primal variables it is inactive. Note also that, due to Step 2(a), all the subsequent iterates uk, k $ 1, become feasible, i.e., uk ˛C n for k $ 1. To begin the analysis of the algorithm, we justify Step 2(c). Lemma 4.2. If uk „ u* for k $ 1, then Step 2(c) of Algorithm A3 is well posed and reduces the value of the cost functional J. Proof. Let k $ 1. When uk „ u*, either Iy or If must be nonempty. Otherwise, uk satisfies the optimality condition of Theorem 2.1(iii). Assume first that Iy „ ;. When w̄ satisfies (32), we have (Aw̄)i = fi > (Auk )i for all i ˛Iy , (34a) w̄j = ukj for all j ˛NnIy : (34b) As in the proof of Theorem 3.1, it follows that AIy Iy (w̄ – uk )Iy > 0 giving w̄ $ uk on Iy . Since w̄ „ uk and f # uk < y on Iy , there exists 0< t̄# 1 such that ū = uk + t̄ (w̄ – uk ) ˛C n : A similar reasoning applies if If „;; thus, Step 2(c) is well posed. From (32), we see that Æ f – Aw̄, uk – w̄æ = 0: Hence, due to (27), we obtain J(w̄) – J(uk ) = – kuk – w̄k2 < 0, (35) 516 JOTA : VOL. 119, NO. 3, DECEMBER 2003 since w̄ „ uk. Apparently, w̄ – uk gives a descent direction for J and so does ū for 0 <t # 1, since the convexity of J implies J(ū) = J(tw̄ + (1 – t)uk ) # tJ(w̄) + (1 – t)J(uk ) < J(uk ): (36) If If „ ;, then the same technique is used to argue that J(û) < J(uk ), for û = uk + t̂(ŵ – uk )˛C n , for some 0< t̂ # 1: Summarizing, using once more the convexity of J, we find that J(uk+1 ) = J((ū + û)=2) # J(ū)=2 + J(û)=2 < J(uk ): u Next, we give the desired convergence result. Theorem 4.2. Algorithm A3 converges in a finite number of steps. Proof. It is sufficient to show the strict decrease of the cost functional J until uk satisfies the necessary optimality condition. Assume first = fj and ( f – Awk+1)j < 0 for j˛Jkf , that wk+1 = uk for k $ 1. Then, wk+1 j k+1 k+1 as well as wj = yj and ( f – Aw )j > 0 for j˛Jyk . Because wk+1 satisfies the linear problem (31) and uk is feasible, we have wk+1 = uk = u* due to Theorem 2.1(iii). Assume that wk+1 is not the solution of the obstacle problem. According to the three cases in Step 2 of Algorithm A3, PCwk+1 is accepted as the next iterate only if J(Pcwk+1)< J(uk). In Step 2(b), uJkkf = f and uJkky = y, so that Æ f – Awk+1, uk – wk+1 æ = 0: As before, then J(wk+1 ) – J(uk ) = – kuk – wk+1 k2 < 0, (37) and for t > 0 we get J(ū) = J(twk+1 + (1 – t)uk ) # tJ(wk+1 ) + (1 – t)J(uk ) < J(uk ): (38) JOTA : VOL. 119, NO. 3, DECEMBER 2003 517 Finally, when t = 0, we arrive at Step 2(c), which was justified in the previous lemma. This ends the proof. u Remark 4.2. algorithm. (i) (ii) (iii) (iv) (v) We state some observations concerning the modified It provides a method that decreases the cost functional J without a line search. In Step 2(c), any convex combination of ū and û is acceptable. Also, a multiplicative (sequential) combination of steps (i) and (ii) instead of an additive (parallel) one can be used. Alternatively, one can proceed by using directly the solution corresponding to the larger set Iy=If as the next iterate. According to our numerical experiments, Step 2(c) of Algorithm A3 is required very rarely. Usually, this can happen only in the first iteration step. Actually Step 2(c) realizes an independent method for solving the obstacle problem, which has been tested by some numerical experiments not reported here. The convergence rate of such an approach, compared to the active set methods, seems to be slow. It is also possible to define other modifications of Algorithm A2 along the lines of Algorithm A3 (Ref. 20). Finally, we remark that a similar inactivation limitation is valid for bilateral algorithms as proved in Lemma 3.2 for unilateral active set methods (cf. Table 5). 4.3. Inexact Solving. So far, we have assumed that the reduced systems of Algorithms A1–A3 in (12), (21), (32), and (33) during the active set iterations are solved exactly. Let us next consider the case of inexact solving; i.e., (Aw̃)i = f̃ i , for all i ˛I˜ , for an index subset Ĩ  N and a given vector f̃ ˛R n, is replaced by j( f̃ – Aw̃e )i j# e, for all i ˛I˜ , (39) where e > 0 is a small, given termination tolerance for an iterative solver. First, we notice that Proposition 4.1 remains valid with obvious modification in (23) when the linear system is solved only approximately. Moreover, evidently the inexact solving affects the determination of active= inactive partitioning in all active set algorithms, but again the proof of Proposition 4.1 indicates that, by choosing c large, to emphasize the difference 518 JOTA : VOL. 119, NO. 3, DECEMBER 2003 between the current solution and the distinct obstacles compared to the inexact residual, allows one to recover a similar behavior of the algorithms as stated in (25). When taking the inexact solving into account in Algorithm A3, Step 2(a) can remain the same, but Step 2(b) should be augmented with an additional test J(wk+1)< J(uk) to ensure the strict decrease before seeking the steplength. The next proposition addresses the inexact solving in Step 2(c) of Algorithm A3, but the same technique applies to other reduced systems in Algorithm A1–A3 as well. Proposition 4.2. There exists ē > 0 such that, for all e ˛(0, ē ), Step 2(c) is well posed and the solution w̄e satisfying (39) on Iy instead of the exact equality in (32) reduces the value of the cost functional J. Proof. Because Iy  Rky , the definition ê = min ( f – Auk )i i ˛Iy yields ê > 0 and ê # fi – (Auk )i , for all i ˛Iy : (40) In addition, there exists at least one index i˛Iy such that strict inequality holds in (40) [if Iy happens to be a singleton or (Auk)i constant over Iy, then we must redefine ê = r ê for some 0< r < 1 to obtain this]. Let us first choose ê as the inexact termination criterion in (39) for (32), which means that j( f – Aw̄ê )i j# ê , for all i ˛Iy : We distinguish between two cases in (41) : Case 1: I1y = {i ˛Iy : ( f – Aw̄ê )i $ 0}: Then, for all i˛I1y , it holds that 0# fi – (Aw̄ê )i # ê # fi – (Auk )i  0# (A(w̄ê – uk ))i : Case 2: I2y = {i ˛Iy : ( f – Aw̄ê )i < 0}: Then, for all i˛I2y , it holds that 0< ê # fi – (Auk )i < (Aw̄ê )i – (Auk )i  0 < (A(w̄ê – uk ))i : Altogether, we have (Aw̄ê )i $ (Auk )i , for all i ˛Iy , (41) JOTA : VOL. 119, NO. 3, DECEMBER 2003 519 and the strict inequality holds for at least one index i˛Iy . Hence, as in the proof of Lemma 4.2, the inexact counterpart of Step 2(c) is well defined in the sense that w̄ ê $ uk and w̄ ê „ uk. From Æa, bæ# (1=2g )jaj2 + (g =2)jbj2 , for all a, b˛Rn and g > 0, and (27), (39) we obtain J(uk ) – J(w̄e ) = kuk – w̄e k2 – 2Æ f – Aw̄e , uk – w̄e æ $ ajuk – w̄e j2 – g juk – w̄e j2 – (1=g )j f – Aw̄e j2 $ (a – g )juk – w̄e j2 – (ĉ e)2 =g , for all g > 0, where ĉ denotes the imbedding constant between l2 and lO. It is a straightforward calculation to show that, for all 0 < e < a |uk – w̄e|=2ĉ, we have J(uk ) – J(w̄e ) > 0: Choosing ē = min{ê , ajuk – w̄e j=2ĉ} ends the proof. u 5. Numerical Experiments Here, we describe numerical experiments based on the proposed algorithms. All the experiments were performed on an HP9000=J280 workstation (180 MHz PA8000 CPU). More extensive coverage of the numerical experiments is contained in Ref. 20. 5.1. Model Problems. We consider the obstacle problem (1) with the Laplacian operator and homogenous Dirichlet boundary conditions. Thus, A from (2) is the matrix representation of the discretization of the symmetric form ð a(u, v) = ru  rvdx: W All computations are performed in the domain W = (0, 1)d, d = 2, 3, and the discretization is based on linear finite elements on an orthogonal, uniform mesh with mesh size h = 2 – L for an integer L. This choice recovers also the five-point (2D) or seven-point (3D) stencil to A. The total number of 520 JOTA : VOL. 119, NO. 3, DECEMBER 2003 unknowns for the test problems in 2D is N = n2 for n = h –1 – 1. The numbers of nodes in our experiments vary from 225 (L = 4) to 1 046 529 (L = 10). In 3D, the number of unknowns at different levels is between 3375 (L = 4) and 2 048 383 (L = 7). The numerical tests are based on the following examples. The related coincidence sets of the discrete problems with h = 1=128 are illustrated in Figs. 2 and 3. Example 5.1. Unilateral 2D Problem. The problem describes the elastoplastic torsion of a cylindrical bar with quadratical cross section (Ref. 27). For the resulting coincidence set, see Fig. 2 Left. With normalized physical constants, we have a constant right-hand side f and the obstacle y given as the distance function dist(x, ¶W). Fig. 2. Left: Example 5.1 with f̃ = 7. Right: Example 5.2. Fig. 3. Left: Example 5.3 on plane {z = x}. Right: Example 5.4. JOTA : VOL. 119, NO. 3, DECEMBER 2003 521 Example 5.2. Unilateral 2D Problem. Same obstacle function as in Example 5.1, but the right-hand side function is given by f̃ (x, y) = 45(x – x2 )[sin(11p x) + 1]: The corresponding coincidence set is given in Fig. 2 Right. Example 5.3. Unilateral 3D Problem. Constant right-hand side f̃ = 10.0 and the obstacle y (x, y, z) = zy (x, y), where y (x, y) is the 2D distance function of Example 5.1. See Fig. 3 Left. Example 5.4. Bilateral 2D Problem. Let y (x, y) = 0.2, f (x, y) = – dist(x, ¶W), and set 8 if (x, y)˛S = {(x, y)˛W: jx – yj# 0:1 and x # 0:3}, > <300, f̃ (x, y)= – 70 exp( y)g(x), if x # 1 – y and (x, y)ˇS, > : 15 exp( y)g(x), if x > 1 – y and (x, y) ˇS, where g(x) = 8 6x, > > > > 2(1 – 3x), > > > < 6(x – 1=3), > 2(1 – 3(x – 1=3)), > > > > > 6(x – 2=3), > : 2(1 – 3(x – 2=3)), if if if if if if 0 < x# 1=6, 1=6 < x# 1=3, 1=3 < x# 1=2, 1=2 < x# 2=3, 2=3 < x# 5=6, 5=6 < x< 1: See Fig. 3 Right. 5.2. Computer Realization : Implementation Algorithms. Bilateral Case. The following form of the basic algorithm from Section 4 was implemented. Step 1. Step 2. Initialize u0 = y. Choose c > 0, e > 0, and set k = 0. Determine Jf = Jfk = { j ˛N: ( f – Auk )j + c(uk – f)j < 0}, Jy = Jyk = { j ˛N: ( f – Auk )j + c(uk – y )j > 0}, and set J =Jk+1 = Jf ¨ Jy and I = NnJ: 522 JOTA : VOL. 119, NO. 3, DECEMBER 2003 If k > 0 and Jk+1 = Jk, then stop; the solution is uk. = fJf , uJk+1 = yJy and solve Set uJk+1 f y Pl API uIk+1 = PI f – PI APJf f – PI APJy y Step 3. using (39) to terminate. If k = 0, set u1 = min{max{u1, f}, y}. Set k = k+1 and go to Step 2. The unilateral algorithm is (formally) obtained by setting f = –O. The projection of the first iterate is included in Step 2, since it sometimes decreased the number of active set iterations by one. For the modified algorithm, we followed Algorithm A3 in Section 4.2. Description of the Reduced System Solvers. Due to the large size of the algebraic problems, we are forced to solve the linear systems iteratively. For this purpose, we apply a preconditioned conjugate gradient (CG) algorithm with an inexact termination tolerance e. According to our earlier experiences with the inexact solving of reduced systems (Refs. 18–19), the numerically tested values of e depend on h. The influence of e on the overall behavior of the algorithms will be addressed in Tables 2, 4, 5. We tested two different preconditioners for the CG method to solve the reduced systems in 2D. The first preconditioner is based on the use of the BPX preconditioner (Ref. 28) in the inactive part of the domain W with bilinear grid transfer operators. Note that an implementation of the standard (unreduced) BPX preconditioner can be adapted easily to this case by using the additional projections as described e.g. in Ref. 19. A basic difficulty in a multilevel procedure, like the BPX, for solving the reduced system is the problem of obtaining only a very sparse collection of coarser subspaces when the reduced system has a complicated geometry. In Ref. 19, we developed a fast implementation of the odd-even multigrid for reduced problems and introduced a systematic way for enlarging the coarser subspaces. The numerical tests reported in Refs. 18–19 showed a very efficient performance of the multigrid-preconditioned CG method for reduced problems with an arbitrary geometry. In 3D, we used only the reduced system solver from Ref. 19 with the exception that only one symmetric multigrid iteration with extended coarser subspaces is applied in preconditioning. This is due to the fact that, after the first active set iteration, we have always a good initial guess for the iterative solver given by the previous iterate uk. Nested Iterations. From Corollary 3.1, we know that the number of active set iterations depends on the mesh size. To accelerate the algorithm, JOTA : VOL. 119, NO. 3, DECEMBER 2003 523 we apply nested iterations (Ref. 25). This technique consists of solving the obstacle problem using a hierarchy of grids with the mesh sizes h = 2– k for integers 1 # k # L – 1. A good initial guess for the solution on the next finer grid is obtained by using interpolation (prolongation) of the solution on the coarser level. According to the numerical experiments, this technique recovers practically a mesh-independent behavior of the primal-dual active set method. To define the coarser problems, which require the right-hand side f and the discrete obstacles f, y, one can use the trivial injection, which sets up these vectors using the grid points on the current level, or the bilinear restriction, which transfers an averaged information from the finest grid. These two possibilities are compared shortly in Table 5. 5.3. Numerical Results. In this section, we collect the main observations about numerous experiments made with the active set strategies. We use the following notation in the tables and figures below : AS and CG denote the numbers of active set (outer) and CG (inner) iterations, respectively, and CPU denotes the elapsed time. In Tables 2, 4, 5, the column labeled T (type) indicates whether nested iterations (N) were applied or not (O). The augmentation parameter is denoted by c and, in the column labeled B, we distinguish between the multigrid (MG) and BPX type preconditioners (BPX). The termination tolerance is specified by e. To study the relation between the solutions obtained using different tolerances, we use the following measures: (M1) (M2) Difference between the final cost functions, D = J(ue*= hd) – J(ue*= 10 – 13 ) for d = 2, 3. Relative difference between the sizes of final active sets in percent R = 100 (M3) jm(J*e = 10– 13 ) – m(J*e = hd )j , max(m(J* e = hd )) e = 10 – 13 ), m(J* where m(J*) denotes the cardinality of the solution active set J*. Finally, we study the efficiency of the active set methods in terms of the quantity C= CPU (obstacle problem) , CPU (final linear problem) measuring how much more CPU time it takes to solve the obstacle problem than the linear system with the prescribed final index set J*. 524 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Table 1. Nested active set method (NASM) with BPX preconditioner in Example 5.1. L 6 c AS CG CPU 1 4 103 3 26 9 7 0.0 0.0 8 9 10 AS CG CPU AS CG CPU AS CG CPU AS CG CPU 4 3 5 3 6 3 6 3 19 7 0.2 0.1 19 8 1.2 0.6 23 8 7.2 3.1 23 8 31.4 13.1 Characteristic Behavior of Algorithms. Comparison of Reduced System Solvers. Here, we compare the performance of the two preconditioners used to accelerate the CG solutions of the reduced system. The multigrid method with extended subspaces was more accurate and clearly faster than the BPX preconditioner (cf. Table 2), but also the latter worked well for nested iterations (cf. Table 1). For the multigrid preconditioner, the average number of preconditioned CG iterations in one outer iteration for e = h2=h3 (2D=3D) was always extremely small, i.e., one or two with the nested iterations and between two and five without the nested iterations. Stopping Tolerance for Solving Linear Systems. In Ref. 19, we tested thoroughly the loose stopping tolerance e # h2=h3 (2D=3D) with different kinds of reduced linear systems corresponding to various geometries. These tests showed that the iteration error kuk – A –1f kL2 with a loose stopping criterion was always much smaller than the discretization error of the piecewise linear approximation. According to columns D and R in Tables 2–5, it becomes evident that the same conclusion holds for its use within the primaldual algorithms (cf. also Fig. 5 Right). Convergence Rate. We conclude from Tables 2, 4, 5 that, with the chosen discretization, the number of active set iterations depends linearly on the mesh size h = 1=(n + 1) (cf. Fig. 4 Left). However, for the nested iteration technique, a practically mesh-independent behavior is observed (Fig. 4 Right). Concerning the choice of setting up the coarser problems in nested iterations, Table 5 (last two rows) suggests to use the bilinear restriction. Also, we see that the number of inner iterations for solving the reduced linear systems remains almost constant with different mesh sizes when the multigrid preconditioner is applied. Thus, the number of both outer and inner iterations of the active set method with nested iterations and the multigrid preconditioner was independent of h in our tests. This implies that only O(N) operations for solving the obstacle problem with the given accuracy are required (cf. CPU time in Fig. 4 Right). Computational Efficiency. Applying nested iterations in 2D with e = h2, C was found to be in the interval [1.4, 3.0] in Table 3. Similarly in 3D, BPX and MG preconditioners in Example 5.2. L=7 B MG BPX MG MG BPX MG T O O O O N N c 1 1 1 106 106 106 e 10 h2 h2 h2 h2 h2 –13 L=8 L=9 L = 10 AS CG CPU m(J*) AS CG CPU m(J*) AS CG CPU m(J*) AS 32 32 31 31 4 4 366 301 67 67 14 5 7638 7662 7646 7646 7656 7646 65 62 62 62 4 4 727 821 158 158 17 5 53.4 36.0 14.2 14.2 1.1 0.8 30 614 30 688 30 638 30 638 30 632 30 614 130 123 124 124 4 4 1418 2208 354 354 17 5 504.3 479.7 147.1 147.1 5.9 3.6 122 474 122 602 122 526 122 526 122 492 122 482 253 2821 249 5602 249 800 249 800 4 18 4 5 5.1 2.0 1.2 1.2 0.2 0.2 CG CPU m(J*) 4141.7 4800.2 1335.0 1335.0 25.8 15.2 489 968 490 226 490 084 490 084 489 986 489 972 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Table 2. 525 526 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Table 3. Nested active set method (NASM) with MG preconditioner for c = 106 and e = h2 in Example 5.2. L 6 7 8 9 10 D R C 5.7 . 10 – 8 2.1 . 10 – 1 3.0 2.2 . 10 – 9 1.0 . 10 – 1 2.0 4.7 . 10 – 10 0 1.6 5.1 . 10 – 11 6.5 . 10 – 3 1.5 1.1 . 10 – 11 8.2 . 10 – 4 1.4 Table 4. Results for c = 106 in 3D Example 5.3. T AS CG m(J*) 4 10 h3 h3 – 13 O O N 6 6 4 67 14 7 2478 2478 2478 5 10 – 13 h3 h3 O O N 9 9 4 130 27 7 6 10 – 13 h3 h3 O O N 18 18 4 7 10 – 13 h3 h3 O O N 35 34 4 L e R CPU C – 1.002 1.2 . 10 – 8 1.1 . 10 – 8 — 0.0 0.0 0.2 0.0 0.0 4.0 4.0 3.0 21 090 21 090 21 090 – 1.013 1.5 . 10 – 8 3.5 . 10 – 8 — 0.0 0.0 3.4 0.9 0.4 8.3 6.6 2.7 276 66 6 173 190 173 190 173 186 – 1.016 1.0 . 10 – 9 3.9 . 10 – 8 — 0.0 2.3 . 10 – 3 95.7 26.6 3.8 15.9 13.0 1.9 554 159 6 1 398 758 – 1.016 1 398 762 1.1 . 10 – 10 1 398 874 4.6 . 10 – 9 — 2.9 . 10 – 4 8.3 . 10 – 3 1768.78 560.0 32.7 41.6 30.3 1.9 D C was in the interval [1.9, 3.0] in Table 4 for nested iterations and e = h3. In the bilateral case in Table 5, the behavior of C depended on the algorithm, but the performance coincided with other tests for L = 8 or 10. Hence, altogether it took only one to three times more CPU time to solve the obstacle problem than the reduced linear system on the final active set. Moreover, in all tests, the difference between these two CPU times is even decreasing as the size of the problem is increasing. To summarize, with the nested active set algorithm, we are able to solve very large (up to 2 . 106 unknowns) 3D unilateral obstacle problems using an ordinary, single-processor workstation with the same efficiency as the corresponding linear problems. Role of the Parameter c: Unilateral Case. We notice from Tables 1 and 2 (cf. also Fig. 5 Left) that increasing c41 is never a disadvantage in the unilateral case. Because a large value of c was especially useful for the BPX type preconditioner in Table 1 with a loose stopping tolerance for inner iterations, the role of the parameter seems to be related to inexact solving. The more accurately the linear systems are solved, the less effect the parameter has. We made also a series of tests for fixed values of L decreasing c in powers of 10 –1. The conclusion from these tests is that, as c is decreased, the behavior of the algorithm, measured in terms of monotonic convergence, 527 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Table 5. Results for the bilateral test problem in Example 5.4. (1) convergence of Algorithm at the coarser levels 3, 4, 5 failed; (2) for c = 1.0 two times Step 2(c) at the level 3, once at the level 4; (3) different behavior on the coarser levels. c e T AS CG m(J*) R CPU C 6 1.0 1.0 1.0 1.0 1.0 106 106 10 – 13 10 – 13 h2 h2 h2 h2 h2 O N(1) N(1) MO MN EO EN ** 4 4 30 4 30(1) 4(0)(2) ** 53 10 85 10 92 10 ** 1323 1323 1324 1323 1324 1323 ** – 11.888 1.651.10 – 8 8.085.10 – 9 1.625.10 – 8 8.085.10 – 9 1.625.10 – 8 — — 0.0 7.6.10 – 2 0.0 7.6.10 – 2 0.0 ** 1.1 0.4(3) 0.3 0.1(3) 0.3 0.1 ** 36.0 35.0 25.0 5.0 27.0 6.0 8 1.0 1.0 1.0 1.0 1.0 106 106 10 – 13 10 – 13 h2 h2 h2 h2 h2 O N(1) N(1) MO MN EO EN ** 4 4 116 4 116(0) 4(0) ** 51 11 440 11 441 11 ** 20 307 20 307 20 308 20 307 20 308 20 307 ** – 11.545 6.488.10 – 10 4.920.10 – 11 6.488.10 – 10 4.920.10 – 11 1.683.10 – 10 — — 0.0 4.9.10 – 3 0.0 4.9.10 – 3 0.0 ** 5.8 1.7 38.7 1.4 39.9 1.4 ** 4.8 3.6 79.0 2.9 81.4 2.9 10 1.0 1.0 1.0 106 106 10 – 13 h2 h2 h2 h2 N(1) N(1) MN EN EN (b) 4 4 4 4(0) 4(0) 48 13 13 15 11 321 164 321 164 321 164 321 164 321 164 – 11.550 1.601.10 – 12 1.499.10 – 12 3.002.10 – 13 0 — 0.0 0.0 0.0 0.0 104.1 33.0 32.6 35.5 28.3 4.1 2.3 2.3 2.5 2.0 L D becomes increasingly more sensitive to e. While for some examples monotonic convergence is observed even for very small values of c, it is necessary for others to reduce e simultaneously with c to maintain monotonic convergence. Role of the Parameter c: Bilateral Case. The results of the bilateral example are given in Table 5, where E in the column T refers to Algorithm A3 in Section 4 (with extra substeps), for which the numbers of Step 2(c) are included in column AS inside brackets. For studying the role of condition (22) on the convergence (cf. Proposition 4.1), we also implemented and tested the more restrictive determination, k (42a) k (42b) Jfk = { j ˛Jfk–1 ¨ Ik–1 : l j + c(uk – f)j < 0}, Jyk = { j ˛Jyk–1 ¨ Ik–1 : l j + c(uk – y )j > 0}, of the active sets in Step 2 of Algorithm A2. This approach, indicated by MO=MN (modified ordinary=nested) in column T of Table 5, assures readily that (22) holds true, independently of c. The additional symbol (b) in the column EN in the last row indicates that there we have used the bilinear restriction, while the other coarser problems for nested iterations were 528 JOTA : VOL. 119, NO. 3, DECEMBER 2003 Fig. 4. Left: Numbers of active set iterations with respect to n in Table 2 for e = 10 – 13. Right: CPU times and iterations numbers with respect to N in a logarithmic scale for nested iterations and the multigrid preconditioner with c = 106 and e = h2 in Table 2. Fig. 5. Left: Numbers of nested AS=CG iterations for different values of c with respect to L in Table 1. Right: D for example 5.2 with respect to L in Table 3. defined using the trivial injection. Finally, the symbol ** in Table 5 means that the algorithm was not convergent. Characteristic Behavior. Table 5 demonstrates the significant influence of c on the performance. The basic algorithm for c = 1 was not always convergent, but choosing c = 106 restored the convergence. This was due to the fact that, for large c, conditions (22) and (26) were always valid after the first iteration, which excluded the direct jumps between the index sets Jf and Jy precisely according to (22) (cf. Tables 9 and 10 in Ref. 20). However, for c = 1, the nested iteration technique was convergent on the finest (i.e. actual) discretization level, even if the coarser problems were not convergent. JOTA : VOL. 119, NO. 3, DECEMBER 2003 529 Altogether, we conclude that choosing large c seems to extend the radius of convergence of active set methods and also yields the more efficient numerical performance. Role of Step 2(c) in Algorithm A3. We introduced the additional step [Step 2(c)] into Algorithm A3 to guarantee the unconditional convergence of the active set method. Based on Table 5, we observe that Step 2(c) was needed once or twice for c = 1 and never for c = 106. To this end, one notices that, by means of the active set iterations, the modified basic algorithm with (42) and the extended algorithm with the additional step share the same number of iterations for both c = 1 and c = 106. This illustrates also the intrinsic relation between the condition (22), size of c, and convergence of the active set methods. 6. Conclusions Active set algorithms for two-dimensional and three-dimensional, unilateral and bilateral obstacle problems were considered. Their convergence and rate of convergence was analyzed and tested numerically. Efficient multigrid and multilevel-based implementations for the proposed algorithms were described and compared. Significant influence of the size of the regularization parameter c by means of active set iterations, inexact solving, and convergence of different algorithms was demonstrated. 7. Appendix: Proof of Theorem 3.1 Since u* # u0 # y, we have J* C* = { j ˛N: u*j = y j }  C0 = { j ˛N: u0j = y j }: (43) Using (43) and the nonpositivity of the codiagonal block AJ*I*, we get 0 l J* = max{0, [ f – Au0 ]J* } $ [ f – Au0 ]J* = fJ* – AJ*J* u0J* – AJ*I* u0I* $ fJ* – AJ*J* y J* – AJ*I* uI** = [ f – Au*]J* = l*J* : (44) 0 Together with l*I* = 0, this yields l $ l*. Let J0 and I0 be determined by Step 2. Because I–1 ˙ J –1 = ; and u0 # y , 530 JOTA : VOL. 119, NO. 3, DECEMBER 2003 we have actually 0 l̂ 0 = l , for all c > 0, so that J0 = J–1 I0 and I –1 : and Then, we infer from l0 $ l* that I0  I*. Recalling (11), we notice that (u1, l) is the solution of Au1 + l = f , l i = 0, u1j = y j, (45a) i ˛I0 , (45b) 0 (45c) j ˛J , and we find that [A(u1 – u0 )]i =( f – Au0 )i # max{0, ( f – Au0 )i } 0 =l i = 0, for all i ˛I0 : (46) Using u1J0 = y J0 = u0J0 , this implies AI0 I0 (u1 – u0 )I0 # 0: Because AI0 I0 is monotone, we obtain and u1 # u0 : u1I0 # u0I0 Furthermore, since AJ0 I0 # 0, we have 1 l J0 = fJ0 – AJ0 I0 u1I0 – AJ0 J0 u1J0 # fJ0 – AJ0 I0 u0I0 – AJ0 J0 u0J0 0 = l J0 : (47) Hence, 1 l = max{0, l} # l 0 and, using the notation introduced in the proof of Lemma 3.1, I1 = I0 ¨ J02 and J1 = J01 : JOTA : VOL. 119, NO. 3, DECEMBER 2003 531 Furthermore, we recall that l1 = 0 on I1. Since I0  I*, we have [A(u1 – u*)]I0 =0: By u1J0 = yJ0 $ u* J0 and AI0 J0 # 0, we get 0 = AI0 I0 (u1 – u*)I0 +AI0 J0 (u1 – u*)J0 # AI0 I0 (u1 – u*)I0 , (48) which implies u1 $ u*. Then, using J*  J0, one argues as in (44) to conclude that 1 l J* = max{0, [ f – Au1 ]J* } $ [ f – Au1 ]J* $ [ f – Au*]J* = l*J* : (49) and l1 $ l*, which together with l0 $ l1 shows that I0  I1  I*. The monotonicity of uk+1 and lk+1 now follows from an induction argument using lk = 0 on Ik as the induction hypothesis. Finally, if Jk = Jk–1 for k $ 1, we stop at the solution. Otherwise, k J „ Jk–1 and the uniqueness of the solution of (11) together with the mono< uj for at least one index tonicity of the iterates yields uk+1 „ uk so that uk+1 j < lkj for at least one j˛N. This j˛N. The same argument shows that also lk+1 j implies the strict inclusion Jk+1  Jk. Because N is finite, there exists only a finite number of possible active sets and we must arrive at Jk = Jk–1 in finitely many steps. u References 1. GLOWINSKI, R., Numerical Methods for Nonlinear Variational Problems, Springer Verlag, New York, NY, 1984. 2. GLOWINSKI, R., LIONS, J. L., and TREMOLIERES, R., Numerical Analysis of Variational Inequalities, North-Holland, Amsterdam, Holland, 1981. 3. HACKBUSCH, W., and MITTELMANN, H., On Multigrid Methods for Variational Inequalities, Numerische Mathematik, Vol. 42, pp. 65–76, 1983. 4. HOFFMANN, K. H., and ZOU, J., Parallel Algorithms of Schwarz Variant for Variational Inequalities, Numerical Functional Analysis and Optimization, Vol. 13, pp. 449– 462, 1992. 532 JOTA : VOL. 119, NO. 3, DECEMBER 2003 5. KORNHUBER, R., Monotone Multigrid Methods for Elliptic Variational Inequalities, I, Numerische Mathematik, Vol. 69, pp. 167–184, 1994. 6. LIONS, P. L., On the Schwarz Alternating Method, I, 1st International Symposium on Domain Decomposition Methods for Partial Differential Equations, Edited by R. Glowinski, G. Golub, G. Meurant, and J. Periaux, SIAM, Philadelphia, Pennsylvania, pp. 2– 42, 1988. 7. SCHÖBERL, J., Solving the Signorini Problem on the Basis of Domain Decomposition Techniques, Computing, Vol. 60, pp. 323–344, 1998. 8. TARVAINEN, P., Two-Level Schwarz Method for Unilateral Variational Inequalities, IMA Journal of Numerical Analysis, Vol. 19, pp. 193–212, 1999. 9. ZENG, J., and ZHOU, S., On Monotone and Geometric Convergence of Schwarz Methods for Two-Sided Obstacle Problems, SIAM Journal on Numerical Analysis, Vol. 35, pp. 600–616, 1998. 10. TARVAINEN, P., Numerical Algorithms Based on Characteristic Domain Decomposition for Obstacle Problems, Communications in Numerical Methods in Engineering, Vol. 13, pp. 793–801, 1997. 11. HOPPE, R., Multigrid Algorithms for Variational Inequalities, SIAM Journal on Numerical Analysis, Vol. 24, pp. 1046–1065, 1987. 12. HOPPE, R., Two-Sided Approximations for Unilateral Variational Inequalities by Multigrid Methods, Optimization, Vol. 18, pp. 867–881, 1987. 13. KIRSTEN, H., and TICHATSCHKE, R., On a Method of Feasible Directions for Solving Variational Inequalities, Optimization, Vol. 16, pp. 535–546, 1985. 14. LIONS, P. L., and MERCIER, B., Approximation Numérique des Équations de Hamilton-Jacobi-Bellman, RAIRO Analyse Numérique, Vol. 14, pp. 375–387, 1980. 15. BERTSEKAS, D. P., Constrained Optimization and Lagrange Multiplier Methods, Academic Press, New York, NY, 1982. 16. ITO, K., and KUNISCH, K., Augmented Lagrangian Methods for Nonsmooth, Convex Optimization in Hilbert Spaces, Nonlinear Analysis : Theory, Methods, and Applications, Vol. 41, pp. 591–616, 2000. 17. HOPPE, R., and KORNHUBER, R., Adaptive Multilevel Methods for Obstacle Problems, SIAM Journal on Numerical Analysis, Vol. 31, pp. 301–323, 1994. 18. KÄRKKÄINEN, T., and TOIVANEN, J., Multigrid Methods with Extended Subspaces for Reduced Systems, Proceedings of the 3rd European Conference on Numerical Mathematics and Advanced Applications, Edited by P. Neittaanmäki, T. Tiihonen, and P. Tarvainen, World Scientific, Singapore, pp. 564 –570, 2000. 19. KÄRKKÄINEN, T., and TOIVANEN, J., Building Blocks for Odd-Even Multigrid with Applications to Reduced Systems, Journal of Computational and Applied Mathematics, Vol. 131, pp. 15–33, 2001. 20. KÄRKKÄINEN, T., KUNISCH, K., and TARVAINEN, P., Primal-Dual Active Set Methods for Obstacle Problems, Report B2, Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylä, Finland, 2000. 21. VARGA, R., Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, New Jersey, 1962. JOTA : VOL. 119, NO. 3, DECEMBER 2003 533 22. WINDISCH, G., M-Matrices in Numerical Analysis, Teubner, Leipzig, Germany, 1989. 23. ITO, K., and KUNISCH, K., An Augmented Lagrangian Technique for Variational Inequalities, Applied Mathematics and Optimization, Vol. 21, pp. 223–241, 1990. 24. KORNHUBER, R., and YSERENTANT, H., Multilevel Methods for Elliptic Problems on Domains Not Resolved by the Coarse Grid, Domain Decomposition Methods in Scientific and Engineering Computing, Edited by D. E. Keyes, and J. Xu, Contemporary Mathematics, American Mathematical Society, Providence, Rhode Island, Vol 180, pp. 49–60, 1993. 25. HACKBUSCH, W., Multigrid Methods and Applications, Springer Series in Computational Mathematics, Springer Verlag, Berlin, Germany, 1985. 26. DOSTAL, Z., Box-Constrained Quadratic Programming with Proportioning and Projections, SIAM Journal on Optimization, Vol. 7, pp. 871–887, 1997. 27. RODRIGUES, J. F., Obstacle Problems in Mathematical Physics, North-Holland, New York, NY, 1987. 28. BRAMBLE, J., PASCIAK, J., and XU, J., Parallel Multilevel Preconditioners, Mathematics of Computation, Vol. 55, pp. 1–22, 1990.