A primal-dual splitting algorithm for composite monotone inclusions with minimal lifting

Aragón-Artacho, Francisco J.; Boţ, Radu I.; Torregrosa-Belén, David

doi:10.1007/s11075-022-01405-9

A primal-dual splitting algorithm for composite monotone inclusions with minimal lifting

Original Paper
Open access
Published: 18 November 2022

Volume 93, pages 103–130, (2023)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

A primal-dual splitting algorithm for composite monotone inclusions with minimal lifting

Download PDF

2289 Accesses
1 Altmetric
Explore all metrics

Abstract

In this work, we study resolvent splitting algorithms for solving composite monotone inclusion problems. The objective of these general problems is finding a zero in the sum of maximally monotone operators composed with linear operators. Our main contribution is establishing the first primal-dual splitting algorithm for composite monotone inclusions with minimal lifting. Specifically, the proposed scheme reduces the dimension of the product space where the underlying fixed point operator is defined, in comparison to other algorithms, without requiring additional evaluations of the resolvent operators. We prove the convergence of this new algorithm and analyze its performance in a problem arising in image deblurring and denoising. This work also contributes to the theory of resolvent splitting algorithms by extending the minimal lifting theorem recently proved by Malitsky and Tam to schemes with resolvent parameters.

A Reflected Forward-Backward Splitting Method for Monotone Inclusions Involving Lipschitzian Operators

Article 30 March 2020

An Inertial Semi-forward-reflected-backward Splitting and Its Application

Article 15 February 2022

Strengthened splitting methods for computing resolvents

Article 20 August 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the last decades, monotone inclusion problems have become an attractive topic of research in operator theory and numerical optimization. The wide variety of situations in applied mathematics that can be modeled as finding a zero of the sum of mixtures of maximally monotone operators is one of the reasons for its increasing popularity. Among the methods that are usually employed for tackling these problems, splitting algorithms (see, e.g., [2, Chapter 26]) are the ones that have received more attention. Using simple operations, these methods define an iterative sequence which separately handles the operators defining the problem and is convergent to a solution to the inclusion problem. Furthermore, as these methods only use first-order information, they are well suited for large-scale optimization problems.

In this work, we focus on the study of primal-dual splitting algorithms for composite monotone inclusion problems in real Hilbert spaces of the following form.

Problem 1

Let ${\mathscr{H}}$ and $(\mathcal {G}_{j})_{1\leq j\leq m}$ be real Hilbert spaces. Let $A_{1}, \ldots , A_{n}:{\mathscr{H}} \rightrightarrows {\mathscr{H}}$ be maximally monotone operators, let $B_{j}:\mathcal {G}_{j}\rightrightarrows \mathcal {G}_{j}$ be maximally monotone and $L_{j}:{\mathscr{H}} \to \mathcal {G}_{j}$ be a bounded linear operator whose adjoint is denoted by $L_{j}^{*}$, for all j ∈{1,…,m}. The problem consists in solving the primal inclusion

$$ \text{find } x \in \mathcal{H} \text{ such that } 0\in\sum\limits_{i=1}^{n} A_{i}(x) + \sum\limits_{j=1}^{m}L_{j}^{*}B_{j} (L_{j}x), $$

(1)

together with its associated dual inclusion

$$ \text{find } (u_{1},\ldots,u_{m})\!\in{\kern-.5pt}\mathcal{G}_{1}{\kern-.5pt}\times\cdots\times{\kern-.5pt}\mathcal{G}_{m}~ \text{such that} ~\left( \exists x\! \in\!\mathcal{H}\right)\! \left\{ \begin{array}{lll} \displaystyle -\sum\limits_{j=1}^{m} L_{j}^{*}u_{j} \in \sum\limits_{i=1}^{n} A_{i}(x), \\ u_{j} {\kern-.5pt}\in{\kern-.5pt} B_{j}(L_{j}x) \quad j{\kern-.5pt}={\kern-.5pt}1{\kern-.5pt},\ldots{\kern-.5pt},{\kern-.5pt}m{\kern-.5pt}. \end{array} \right. $$

(2)

Problem 1 encompasses numerous important problems in mathematical optimization and real-world applications (see, e.g., [10, 11, 20]). In these settings, it is highly desirable to devise algorithms that simultaneously obtain solutions to both problems (1) and (2) –namely, a primal-dual solution– and which only make use of resolvents of the maximally monotone operators, forward evaluations of the linear operators and their adjoints, scalar multiplication, and vector addition. Many splitting methods can be found in the literature satisfying these conditions (see, e.g., [3,4,5, 12, 22]). One of the best-known primal-dual algorithm is the one proposed by Briceño-Arias and Combettes in [7], which was further studied in [6]. To derive this scheme, let us consider first the particular instance of Problem 1 in which n = m = 1 and let us define the pair of operators M and N given by

$$ \left\{ \begin{array}{ll} M:\mathcal{H}\times\mathcal{G}\rightrightarrows\mathcal{H}\times\mathcal{G} : (x,u) \to A(x) \times B^{-1}(u), \\ N:\mathcal{H}\times\mathcal{G}\to\mathcal{H}\times\mathcal{G} : (x,u) \to (L^{*}u,-Lx). \end{array} \right. $$

The operator M is maximally monotone and N is a skew symmetric bounded linear operator. Furthermore, the set of zeros of the sum M + N consists of primal-dual solutions to Problem 1. Applying the forward-backward-forward algorithm to the problem of finding the zeros of M + N results in the fixed point iteration given by

$$ x^{k+1} = \big(J_{\gamma M} \left( \text{Id} - \gamma N\right) + \gamma N \left( \text{Id} - J_{\gamma M}\left( \text{Id} - \gamma N\right)\right)\big)(x^{k}) \quad \forall k \geq 0, $$

(3)

where γ > 0, Id denotes the identity operator and J_γA stands for the resolvent of A with parameter γ (see Definition 3). Thus, since the resolvent of a cartesian product is the cartesian product of the resolvents, it can be seen that (3) is a full splitting algorithm, as it only requires evaluations of the resolvents J_γA and $J_{\gamma B^{-1}}$, and of the linear operator and its adjoint.

The general problem involving more than two operators can be addressed by setting

$$ \begin{array}{ll} A :=A_{1}, \quad B := A_{2}\times\cdots\times A_{n} \times B_{1}\times\cdots\times B_{m}\\ \text{and } \quad \qquad L := \text{Id} \times \stackrel{(n)}{\cdots} \times \text{Id} \times L_{1} \times {\cdots} \times L_{m}. \end{array} $$

In this case, according to (3), the resulting algorithm is generated by a fixed point iteration of an operator defined in the ambient space ${\mathscr{H}}^{n} \times \mathcal {G}_{1} \times {\cdots } \times \mathcal {G}_{m}$. The dimension of the underlying space is directly related to the memory requirements of the resulting algorithm. In general, a smaller dimension of the space translates into less consumption of computational resources. For this reason, the development of algorithms with reduced dimension for solving monotone inclusion problems has recently become an active topic of research [8, 14, 16, 19].

Lifted splitting algorithms

The notion of lifted splitting, first introduced in [19], relates a fixed point algorithm with the dimension of its underlying ambient space. Consider the simplest case of the classical monotone inclusion problem obtained by setting m = 0 in (1):

Problem 2

Let $A_{1},\ldots ,A_{n}:{\mathscr{H}}\rightrightarrows {\mathscr{H}}$ be maximally monotone operators and consider the problem

$$ \text{ find } x\in\mathcal{H} \text{ such that } 0\in\sum\limits_{i=1}^{n} A_{i}(x). $$

A fixed point algorithm for finding a solution to Problem 2 employs a d-fold lifting if its underlying fixed point operator can be defined on the d-fold Cartesian product ${\mathscr{H}}^{d}$. For example, if n = 2, the famous Douglas–Rachford algorithm [15] makes use of a 1-fold lifting, since it can be written as the fixed point iteration

$$ x^{k+1} = x^{k} + \lambda \left( J_{A_{2}}\left( 2J_{A_{1}} -\text{Id}\right) -J_{A_{1}} \right)(x^{k}) \quad \forall k \geq 0, $$

with λ ∈ ]0,2[. Until very recently, the only way to tackle the problem when n > 2 was using Pierra’s product space reformulation [18], which implies an n-fold lifting. Nowadays, various algorithms have been proposed allowing to solve the problem by only resorting to an (n − 1)-lifting (see, e.g., [8, 13, 14]). This reduction from n to n − 1 has been proven to be minimal [16] when the algorithms are required to be frugal resolvent splittings [19], which means that each of the resolvents $J_{A_{1}}, \ldots , J_{A_{n}}$ is evaluated only once per iteration.

To the best of the authors’ knowledge, the notion of lifting has not been developed in the setting of primal-dual inclusions given by Problem 1. We will say that a primal-dual splitting has (d,f)-lifting if the underlying fixed point operator can be written in the product space

$$ \mathcal{H}^{d} \times \mathcal{G}_{1}^{f_{1}} \times {\cdots} \times \mathcal{G}_{m}^{f_{m}}, $$

with $f = {\sum }_{j=1}^{m} f_{j}$. Thus, the Briceño–Arias–Combettes primal-dual splitting algorithm makes use of an (n,m)-fold lifting. This is also the case for the other primal-dual algorithms existing in the literature. In this work, we propose the first (n − 1,m)-lifted splitting method for solving primal-dual inclusions and demonstrate the minimality of the algorithm. In order to do this, it is important to note that the definition of frugal resolvent splitting does not allow the use of parametrized resolvents. The inclusion of these resolvent parameters is of crucial importance for controlling the Lipschitz constants of the linear operators in Problem 1, as can be seen in all the existent primal-dual schemes. This motivates the introduction of the concept of frugal parametrized resolvent splitting, whose definition coincides with the one of frugal resolvent splitting except that it permits the inclusion of resolvent parameters. Our contribution to the theory of minimal lifting splitting methods is double: (i) we extend the results of Malitsky–Tam in [16, Section 3] to frugal parametrized resolvent splitting algorithms, (ii) we prove that for a frugal primal-dual parametrized resolvent splitting (see Section 3 for a precise definition) with (d,m)-fold lifting to solve Problem 1, one necessarily has d ≥ n − 1. Our proposed algorithm is the first algorithm in the literature being minimal according to this relation.

The rest of this work is structured as follows. In Section 2, we recall some preliminary notions and results. In particular, in Section 2.1, we present the extension of the results by Malitsky–Tam [16] to parametrized resolvent splitting algorithms. In Section 3, we introduce the first primal-dual algorithm with reduced lifting for composite monotone inclusion problems and prove its convergence. The concept of parametrized resolvent splitting is adapted to primal-dual schemes in Section 3. We prove a minimality theorem under the hypothesis of frugality and show that our proposed algorithm verifies it. In Section 4, we include a numerical experiment on image deblurring and compare the performance of the new algorithm with the best performing primal-dual algorithm for this problem. The paper ends with some conclusions and possible future work directions in Section 5. Finally, in Appendix A, a detailed proof of the results in Section 2.1 is presented.

2 Preliminaries

Throughout this paper, ${\mathscr{H}}$, $\mathcal {G}$, and $(\mathcal {G}_{j})_{1 \leq j \leq m}$ are real Hilbert spaces. Otherwise stated, to simplify the notation we will employ 〈⋅,⋅〉 and ∥⋅∥ to denote the inner product and the induced norm, respectively, of any space. We use → to denote norm convergence of a sequence. We denote by ${\mathscr{H}}^{n}$ the product Hilbert space ${\mathscr{H}}^{n}={\mathscr{H}} \times \stackrel {(n)}{\cdots } \times {\mathscr{H}}$ with inner product defined as

$$ \langle (x_{1},\ldots,x_{n}), (\bar{x}_{1},\ldots,\bar{x}_{n})\rangle := \sum\limits_{i=1}^{n} \langle x_{i},\bar{x}_{i}\rangle\quad\forall (x_{1},\ldots,x_{n}), (\bar{x}_{1},\ldots,\bar{x}_{n})\in\mathcal{H}^{n}. $$

Sequences and sets in product spaces are marked with bold, e.g., $\mathbf {x}=(x_{1},\ldots ,x_{n})\in {\mathscr{H}}^{n}$.

For a set-valued operator, we write $A:{\mathscr{H}} \rightrightarrows {\mathscr{H}}$, in opposite to $A:{\mathscr{H}}\to {\mathscr{H}}$ which denotes a single-valued operator. The notation dom, Fix, zer and gra is used for the domain, the set of fixed points, the zeros and the graph of A, respectively, i.e.,

$$ \begin{array}{@{}rcl@{}} \text{dom} A:&=&\left\{x\in\mathcal{H} : A(x)\neq\emptyset\right\},\quad \text{Fix} A:=\left\{x\in\mathcal{H} : x\in A(x)\right\}, \\ \text{zer} A&:=&\left\{x\in\mathcal{H} : 0\in A(x)\right\},\quad \text{gra} A:=\left\{(x,u)\in\mathcal{H}\times\mathcal{H} : u\in A(x)\right\} . \end{array} $$

The inverse operator of A, denoted by A^− 1, is the operator whose graph is given by $\text {gra} A^{-1} = \left \{ (u,x)\in {\mathscr{H}}\times {\mathscr{H}} : u\in A(x)\right \}$. The identity operator is denoted by Id. When $L:{\mathscr{H}}\to \mathcal {G}$ is a bounded linear operator, we use $L^{*}:\mathcal {G}\to {\mathscr{H}}$ to denote its adjoint, which is the unique bounded linear operator such that 〈Lx,y〉 = 〈x,L^∗y〉, for all $x\in {\mathscr{H}}$ and $y\in \mathcal {G}$.

To simplify the notation, we will use ⟦k,l⟧ to denote the set of integers between $k,l\in \mathbb {N}$, i.e.,

$$ \llbracket k, l\rrbracket := \left\{\begin{array}{ll} \{k, k+1, {\ldots} ,l\} & \text{if } k\leq l, \\ \emptyset & \text{otherwise}. \end{array}\right. $$

Definition 1

An operator $T:{\mathscr{H}} \to {\mathscr{H}}$ is said to be

(i)
κ-Lipschitz continuous for κ > 0 if
$$ \|T(x)-T(y)\| \leq \kappa \|x-y\| \quad \forall x,y \in \mathcal{H}; $$
(ii)
nonexpansive if it is 1-Lipschitz continuous, i.e.,
$$ \|T(x)-T(y)\| \leq \|x-y\| \quad \forall x,y \in \mathcal{H}; $$
(iii)
α-averaged nonexpansive for α ∈ ]0,1[ if
$$ \|T(x)-T(y)\|^{2}+\frac{1-\alpha}{\alpha} \|(\text{Id}-T)(x)-(\text{Id}-T)(y)\|^{2} \leq \|x-y\|^{2} \quad \forall x,y\in\mathcal{H}. $$

Definition 2

A set-valued operator $A:{\mathscr{H}} \rightrightarrows {\mathscr{H}}$ is monotone if

$$ \langle x-y,u-v\rangle \geq 0 \quad \forall (x,u),(y,v)\in \text{gra} {A}. $$

Furthermore, A is said to be maximally monotone if there exists no monotone operator $B:{\mathscr{H}} \rightrightarrows {\mathscr{H}}$ such that graB properly contains graA.

Definition 3

Given an operator $A\colon {\mathscr{H}}\rightrightarrows {\mathscr{H}}$, the resolvent of A with parameter γ > 0 is the operator $J_{\gamma A}\colon {\mathscr{H}}\rightrightarrows {\mathscr{H}}$ defined by J_γA := (Id + γA)^− 1.

The next result contains Minty’s theorem [17].

Proposition 1 ([2, Proposition 23.10])

Let $A:{\mathscr{H}}\rightrightarrows {\mathscr{H}}$ be monotone and let γ > 0. Then,

(i)
J_γA is single-valued,
(ii)
$\text {dom} J_{\gamma A}={\mathscr{H}}$ if and only if A is maximally monotone.

2.1 Parametrized resolvent splitting

Besides developing lifted splitting algorithms with reduced dimension, different works have been devoted to determine the minimal dimension reduction that can be achieved under some conditions. This is the case of [16, 19], where a minimality result is obtained for the classical monotone inclusion Problem 2. In what follows, we employ T for denoting a fixed point operator and S for a solution operator, both depending on the maximally monotone operators appearing in the problem.

Definition 4 (Fixed point encoding 19)

A pair of operators (T,S) is a fixed point encoding for Problem 2 if, for all particular instance of the problem,

$$ \text{Fix} {T}\neq \emptyset \Longleftrightarrow \text{zer}~{\left( \sum\limits_{i=1}^{n} A_{i}\right)}\neq\emptyset \text{ and } \mathbf{z}\in \text{Fix} {T} \Longrightarrow S(\mathbf{z}) \in \text{zer } {\left( \sum\limits_{i=1}^{n} A_{i}\right)}. $$

Previous works on minimality are based on the concept of resolvent splitting, which does not allow employing parametrized resolvents (i.e., it only permits computation of the resolvents $J_{A_{1}},\ldots ,J_{A_{n}}$). In this work, we introduce the notion of parametrized resolvent splitting and adapt the minimality result in [16, Section 3] to the more general parametrized setting. Since the reasoning is very similar to the one in the mentioned reference, we only present the results here and refer the interested reader to Appendix A for a detailed demonstration.

Definition 5 (Parametrized resolvent splitting)

A fixed point encoding (T,S) for Problem 2 is a parametrized resolvent splitting if, for all particular instances of the problem, there is a finite procedure that evaluates T and S at a given point which only uses vector addition, scalar multiplication, and the parametrized resolvents of A₁,…,A_n.

Definition 6 (Frugality)

A parametrized resolvent splitting (T,S) for Problem 2 is frugal if, in addition, each of the parametrized resolvents of A₁,…,A_n is used exactly once.

Definition 7 (Lifting 19)

Let $d\in \mathbb {N}$. A fixed point encoding (T,S) is a d-fold lifting for Problem 2 if $T:{\mathscr{H}}^{d}\to {\mathscr{H}}^{d}$ and $S:{\mathscr{H}}^{d}\to {\mathscr{H}}$.

Example 1

In [8], a product space reformulation with reduced dimension is proposed, which applied to Problem 2 yields the following lifted splitting. Given any γ > 0 and λ ∈ ]0,2], the algorithm in [8, Theorem 5.1] can be defined by the operator $R:{\mathscr{H}}^{n-1}\to {\mathscr{H}}^{n-1}$ given by

$$ R(\mathbf{z}) := \mathbf{z} + \lambda \left( \begin{array}{c} x_{1} - x_{0} \\ x_{2} - x_{0} \\ {\vdots} \\ x_{n-1} - x_{0} \end{array}\right), $$

where z = (z₀,z₁,…,z_n− 1) and $\mathbf {x} = (x_{0},x_{1},\ldots ,x_{n-1})\in {\mathscr{H}}^{n}$ is the vector defined as

$$ \left\{ \begin{array}{l} x_{0} = J_{\frac{\gamma}{n-1} A_{n}} \left( \frac{1}{n-1} \sum\limits_{i=1}^{n-1} z_{i}\right), \\ x_{i} = J_{\gamma A_{i}} (2 x_{0} - z_{i}) \quad \forall i\in{\llbracket 1,n-1\rrbracket}. \end{array} \right. $$

Moreover, if we let $S:{\mathscr{H}}^{n-1} \to {\mathscr{H}}$ be the operator given by

$$ S(\mathbf{z}) := J_{\frac{\gamma}{n-1} A_{n}} \left( \frac{1}{n-1} \sum\limits_{i=1}^{n-1} z_{i}\right), $$

then the pair (R,S) is a frugal parametrized resolvent splitting with (n − 1)-fold lifting which is not a resolvent splitting, since it makes use of resolvent parameters.

Malitsky and Tam prove in [16, Theorem 3.3] that the minimal lifting that one can achieve for Problem 2 with frugal resolvent splittings is n − 1. From their proof, it cannot be directly determined whether the same result holds when the resolvents are allowed to have different parameters. The next theorem provides an affirmative answer to this question.

Theorem 2 (Minimal lifting for frugal parametrized splittings)

Let n ≥ 2 and let (T,S) be a frugal parametrized resolvent splitting with d-fold lifting for Problem 2. Then, d ≥ n − 1.

Proof

See Theorem 6 in Appendix A. □

3 A primal-dual splitting with minimal lifting

In this section, we devise a primal-dual splitting algorithm for Problem 1 with minimal lifting. We base our analysis in the case in which the primal problem involves only one linear composition, i.e., m = 1, and later extend to an arbitrary finite number of linearly composed maximally monotone operators by appealing to a product space reformulation. Lastly, we prove minimality of the algorithm by adapting the concept of lifted splitting to primal-dual algorithms.

The case with one linear composition

Let n ≥ 2. We start by considering the primal-dual problem given by

$$ \text{find } x \in \mathcal{H} \text{ such that } 0\in\sum\limits_{i=1}^{n} A_{i}(x) + L^{*} B (Lx), $$

(4)

and

$$ \text{find } u\in\mathcal{G} \text{ such that } 0\in -L \left( \sum\limits_{i=1}^{n} A_{i} \right)^{-1} \bigl(- L^{*} u\bigr) + B^{-1} (u), $$

(5)

where $A_{1},\ldots ,A_{n}:{\mathscr{H}}\rightrightarrows {\mathscr{H}}$ and $B:\mathcal {G} \rightrightarrows \mathcal {G}$ are maximally monotone operators and $L:{\mathscr{H}}\to \mathcal {G}$ is a bounded linear operator. Note that in this case, (5) corresponds to the Attouch–Théra dual problem of (4) (see [1]). In the following, we denote the set of solutions of (4) and (5) by $\mathcal {P}$ and $\mathcal {D}$, respectively, and consider the set Z defined as

$$ \mathbf{Z} := \left\{ (x,u)\in \mathcal{H} \times \mathcal{G} : -L^{*}u \in\sum\limits_{i=1}^{n} A_{i}(x) \text{ and } u \in B(Lx)\right\}, $$

which is useful for tackling primal-dual inclusion problems. It is well-known that Z is a subset of $\mathcal {P} \times \mathcal {D}$ and that

$$ \mathcal{P} \neq \emptyset \Longleftrightarrow \mathbf{Z} \neq \emptyset \Longleftrightarrow \mathcal{D} \neq \emptyset. $$

Indeed, we have

$$ \begin{array}{@{}rcl@{}} \exists x\in\mathcal{P} & \Longleftrightarrow & (\exists x\in\mathcal{H})\quad 0\in\sum\limits_{i=1}^{n} A_{i}(x)+L^{*}B(Lx) \\ & \Longleftrightarrow & (\exists (x,u) \in\mathcal{H} \times \mathcal{G}) \left\{ \begin{array}{ll} -L^{*}(u) & \in \sum\limits_{i=1}^{n} A_{i}(x), \\ & u \in B(Lx), \end{array} \right.\\ & \Longleftrightarrow & (\exists (x,u) \in\mathcal{H} \times \mathcal{G}) \left\{ \begin{array}{ll} x \in \left( \displaystyle\sum\limits_{i=1}^{n} A_{i}\right)^{-1}\bigl(-L^{*}u\bigr), \\ Lx\in B^{-1} (u), \end{array} \right. \\ &\Longleftrightarrow & (\exists u\in\mathcal{G}) \quad 0\in -L \left( \sum\limits_{i=1}^{n} A_{i} \right)^{-1} \bigl(- L^{*} u\bigr) + B^{-1} (u) \Longleftrightarrow \exists u \in\mathcal{D}. \end{array} $$

We refer to an element of Z as a primal-dual solution of (4) and (5).

Now, we introduce a fixed point algorithm for solving the primal-dual problem given by (4) and (5). Let λ,γ > 0 and let $T:{\mathscr{H}}^{n-1} \times \mathcal {G} \to {\mathscr{H}}^{n-1} \times \mathcal {G}$ be the operator given by

$$ T \left( \begin{array}{c} \mathbf{z}\\ v \end{array}\right) : = \left( \begin{array}{c} \mathbf{z}\\v \end{array}\right) + \lambda \left( \begin{array}{c} x_{2} - x_{1}\\ x_{3}- x_{2}\\ \vdots\\ x_{n}-x_{n-1}\\ \gamma(y-Lx_{n}), \end{array}\right) $$

(6)

where $(\mathbf {x},y)=(x_{1},\ldots ,x_{n},y) \in {\mathscr{H}}^{n}\times \mathcal {G}$ depends on $(\mathbf {z},v) = (z_{1},\ldots ,z_{n-1},v) \in {\mathscr{H}}^{n-1} \times \mathcal {G}$ in the following way

$$ \left\{ \begin{array}{lll} x_{1} &=& J_{A_{1}}(z_{1}), \\ x_{i} &=& J_{A_{i}}(z_{i}+x_{i-1}-z_{i-1}), \quad \forall i\in{\llbracket 2,n-1\rrbracket}, \\ x_{n} &=& J_{A_{n}}(x_{1}+x_{n-1} -z_{n-1}- L^{*} (\gamma Lx_{1} - v)), \\ y &=& J_{B/\gamma} \left( L(x_{1} + x_{n})- \frac{v}{\gamma}\right). \end{array} \right. $$

(7)

In the next lemma, we characterize the set of fixed points of the operator T by means of the set of primal-dual solutions to (4) and (5).

Lemma 1

Let n ≥ 2 and λ,γ > 0. The following assertions hold.

(i)
If $(\bar {x},\bar {u})\in \mathbf {Z}$, then there exists $\bar {\textbf {z}}\in {\mathscr{H}}^{n-1}$ such that $(\bar {\textbf {z}},\gamma L\bar {x}-\bar {u})\in \text {Fix} {T}$.
(ii)
If $(\bar {z}_{1},\ldots ,\bar {z}_{n-1},\bar {v})\in \text {Fix} {T}$, then $(J_{A_{1}}(\bar {z}_{1}),\gamma L\bar {x}-\bar {v})\in \mathbf {Z}$.

As a result,

$$ \text{Fix} {T} \neq \emptyset \Longleftrightarrow \mathbf{Z} \neq \emptyset. $$

Proof

(i) Let $(\bar {x},\bar {u})\in \mathbf {Z}$. Then, $\bar {u} \in B(L\bar {x})$ and there exists $(a_{1},\ldots ,a_{n})\in {\mathscr{H}}^{n}$ such that $a_{i}\in A_{i}(\bar {x})$ and $-L^{*}\bar {u} = {\sum }_{i=1}^{n} a_{i}$. Consider the vectors $(\bar {z}_{1},\ldots ,\bar {z}_{n-1},\bar {v})\in {\mathscr{H}}^{n-1}\times \mathcal {G}$ defined as

$$ \left\{ \begin{array}{lll} \bar{z}_{1} &:=& \bar{x} + a_{1} \in (\text{Id} + A_{1})(\bar{x}), \\ \bar{z}_{i} &:=& a_{i} + \bar{z}_{i-1} = (\text{Id} + A_{i})(\bar{x}) - \bar{x} + \bar{z}_{i-1}, \quad \forall i\in{\llbracket2,n-1\rrbracket}, \\ \bar{v} &: =& \gamma L\bar{x}-\bar{u} \in \left( \gamma \text{Id} - B\right)(L\bar{x}). \end{array} \right. $$

Then, we deduce that $\bar {x} = J_{A_{1}}(\bar {z}_{1})$ and $\bar {x} = J_{A_{i}} (\bar {z}_{i} + \bar {x}-\bar {z}_{i-1})$ for all i ∈ ⟦2,n − 1⟧. Moreover, we have

$$ \begin{array}{@{}rcl@{}} 2\bar{x} - \bar{z}_{n-1}- L^{*}(\gamma L\bar{x}-\bar{v})& = &2\bar{x} -\bar{z}_{n-1} - L^{*}(\bar{u}) \\ & =& \bar{x} + a_{n} + \bar{x}-\bar{z}_{n-1} + \sum\limits_{i=1}^{n-1} a_{i} \\ & =& \bar{x}+a_{n} + \bar{x}-\bar{z}_{n-1} + \sum\limits_{i=2}^{n-1} (\bar{z}_{i} - \bar{z}_{i-1}) + \bar{z}_{1}-\bar{x} \\ & =& (\text{Id} + A_{n})(\bar{x}). \end{array} $$

Altogether, we obtain

$$ \left\{ \begin{array}{lll} \bar{x} &=& J_{A_{1}}(\bar{z}_{1}), \\ \bar{x} &=& J_{A_{i}}(\bar{z}_{i}+\bar{x}-\bar{z}_{i-1}), \quad \forall i\in{\llbracket 2,n-1\rrbracket}, \\ \bar{x} &=& J_{A_{n}}(2\bar{x} -\bar{z}_{n-1}- L^{*} (\gamma L\bar{x} - \bar{v})), \\ L\bar{x} & =& J_{B/\gamma}\left (2L\bar{x}- \frac{\bar{v}}{\gamma}\right), \end{array} \right. $$

which implies that $(\bar {z}_{1},\ldots ,\bar {z}_{n-1},\bar {v})\in \text {Fix} {T}$.

(ii) Let $(\bar {z}_{1},\ldots ,\bar {z}_{n-1},\bar {v})\in \text {Fix} {T}$ and set $\bar {x}:= J_{A}(\bar {z}_{1})$. By (6), $y = L\bar {x}$ and $\bar {x}_{i} = \bar {x}$ for all i = 1,…,n. Consequently, from (7), we derive

$$ \left\{ \begin{array}{l} \bar{z}_{1}-\bar{x} \in A_{1}(\bar{x}), \\ \bar{z}_{i}-\bar{z}_{i-1} \in A_{i}(\bar{x}), \quad \forall i\in{\llbracket 2,n-1\rrbracket}, \\ \bar{x}-\bar{z}_{n-1}-L^{*}(\gamma L\bar{x}-\bar{v})\in A_{n}(\bar{x}), \\ \gamma L\bar{x}-\bar{v} \in B (L\bar{x}). \end{array} \right. $$

Summing together the first n inclusions above and setting $\bar {u}:=\gamma L\bar {x}-\bar {v}$, we deduce

$$ \left\{ \begin{array}{ll} -L^{*}\bar{u} & \in \sum\limits_{i=1}^{n} A_{i}(\bar{x}), \\ & \bar{u} \in B(L\bar{x}), \end{array} \right. $$

which implies $(\bar {x},\bar {u})\in \mathbf {Z}$, as claimed. □

The following technical lemma provides nonexpansive properties of the operator T in the Hilbert space ${\mathscr{H}}^{n-1} \times \mathcal {G} $ with scalar product given by

$$ \langle (z_{1},\ldots,z_{n-1},v),(\bar{z}_{1},\ldots,\bar{z}_{n-1},\bar{v}) \rangle_{\gamma} : = \sum\limits_{i=1}^{n-1} \langle z_{i}, \bar{z}_{i}\rangle_{\mathcal{H}} + \frac{1}{\gamma} \langle v, \bar{v} \rangle_{\mathcal{G}}, $$

(8)

for $(z_{1},\ldots ,z_{n-1},v),(\bar {z}_{1},\ldots ,\bar {z}_{n-1},\bar {v})\in {\mathscr{H}}^{n-1}\times \mathcal {G}$ and γ > 0.

Lemma 2

For all $(\mathbf {z},v) \! =\! (z_{1},\ldots ,z_{n-1},v)\! \in \! {\mathscr{H}}^{n-1}\! \times \! \mathcal {G}$ and $(\bar {\mathbf {z}},\bar {v}) \! =\! (\bar {z}_{1},\ldots ,\bar {z}_{n-1},\bar {v})$ $\in {\mathscr{H}}^{n-1}\times \mathcal {G}$,

$$ \begin{array}{@{}rcl@{}} \| T(\mathbf{z},v) &-& T({\bar{\mathbf{z}}},\bar{v})\|^{2}_{\gamma} + \frac{1-\lambda}{\lambda} \| \left( \mathrm{I}\mathrm{d}-T\right)(\mathbf{z},v)-\left( \mathrm{I}\mathrm{d}-T\right)({\bar{\mathbf{z}}},\bar{v})\|^{2}_{\gamma}\\ &+& \frac{1-\gamma \|L\|^{2}}{\lambda} \left\| \sum\limits_{i=1}^{n-1} \left( \mathrm{I}\mathrm{d}-T\right)(\mathbf{z},v)_{i} - \sum\limits_{i=1}^{n-1} \left( \mathrm{I}\mathrm{d}-T\right)({\bar{\mathbf{z}}},\bar{v})_{i}\right\|^{2}_{\gamma}\\ &\leq& \|(\mathbf{z},v)-({\bar{\mathbf{z}}},\bar{v})\|^{2}_{\gamma}, \end{array} $$

(9)

where ∥⋅∥_γ denotes the norm induced by the scalar product (8). In particular, if λ ∈ ]0,1[ and $\gamma \in {\left ]0,\frac {1}{\|L\|^{2}}\right ]}$, the operator T is λ-averaged nonexpansive.

Proof

Let $(x_{1},\ldots ,x_{n},y)\in {\mathscr{H}}^{n}\times \mathcal {G}$ and $(\bar {x}_{1},\ldots ,\bar {x}_{n},\bar {y})\in {\mathscr{H}}^{n} \times \mathcal {G}$ be given by (7) from (z,v) and $({\bar {\mathbf {z}}},\bar {v})$, respectively. For simplicity, we denote (z⁺,v⁺) = T(z,v) and $({\bar {\mathbf {z}}}^{+},\bar {v}^{+})=T({\bar {\mathbf {z}}},\bar {v})$. Since z₁ − x₁ ∈ A₁(x₁) and $\bar {z}_{1}-\bar {x}_{1}\in A_{1}(\bar {x}_{1})$, by monotonicity of A₁

$$ \begin{array}{@{}rcl@{}} 0 & \leq& \langle (z_{1}-x_{1}) - (\bar{z}_{1}-\bar{x}_{1}), x_{1}-\bar{x}_{1} \rangle \\ & =& \langle (z_{1}-x_{1}) -(\bar{z}_{1}-\bar{x}_{1}), x_{1}-x_{2}\rangle + \langle (z_{1}-x_{1}) -(\bar{z}_{1}-\bar{x}_{1}), x_{2} - \bar{x}_{1}\rangle. \end{array} $$

(10)

For every i ∈ ⟦2,n − 1⟧ , we have z_i + x_i− 1 − z_i− 1 − x_i ∈ A_i(x_i) and $\bar {z}_{i} + \bar {x}_{i-1}-\bar {z}_{i-1}-\bar {x}_{i}\in A_{i}(\bar {x}_{i})$ and thus, by monotonicity of A_i

$$ \begin{array}{@{}rcl@{}} 0 & \leq& \langle (z_{i}+x_{i-1}-z_{i-1}-x_{i})- (\bar{z}_{i} + \bar{x}_{i-1}-\bar{z}_{i-1}-\bar{x}_{i}), x_{i} - \bar{x}_{i} \rangle \\ & =& \langle (z_{i}-x_{i})-(\bar{z}_{i}-\bar{x}_{i}), x_{i}-\bar{x}_{i}\rangle - \langle (z_{i-1}-x_{i-1})-(\bar{z}_{i-1}-\bar{x}_{i-1}), x_{i}-\bar{x}_{i}\rangle \\ & =& \langle (z_{i}-x_{i})-(\bar{z}_{i}-\bar{x}_{i}),x_{i}-x_{i+1}\rangle + \langle (z_{i}-x_{i})-(\bar{z}_{i}-\bar{x}_{i}), x_{i+1}-\bar{x}_{i}\rangle \\ && \quad - \langle (z_{i-1}-x_{i-1})-(\bar{z}_{i-1}-\bar{x}_{i-1}), x_{i}-\bar{x}_{i-1} \rangle \\ & &\quad- \langle (z_{i-1}-x_{i-1})-(\bar{z}_{i-1}-\bar{x}_{i-1}),\bar{x}_{i-1}-\bar{x}_{i}\rangle. \end{array} $$

(11)

Now, since $x_{1}+x_{n-1}-z_{n-1}-x_{n} - L^{*}\left (\gamma Lx_{1}-v\right ) \in A_{n}(x_{n}) $ and $\bar {x}_{1}+\bar {x}_{n-1}-\bar {z}_{n-1}-\bar {x}_{n} - L^{*}\left (\gamma L\bar {x}_{1}-\bar {v}\right )\in A_{n}(\bar {x}_{n})$, again monotonicity of A_n results in the inequality

$$ \begin{array}{@{}rcl@{}} 0 &\leq& \left\langle x_{1}+x_{n-1}-z_{n-1}-x_{n} - L^{*}\left( \gamma Lx_{1}-v\right), x_{n}-\bar{x}_{n}\right\rangle\\ &&-\left\langle \bar{x}_{1}+\bar{x}_{n-1}-\bar{z}_{n-1}-\bar{x}_{n} - L^{*}\left( \gamma L\bar{x}_{1}-\bar{v}\right), x_{n}-\bar{x}_{n}\right\rangle\\ & = &\langle (x_{n-1}-z_{n-1}) -(\bar{x}_{n-1}-\bar{z}_{n-1}), x_{n}-\bar{x}_{n} \rangle + \langle (x_{1} - \bar{x}_{1})\!-(x_{n}\!-\bar{x}_{n}),x_{n} - \bar{x}_{n} \rangle \\ && - \langle \gamma \left( Lx_{1}-L\bar{x}_{1}\right)-(v-\bar{v}), Lx_{n}-L\bar{x}_{n}\rangle \\ & =& \langle (x_{n-1}-z_{n-1}) -(\bar{x}_{n-1}-\bar{z}_{n-1}), x_{n}- \bar{x}_{n-1} \rangle \\ &&+ \langle (x_{1}-\bar{x}_{1})-(x_{n}-\bar{x}_{n}),x_{n}-\bar{x}_{n} \rangle\\ & & + \langle (x_{n-1}-z_{n-1}) -(\bar{x}_{n-1}-\bar{z}_{n-1}), \bar{x}_{n-1}- \bar{x}_{n} \rangle \\ && - \langle \gamma \left( Lx_{1}-L\bar{x}_{1}\right)-(v-\bar{v}), Lx_{n}-L\bar{x}_{n}\rangle. \\ \end{array} $$

(12)

Finally, we have γL(x₁ + x_n) − v − γy ∈ B(y) and $\gamma L(\bar {x}_{1}+\bar {x}_{n})-\bar {v}-\gamma \bar {y}\in B(\bar {y})$, so by monotonicity of B we get

$$ 0 \leq \langle (\gamma L(x_{1}+x_{n}) -v-\gamma y)-(\gamma L(\bar{x}_{1}+\bar{x}_{n})-\bar{v}-\gamma \bar{y}), y-\bar{y} \rangle. \\ $$

(13)

Summing together (10)–(13) and rearranging, yields

$$ \begin{array}{@{}rcl@{}} 0 &\leq& \sum\limits_{i=1}^{n-1} \langle (x_{i}-x_{i+1})- (\bar{x}_{i}-\bar{x}_{i+1}), z_{i}-\bar{z}_{i}\rangle \\ & & + \sum\limits_{i=1}^{n-1} \langle (x_{i}-\bar{x}_{i})-(x_{i+1}-\bar{x}_{i+1}), \bar{x}_{i}-x_{i}\rangle \\ & & + \langle (x_{1}-\bar{x}_{1}) - (x_{n}-\bar{x}_{n}), x_{n}-\bar{x}_{n}\rangle +\langle \left( Lx_{n}-L\bar{x}_{n}\right)-(y-\bar{y}), v-\bar{v}\rangle \\ &&+ \gamma \langle \bigl(L(x_{1}+x_{n})-L(\bar{x}_{1}+\bar{x}_{n})\bigr)- (y-\bar{y}), y-\bar{y}\rangle \\ && - \gamma \langle Lx_{1}-L\bar{x}_{1},Lx_{n}-L\bar{x}_{n}\rangle . \end{array} $$

(14)

The sums in (14) can be written, respectively, as

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i=1}^{n-1}\!\!\!&\langle&\!\!\!\! (x_{i}-x_{i+1})-(\bar{x}_{i}-\bar{x}_{i+1}), z_{i}-\bar{z}_{i}\rangle \\ &=& \frac{1}{\lambda} \sum\limits_{i=1}^{n-1}\langle (z_{i}-z_{i}^{+})-(\bar{z}_{i}-\bar{z}_{i}^{+}), z_{i}-\bar{z}_{i}\rangle \\ & =& \frac{1}{\lambda} \langle (\mathbf{z}-\mathbf{z}^{+})-({\bar{\mathbf{z}}}-{\bar{\mathbf{z}}}^{+}),\mathbf{z}-{\bar{\mathbf{z}}}\rangle \\ & =& \frac{1}{2\lambda} \left( \|(\mathbf{z}-\mathbf{z}^{+})-({\bar{\mathbf{z}}}-{\bar{\mathbf{z}}}^{+})\|^{2} - \| \mathbf{z}^{+}-{\bar{\mathbf{z}}}^{+}\|^{2} + \|\mathbf{z}- {\bar{\mathbf{z}}}\|^{2} \right), \end{array} $$

(15)

and

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i=1}^{n-1}\!\!\!&\langle&\!\!\!\! (x_{i}-\bar{x}_{i})-(x_{i+1}-\bar{x}_{i+1}), \bar{x}_{i}-x_{i}\rangle \\ & =& \frac{1}{2} {\sum}_{i=1}^{n-1} \left( \|x_{i+1}-\bar{x}_{i+1}\|^{2} - \|x_{i}-\bar{x}_{i}\|^{2} - \| (x_{i}-x_{i+1})-(\bar{x}_{i}-\bar{x}_{i+1})\|^{2} \right) \\ & = &\frac{1}{2} \left( \|x_{n}-\bar{x}_{n}\|^{2} - \|x_{1} -\bar{x}_{1}\|^{2} - \frac{1}{\lambda^{2}} {\sum}_{i=1}^{n-1} \| (z_{i}-z_{i}^{+})-(\bar{z}_{i}- \bar{z}_{i}^{+})\|^{2} \right) \\ & =& \frac{1}{2} \left( \|x_{n}-\bar{x}_{n}\|^{2} - \|x_{1} -\bar{x}_{1}\|^{2} - \frac{1}{\lambda^{2}} \| (\mathbf{z}-\mathbf{z}^{+}) - ({\bar{\mathbf{z}}}-{\bar{\mathbf{z}}}^{+})\|^{2} \right). \end{array} $$

(16)

The third term in (14), becomes

$$ \begin{array}{@{}rcl@{}} \langle (x_{1}-\bar{x}_{1}) &-& (x_{n}-\bar{x}_{n}), x_{n}-\bar{x}_{n}\rangle \\ &=& \frac{1}{2} \left( \|x_{1}-\bar{x}_{1}\|^{2} - \| x_{n}-\bar{x}_{n}\|^{2} - \|(x_{1}-\bar{x}_{1})-(x_{n}-\bar{x}_{n})\|^{2} \right),\qquad \end{array} $$

(17)

while the fourth term yields

$$ \begin{aligned} \langle (Lx_{n}-&L\bar{x}_{n}) - (y-\bar{y}), v-\bar{v} \rangle \\ & = \frac{1}{\gamma \lambda } \langle (v-v^{+})-(\bar{v}-\bar{v}^{+}), v-\bar{v}\rangle \\ & = \frac{1}{2\gamma \lambda} \left( \|(v-v^{+}) -(\bar{v}-\bar{v}^{+})\|^{2} - \|v^{+}-\bar{v}^{+}\|^{2} + \|v-\bar{v}\|^{2} \right). \end{aligned} $$

(18)

Lastly, making use of the Cauchy–Schwarz and Young’s inequalities, the second last term of (14) gives

$$ \begin{aligned} \gamma \langle &\left( L(x_{1}+x_{n})-L(\bar{x}_{1}+\bar{x}_{n})\right)- (y-\bar{y}), y-\bar{y} \rangle \\ & = \gamma \left( \langle Lx_{1}-L\bar{x}_{1}, y-\bar{y}\rangle + \langle \left( Lx_{n}-L\bar{x}_{n}\right)-(y-\bar{y}), y-\bar{y}\rangle \right) \\ & = \frac{\gamma}{2} \left( \|Lx_{n}-L\bar{x}_{n}\|^{2} - \|\left( Lx_{n}-L\bar{x}_{n}\right)-(y-\bar{y})\|^{2} -\|y-\bar{y}\|^{2} \right) \\ &\quad + \gamma \langle Lx_{1}-L\bar{x}_{1}, y-\bar{y}\rangle \\ & \leq \frac{\gamma}{2} \left( \|Lx_{n}-L\bar{x}_{n}\|^{2} - \frac{1}{\gamma^{2} \lambda^{2}} \|(v-v^{+})-(\bar{v}- \bar{v}^{+})\|^{2} -\|y-\bar{y}\|^{2} \right)\\ & \quad + \frac{\gamma}{2} \|Lx_{1}-L\bar{x}_{1}\|^{2} + \frac{\gamma}{2} \|y-\bar{y}\|^{2} \\ & = \frac{\gamma}{2} \|Lx_{1}-L\bar{x}_{1}\|^{2} + \frac{\gamma}{2} \|Lx_{n}-L\bar{x}_{n}\|^{2} - \frac{1}{2 \gamma \lambda^{2}} \|(v-v^{+})-(\bar{v}- \bar{v}^{+})\|^{2}, \end{aligned} $$

(19)

while the last term can be rearranged as follows

$$ \begin{aligned} - \gamma \langle &Lx_{1}-L\bar{x}_{1}, Lx_{n}-L\bar{x}_{n}\rangle \\ & = \frac{\gamma }{2} \left( \| L(x_{1}-x_{n}) - L(\bar{x}_{1}-\bar{x}_{n})\|^{2} - \|Lx_{1}-L\bar{x}_{1}\|^{2} - \|Lx_{n}-L\bar{x}_{n}\|^{2} \right). \end{aligned} $$

(20)

Summing together (19) and (20) and using the Lipschitz continuity of L, we get

$$ \begin{aligned} \gamma \langle& \left( L(x_{1}+x_{n})-L(\bar{x}_{1}+\bar{x}_{n})\right)- (y-\bar{y}), y-\bar{y}\rangle - \gamma \langle Lx_{1}-L\bar{x}_{1},Lx_{n}-L\bar{x}_{n}\rangle \\ & = \frac{\gamma }{2} \| L(x_{1}-x_{n}) - L(\bar{x}_{1}-\bar{x}_{n})\|^{2} - \frac{1}{2 \gamma \lambda^{2}} \|(v-v^{+})-(\bar{v}- \bar{v}^{+})\|^{2} \\ & \leq \frac{\gamma \|L\|^{2} }{2} \| (x_{1}-x_{n}) - (\bar{x}_{1}-\bar{x}_{n})\|^{2} - \frac{1}{2 \gamma \lambda^{2}} \|(v-v^{+})-(\bar{v}- \bar{v}^{+})\|^{2}. \end{aligned} $$

(21)

Multiplying (14) by 2λ and substituting (15)–(21), we obtain the final inequality

$$ \begin{array}{@{}rcl@{}} \|{\mathbf{z}}^{+}-\bar{\mathbf{z}}^{+}\|^{2} &+& \left( \frac{1}{\lambda}-1\right)\left( \|(\mathbf{z}-\mathbf{z}^{+})-(\bar{\mathbf{z}}-\bar{\mathbf{z}}^{+})\|^{2} + \frac{1}{\gamma}\|(v-v^{+})-(\bar{v}-\bar{v}^{+})\|^{2} \right) \\ &+& \frac{1}{\gamma}\|v^{+}-\bar{v}^{+}\|^{2}+\lambda \left( 1-\gamma \|L\|^{2}\right) \|(x_{1}-x_{n}) -(\bar{x}_{1}-\bar{x}_{n})\|^{2} \\ &\leq& \|{\mathbf{z}}-\bar{\mathbf{z}}\|^{2} +\frac{1}{\gamma} \|v -\bar{v}\|^{2}. \end{array} $$

To complete the proof, just note that

$$ \begin{array}{@{}rcl@{}} \lambda (x_{1}-x_{n})-\lambda(\bar{x}_{1}-\bar{x}_{n})&= & \lambda \sum\limits_{i=1}^{n-1} (x_{i}-x_{i+1}) - \lambda \sum\limits_{i=1}^{n-1} (\bar{x}_{i}-\bar{x}_{i+1}) \\ &= & \sum\limits_{i=1}^{n-1} (z_{i}-z_{i}^{+}) - \sum\limits_{i=1}^{n-1} (\bar{z}_{i}-\bar{z}_{i}^{+}), \end{array} $$

from where (9) finally follows. □

Next, we state our main result, which establishes the convergence of the iterative algorithm defined by the operator T in (6) and (7).

Theorem 3

Let n ≥ 2, let $L:{\mathscr{H}} \to \mathcal {G}$ be a bounded linear operator and let $A_{1},\ldots ,A_{n}:{\mathscr{H}}\rightrightarrows {\mathscr{H}}$ and $B:\mathcal {G}\rightrightarrows \mathcal {G}$ be maximally monotone operators with $\text {zer}~{\left ({\sum }_{i=1}^{n} A_{i} + L^{*}BL\right )\neq \emptyset }$. Furthermore, let λ ∈ ]0,1[ and $\gamma \in {\left ]0,\frac {1}{\|L\|^{2}}\right ]}$. Given an initial point $(\mathbf {z}^{0},v^{0}) = ({z_{1}^{0}},\ldots ,z_{n-1}^{0},v^{0})\in {\mathscr{H}}^{n-1}\times \mathcal {G}$, consider the sequences given by

$$ \begin{pmatrix}{\mathbf{z}}^{k+1}\\v^{k+1}\end{pmatrix} = \begin{pmatrix}\mathbf{z}^{k}\\v^{k}\end{pmatrix} + \lambda \begin{pmatrix}{x_{2}^{k}} - {x_{1}^{k}}\\ {x_{3}^{k}}- {x_{2}^{k}}\\ \vdots\\ {x_{n}^{k}}-x_{n-1}^{k}\\ \gamma(y^{k}-L{x_{n}^{k}})\end{pmatrix} \quad \forall k\geq 0, $$

(22)

with

$$ \left\{ \begin{aligned} {x_{1}^{k}} &= J_{A_{1}}({z_{1}^{k}}), \\ {x_{i}^{k}} &= J_{A_{i}}({z_{i}^{k}}+x_{i-1}^{k}-z_{i-1}^{k}), \quad \forall i\in{\llbracket 2,n-1\rrbracket}, \\ {x_{n}^{k}} &= J_{A_{n}}({x_{1}^{k}}+x_{n-1}^{k} -z_{n-1}^{k}- L^{*} (\gamma L{x_{1}^{k}} - v^{k})), \\ y^{k} & = J_{B/\gamma} \left( L({x_{1}^{k}} + {x_{n}^{k}})- \frac{v^{k}}{\gamma}\right). \end{aligned} \right. $$

(23)

Then, the following statements hold.

(i)
The sequence $(\mathbf {z}^{k},v^{k})_{k\in \mathbb {N}}$ converges weakly to a point $(\bar {\mathbf {z}},\bar {v})\in \text {Fix} {T}$.
(ii)
The sequence $({x^{k}_{1}},\ldots ,{x^{k}_{n}},y^{k})_{k\in \mathbb {N}}$ converges weakly to $(\bar {x},\ldots ,\bar {x},L\bar {x})$ with $\bar {x} \in \mathcal {P}$.
(iii)
The sequence $\left (\gamma L{x_{i}^{k}}-v^{k} \right )_{k\in \mathbb {N}}$ converges weakly to $\gamma L\bar {x}-\bar {v}\in \mathcal {D}$, for all i ∈ ⟦1,n⟧.

Proof

(i) The sequence in (22) is the fixed point iteration generated as

$$ \begin{pmatrix}\mathbf{z}^{k+1}\\v^{k+1}\end{pmatrix} = T \begin{pmatrix}\mathbf{z}^{k}\\v^{k}\end{pmatrix} \quad \forall k\geq 0. $$

Since λ ∈ ]0,1[ and $\gamma \in {\left ]0,\|L\|^{-2}\right ]}$, T is averaged nonexpansive by Lemma 2 and, moreover, FixT = ∅, due to Z≠∅ and Lemma 1(i). Then, by [2, Theorem 5.15] the sequence $(\mathbf {z}^{k},v^{k})_{k\in \mathbb {N}}$ converges weakly to a point $(\bar {\mathbf {z}},\bar {v})\in \text {Fix} {T}$ and $\lim _{k\to \infty } \|(\mathbf {z}^{k+1},v^{k+1})-(\mathbf {z}^{k},v^{k})\|_{\gamma }= 0$.

(ii) From (i), the sequence $(\mathbf {z}^{k},v^{k})_{k\in \mathbb {N}}$ is bounded. Then, nonexpansivity of the resolvents and boundedness of the linear operator L imply that the sequence $(\mathbf {x}^{k},y^{k})_{k\in \mathbb {N}}=({x_{1}^{k}},\ldots ,{x_{n}^{k}},y^{k})_{k\in \mathbb {N}}$ is also bounded. Furthermore, the fact that $(\mathbf {z}^{k+1},v^{k+1})_{k\in \mathbb {N}}-(\mathbf {z}^{k},v^{k})_{k\in \mathbb {N}}\to 0$, as $k\to \infty $, implies by (22) that

$$ y^{k} - L{x_{n}^{k}} \to 0 \text{ and } x_{i+1}^{k} -{x_{i}^{k}} \to 0, \text{ for all } i\in{\llbracket 1,n-2\rrbracket}. $$

(24)

Next, by making use of the definition of resolvents and (23), we can write

$$ C\begin{pmatrix} {z_{1}^{k}}-{x_{1}^{k}} \\ ({z_{2}^{k}}-{x_{2}^{k}})-({z_{1}^{k}}-{x_{1}^{k}}) \\ {\vdots} \\ (z_{n-1}^{k}-x_{n-1}^{k})-(z_{n-2}^{k}-x_{n-2}^{k}) \\ {x_{n}^{k}} \\ \gamma \left( L({x_{1}^{k}}+{x_{n}^{k}})-y^{k}\right)-v^{k} \end{pmatrix} \ni \begin{pmatrix} {x_{1}^{k}}-{x_{n}^{k}} \\ {x_{2}^{k}}-{x_{n}^{k}} \\ {\vdots} \\ x_{n-1}^{k}-{x_{n}^{k}}\\ {x_{1}^{k}}-{x_{n}^{k}} +\gamma L^{*}\left( L{x_{n}^{k}}-y^{k}\right)\\ y^{k}- L{x_{n}^{k}} \end{pmatrix}, $$

(25)

where the operator $C\colon {\mathscr{H}}^{n}\times \mathcal {G}\rightrightarrows {\mathscr{H}}^{n}\times \mathcal {G}$ is given by

$$ C:= \begin{pmatrix} A_{1}^{-1}\\A_{2}^{-1} \\ {\vdots} \\ A_{n-1}^{-1} \\ A_{n}\\ B^{-1} \end{pmatrix} + \begin{pmatrix} 0 & 0 & {\dots} & 0 & -\text{Id} &0 \\ 0 & 0 & {\dots} & 0 & -\text{Id} &0\\ {\vdots} & {\vdots} & {\ddots} & {\vdots} & {\vdots} &\vdots\\ 0 & 0 & {\dots} & 0 & -\text{Id} &0\\ \text{Id} & \text{Id} & {\dots} & \text{Id} & 0 & L^{*}\\ 0 & 0 &{\dots} &0 &- L &0 \end{pmatrix}. $$

(26)

The operator C is maximally monotone as the sum of a maximally monotone operator and a skew symmetric linear operator (see, e.g., [2, Corollary 25.5 (i) & Example 20.35]). Thus, the graph of C is sequentially closed in the weak-strong topology, by demiclosedness of maximally monotone operators [2, Corollary 20.38].

Now, let $(\bar {\mathbf {x}},\bar {y})$ be a weak sequential cluster point of $(\mathbf {x}^{k},y^{k})_{k\in \mathbb {N}}$. Due to (24), $\bar {\mathbf {x}}$ is of the form $\bar {\mathbf {x}} = (\bar {x},\ldots ,\bar {x})\in {\mathscr{H}}^{n}$ and $\bar {y}=L\bar {x}$. Taking the limit along a subsequence of $(\mathbf {x}^{k},y^{k})_{k\in \mathbb {N}}$ which converges weakly to $(\bar {\mathbf {x}},\bar {y})$ and using demiclosedness of C, (25) and (26) yield the expression

$$ \left\{\begin{array}{l} \bar{z}_{1}-\bar{x} \in A_{1}(\bar{x}), \\ \bar{z}_{i}-\bar{z}_{i-1} \in A_{i}(\bar{x}), \quad \forall i\in{\llbracket 2,n-1\rrbracket}, \\ \bar{x}-\bar{z}_{n-1}-L^{*}(\gamma L \bar{x}-\bar{v})\in A_{n}(\bar{x}) , \\ \gamma L\bar{x}-\bar{v} \in B (L\bar{x}), \end{array}\right. $$

which, by summing the first n equations, implies that $(\bar {x},\gamma L\bar {x}-\bar {v})\in \mathbf {Z}$ with $\bar {x} = J_{A_{1}}(\bar {z}_{1})$. In particular, we have shown that $(\bar {\mathbf {x}},\bar {y})$ is directly obtained from $\bar {\mathbf {z}}$, implying that it is the unique weak sequential cluster point of the bounded sequence $(\mathbf {x}^{k},y^{k})_{k\in \mathbb {N}}$. Thus, the full sequence converges weakly to this point.

(iii) From (i)-(ii), for all i ∈ ⟦1,n⟧, we deduce that the sequence $(\gamma L{x_{i}^{k}}-v^{k})_{k\in \mathbb {N}}$ weakly converges to $\gamma L\bar {x}-\bar {v}$, which belongs to $\mathcal {D}$ since $(\bar {x},\gamma L\bar {x}-\bar {v})\in \mathbf {Z}$. □

Remark 1 (Malitsky–Tam resolvent splitting 16 as a special case)

Consider Problems (4) and (5) in the particular case in which L = Id. Then, $B:{\mathscr{H}}\rightrightarrows {\mathscr{H}}$ and (4) becomes the classical monotone inclusion problem with (n + 1)-operators. Furthermore, by setting γ = 1 in Theorem 3, it is straightforward to see that the sequences in (22) and (23) yield the Malitsky–Tam resolvent splitting with minimal lifting for (n + 1)-operators.

Remark 2 (On the parameter γ in the definition of the norm ∥⋅∥_γ)

In Lemma 2, we proved that the operator T is λ-averaged nonexpansive with respect to the norm ∥⋅∥_γ induced by the scalar product defined in (8). Although the use of this norm did not require detours from the usual procedure to prove convergence of the fixed point algorithm in Theorem 3, it may numerically affect the performance of the algorithm. To give an intuition about this, consider the norm of the sequence of residuals $\left (\|(\mathbf {z}^{k+1},v^{k+1})-(\mathbf {z}^{k},v^{k})\|_{\gamma }\right )_{k\in \mathbb {N}}$, which converges to 0 as the algorithm reaches a fixed point, and note that we have

$$ \left\|(\mathbf{z}^{k+1},v^{k+1})-(\mathbf{z}^{k},v^{k})\right\|^{2}_{\gamma} = \|\mathbf{z}^{k+1}-\mathbf{z}^{k}\|^{2} + \frac{1}{\gamma} \|v^{k+1}-v^{k}\|^{2} \quad \forall k\geq 0. $$

Lemma 2 implies that this sequence is monotone decreasing, but if γ is very small, the weight of the sequence of dual variables $(v^{k+1}- v^{k})_{k\in \mathbb {N}}$ in the norm would be much larger than the one of the sequence of primal variables $(\mathbf {z}^{k+1}-\mathbf {z}^{k})_{k\in \mathbb {N}}$, so a small decrease in the value of ∥v^k+ 1 − v^k∥ will readily imply a decrease of the norm of the sequence of residuals even if ∥z^k+ 1 −z^k∥ does not diminish much. Because of that, a larger number of iterations might be needed to achieve convergence of the primal sequence, which can slow down the overall convergence of the algorithm. Nonetheless, it is possible to perform some sort of pre-conditioning to prevent from having a large constant in the definition of the norm. We will further comment on this in the numerical experiments in Section 4.

The case with multiple linear compositions

A standard product space reformulation permits to extend our method to the more general inclusion Problem 1, which has finitely many linearly composed maximally monotone operators. We detail this in the following corollary, while the resulting scheme is displayed in Algorithm 1.

Corollary 1

Let n ≥ 2 and assume that Problem 1 has a solution. Let λ ∈ ]0,1[ and $\gamma \in {\left ]0,1/ {\sum }_{j=1}^{m} \|L_{j}\|^{2} \right ]}$. Given some initial points $\mathbf {z}^{0}= (z_{1},\ldots ,z_{n-1})\in {\mathscr{H}}^{n-1}$ and $\mathbf {v}^{0} = ({v_{1}^{0}},\ldots ,{v_{m}^{0}})\in \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}$, consider the sequences $(\mathbf {z}^{k},\mathbf {v}^{k})_{k\in \mathbb {N}}$ and $(\mathbf {x}^{k},\mathbf {y}^{k})_{k\in \mathbb {N}}$ generated by Algorithm 1. Then, the following assertions hold:

(i)
The sequence $(\mathbf {z}^{k},\mathbf {v}^{k})_{k\in \mathbb {N}}$ converges weakly to a point $(\bar {\mathbf {z}},\bar {\mathbf {v}})\in {\mathscr{H}}^{n-1}\times \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}$.
(ii)
The sequence $({x_{1}^{k}}, \ldots , {x_{n}^{k}},{y_{1}^{k}},\ldots ,{y_{m}^{k}})_{k\in \mathbb {N}}$ converges weakly to $(\bar {x},\ldots ,\bar {x},$ $L_{1}\bar {x},\ldots ,L_{m}\bar {x})$ with $\bar {x}\in {\mathscr{H}}$ solving the primal inclusion (1).
(iii)
For all i ∈ ⟦1,n⟧, the sequence $(\gamma L_{1} {x_{i}^{k}}-{v_{1}^{k}},\ldots ,\gamma L_{m} {x_{i}^{k}}-{v_{m}^{k}})_{k\in \mathbb {N}}$ converges weakly to $(\gamma L_{1}\bar {x}-\bar {v}_{1}, \ldots , \gamma L_{m}\bar {x}-\bar {v}_{m})$, which solves the dual inclusion (2).

Proof

Just note that Problem 1 can be reformulated as an instance of Problems (4) and (5) by replacing B by the operator $\mathbf {B}: \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}\rightrightarrows \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}$ defined as the cartesian product and L by the linear operator . In particular, $\|\mathbf {L}\|^{2} ={\sum }_{j=1}^{n}\|L_{j}\|^{2}$ and its adjoint operator is $\mathbf {L}^{*}:\mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}\to {\mathscr{H}}:(v_{1},\ldots ,v_{m})\to {\sum }_{j=1}^{m} L_{j}^{*} v_{j}$. Hence, the result follows by considering the averaged nonexpansive operator T in (6) for this choice of operators and applying Theorem 3. □

Minimality for primal-dual parametrized resolvent splitting

We begin by extending the definition of fixed point encoding to englobe primal-dual problems. As in Section 2.1, we denote by T a fixed point operator and by S a solution operator, both parametrized by the maximally monotone operators as well as the linear and adjoint operators appearing in Problem 1.

Definition 8 (Fixed point encoding)

A pair of operators (T,S) is a fixed point encoding for Problem 1 if, for all particular instance of the problem,

$$ \text{Fix} {T}\neq \emptyset \Longleftrightarrow \text{zer}~{\left( \sum\limits_{i=1}^{n} A_{i}+ \sum\limits_{j=1}^{m} L_{j}^{*}B_{j}L_{j}\right)}\neq\emptyset \text{ and } \mathbf{w}\in\text{Fix} {T} \Longrightarrow S(\mathbf{w}) \in\mathbf{Z}, $$

where we recall that Z denotes the set of primal-dual solutions of the problem.

When talking about lifting for primal-dual problems, the need to distinguish between variables in the space of primal solutions and dual solutions arises. This motivates the following definition.

Definition 9 (Primal-dual lifting)

Let $d,f\in \mathbb {Z}_{+}$. A fixed point encoding (T,S) is a (d,f)-fold lifting for Problem 1 if

$$ T:\mathcal{H}^{d}\times\mathcal{G}_{1}^{f_{1}}\times\cdots\times\mathcal{G}_{m}^{f_{m}} \to\mathcal{H}^{d}\times\mathcal{G}_{1}^{f_{1}}\times\cdots\times\mathcal{G}_{m}^{f_{m}} $$

and

$$ S:\mathcal{H}^{d}\times\mathcal{G}_{1}^{f_{1}}\times\cdots\times\mathcal{G}_{m}^{f_{m}}\to\mathcal{H}\times\mathcal{G}_{1}\times\cdots\times\mathcal{G}_{m}, $$

where f_j ≥ 0 for all j ∈ ⟦1,m⟧ and $f = {\sum }_{j=1}^{m} f_{j}$. We adopt the convention that the space $\mathcal {G}_{j}$ vanishes from the equation when f_j = 0.

The need to control the Lipschitz constants of the linear operators requires the introduction of parameters in the resolvents of the maximally monotone operators. This motivates the definition of parametrized resolvent splitting introduced in Section 2.1 and which we now adapt to primal-dual splitting algorithms.

Definition 10 (Primal-dual parametrized resolvent splitting)

A fixed point encoding (T,S) for Problem 1 is a primal-dual parametrized resolvent splitting if, for all particular instance of the problem, there is a finite procedure that evaluates T and S at a given point which only uses vector addition, scalar multiplication, the parametrized resolvents of A₁,…A_n and B₁,…,B_m, and forward evaluations of L₁,…,L_m and their adjoints.

Definition 11 (Frugality)

A primal-dual parametrized resolvent splitting (T,S) for Problem 1 is frugal if, in addition, each of the parametrized resolvents of A₁,…,A_n and B₁,…,B_m is used exactly once.

Remark 3 (On the absence of restrictions on the evaluation of the linear operators)

Since in the finite case, a forward evaluation of a linear operator is computationally equivalent to performing vector addition and scalar multiplication, this suggests that for practical applications there is no computational need to control the number of evaluations of the linear operators in the definition of frugality.

Example 4

Let n ≥ 2 and consider Problem 1. Let $T:{\mathscr{H}}^{n-1}\times \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}\to {\mathscr{H}}^{n-1}\times \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}$ be the operator defined in (6) by setting and . Let $S:{\mathscr{H}}^{n-1}\times \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}\to {\mathscr{H}}\times \mathcal {G}_{1}\times \cdots \times \mathcal {G}_{m}$ be defined as

$$ S\begin{pmatrix}\mathbf{z}\\ \mathbf{v}\end{pmatrix} := \begin{pmatrix} J_{A_{1}}(z_{1}) \\ \gamma L_{1} J_{A_{1}}(z_{1}) - v_{1} \\{\vdots} \\ \gamma L_{m} J_{A_{1}}(z_{1}) - v_{m} \end{pmatrix}. $$

Then, by Lemma 1 and Corollary 1, the pair (T,S) is a frugal parametrized resolvent splitting with (n − 1,m)-fold lifting.

The following result shows that the lifting of Algorithm 1 is minimal among frugal primal-dual parametrized resolvent splitting algorithms with m dual variables.

Theorem 5 (Minimality theorem for frugal parametrized splitting)

Let (T,S) be a frugal primal-dual parametrized resolvent splitting for Problem 1 with (d,m)-fold lifting. Then, if n ≥ 2, necessarily d ≥ n − 1.

Proof

By way of contradiction, let (T,S) be a frugal primal-dual parametrized resolvent splitting for Problem 1 with (d,m) fold lifting and d < n − 1. Consider the instance of the problem in which $L_{j} =\text {Id} :{\mathscr{H}} \to {\mathscr{H}}$ for all j ∈ ⟦1,m⟦. Then, Problem 1 becomes the classical monotone inclusion problem with n + m operators and (T,S) is a frugal resolvent splitting with (d + m)-fold lifting for such problem with d + m < n + m − 1, which contradicts Theorem 2. □

Finally, we conclude this section by highlighting that Algorithm 1 can be applied with n < 2, by setting A_i = 0 if required. However, a reduction in the lifting is not obtained in this case.

Remark 4 (Algorithm 1 when n ≤ 1)

Consider Algorithm 1 applied to Problem 1 with n ≤ 1. We distinguish the two cases:

(i)
If n = 1, then Algorithm 1 has (1,m)-lifting. Indeed, (27) and (28) become
$$ \left( \begin{array}{ll} z^{k+1} \\ \mathbf{v}^{k+1} \end{array}\right) = \left( \begin{array}{ll} z^{k} \\ \mathbf{v}^{k} \end{array}\right) + \lambda \left( \begin{array}{cc} x^k-z^k \\ \gamma (y_1^k - L_1x^k) \\ {\vdots} \\ \gamma(y_m^k - L_m x^k) \end{array}\right) \quad \forall k\geq 0, $$
(29)
and
$$ \left\{\begin{array}{ll} x^{k} &= J_{A_{1}}\left( z^{k} -\sum\limits_{j=1}^{m} L_{j}^{*}(\gamma L_{j} z^{k} - {v_{j}^{k}})\right), \\ {y_{j}^{k}} &= J_{B_{j}/\gamma} \left( L_{j}(z^{k}+x^{k}) - \frac{{v^{k}_{j}}}{\gamma}\right), \quad \forall j\in{\llbracket 1,m\rrbracket}, \end{array}\right. $$
(30)
respectively. This means that, in contrast with what happens when n ≥ 2, there is no reduction in the lifting with respect to the number of operators involved.
(ii)
If n = 0, the scheme also has (1,m)-lifting. In fact, the scheme is the same as in the previous case but substituting $J_{A_{1}}$ by Id in (30). Note that this is also the lifting obtained by the already known algorithms in the literature applied to this case.

4 Numerical experiments

In this section, we test our algorithm for solving an ill-conditioned linear inverse problem which arises in image deblurring and denoising. Let $b\in \mathbb {R}^{n}$ be an observed blurred and noisy image of size M × N, with n = MN for grayscale and n = 3MN for color images, and denote by $A\in \mathbb {R}^{n\times n}$ the blur operator. The problem can be tackled by means of the regularized convex non-differentiable problem

$$ \inf\limits_{s \in \mathbb{R}^{n}} \left\{ \|A s- b\|_{1} + \alpha_{1} \|Ws\|_{1} + \alpha_{2} TV(s) + \delta_{[0,1]^{n}}(s) \right\}, $$

(31)

where α₁,α₂ > 0 are regularization parameters, $\delta _{[0,1]^{n}}$ denotes the indicator function of the set [0,1]ⁿ, $TV:\mathbb {R}^{n}\to \mathbb {R}$ is the discrete isotropic total variation function and W is the linear operator given by the normalized nonstandard Haar transform [21].

Recalling Remark 2, it is of interest to consider a mechanism which allows tuning the parameter γ appearing in the definition of the norm given by the inner product in (8) to an appropriate value. To this aim, we perform in (31) a change of variable of the form s = μx, with μ > 0, and instead handle the problem

$$ \inf_{x \in \mathbb{R}^{n}} \left\{ \mu \left\|A x- \frac{b}{\mu}\right\|_{1} + \alpha_{1}\mu \|Wx\|_{1} + \alpha_{2} TV(\mu x) + \delta_{[0,1/\mu]^{n}}(x) \right\}. $$

(32)

Below, we will see the way in which the choice of μ can help setting a suitable parameter γ.

The minimization problem in (32) can be modeled as a composite monotone inclusion problem. For this, define the operator $L:\mathbb {R}^{n}\to \mathbb {R}^{n}\times \mathbb {R}^{n} : x \to (L_{1} x, L_{2} x)$ where L₁ and L₂ are defined component-wise as

$$ \left\{\!\begin{array}{ll} \frac{x_{i+1,j} - x_{i,j}}{\mu}, & \text{if } i<M, \\ 0, & \text{otherwise}, \end{array}\right. \text{ and } (L_{2} x)_{i,j} \! =\! \left\{\! \begin{array}{ll} \frac{x_{i,j+1} - x_{i,j}}{\mu}, & \text{if } j<N,\\ 0, & \text{otherwise}. \end{array}\right. $$

(33)

Then, the parametrized total variation function can be written as TV (μ⋅) = ∥L(⋅)∥_×, with $\|(p,q)\|_{\times } := {\sum }_{i=1}^{m} {\sum }_{j=1}^{n} \sqrt {p_{i,j}^{2} + q_{i,j}^{2}}$. Furthermore, an upper bound of the Lipschitz constant of L is given by ∥L∥² ≤ 8μ² (see [9] for details).

By [2, Proposition 27.5], obtaining a solution to the following problem is equivalent to solving (32)

$$ \text{find } x\in\text{zer}~{\left( N_{[0,1/\mu]^{n}}+ W^{*} \circ \partial g_{1} \circ W + A^{*} \circ \partial g_{2} \circ A + L^{*} \circ \partial g_{3} \circ L \right)}, $$

(34)

with $g_{1}: \mathbb {R}^{n} \to \mathbb {R}$, g₁(y) = α₁μ∥y∥₁, $g_{2}:\mathbb {R}^{n} \to \mathbb {R}$, g₂(y) = μ∥y − b/μ∥₁, $g_{3}:\mathbb {R}^{n} \times \mathbb {R}^{n} \to \mathbb {R}$, g₃(p,q) = α₂∥(p,q)∥_×, and $N_{[0,1/\mu ]^{n}}$ the normal cone operator to the set [0,1/μ]ⁿ. In order to implement Algorithm 1 for solving (34), we need the expression of the following resolvents and proximity operators. By [2, Proposition 23.25 (iii)], the second term in (34) is a maximally monotone operator and its resolvent can be expressed as

$$ J_{W^{*} \circ \partial g_{1} \circ W} = \text{Id} - W^{*} \circ \left( \text{Id} - \text{prox}_{g_{1}}\right)\circ W = \text{Id} - W^{*}\circ \text{prox}_{g_{1}^{*}} \circ W, $$

where prox_g = J_∂g denotes the proximity operator of a function g, and $g_{1}^{*}$ is the conjugate function to g₁, which is equal to the indicator function $\delta _{[-\alpha _{1}\mu ,\alpha _{1}\mu ]^{n}}$, and thus $\text {prox}_{g_{1}^{*}} = P_{[-\alpha _{1}\mu ,\alpha _{1}\mu ]^{n}}$. Given σ > 0, the proximity operators of g₂ and g₃ are, respectively,

$$ \text{prox}_{\sigma g_{2}}(x) = \frac{b}{\mu}+\text{prox}_{\sigma \mu \|\cdot\|_{1}} \left( x-\frac{b}{\mu}\right) = \frac{b}{\mu} + \text{sign}{\left( x-\frac{b}{\mu}\right)} \odot \left[\biggl\vert x-\frac{b}{\mu}\biggr\vert- \sigma \mu \right]_{+}, $$

where ⊙ denotes element-wise product and [⋅]₊ and |⋅| are applied element-wise, and

$$ \text{prox}_{\sigma g_{3}} = \text{Id} - \sigma \text{prox}_{\frac{1}{\sigma} g_{3}^{*}} \circ \frac{1}{\sigma} \text{Id} = \text{Id} - \sigma P_{S} \circ \frac{1}{\sigma} \text{Id}, $$

since the conjugate function of g₃ is $g_{3}^{*}:\mathbb {R}^{n}\times \mathbb {R}^{n} \to \mathbb {R}^{n}$, $g_{3}^{*} = \delta _{S}$, with the set S defined as

$$ S:= \left\{ (p,q)\in\mathbb{R}^{n}\times\mathbb{R}^{n} : \max\limits_{1\leq i\leq M, 1\leq j\leq N} \sqrt{p_{i,j}^{2} + q_{i,j}^{2}} \leq \alpha_{2} \right\}, $$

and the projection operator $P_{S}:\mathbb {R}^{n}\times \mathbb {R}^{n}\to S$ is given component-wise by

$$ (p_{i,j},q_{i,j}) \mapsto \alpha_{2} \frac{(p_{i,j},q_{i,j})}{\max{\{\alpha_{2}, \sqrt{p_{i,j}^{2}+q_{i,j}^{2}}\}}}, \quad 1\leq i \leq M, 1\leq j\leq N. $$

Hence, when choosing $z^{0}\in \mathbb {R}^{n}$, ${v_{1}^{0}}\in \mathbb {R}^{n}$ and ${v_{2}^{0}}\in \mathbb {R}^{n}\times \mathbb {R}^{n}$ as starting values, and letting λ ∈ ]0,1[ and $\gamma \in {\left ]0,1/(\|A\|^{2}+\|L\|^{2})\right ]}$, the iterative scheme in Algorithm 1 becomes

$$ \left\lfloor \begin{array}{ll} {\kern9.5pt} {x_{1}^{k}} = &P_{[0,1/\mu]^{n}} (z^{k}), \\ {\kern9.5pt}{x_{2}^{k}} =& \left( \text{Id} - W^{*} \circ P_{[- \alpha_{1} \mu,\alpha_{1} \mu]^{n}} \circ W \right)\\ &\left( 2{x_{1}^{k}}-z^{k} - A^{*}(\gamma A{x_{1}^{k}} - {v_{1}^{k}}) - L^{*}(\gamma L{x_{1}^{k}} -{v_{2}^{k}}) \right), \\ {\kern9.5pt}{y_{1}^{k}} = &\frac{b}{\mu}+\text{prox}_{\frac{\mu}{\gamma} \|\cdot\|_{1}}\left( A({x_{1}^{k}}+{x_{2}^{k}}) - \frac{{v_{1}^{k}}}{\gamma} -\frac{b}{\mu}\right), \\ {\kern9.5pt}{y_{2}^{k}} = &\left( \text{Id} - \frac{1}{\gamma} P_{S}\right) \left( \gamma L({x^{k}_{1}}+{x_{2}^{k}})- {v_{2}^{k}}\right), \\ {\kern.5pt}z^{k+1} = &z^{k} + \lambda ({x_{2}^{k}}-{x_{1}^{k}}), \\ v_{1}^{k+1} =& {v_{1}^{k}} + \lambda \gamma ({y_{1}^{k}}-A {x_{2}^{k}}), \\ v_{2}^{k+1} =& {v_{2}^{k}} + \lambda \gamma ({y_{2}^{k}} - L {x_{2}^{k}}). \end{array}\right. $$

In our experiment, we replicate the problem in [5, Section 4.2], where an extensive comparison between different primal-dual algorithms is presented. Since the best performing algorithm is the Douglas–Rachford type primal-dual method in [5, Algorithm 3.1], we limit our comparison to this algorithm, whose detailed implementation is given in the cited work. We ran our experiments in MATLAB, making use of the inbuilt functions fspecial and imfilter to define an operator A which is a Gaussian blur operator of size 9 × 9 with standard deviation 4 and reflexive boundary conditions. In particular, A verifies ∥A∥ = 1 and A^∗ = A. We employed as observed image b a picture taken at the Schönbrunn Palace Gardens (Vienna) subjected to the already specified blur followed by the addition of a zero-mean Gaussian noise with standard deviation 10^− 3 (see Fig. 2). To test the influence on the performance of the picture size, we resized the original picture to different pixel resolutions (see Table 1).

Table 1 Results from running on the picture displayed in Fig. 2 (for various pixel resolutions) 400 iterations of Algorithm 1 with μ = 1 and $\mu =1/\sqrt {8}$, and DR1

Full size table

When measuring the quality of the restored images, we use the improvement in signal-to-noise-ratio (ISNR), which is given by

$$ \text{ISNR}_{k} = 10\log_{10}\left( \frac{\|x-b\|^{2}}{\|x-x^{k}\|^{2}}\right), $$

where x and x^k are the original and the reconstructed image at iteration k, respectively. We tuned the regularization parameters in order to guarantee an adequate ISNR value for the restored images, setting α₁ := 0.005 and α₂ := 0.009.

We recall that the stepsize parameter γ of Algorithm 1 must be taken in the interval $\gamma \in {\left ]0,1/(\|A\|^{2}+\|L\|^{2})\right ]}= {\left ]0,1/(1+8\mu ^{2})\right ]}$. When μ = 1 (i.e., we solve (31)), this interval is ]0,0.111]. In our numerical experiments, we empirically observed that a very small stepsize negatively affects the performance of the algorithm, as mentioned in Remark 2. After testing different options, the most convenient one seems to be $\mu = 1/\sqrt {8}$, which implies making the Lipschitz constant of both linear operators in the problem equal to 1.

The initialization of each of the methods was the following:

DR1([5, Algorithm 3.1]): starting points x₀ = b and (v_1,0,v_2,0,v_3,0) = (0,0,0), σ₁ = 1, σ₂ = 0.05, σ₃ = 0.05, τ = 1(σ₁ + σ₂ + 8σ₃)^− 1 − 0.01, λ_n = 1.5 for al $n\in \mathbb {N}$.
Algorithm 1 with μ = 1: starting points z₀ = b and $({v_{1}^{0}},{v_{2}^{0}}) = (0,0)$, λ = 0.99 and γ = 1/9;
Algorithm 1 with $\mu =1/\sqrt {8}$: starting points z₀ = b/μ and $({v_{1}^{0}},{v_{2}^{0}}) = (0,0)$, λ = 0.99 and γ = 1/2.

We performed 400 iterations of each of the algorithms and compared the values of the objective function in (32) and the ISNR with respect to the CPU time, which provides a more realistic comparison than iteration count, since DR1 has a higher computational cost per iteration than Algorithm 1. The tests were ran on a desktop of Intel Core i7-4770 CPU 3.40GHz with 32GB RAM, under Windows 10 (64-bit). The algorithms were ran 3 times, once for each of the RGB components of the picture. The evolution in CPU time of adding these 3 values of the objective function and those of the ISNR for the 640 × 768-sized picture are represented in Fig. 1, where we observe that Algorithm 1 with $\mu =1/\sqrt {8}$ obtains slightly better values than those returned by DR1, but in significantly less time.

The restored images are presented in Fig. 2. There is no much difference between the ones corresponding to Algorithm 1 with $\mu =1/\sqrt {8}$ (bottom-middle) and DR1 (bottom-right), but a close look at the image obtained with Algorithm 1 with μ = 1 permits to observe its worse quality. To show that this trend in the performance of the algorithms is not affected by the image size, we present in Table 1 the results from running the algorithms on the same picture for five different pixel resolutions. Overall, we notice that the CPU time required for computing the 400 iterations is significantly lower for Algorithm 1, as expected. On average, DR1 required 45% more time than Algorithm 1 to compute the 400 iterations, independently of the size of the image. Regarding the parameter μ, Algorithm 1 with μ = 1 is notably outperformed by the other two methods, making thus clear the influence that this parameter has on it. The function values obtained were slightly lower for DR1, while the ISNR was slightly lower for Algorithm 1 with $\mu =1/\sqrt {8}$, which implies that both algorithms performed similarly with respect to the restored image quality.

Interpretation of the results of the experiments

The experimental results show that, after performing the same number of iterations, Algorithm 1 with $\mu =1/\sqrt {8}$ obtains similar results in the function values and the measurement in the quality of the image recovery than those obtained by DR1, but in considerably less time. This decrease in the running time can be attributed to the reduction in the lifting of the operator. Although in the first iterations DR1 achieves a larger reduction of the objective function, the quality of the restored image is not sufficient, as assessed by the low ISNR values. On the other hand, Algorithm 1 with μ = 1 can be discarded, as it obtains higher objective and lower ISNR values. Consequently, Algorithm 1 with $\mu =1/\sqrt {8}$ is the preferable choice to address problem (31).

5 Conclusions and open questions

In this work, we have considered the composite monotone inclusion problem together with its dual counterpart given by Problem 1. We have extended the definition of resolvent splitting given in [19] to encompass primal-dual algorithms and the inclusion of parameters in the resolvent and presented a definition of minimal lifting for frugal schemes of this form. We have proposed the first primal-dual algorithm which presents minimal lifting in this sense, and show its good performance with a numerical example.

To conclude, we outline possible directions for further research.

Establishing an optimal criterion for tuning the stepsize γ

We pointed out in Remark 2 the influence that the parameter γ can have in the performance of the algorithm. In Section 4 we presented a possibility for controlling this parameter, by making use of a change of variable which modifies the Lipschitz constants of the linear operators, and we empirically showed that it significantly affects the speed of performance of the algorithm. However, there is no guarantee that this strategy is optimal. It would be interesting to further investigate which is the best way for tuning the value of γ.

Achieving lifting reduction in the dual variables

The reduction in the lifting with respect to the number of operators achieved in the algorithm here presented only affects the primal variables. It remains open the question of whether it is possible to reduce the dimension of the underlying space associated to the linearly composed operators. More precisely, if we consider the problem given by

$$ \text{find } x \in \mathcal{H} \text{ such that }0 \in\sum\limits_{j=1}^{m} L^{*}_{j} {B_{j}} (L_{j} x), $$

is it possible to obtain an algorithm for solving this problem with (0,m − 1)-fold lifting (according to Definition 9)? Or even with (1,m − 1) or (0,m)-fold lifting? All these questions remain open.

Notes

Here, we make use of an abuse of notation. Indeed (36), should be written as y = (Y_z ⊗Id)z + (Y_x⊗)x, where ⊗ denotes the Kronecker product.

References

Attouch, H., Théra, M.: A general duality principle for the sum of two operators. J. Convex Anal. 3, 1–24 (1996)
MathSciNet MATH Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces, 2nd edn. Springer, Berlin (2017)
Book MATH Google Scholar
Boţ, R.I., Csetnek, E.R., Heinrich, A.: A primal-dual splitting algorithm for finding zeros of sums of maximally monotone operators. SIAM J. Optim. 23(4), 2011–2036 (2013)
Article MathSciNet MATH Google Scholar
Boţ, R.I., Csetnek, E.R., Heinrich, A., Hendrich, C.: On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems. Math. Program. 150(2), 251–279 (2015)
Article MathSciNet MATH Google Scholar
Boţ, R.I., Hendrich, C.: A Douglas–Rachford type primal-dual method for solving inclusions with mixtures of composite and parallel-sum type monotone operators. SIAM. J. Optim. 23(4), 2541–2565 (2013)
MathSciNet MATH Google Scholar
Boţ, R.I., Hendrich, C: Solving monotone inclusions involving parallel sums of linearly composed maximally monotone operators. Inverse Probl. Imaging 10(3), 617–640 (2016)
Article MathSciNet MATH Google Scholar
Briceño-Arias, L., Combettes, P.L.: A monotone + skew splitting model for composite monotone inclusions in duality. SIAM J. Optim. 21(4), 1230–1250 (2011)
Article MathSciNet MATH Google Scholar
Campoy, R: A product space reformulation with reduced dimension for splitting algorithms. Comput. Optim. Appl 83, 319–348 (2022)
Article MathSciNet MATH Google Scholar
Chambolle, A.: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20(1–2), 89–97 (2004)
MathSciNet MATH Google Scholar
Chambolle, A., Lions, P.L.: Image recovery via total variation minimization and related problems. Numer. Math. 76(2), 167–188 (1997)
Article MathSciNet MATH Google Scholar
Chambolle, A., Pock, T: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011)
Article MathSciNet MATH Google Scholar
Combettes, P.L., Pesquet, J.-C.: Primal-dual splitting algorithm for solving inclusions with mixture of composite, Lipschtizian, and parallel-sum type monotone operators. Set-valued Var. Anal. 20(2), 307–330 (2012)
Article MathSciNet MATH Google Scholar
Condat, L, Kitahara, D, Contreras, A, Hirabayashi, A: Proximal splitting algorithms for convex optimization: a tour of recent advances, with new twists. To be published in SIAM Review (2022)
Dao, M.N., Dizon, N., Hogan, J.A., Tam, M.K.: Constraint reduction reformulations for projection algorithms with applications to wavelet construction. J. Optim. Theory Appl. 190, 201–233 (2021)
Article MathSciNet MATH Google Scholar
Lions, P.L., Mercier, B: Splitting algorithms for the sum of two nonlinear operators. SIAM. J. Numer. Anal. 16(6), 964–979 (1979)
Article MathSciNet MATH Google Scholar
Malitsky, Y, Tam, M.K: Resolvent splitting for sums of monotone operators with minimal lifting. arXiv:2108.02897 (2021)
Minty, G.J.: Monotone (nonlinear) operators in Hilbert space. Duke. Math. J. 29, 341–346 (1962)
Article MathSciNet MATH Google Scholar
Pierra, G.: Decomposition through formalization in a product space. Math. Program. 28, 96–115 (1984)
Article MathSciNet MATH Google Scholar
Ryu, E.K.: Uniqueness of DRS as the 2-operator resolvent-splitting and impossibility of 3-operator resolvent-splitting. Math. Program. 182(1), 233–273 (2020)
Article MathSciNet MATH Google Scholar
Setzer, S., Steidl, G., Teuber, T.: Infimal convolution regularizations with discrete ℓ₁-type functionals. Commun. Math. Sci. 9(3), 797–827 (2011)
Article MathSciNet MATH Google Scholar
Stollnitz, E.J., DeRose, T.D., Salesim, H.D.: Wavelets for computer graphics: a primer, part 1. IEEE Comput. Graph. Appl. 15(3), 76–84 (1995)
Article Google Scholar
Vũ, B.C.: A splitting algorithm for dual monotone inclusions involving cocoercive operators. Adv. Comput. Math. 38, 667–681 (2013)
Article MathSciNet MATH Google Scholar

Download references

Funding

FJAA and DTB were partially supported by the Ministry of Science, Innovation and Universities of Spain and the European Regional Development Fund (ERDF) of the European Commission, Grant PGC2018-097960-B-C22. FJAA was partially supported by the Generalitat Valenciana (AICO/2021/165). RIB was partially supported by FWF (Austrian Science Fund), project P 34922-N. DTB was supported by MINECO and European Social Fund (PRE2019-090751) under the program “Ayudas para contratos predoctorales para la formación de doctores” 2019.

Author information

Authors and Affiliations

Department of Mathematics, University of Alicante, San Vicente del Raspeig, 03690, Alicante, Spain
Francisco J. Aragón-Artacho & David Torregrosa-Belén
Faculty of Mathematics, University of Vienna, Vienna, 1090, Austria
Radu I. Boţ

Authors

Francisco J. Aragón-Artacho
View author publications
You can also search for this author in PubMed Google Scholar
Radu I. Boţ
View author publications
You can also search for this author in PubMed Google Scholar
David Torregrosa-Belén
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Torregrosa-Belén.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proof of the minimality theorem for parametrized resolvent splitting

Throughout this section, we assume that n ≥ 2 and we denote by $\mathcal {A}_{n}$ the set of all n-tuples of maximally monotone operators on ${\mathscr{H}}$. Hence, an element $A\in \mathcal {A}_{n}$ is of the form A = (A₁,…,A_n), where $A_{i}:{\mathscr{H}} \rightrightarrows {\mathscr{H}}$ are maximally monotone operators for all i ∈ ⟦1,n⟧. Every instance of Problem 2 is determined by the choice of $A\in \mathcal {A}_{n}$. In particular, when considering a fixed point encoding for this problem, the fixed point operator and the solution operator are both parametrized in terms of $A\in \mathcal {A}_{n}$. To emphasize this idea and to facilitate the exposition, we denote these operators by T_A and S_A in the following.

Let (T_A,S_A) be a d-fold lifted frugal parametrized resolvent splitting for Problem 2. By definition, there exists a finite procedure for evaluating T_A and S_A using only vector addition, scalar multiplication and the resolvents $J_{\delta _{1} A_{1}}, \ldots , J_{\delta _{n} A_{n}}$ precisely once, where δ = (δ₁,…,δ_n)^T is a vector of positive parameters. Following the same reasoning than in [16, Section 3], we can completely describe the evaluation of a point $\mathbf {z} = (z_{1},\ldots ,z_{d})\in {\mathscr{H}}^{d}$ by T_A with a series of equations. We directly present them here.

(1)
There exists $\mathbf {x}= (x_{1},\ldots ,x_{n})\in {\mathscr{H}}^{n}$ and $\mathbf {y} = (y_{1},\ldots ,y_{n})\in {\mathscr{H}}^{n}$ such that
$$ \mathbf{x}= J_{\delta A} (\mathbf{y}) \Longleftrightarrow 0 \in \mathbf{x} - \mathbf{y} + \delta A(\mathbf{x}), $$
(35)
where $\delta A := (\delta _{1} A_{1}, \ldots , \delta _{n} A_{n})\in \mathcal {A}_{n}$.
(ii)
There exists $Y_{z} \in \mathbb {R}^{n\times d}$ and a lower-triangular matrix $Y_{x} \in \mathbb {R}^{n\times n}$ with zeros in the diagonal such that^{Footnote 1}
$$ \mathbf{y} = Y_{z} \mathbf{z} + Y_{x}\mathbf{x}. $$
(36)
(iii)
By frugality, there exists $T_{z}\in \mathbb {R}^{d\times d}$ and $T_{x}\in \mathbb {R}^{d \times n}$ such that
$$ T_{A}(\mathbf{z}) = T_{z}\mathbf{z} + T_{x}\mathbf{x}. $$
(37)

Similarly, also by frugality, the evaluation of z by the solution operator S can be expressed as

$$ S_{A}(\mathbf{z}) = S_{z} \mathbf{z} + S_{x} \mathbf{x}, $$

(38)

where $S_{z}\in \mathbb {R}^{1\times d}$ and $S_{x}\in \mathbb {R}^{1\times n}$.

The proof of the next technical lemma can be obtained by following the same steps than in [16, Lemma 3.1], so we do not replicate it here.

Lemma 3

Let (T_A,S_A) be a frugal parametrized resolvent splitting for Problem 2. Let M denote the block matrix given by

$$ M :=\left[ \begin{array}{cccc} 0 & \mathrm{I}\mathrm{d} & -\mathrm{I}\mathrm{d} & \delta^{T} \mathrm{I}\mathrm{d} \\ Y_{z} & Y_{x} & -\mathrm{I}\mathrm{d} & 0 \\ T_{z}-\mathrm{I}\mathrm{d} & T_{x} & 0 & 0 \end{array}\right]. $$

If z ∈FixT_A, then there exists $\mathbf {v} = \left [\mathbf {z}, \mathbf {x}, \mathbf {y}, \mathbf {a} \right ]^{T}\in \ker {M}$ with a ∈ A(x). Conversely, if $\mathbf {v} = \left [ \mathbf {z}, \mathbf {x}, \mathbf {y}, \mathbf {a} \right ]^{T}\in \ker {M}$ and a ∈ A(x), then z ∈FixT_A, x = J_δA(y) and S_A(z) = S_zz + S_xx.

Proposition 5 (Solution operator)

Let (T_A,S_A) be a frugal parametrized resolvent splitting for Problem 2 . Then, for all $\bar {\mathbf {z}}\in \text {Fix} {T_{A}}$ and $\bar {\mathbf {x}} = J_{\delta A}(\bar {\mathbf {y}})$, we have

$$ S_{A}(\bar{\mathbf{z}}) = \frac{1}{n} \sum\limits_{i=1}^{n} (\bar{y}_{i} -\delta_{i} \bar{a}_{i})= \bar{x}_{1} = {\cdots} = \bar{x}_{n}, $$

(39)

where $\bar {\mathbf {a}} =A(\bar {\mathbf {x}})$.

Proof

Consider a particular instance of Problem 2 given by some operators $A\in \mathcal {A}_{n}$. Let T_A and S_A be the fixed point and solution operators of this particular instance, respectively. Let $\bar {\mathbf {z}}\in \text {Fix} {T_{A}}$ and $x^{*} = S_{A}(\bar {\mathbf {z}})$. By Lemma 1, there exists $\mathbf {v} := \left [ \bar {\mathbf {z}}, \bar {\mathbf {x}}, \bar {\mathbf {y}}, \bar {\mathbf {a}} \right ]^{T}\in \ker {M}$ with $\bar {\mathbf {a}}\in A(\bar {\mathbf {x}})$ and $x^{*} = S_{A}(\bar {\mathbf {z}}) = S_{z} \bar {\mathbf {z}} + S_{z} \bar {\mathbf {x}}$.

Consider now the n + 1 instances of Problem 2 given by the n-tuples of maximally monotone operators $A^{(0)}, A^{(1)},\ldots , A^{(n)}\in \mathcal {A}_{n}$ defined as

$$ A^{(0)}({\mathbf{x}}) := \bar{\mathbf{a}} \text{ and } A^{(j)}(\mathbf{x}) := \bar{\mathbf{a}} + \left[ \begin{array}{c} 0 \\ {\vdots} \\ x_{j} - \bar{x}_{j} \\ {\vdots} \\ 0 \end{array} \right] \quad \forall j \in{\llbracket 1,n\rrbracket}. $$

Since ${\mathbf {v}}\in \ker M$ and $\bar {\mathbf {a}} = A^{(j)}(\bar {\mathbf {x}})$, for all j ∈ ⟦0,n⟧, Lemma 1 implies that $\bar {\mathbf {z}} \in \text {Fix} {T_{A^{(j)}}}$, $\bar {\mathbf {x}} = J_{\delta A^{(j)}}(\bar {\mathbf {y}})$ and thus, $S_{A^{(j)}}(\bar {\mathbf {z}})= S_{z} \bar {\mathbf {z}} + S_{x}\bar {\mathbf {x}}=x^{*}$ is a solution to every instance. Therefore, we have $0 = {\sum }_{i=1}^{n} A_{i}^{(0)}(x^{*}) = {\sum }_{i=}^{n} \bar {a}_{i}$ and hence

$$ 0 = \sum\limits_{i=1}^{n} A^{(j)}_{i}(x^{*}) = \sum\limits_{i=1}^{n} \bar{a}_{i} + x^{*} -\bar{x}_{j} = x^{*} -\bar{x}_{j} \quad \forall j \in{\llbracket 1,n\rrbracket}, $$

from where it follows that $x^{*} = \bar {x}_{1} = {\cdots } = \bar {x}_{n}$. Finally, since $\bar {\mathbf {x}} = J_{\delta A^{(0)}} (\bar {\mathbf {y}})$, we have that $\bar {\mathbf {y}}-\bar {\mathbf {x}} = \delta A^{(0)}(\bar {\mathbf {x}})=(\delta _{1}\bar {a}_{1},\ldots ,\delta _{n}\bar {a}_{n})$. Consequently, ${\sum }_{i=1}^{n} \bar {y}_{i} - n x^{*} = {\sum }_{i=1}^{n} \delta _{i} \bar {a}_{i}$, which completes the proof. □

Note that, although the expression for the solution operator given by (39) differs from the one obtained in [16, Proposition 3.2], it still holds that the vector $\bar {\mathbf {x}}$ belongs to the diagonal subspace of dimension n, which we denote by Δ_n. This is what we employ to prove the following theorem.

Theorem 6

Let (T_A,S_A) be a frugal parametrized resolvent splitting with d-fold lifting for Problem 2. Then, d ≥ n − 1.

Proof

Suppose, by contradiction, that (T_A,S_A) is a frugal parametrized resolvent splitting for Problem 2 with d-fold lifting such that d ≤ n − 2. Consider a particular instance of the problem given by $A\in \mathcal {A}_{n}$ such that $\text {zer}~{\left ({\sum }_{i=1}^{n} A_{i}\right )} \neq \emptyset $ and take z ∈FixT_A. By Lemma 1, there exists $\mathbf {v}:= \left [ \mathbf {z}, \mathbf {x}, \mathbf {y}, \mathbf {a} \right ]^{T}\in \ker M$ with a ∈ A(x). The last row of M implies that 0 = (T_z −Id)z + T_xx. Since $T_{x}\in \mathbb {R}^{d\times n}$ and d ≤ n − 2, by the rank-nullity theorem, $\dim \ker T_{x} = n - \dim \text {rank} T_{x} \geq n - d \geq 2$. Since Δ_n is a subspace of dimension 1, there exists $\bar {\mathbf {x}}\notin {\Delta }_{n}$ such that $T_{x} \mathbf {x} = T_{x}\bar {\mathbf {x}}$.

Now, set ${\bar {\mathbf {z}}} := \mathbf {z}$, $\bar {\mathbf {y}} := Y_{z} \bar {\mathbf {z}} + Y_{x} \bar {\mathbf {x}}$ and $\bar {\mathbf {a}} := ((\bar {y}_{1}-\bar {x}_{1})/\delta _{1},\ldots ,(\bar {y}_{n} -\bar {x}_{n})/\delta _{n})$ and consider the instance of the problem given by $\bar {A}\in \mathcal {A}_{n}$ defined as $\bar {A}(\mathbf {s}) := \bar {\mathbf {a}}$ for all $\mathbf {s}\in {\mathscr{H}}^{n}$. Then, $\bar {\mathbf {v}} := \left [\bar {\mathbf {z}}, \bar {\mathbf {x}}, \bar {\mathbf {y}}, \bar {\mathbf {a}}\right ]^{T}\in \ker M$ with $\bar {\mathbf {a}} =\bar {A}(\bar {\mathbf {x}})$. By Lemma 3 and Proposition 5, this implies that $\bar {\mathbf {x}} \in {\Delta }_{n}$, obtaining thus a contradiction which completes the proof. □

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Aragón-Artacho, F.J., Boţ, R.I. & Torregrosa-Belén, D. A primal-dual splitting algorithm for composite monotone inclusions with minimal lifting. Numer Algor 93, 103–130 (2023). https://doi.org/10.1007/s11075-022-01405-9

Download citation

Received: 23 February 2022
Accepted: 23 August 2022
Published: 18 November 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11075-022-01405-9

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A primal-dual splitting algorithm for composite monotone inclusions with minimal lifting

Abstract

Similar content being viewed by others

A Reflected Forward-Backward Splitting Method for Monotone Inclusions Involving Lipschitzian Operators

An Inertial Semi-forward-reflected-backward Splitting and Its Application

Strengthened splitting methods for computing resolvents

1 Introduction

Problem 1

Lifted splitting algorithms

Problem 2

2 Preliminaries

Definition 1

Definition 2

Definition 3

Proposition 1 ([2, Proposition 23.10])

2.1 Parametrized resolvent splitting

Definition 4 (Fixed point encoding 19)

Definition 5 (Parametrized resolvent splitting)

Definition 6 (Frugality)

Definition 7 (Lifting 19)

Example 1

Theorem 2 (Minimal lifting for frugal parametrized splittings)

Proof

3 A primal-dual splitting with minimal lifting

The case with one linear composition

Lemma 1

Proof

Lemma 2

Proof

Theorem 3

Proof

Remark 1 (Malitsky–Tam resolvent splitting 16 as a special case)

Remark 2 (On the parameter γ in the definition of the norm ∥⋅∥γ)

The case with multiple linear compositions

Corollary 1

Proof

Minimality for primal-dual parametrized resolvent splitting

Definition 8 (Fixed point encoding)

Definition 9 (Primal-dual lifting)

Definition 10 (Primal-dual parametrized resolvent splitting)

Definition 11 (Frugality)

Remark 3 (On the absence of restrictions on the evaluation of the linear operators)

Example 4

Theorem 5 (Minimality theorem for frugal parametrized splitting)

Proof

Remark 4 (Algorithm 1 when n ≤ 1)

4 Numerical experiments

Interpretation of the results of the experiments

5 Conclusions and open questions

Establishing an optimal criterion for tuning the stepsize γ

Achieving lifting reduction in the dual variables

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Data availability

Publisher’s note

Appendix A: Proof of the minimality theorem for parametrized resolvent splitting

Appendix A: Proof of the minimality theorem for parametrized resolvent splitting

Lemma 3

Proposition 5 (Solution operator)

Proof

Theorem 6

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation

Remark 2 (On the parameter γ in the definition of the norm ∥⋅∥_γ)