arXiv:1001.1078v2 [math.AT] 10 May 2010
STABILITY OF MULTIDIMENSIONAL PERSISTENT
HOMOLOGY WITH RESPECT TO DOMAIN PERTURBATIONS
PATRIZIO FROSINI AND CLAUDIA LANDI
Abstract. Motivated by the problem of dealing with incomplete or imprecise
acquisition of data in computer vision and computer graphics, we extend results concerning the stability of persistent homology with respect to function
perturbations to results concerning the stability with respect to domain perturbations. Domain perturbations can be measured in a number of different
ways. An important method to compare domains is the Hausdorff distance. We
show that by encoding sets using the distance function, the multidimensional
matching distance between rank invariants of persistent homology groups is
always upperly bounded by the Hausdorff distance between sets. Moreover,
we prove that our construction maintains information about the original set.
Other well known methods to compare sets are considered, such as the symmetric difference distance between classical sets and the sup-distance between
fuzzy sets. Also in these cases we present results stating that the multidimensional matching distance between rank invariants of persistent homology
groups is upperly bounded by these distances.
An experiment showing the potential of our approach concludes the paper.
Introduction
Persistent topology is a theory for studying objects related to computer vision
and computer graphics, by analyzing the qualitative and quantitative behavior of
real-valued functions defined over topological spaces and measuring the shape properties of the topological space under study (e.g., roundness, elongation, bumpiness,
color). More precisely, persistent topology studies the sequence of nested lower level
sets of the considered measuring functions and encodes at which scale a topological
feature (e.g., a connected component, a tunnel, a void) is created, and when it
is annihilated along this filtration. At the very beginning of the development of
persistent topology, this encoding captured only the connected component changes
in the lower level sets of a real valued function, and took the name of size function
[13, 16]. Some years later, it was extended to consider all homotopy groups of the
lower level sets of a vector-valued function, under the name of size homotopy groups
[14]. Nowadays we have a wide choice of variants for this encoding, ranging from
persistent homology groups capturing the homology of a one-parameter increasing
family of spaces [12], to multidimensional persistent homology groups extending the
previous concept to a multi-parameter setting [4], to vineyards coping with changes
in the function over time [7], to interval persistence [9], just to cite a few. In this
paper we focus on multidimensional persistent homology groups. For application
2010 Mathematics Subject Classification. Primary: 55N35; Secondary: 68T10, 68U05, 55N05.
Key words and phrases. Persistent topology, shape analysis, Čech homology, matching distance, distance function, Hausdorff distance, symmetric difference distance.
Research partially carried out within the activities of ARCES (Università di Bologna).
1
2
PATRIZIO FROSINI AND CLAUDIA LANDI
purposes, these groups are further encoded by considering only their rank, yielding
to a parametrized version of Betti numbers, called rank invariants (or persistent
Betti numbers).
The stability of multidimensional rank invariants is quite an important issue
in persistent homology theory and its applications because the lack of stability
would make this invariant useless, every data measurement being affected by noise.
Stability with respect to perturbations of the measuring function was proved in
[5], based on the results of [1, 2], comparing persistent homology groups by the
multidimensional matching distance.
In this paper we consider the problem of stability with respect to changes of the
topological space, which is as much important as the stability with respect to the
change of measuring functions. Changes of the space under study can be measured
in a number of different ways. Indeed, according to the kind of noise producing
the perturbation, some distances are more suitable than other to compare sets. For
example, the Hausdorff distance is useful to measure distortions of the domain,
while the symmetric difference distance can cope with the presence of outliers.
Due to the existence of many different ways to compare sets, we propose a general
approach to the problem of stability of persistent homology groups with respect
domain perturbations, and we apply this approach in a few cases.
Our main idea is to reduce the problem of stability with respect to changes of
the topological space to that of stability with respect to changes of the measuring
functions. This is achieved by substituting the domain K we are interested in
with an appropriate function fK defined on a fixed set D containing K, so that
the perturbation of the set K becomes a perturbation of the function fK . As a
consequence, the original measuring function ϕ
~ |K : K → Rk is replaced by a new
k+1
~ :D→R
~ = (fK , ϕ
~ can be
measuring function Φ
,Φ
~ ). Rank invariants of (D, Φ)
compared using the multidimensional matching distance. In this way we can prove
robustness of persistent homology groups under domain perturbations.
In particular, we use this strategy when sets are compared by the Hausdorff
distance. In this case, taking fK equal to the distance function from K, we prove
that the multidimensional matching distance between the rank invariants associated
with two compact sets K1 and K2 is always upperly bounded by the Hausdorff
distance between K1 and K2 (Theorem 2.1). At the same time, we show that,
in our approach, the information about the original domain K and its original
measuring function ϕ
~ is fully maintained in the persistent homology groups of
~ (Theorem 2.2).
(D, Φ)
As a further contribution, we show stability with respect to perturbations of
the domain measured using the symmetric difference distance. Also in this case,
associating with K a suitable function fK , we can prove stability with respect to
domain perturbations measured by the symmetric difference distance (Theorem
3.1).
We also consider the situation where sets are described in a fuzzy sense, by means
of probability density functions, easily obtaining a stability result also in this case
(Theorem 3.2).
An experiment on a binary image concludes the paper, illustrating our results.
To conclude this introduction, we emphasize three key-points in this paper.
Firstly, there is a real need for developing techniques providing persistent homology
with robustness against domain perturbations since the classical setting is not stable
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
3
in this respect, as the example in Section 4 clearly shows. Secondly, the technique
we propose in order to achieve stability makes sense only in the multidimensional
version of persistent homology, since it is based on passing from a k-dimensional
measuring function to a (k + 1)-dimensional one. Thirdly, we underline that, substituting the study of the domain K with that of a much simpler domain D, we
obtain the searched for stability but not at the price of forgetting the persistent
homology of K.
1. Preliminaries
1.1. Multidimensional persistent homology groups. The following relations
and ≺ are defined in Rk : for ~u = (u1 , . . . , uk ) and ~v = (v1 , . . . , vk ), we say
~u ~v (resp. ~u ≺ ~v ) if and only if ui ≤ vi (resp. ui < vi ) for every index i =
1, . . . , k. Moreover, Rk is endowed with the usual max-norm: k(u1 , u2 , . . . , uk )k∞ =
max1≤i≤k |ui |.
We shall use the following notations: ∆+ will be the open set {(~u, ~v ) ∈ Rk × Rk :
~u ≺ ~v }. Given a topological space X, for every k-tuple ~u = (u1 , . . . , uk ) ∈ Rk
and for every continuous function ϕ
~ : X → Rk , we shall denote by Xh~
ϕ ~u i the
lower level set {x ∈ X : ϕi (x) ≤ ui , i = 1, . . . , k} and by k~
ϕk∞ the sup-norm of
ϕ
~ , i.e. k~
ϕk∞ = maxx∈X k~
ϕ(x)k∞ . The function ϕ
~ will be called a k-dimensional
measuring (or filtering) function.
(~
u,~
v)
Definition 1.1. Let πq
: Ȟq (Xh~
ϕ ~ui) → Ȟq (Xh~
ϕ ~v i) be the homomorphism induced by the inclusion map π (~u,~v) : Xh~
ϕ ~ui ֒→ Xh~
ϕ ~v i with ~u ~v ,
(~
u,~
v)
where Ȟq denotes the qth Čech homology group. If ~u ≺ ~v , the image of πq
is
called the multidimensional qth persistent homology group of (X, ϕ
~ ) at (~u, ~v ), and
(~
u,~
v)
is denoted by Ȟq
(X, ϕ
~ ).
(~
u,~
v)
In other words, the group Ȟq
(X, ϕ
~ ) contains all and only the homology classes
of q-cycles born before ~u and still alive at ~v .
In what follows, we shall work with coefficients in a field K, so that homology
groups are vector spaces, and hence torsion-free. Therefore, they can be completely
described by their rank, leading to the following definition (cf. [4]).
Definition 1.2 (qth rank invariant). Let X be a topological space, and ϕ
~ : X → Rk
a continuous function. Let q ∈ Z. The qth rank invariant of the pair (X, ϕ
~ ) is the
function ρ(X,ϕ),q
: ∆+ → N ∪ {∞} defined as
~
ρ(X,ϕ),q
u, ~v ) = rank im πq(~u,~v) .
~ (~
If X is a triangulable space embedded in some Rn , then ρ(X,ϕ),q
u, ~v ) < +∞, for
~ (~
every (~u, ~v ) ∈ ∆+ and every q ∈ Z [3, 5].
1.2. Matching distance. We now recall the construction of the distance Dmatch
to compare rank invariants of multidimensional persistent homology groups.
Definition 1.3 (Admissible pairs). For every vector ~l = (l1 , . . . , lk ) of Rk such
Pk
that li > 0 for i = 1, . . . , k and i=1 li2 = 1, and for every vector ~b = (b1 , . . . , bk ) of
P
k
Rk such that i=1 bi = 0, we shall say that the pair (~l, ~b) is admissible. We shall
denote the set of all admissible pairs in Rk × Rk by Admk . Given an admissible
4
PATRIZIO FROSINI AND CLAUDIA LANDI
pair (~l, ~b), we define the half-plane π(~l,~b) of Rk × Rk by the following parametric
equations:
(
~u = s~l + ~b
~v = t~l + ~b
for s, t ∈ R, with s < t.
The key property of this foliation of ∆+ by half-planes is that the restriction of
ρ(X,ϕ),q
to each leaf can be seen as the 1-dimensional rank invariant associated with
~
a suitable pair (X, F ϕ~~ ~ ), where F ϕ~~ ~ : X → R. Precisely, the following statement
(l,b)
(l,b)
(proved in [2]) holds:
Theorem 1.4 (Reduction Theorem). Let (~l, ~b) be an admissible pair and let F ϕ~~ ~ :
(l,b)
X → R be defined by setting
ϕi (x) − bi
ϕ
~
F ~ ~ (x) = max
.
(l,b)
i=1,...,k
li
Then, for every (~u, ~v ) = (s~l + ~b, t~l + ~b) ∈ π ~ ~ we have that
(l,b)
ρ(X,ϕ),q
u, ~v ) = ρ(X,F ϕ~
~ (~
(~
l,~
b)
),q (s, t)
.
Since 1-dimensional rank invariants can be compared by a distance dmatch that
matches points of persistence diagrams [5, 6], we can construct the following distance between k-dimensional rank invariants, called multidimensional matching distance:
Dmatch (ρ(X,ϕ),q
~ )=
~ , ρ(X,ψ),q
sup
(~l,~b)∈Admk
min li · dmatch ρ(X,F ϕ~
i
(~
l,~
b)
~
),q , ρ(X,Gψ
(~
l,~
b)
),q
,
n
n
o
o
~
ψi (x)−bi
ψ
i
and
G
=
max
.
where F ϕ~~ ~ = maxi=1,...,k ϕi (x)−b
i=1,...,k
li
li
(~l,~b)
(l,b)
The key property of Dmatch is its stability with respect to perturbations of the
measuring function ϕ
~ , as the following theorem states (cf. [5]).
Theorem 1.5 (Multidimensional Stability Theorem). Let X be triangulable. For
~ : X → Rk ,
every q ∈ Z, and for every two continuous functions ϕ
~, ψ
~ ∞.
≤ k~
ϕ − ψk
Dmatch ρ(X,ϕ),q
~
~ , ρ(X,ψ),q
1.3. Comparison of sets. The problems of description and comparison of sets
can been dealt with in a myriad of different ways, each one more or less suitable
than another for a given application task.
In classical set theory, the membership of elements in a set is assessed in binary
terms according to a bivalent condition – an element either belongs or does not
belong to the set. By contrast, in fuzzy set theory [17, 11], a fuzzy set A in X
is characterized by a membership function fA : X → [0, 1], with the value fA (x)
representing the grade of membership of x in A. Usually, the nearer the value of
fA (x) to 1, the higher the grade of membership of x in A. The fuzzy set theory can
be used in a wide range of domains in which information is incomplete or imprecise.
If classical set theory is adopted, then a number of different dissimilarity measures exist to compare two sets [15, 10]. A frequently used dissimilarity measure is
the Hausdorff distance, which is defined for arbitrary non-empty compact subsets
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
5
K1 , K2 of Rn . If K1 , K2 are contained in a non-empty compact subset D of Rn ,
the Hausdorff distance can be defined by
δH (K1 , K2 ) = max{ max dK1 (x), max dK2 (y)},
x∈K2
y∈K1
where dK denotes the distance to K, that is the function dK : D → R defined by
dK (x) = miny∈K kx − yk, k · k being any norm on Rn (e.g., the Euclidean norm).
This can be reformulated as follows (cf. [8, Ch.4, Sect. 2.2]):
(1.1)
δH (K1 , K2 ) = kdK1 − dK2 k∞ .
The Hausdorff distance is robust against small deformations, but it is sensitive to
outliers: a single far-away noise point drastically increases the Hausdorff distance.
A dissimilarity measure that is based on the area of the symmetric difference,
such as the symmetric difference pseudo-metric, overcomes the problem of outliers.
Denoting by µ the Lebesgue measure on Rn , the symmetric difference pseudo-metric
is defined between two measurable sets A, B with finite measure by d△ (A, B) =
µ(A△B) where A△B = (A ∪ B) \ (A ∩ B) is the symmetric difference of A and B.
It holds that d△ (A, B) = 0 if and only if A and B are equal almost everywhere.
Identifying two sets A and B if µ(A△B) = 0, we obtain the symmetric difference
metric.
Other dissimilarity measures for more restricted patterns are, for example, the
bottleneck distance between finite point sets and the Fréchet distance between
curves. However, since many other distances could be considered, we will limit our
research to consider stability with respect to the Hausdorff and symmetric difference
distances.
When fuzzy sets are used, their dissimilarity can be measured by any function
distance. In this case we will confine ourselves to consider the sup-norm between
fuzzy sets.
2. Stability with respect to Hausdorff distance
Our main idea in proving stability of rank invariants with respect to noisy domains is to transform perturbations of sets into perturbations of functions. In this
way it is possible to apply the Multidimensional Stability Theorem 1.5. When domain perturbations are measured by the Hausdorff distance, in order to pass from
a set K to a function, we insert the distance function dK described in subsection
1.3 as the first component of the measuring function. In this way, assuming that
all the sets under study are contained in a larger set D, the original problem, i.e.
studying persistent homology groups of a set K endowed with the restriction to
K of a measuring function ϕ
~ : D → Rk , is transformed into the new problem of
studying the persistent homology groups of D endowed with the measuring function
~ = (dK , ϕ
Φ
~ ).
Given two domains K1 and K2 , and two functions ϕ
~ 1, ϕ
~ 2 : D → Rk , our first
~ 1 ), (D, Φ
~ 2 ) to the
result relates the distance Dmatch between the new pairs (D, Φ
change of the measuring functions ϕ
~ 1 and ϕ
~ 2 , and to the Hausdorff distance between
the original sets K1 , K2 . More precisely, it proves stability with respect to both set
and function perturbations. Indeed, the change in the multidimensional matching
distance Dmatch is shown to be never greater than the maximum among the change
in the Hausdorff distance between the domains K1 and K2 and the change in the
6
PATRIZIO FROSINI AND CLAUDIA LANDI
sup-norm between the measuring functions ϕ
~ 1 and ϕ
~ 2 . In particular, if ϕ
~ 1 and
ϕ
~ 2 coincide then the change in the multidimensional matching distance Dmatch is
never greater than the Hausdorff distance between K1 and K2 .
Theorem 2.1. Let K1 , K2 be non-empty closed subsets of a triangulable subspace
D of Rn . Let dK1 , dK2 : D → R be their respective distance functions. Moreover,
~ 1, Φ
~2 :
let ϕ
~ 1, ϕ
~ 2 : D → Rk be vector-valued continuous functions. Then, defining Φ
k+1
~
~
~ 2 ), the following inequality holds:
~ 1 ) and Φ2 = (dK2 , ϕ
D→R
by Φ1 = (dK1 , ϕ
ϕ1 − ϕ
~ 2 k∞ } .
Dmatch ρ(D,Φ
~ 1 ),q , ρ(D,Φ
~ 2 ),q ≤ max {δH (K1 , K2 ), k~
Proof. The Multidimensional Stability Theorem1.5 for measuring function pertur~
~
bations implies that Dmatch ρ(D,Φ
~ 1 ),q , ρ(D,Φ
~ 2 ),q ≤ kΦ1 − Φ2 k∞ . It follows that
ϕ1 − ϕ
~ 2 k∞ } .
Dmatch ρ(D,Φ
~ 1 ),q , ρ(D,Φ
~ 2 ),q ≤ max {kdK1 − dK2 k∞ , k~
Hence, by equality (1.1), the claim is proved.
We now consider the problem of retrieving the rank invariants of (K, ϕ
~ |K ) from
~ with Φ
~ = (dK , ϕ
the rank invariants of (D, Φ),
~ ). The next result shows that for any
sufficiently small value of β ∈ R there exists a sufficiently small value α ∈ R with
0 ≤ α < β such that ρ(K,ϕ~ |K ),q (~u, ~v ) = ρ(D,Φ),q
((α, ~u), (β, ~v )).
~
Theorem 2.2. Let K be a non-empty triangulable subset of a triangulable subspace
~ :D →
D of Rn . Moreover, let ϕ
~ : D → Rk be a continuous function. Setting Φ
k+1 ~
k
R , Φ = (dK , ϕ
~ ), for every ~u, ~v ∈ R with ~u ≺ ~v , there exists a real number β̂ > 0
such that, for any β ∈ R with 0 < β ≤ β̂, there exists a real number α̂ = α̂(β), with
0 < α̂ < β, for which
ρ(K,ϕ~ |K ),q (~u, ~v ) = ρ(D,Φ),q
((α, ~u), (β, ~v )) ,
~
for every α ∈ R with 0 ≤ α ≤ α̂. In particular,
ρ(K,ϕ~ |K ),q (~u, ~v ) = lim+ ρ(D,Φ),q
((0, ~u), (β, ~v )) .
~
β→0
Proof. For every ~u ∈ Rk , we have
Kh~
ϕ|K ~ui =
=
=
=
{x ∈ K : ϕ
~ (x) ~u}
{x ∈ D : dK (x) ≤ 0} ∩ {x ∈ D : ϕ
~ (x) ~u}
~
{x ∈ D : Φ(x) (0, ~u)}
~ (0, ~u)i.
DhΦ
(α,~
u),(β,~
v)
Hence, for every q ∈ Z, denoting by πq
the homology homomorphism in~ (α, ~u)i → DhΦ
~ (β, ~v )i, with (α, ~u) (β, ~v ), it
duced by the inclusion DhΦ
holds that
ρ(K,ϕ~ |K ),q (~u, ~v ) = rank im πq(0,~u),(0,~v) .
We claim that there exists a positive real number β̂ such that
im π (0,~u),(0,~v) ∼
= im π (0,~u),(β,~v)
q
q
for every β with 0 < β ≤ β̂ (the claim is trivial for β = 0). In particular, this fact
proves that ρ(K,ϕ~ |K ),q (~u, ~v ) = limβ→0+ ρ(D,Φ),q
((0, ~u), (β, ~v )) .
~
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
7
In order to prove this claim, we consider the inverse system of homomorphisms
(0,~
u),(β,~
v)
~ (0, ~ui) → Ȟq (DhΦ
~ (β, ~v i) over the directed set {β ∈ R :
πq
: Ȟq (DhΦ
β > 0} decreasingly ordered. The following isomorphisms hold:
im πq(0,~u),(β,~v) .
π (0,~u),(β,~v) ∼
im πq(0,~u),(0,~v) ∼
= lim
= im lim
←−
←− q
(0,~
u),(β,~
v)
∼
by the continuity of Čech homology, and
π
= im lim
←− q
(0,~
u),(β,~
v) ∼
(0,~
u),(β,~
v)
because the inverse limit of vector spaces is
im lim πq
im πq
= lim
←−
←−
an exact functor and therefore it preserves epimorphisms, and hence images.
It remains to prove that there exists a positive real number β̂ such that, for
(0,~
u),(β,~
v)
(0,~
u),(β,~
v)
every 0 < β ≤ β̂, im πq
is isomorphic to lim im πq
. To this end, let
←−
us consider the following commutative diagram, with 0 < β ′ ≤ β ′′ :
(0,~
u),(0,~
v)
Indeed, im πq
id
~ (0, ~u)i)
Ȟq (DhΦ
(2.1)
u),(β
πq(0,~
/ Ȟq (DhΦ
~ (0, ~u)i)
′ ,~
v)
u),(β
πq(0,~
~ (β ′ , ~v )i)
Ȟq (DhΦ
πq(β
′ ,~
v ),(β ′′ ,~
v)
′′ ,~
v)
/ Ȟq (DhΦ
~ (β ′′ , ~v )i).
(β ′ ,~
v ),(β ′′ ,~
v)
(β ′ ,β ′′ )
From the above diagram (2.1), we see that each πq
induces a map τq
:
(0,~
u),(β ′ ,~
v)
(0,~
u),(β ′′ ,~
v)
im πq
→ im πq
. From diagram (2.1) we see that these maps are
(0,~
u),(0,~
v)
surjective. On the other hand, by the finiteness of the rank of im πq
and
the monotonicity of the rank invariants, there exists β̂ > 0 such that the rank of
(0,~
u),(β ′′ ,~
v)
(0,~
u),(β ′ ,~
v)
, whenever 0 < β ′ ≤ β ′′ ≤ β̂.
is finite and equal to the rank of πq
πq
′
′′
(β ,β )
Hence the maps τq
are surjections between vector spaces of the same finite
(0,~
u),(β,~
v)
dimension, i.e. isomorphisms for every 0 < β ′ ≤ β ′′ ≤ β̂. Thus, lim im πq
←−
is the inverse limit of a system of finite dimensional vector spaces isomorphic to
(0,~
u),(β̂,~
v)
im πq
, proving the claim.
We now claim that for every strictly positive real number β, there exists a strictly
positive real number α̂ < β such that
im πq(0,~u),(β,~v) ∼
= im πq(α,~u),(β,~v)
for every α with 0 ≤ α ≤ α̂.
This claim can be proved in much the same way as the previous one. We consider
(α,~
u),(β,~
v)
~ (α, ~ui) → Ȟq (DhΦ
~
the inverse system of homomorphisms πq
: Ȟq (DhΦ
(β, ~v i) over the directed set {α ∈ R : 0 ≤ α < β} decreasingly ordered. The
following isomorphisms follow again from the continuity of Čech homology and the
exacteness of the inverse limit functor for vector spaces:
im πq(α,~u),(β,~v) .
π (α,~u),(β,~v) ∼
im πq(0,~u),(β,~v) ∼
= lim
= im lim
←−
←− q
To prove that there exists a strictly positive real number α̂ such that, for every
(α,~
u),(β,~
v)
(α,~
u),(β,~
v)
0 ≤ α ≤ α̂, im πq
is isomorphic to lim im πq
, let us consider the
←− ′′
′
following commutative diagram, with 0 ≤ α ≤ α :
8
(2.2)
PATRIZIO FROSINI AND CLAUDIA LANDI
~ (α′ , ~u)i)
Ȟq (DhΦ
πq(α
πq(α
′ ,~
u),(α′′ ,~
u)
/ Ȟq (DhΦ
~ (α′′ , ~u)i)
′ ,~
u),(β,~
v)
~ (β, ~v )i)
Ȟq (DhΦ
πq(α
id
′′ ,~
u),(β,~
v)
/ Ȟq (DhΦ
~ (β, ~v )i).
(α′ ,~
u),(α′′ ,~
u)
(α′ ,α′′ )
From the above diagram (2.2), we see that each πq
induces a map σq
:
(α′ ,~
u),(β,~
v)
(α′′ ,~
u),(β,~
v)
im πq
→ im πq
. From diagram (2.2) we see that these maps are
(α,~
u),(β,~
v)
injective. On the other hand, by the finiteness of the rank of im πq
, for any
α with 0 < α < β, and the monotonicity of the rank invariants, there exists α̂,
(α′ ,~
u),(β,~
v)
is finite and equal to the rank
with 0 < α̂ < β, such that the rank of πq
′′
(α ,~
u),(β,~
v)
(α′ ,α′′ )
′
′′
of πq
, whenever 0 ≤ α ≤ α ≤ α̂. Hence the maps σq
are injections
between vector spaces of the same finite dimension, i.e. isomorphisms for every
(α,~
u),(β,~
v)
0 ≤ α′ ≤ α′′ ≤ α̂. Thus, lim im πq
is the inverse limit of a system of finite
←−
(α̂,~
u),(β̂,~
v)
dimensional vector spaces isomorphic to im πq
, proving the claim.
Many applications require that the presence of single outliers does not affect
the evaluation of similarity. In these cases, always assuming K triangulable, it
is sufficient to study the closure of the interior of K instead of K itself. Indeed,
applying Theorems 2.1 and 2.2 with the closure of the interior of K instead of
K, we obtain a result of stability of persistent homology groups with respect to
the perturbations of the studied set and a reconstruction result for the original
persistent homology groups modulo perturbations of zero measure.
We underline once more that the results of this section are based on the idea of
translating the problem of stability with respect to set perturbations into that of
stability with respect to function perturbations. Therefore, the use of the distance
function is only one among many ways to achieve this end and has the advantage
of working well when sets are compared using the Hausdorff distance. One could
conceive different ways, in connection with other methods to compare sets, as the
following sections show.
3. Stability with respect to other distances between sets
Our approach can be easily adapted to noise that is small with respect to distances other than the Hausdorff distance δH .
We first show how persistent homology can be made stable with respect to perturbations of the sets measured using the symmetric difference distance (Theorem
3.1). Then we show the stability with respect to perturbations of fuzzy sets (Theorem 3.2).
3.1. Stability with respect to the symmetric difference distance. We work
with a non-empty closed subset K of a triangulable set D in Rn . In this case,
instead of the distance function dK , our construction depends on the use of functions
λεK : Rn → R, with ε ∈ R, ε > 0, defined as
Z
λεK (x) = µ(Bε )−1 ·
χK (y) dy
y∈Bε (x)
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
9
where Bε (x) denotes the n-disk centered at x with radius ε, Bε = Bε (~0), and χK
denotes the characteristic function of K. The underlying idea of this choice is that
the closer a point x of the real plane is to a large part of K, the closer the value
of λεK (x) is to 1. More precisely, λεK (x) = 1 if and only if µ(Bε (x) ∩ K) = µ(Bε ),
whereas λεK (x) = 0 if and only if µ(Bε (x) ∩ K) = 0. Clearly, λεK is a continuous
function for every real number ε > 0.
Analogously to Theorem 2.1, in this case we have the following result.
Theorem 3.1. Let K1 , K2 be non-empty closed subsets of a triangulable subspace
D of Rn . Moreover, let ϕ
~1, ϕ
~ 2 : D → Rk be vector-valued continuous functions.
ε ~ε
ε
~
~ ε = (−λε , ϕ
~ε
Then, defining Ψ1 , Ψ2 : D → Rk+1 by Ψ
~ 2 ), the
1
K1 ~ 1 ) and Ψ2 = (−λK2 , ϕ
following inequality holds:
d△ (K1 , K2 )
(3.1) Dmatch ρ(D,Ψ
,
k~
ϕ
−
ϕ
~
k
,
ρ
≤
max
ε
ε
1
2 ∞ .
~ ),q (D,Ψ
~ ),q
1
2
µ(Bε )
Proof. For every x ∈ D,
|λεK1 (x) − λεK2 (x)|
=
Z
µ(Bε )−1 ·
−1
·
y∈Bε (x)
Z
χK1 (y) − χK2 (y) dy
|χK1 (y) − χK2 (y)| dy
≤
µ(Bε )
=
µ(Bε )−1 · µ(K1 △K2 ).
D
Thus kλεK1 − λεK2 k∞ ≤ µ(Bε )−1 · µ(K1 △K2 ). The Multidimensional Stability Theorem 1.5 for measuring function perturbations implies that
~ε ~ε
Dmatch ρ(D,Ψ
~ ε ),q , ρ(D,Ψ
~ ε ),q ≤ kΨ1 − Ψ2 k∞ .
1
It follows that
≤
,
ρ
Dmatch ρ(D,Ψ
ε
ε
~ ),q (D,Ψ
~ ),q
1
2
≤
=
2
ϕ1 − ϕ
~ 2 k∞
max kλεK1 − λεK2 k∞ , k~
max µ(Bε )−1 · µ(K1 △K2 ), k~
ϕ1 − ϕ
~ 2 k∞
−1
max µ(Bε ) · d△ (K1 , K2 ), k~
ϕ1 − ϕ
~ 2 k∞ .
The previous theorem shows that, under our hypotheses, if two compact subsets
K1 , K2 of the real plane are close to each other in the sense that their symmetric
difference has a small measure, then also the rank invariants constructed using the
~ ε are close to each other.
~ ε, Ψ
functions Ψ
1
2
We observe that the estimate in inequality (3.1) can be improved by substituting
R
d△ (K1 , K2 ) with maxx∈Rn y∈Bε (x) χK1 (y) − χK2 (y) dy .
3.2. Stability with respect to perturbations of fuzzy sets. Now we consider
the case when sets are defined according to fuzzy theory, that is through functions
representing the grade of membership of points to the considered set. One obtains
a fuzzy set, for example, when a probability density p(x) is given, p(x) expressing the probability that a point of the considered set belongs to an infinitesimal
neighborhood of x. We confine ourselves to considering only probability densities
with compact support contained in a triangulable subspace D of Rn . From the
10
PATRIZIO FROSINI AND CLAUDIA LANDI
Multidimensional Stability Theorem 1.5 for measuring function perturbations we
immediately deduce the following result, whose simple proof is omitted, concerning the stability with respect to perturbations of fuzzy sets defined by probability
densities.
Theorem 3.2. Let p1 , p2 be two probability density functions having support con~ 1, Ψ
~2 : D →
tained in a compact and triangulable subspace D of Rn . Defining Ψ
k+1
~
~
R
by Ψ1 = (−p1 , ϕ
~ 1 ) and Ψ2 = (−p2 , ϕ
~ 2 ), the following statement holds:
ϕ1 − ϕ
~ 2 k∞ } .
Dmatch ρ(D,Ψ
~ 1 ),q , ρ(D,Ψ
~ 2 ),q ≤ max {kp1 − p2 k∞ , k~
4. An example
In this section the theoretical framework presented in Section 2 is applied in
a discrete setting. Our goal is to check the stability of the proposed framework
with respect to set perturbations measured by the Hausdorff distance. We confine
ourselves to the case q = 0.
With this aim in mind, we work with the binary digital image represented in
Figure 1 (left), and we corrupt this image by adding salt & pepper noise to a
neighborhood of the set of its black pixels, as shown in Figure 1 (right).
Figure 1. Two binary images of an octopus. The image on the
right is a noisy version of that on the left.
Black pixels of left and right images represent the sets K1 , K2 under study,
respectively, whereas in both cases the 269x256 rectangle of black and white pixels
together constitute the set D. The so obtained noisy set K2 is close to the original
set K1 with respect to the Hausdorff distance.
A graph structure based on the local 4-neighbors adjacency relations of the
digital points is used in order to topologize the images.
Fixed the point c ∈ D corresponding to the center of mass of K1 , the chosen
measuring function for both instances is ϕ : D → R, ϕ(p) = −kp − ck.
Figure 2 (left) shows the persistence diagram of the 1-dimensional 0th rank
invariant ρ(K1 ,ϕ|K1 ),0 . It displays eight relevant points in the persistence diagram,
corresponding to the eight tentacles of the octopus. Only one of these points is
at infinity (and therefore depicted by a vertical line rather than by a circle) since
K1 has only one connected component. As for ρ(K2 ,ϕ|K2 ),0 , due to the presence
of a great quantity of connected components in the noisy octopus, its persistence
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
11
diagram has a very large number of points at infinity, and a figure showing them
all would be hardly readable. For this reason Figure 2 (right) shows only a small
subset of its persistence diagram. However it is sufficient to perceive how dissimilar
it is from ρ(K1 ,ϕ|K1 ),0 .
−140
−60
−142
−80
−144
−146
−100
v
v
−148
−150
−120
−152
−154
−140
−156
−160
−158
−160
−160
−140
−120
u
−100
−80
−60
−160
−155
−150
u
−145
−140
Figure 2. Left: The persistence diagram of the rank invariant
ρ(K1 ,ϕ|K1 ),0 corresponding to the original octopus image. Right: A
detail of the persistence diagram of the rank invariant ρ(K2 ,ϕ|K2 ),0
corresponding to the noisy octopus image.
As suggested by Theorem 2.1, if instead we compare K1 and K2 by means of
2 ~
~
the rank invariants ρ(D,Φ
~ 1 ),0 and ρ(D,Φ
~ 2 ),0 , where Φ1 : D → R , Φ1 = (dK1 , ϕ), and
~ 2 : D → R2 , Φ
~ 2 = (dK2 , ϕ), we can see the similarity between K1 and K2 modulo
Φ
the salt & pepper noise. This is illustrated in Figure 3. In Figure 3 (a)-(b), we show
the rank invariants ρ(D,Φ
~ 1 ),0 and ρ(D,Φ
~ 2 ),0 both restricted to the half-plane π(~l,~b) ,
~
~
with l = (0.1483, 0.9889) and b = (13.0434, −13.0434), that is the half-plane of the
foliation containing the point ((0, −100), (3, −80)). In other words, Figure 3 (a)-(b)
shows ρ
and ρ
, respectively. We can appreciate their similarity,
~
~
Φ
Φ
1
2
(D,F
(~
l,~
b)
),0
(D,F
(~
l,~
b)
),0
even if their matching distance dmatch is not necessarily smaller than the Hausdorff
distance between K1 and K2 . The considered half-plane has been chosen so that it
contains points where the rank invariant takes non-trivial values.
Indeed it is easy to verify that Theorem 2.1 does not guarantee the stability of
ρ(D,F Φ~ ),q but the stability of ρ(D,µ·F Φ~ ),q , where µ = mini li . We point out that
(~
l,~
b)
ρD,µ·F Φ~
(~
l,~
b)
(~
l,~
b)
,q
(µ · s, µ · t) = ρD,F Φ~
(~
l,~
b)
~
,q
(s, t), and hence the passage from the mea~
suring function F(Φ~l,~b) to the measuring function µ · F(Φ~l,~b) corresponds to “rescaling
up” the domain of the rank invariant. In other words, when we change K1 into a
new compact set K2 that is close to K1 with respect to the Hausdorff distance, the
matching distance between ρ(K1 ,ϕ|K1 ),q and ρ(K2 ,ϕ|K2 ),q may be not small, while the
must be small.
and ρ
one between ρ
(dK ,ϕ|K )
(dK ,ϕ|K )
2
1
2
1
D,µ·F
(~
l,~
b)
,q
D,µ·F
(~
l,~
b)
,q
12
PATRIZIO FROSINI AND CLAUDIA LANDI
This is illustrated in Figure 3, where the rank invariants ρ
~
Φ
1 ),0
(~
l,~
b)
(D,F
displayed the top row, are not as similar as the rank invariants ρ
ρ
~
Φ
2
),0
(~
l,~
b)
(D,µ·F
~
Φ
2 ),0
(~
l,~
b)
(D,F
~
Φ
1 ),0
(~
l,~
b)
(D,µ·F
and
, displayed in the bottom row.
−50
−50
−55
−55
−60
−60
−65
−65
−70
−70
t
t
and ρ
−75
−75
−80
−80
−85
−85
−90
−90
−95
−95
−95
−90
−85
−80
−75
−70
−65
s
−60
−55
−50
−95
−90
−85
−80
25
25
20
20
15
15
10
10
µ·t
5
−65
−60
−55
−50
5
0
0
−5
−5
−10
−10
−15
−15
−20
−20
−70
s
(b)
µ·t
(a)
−75
−15
−10
−5
0
5
µ·s
(c)
10
15
20
25
−20
−20
−15
−10
−5
0
5
µ·s
10
15
20
25
(d)
Figure 3. (a) The rank invariant ρ(D,Φ
~ 1 ),0 restricted to the half~
plane π ~ ~ , with l = (0.1483, 0.9889) and ~b = (13.0434, −13.0434),
(l,b)
that is the half-plane of the foliation containing the point
((0, −100), (3, −80)). (b) The rank invariant ρ(D,Φ
~ 1 ),0 restricted
to the same half-plane. (c)-(d) The same restrictions as in (a)-(b),
respectively, but rescaled by µ = min{l1 , l2 }.
Next we show how it is possible to point-wisely recover the rank invariant of
~ 1 ). According to Theorem 2.2, ρ(K ,ϕ ),0 (u, v) =
(K1 , ϕ|K1 ) from that of (D, Φ
1
|K1
ρ(D,Φ
~ 1 ) (α, u, β, v) for α, β > 0 sufficiently small.
,
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
13
As shown in [1], in this case the following equalities hold (with reference to
Definition 1.3):
β−α
,
(β−α)2 +(v−u)2
b1 (α, u, β, v) = α(β+v)−β(α+u)
(β−α)+(v−u) ,
s(a, u, β, v) = lα+u
,
1 +l2
v−u
,
(β−α)2 +(v−u)2
b2 (α, u, β, v) = u(β+v)−v(α+u)
(β−α)+(v−u) ,
t(α, u, β, v) = lβ+v
.
1 +l2
l1 (α, u, β, v) = √
(4.1)
l2 (α, u, β, v) = √
As a consequence, Theorems 2.2 and 1.4 (applied in this order) imply that for
every pair (u, v), with u < v, and for 0 < α < β, with α and β sufficiently small,
ρ(K1 ,ϕ|K1 ),q (u, v) = ρ(D,Φ
~ 1 ),q ((α, u), (β, v))
= ρ
= ρ
~
Φ
1
(~
l,~
b)
,q
~
Φ
1
(~
l,~
b)
,q
D,F
D,F
(s(α, u, β, v), t(α, u, β, v))
α+u β+v
,
l1 + l2 l1 + l2
~
where F Φ~ 1~ : D → R is defined by setting, for every x ∈ D,
(l,b)
~
F Φ~ 1~ (x) = max
(l,b)
dK (x) − b1 ϕ(x) − b2
,
l1
l2
.
Hence the finite value ρ(K1 ,ϕ|K−1 ),q (u, v) is equal to ρ
~
Φ
D,F ~1~
(l,b)
,q
α+u β+v
l1 +l2 , l1 +l2
,
if we choose α and β small enough. The corresponding admissible pair (~l, ~b) results
to be close to the pair ((0, 1), (0, 0)).
In other words, the information about the rank invariant of the original pair
(K1 , ϕ|K1 ) can be recovered on the leaves associated with the admissible pairs (~l, ~b)
in a small neighborhood of the pair ((0, 1), (0,0)), after re-parameterizing
these
, β+v . Note that the pair
leaves by taking each point (u, v) to the point lα+u
1 +l2 l1 +l2
((0, 1), (0, 0)) is not admissible but is located on the boundary of the set Adm2 .
This leads to instabilities if we take α, β too small.
We underline that Theorem 2.2 ensures this approximation is good only point-bypoint. Thus, even for admissible pairs (~l, ~b) close enough to the pair ((0, 1), (0, 0)),
and ρ
the matching distance between ρ
~
(K,ϕ|K1 ),q may be quite large.
Φ
1
D,F
(~
l,~
b)
,q
To illustrate these issues we have computed the value taken by ρ(K1 ,ϕ|K1 ),0 at the
point (u, v) = (−100, −80), that is 8. Using formulas (4.1), we have computed the
admissible pairs ~l = ~l(α, β), ~b = ~b(α, β) such that the half-plane π(~l,~b) contains the
point ((α, −100), (β, −80)) for the values of α, β shown in Table 4. For the same
values of α, β, Table 4 also shows the parameters s = s(α, β) and t = t(α, β) for
(s, t).
which we have that ρ(D,Φ
~
~ 1 ),0 ((α, −100), (β, −80)) = ρ
Φ
1
D,F
(~
l,~
b)
Computations show that ρ(K1 ,ϕ|K1 ),0 (−100, −80) =
,0
(s, t)
ρ
~
Φ
D,F ~1~ ,0
= 8 for
(l,b)
small but positive values of α and β. It is noticeable that for α = 0, due to the
mentioned instabilities near the boundary of the set Adm2 , computations are not
reliable.
14
PATRIZIO FROSINI AND CLAUDIA LANDI
α
u
β
v
0.5
0.5
0.5
0.5
0.5
0.3
0.1
0
-100
-100
-100
-100
-100
-100
-100
-100
24
16
8
1
0.65
0.45
0.25
0.15
-80
-80
-80
-80
-80
-80
-80
-80
s
t
ρ
-70.5866 -39.7272
-70.9216 -45.6179
-77.2843 -55.9262
-97.1120 -77.1040
-98.7692 -78..7672
-98.9677 -78.9657
-99.1663 -79.1643
-99.2655 -79.2635
~
Φ
1
(~
l(ε),~
b(ε))
D,F
,0
(s, t)
1
3
3
8
8
8
8
0
Table 1. The parameters used to approximate the value of
.
ρ(K1 ,ϕ|K1 ),0 at (u, v) = (−100, −80), that is 8, using ρ
~
Φ
1
D,F
(~
l,~
b)
The corresponding rank invariants ρ
~
Φ
1
(~
l(ε),~
b(ε))
D,F
,0
,0
for the choices of α and β
considered in Table 4 are displayed in Figure 4.
5. Discussion
In this paper we have shown the stability of persistent homology groups with
respect to perturbations of the studied set. Measuring set perturbations by different
distances requires different constructions in order to achieve stability.
If set perturbations are measured through the Hausdorff distance, we replace
compact sets by distance functions. In this way, by the well-known property that if
K ′ is a good Hausdorff approximation of K then the distance function dK ′ is close
to dK , and by utilizing already available results of persistent homology stability
with respect to function perturbations, we deduce stability with respect to set
perturbations. We also show that while passing from sets to functions we are still
able to recover information about the persistent homology groups of the original
sets.
If set perturbations are measured through the symmetric difference distance, an
analogous procedure leads to the proof of stability also in this case. Finally, using
the sup-distance enables us to guarantee stability also with respect to perturbations
of fuzzy sets.
The common underlying idea is to compare sets by comparing functions describing the sets themselves.
While considering the Hausdorff distance and the symmetric difference distance
is certainly not exhaustive of all the possible ways of measuring set perturbations,
it accounts for two very widely used ones, and allows us to indicate a general
procedure that could be applied also when dealing with other distances.
Finally, we underline that the technique developed in this paper essentially relies
on the multidimensional generalization of persistent homology, showing once more
the importance of further pursuing this area of research.
MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS
15
References
[1] S. Biasotti, A. Cerri, P. Frosini, D. Giorgi, and C. Landi. Multidimensional size functions for
shape comparison. J. Math. Imaging Vision, 32(2):161–179, 2008.
[2] F. Cagliari, B. Di Fabio, and M. Ferri. One-dimensional reduction of multidimensional persistent homology. Posted on April 9, 2010 PII: S 0002-9939(10)10312-8 (to appear in print).
[3] F. Cagliari and C. Landi. Finiteness of rank invariants of multidimensional persistent homology groups. arXiv:1001.0358v1 [math.AT].
[4] G. Carlsson and A. Zomorodian. The theory of multidimensional persistence. Discrete &
Computational Geometry, 42(1):71–93, 2009.
[5] A. Cerri, B. Di Fabio, M. Ferri, P. Frosini, and C. Landi. Multidimensional persistent homology is stable. Technical Report, Università di Bologna, Luglio 2009.
http://amsacta.cib.unibo.it/2603/.
[6] D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. Stability of persistence diagrams. Discrete
Comput. Geom., 37(1):103–120, 2007.
[7] D. Cohen-Steiner, H. Edelsbrunner, and D. Morozov. Vines and vineyards by updating persistence in linear time. In SCG ’06: Proceedings of the twenty-second annual symposium on
Computational geometry, pages 119–126, New York, NY, USA, 2006. ACM.
[8] M. C. Delfour and J.-P. Zolésio. Shapes and geometries: analysis, differential calculus, and
optimization. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2001.
[9] T. K. Dey and R. Wenger. Stability of critical points with interval persistence. Discrete
Comput. Geom., 38(3):479–512, 2007.
[10] M.-M. Deza and E. Deza. Dictionary of Distances. Elsevier, Burlington, 2006.
[11] D. Dubois and H. Prade. Fuzzy sets and systems - Theory and applications. Academic Press,
New York, 1980.
[12] H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification.
Discrete & Computational Geometry, 28(4):511–533, 2002.
[13] P. Frosini. Measuring shapes by size functions. In D. P. Casasent, editor, Intelligent Robots
and Computer Vision X: Algorithms and Techniques, volume 1607, pages 122–133, 1991.
[14] P. Frosini and M. Mulazzani. Size homotopy groups for computation of natural size distances.
Bulletin of the Belgian Mathematical Society, 6(3):455–464, 1999.
[15] R. C. Veltkamp and M. Hagedoorn. State of the art in shape matching. Principles of visual
information retrieval, pages 87–119, 2001.
[16] A. Verri, C. Uras, P. Frosini, and M. Ferri. On the use of size functions for shape analysis.
Biol. Cybern., 70:99–107, 1993.
[17] L. Zadeh. Fuzzy sets. Information Control, 8:338–353, 1965.
Dipartimento di Matematica, Università di Bologna, P.zza di Porta S. Donato 5,
I-40126 Bologna, Italia
E-mail address: frosini@dm.unibo.it
Dipartimento di Scienze e Metodi dell’Ingegneria, Università di Modena e Reggio
Emilia, Via Amendola 2, Pad. Morselli, I-42100 Reggio Emilia, Italia
E-mail address: clandi@unimore.it
16
PATRIZIO FROSINI AND CLAUDIA LANDI
α = 0.5, β = 24
α = 0.5, β = 16
−40
−40
−45
−45
−50
−50
−55
−55
t
t
−35
−60
−60
−65
−65
−70
−70
−75
−75
−75
−70
−65
−60
−55
−50
s
−45
−40
−35
−75
−70
α = 0.5, β = 8
−65
−60
−55
s
−50
−45
−40
α = 0.5, β = 1
−40
−45
−60
−50
−70
−55
−80
t
t
−60
−90
−65
−70
−100
−75
−110
−80
−120
−85
−85
−80
−75
−70
−65
s
−60
−55
−50
−45
−40
−120
α = 0.5, β = 0.65
−110
−100
−90
s
−80
−70
−60
α = 0.3, β = 0.45
−60
−60
−70
−80
−80
−90
t
t
−100
−120
−100
−110
−120
−140
−130
−140
−160
−150
−160
−140
−120
s
−100
−80
−60
−150 −140 −130 −120 −110 −100
s
α = 0.1, β = 0.25
−90
−80
−70
α = 0, β = 0.15
−55
−60
−60
−65
−70
−70
−80
t
t
−75
−80
−90
−85
−90
−100
−95
−110
−100
−105
−120
−120
−110
−100
−90
s
−80
−70
−60
−100
Figure 4. The rank invariant ρ
~
Φ
1
(~
l(α),~
b(β))
D,F
,0
−90
−80
s
−70
as α and β tend
to 0. Red circles and red lines denote the points (proper or at infinity) of the corresponding persistence diagram. The blue diamonds
denote the point (s, t) corresponding to ((α, −100), (β, −80)).
−60
−60