Academia.eduAcademia.edu

Stability of multidimensional persistent homology with respect to domain perturbations

2010, Arxiv preprint arXiv:1001.1078

arXiv:1001.1078v2 [math.AT] 10 May 2010 STABILITY OF MULTIDIMENSIONAL PERSISTENT HOMOLOGY WITH RESPECT TO DOMAIN PERTURBATIONS PATRIZIO FROSINI AND CLAUDIA LANDI Abstract. Motivated by the problem of dealing with incomplete or imprecise acquisition of data in computer vision and computer graphics, we extend results concerning the stability of persistent homology with respect to function perturbations to results concerning the stability with respect to domain perturbations. Domain perturbations can be measured in a number of different ways. An important method to compare domains is the Hausdorff distance. We show that by encoding sets using the distance function, the multidimensional matching distance between rank invariants of persistent homology groups is always upperly bounded by the Hausdorff distance between sets. Moreover, we prove that our construction maintains information about the original set. Other well known methods to compare sets are considered, such as the symmetric difference distance between classical sets and the sup-distance between fuzzy sets. Also in these cases we present results stating that the multidimensional matching distance between rank invariants of persistent homology groups is upperly bounded by these distances. An experiment showing the potential of our approach concludes the paper. Introduction Persistent topology is a theory for studying objects related to computer vision and computer graphics, by analyzing the qualitative and quantitative behavior of real-valued functions defined over topological spaces and measuring the shape properties of the topological space under study (e.g., roundness, elongation, bumpiness, color). More precisely, persistent topology studies the sequence of nested lower level sets of the considered measuring functions and encodes at which scale a topological feature (e.g., a connected component, a tunnel, a void) is created, and when it is annihilated along this filtration. At the very beginning of the development of persistent topology, this encoding captured only the connected component changes in the lower level sets of a real valued function, and took the name of size function [13, 16]. Some years later, it was extended to consider all homotopy groups of the lower level sets of a vector-valued function, under the name of size homotopy groups [14]. Nowadays we have a wide choice of variants for this encoding, ranging from persistent homology groups capturing the homology of a one-parameter increasing family of spaces [12], to multidimensional persistent homology groups extending the previous concept to a multi-parameter setting [4], to vineyards coping with changes in the function over time [7], to interval persistence [9], just to cite a few. In this paper we focus on multidimensional persistent homology groups. For application 2010 Mathematics Subject Classification. Primary: 55N35; Secondary: 68T10, 68U05, 55N05. Key words and phrases. Persistent topology, shape analysis, Čech homology, matching distance, distance function, Hausdorff distance, symmetric difference distance. Research partially carried out within the activities of ARCES (Università di Bologna). 1 2 PATRIZIO FROSINI AND CLAUDIA LANDI purposes, these groups are further encoded by considering only their rank, yielding to a parametrized version of Betti numbers, called rank invariants (or persistent Betti numbers). The stability of multidimensional rank invariants is quite an important issue in persistent homology theory and its applications because the lack of stability would make this invariant useless, every data measurement being affected by noise. Stability with respect to perturbations of the measuring function was proved in [5], based on the results of [1, 2], comparing persistent homology groups by the multidimensional matching distance. In this paper we consider the problem of stability with respect to changes of the topological space, which is as much important as the stability with respect to the change of measuring functions. Changes of the space under study can be measured in a number of different ways. Indeed, according to the kind of noise producing the perturbation, some distances are more suitable than other to compare sets. For example, the Hausdorff distance is useful to measure distortions of the domain, while the symmetric difference distance can cope with the presence of outliers. Due to the existence of many different ways to compare sets, we propose a general approach to the problem of stability of persistent homology groups with respect domain perturbations, and we apply this approach in a few cases. Our main idea is to reduce the problem of stability with respect to changes of the topological space to that of stability with respect to changes of the measuring functions. This is achieved by substituting the domain K we are interested in with an appropriate function fK defined on a fixed set D containing K, so that the perturbation of the set K becomes a perturbation of the function fK . As a consequence, the original measuring function ϕ ~ |K : K → Rk is replaced by a new k+1 ~ :D→R ~ = (fK , ϕ ~ can be measuring function Φ ,Φ ~ ). Rank invariants of (D, Φ) compared using the multidimensional matching distance. In this way we can prove robustness of persistent homology groups under domain perturbations. In particular, we use this strategy when sets are compared by the Hausdorff distance. In this case, taking fK equal to the distance function from K, we prove that the multidimensional matching distance between the rank invariants associated with two compact sets K1 and K2 is always upperly bounded by the Hausdorff distance between K1 and K2 (Theorem 2.1). At the same time, we show that, in our approach, the information about the original domain K and its original measuring function ϕ ~ is fully maintained in the persistent homology groups of ~ (Theorem 2.2). (D, Φ) As a further contribution, we show stability with respect to perturbations of the domain measured using the symmetric difference distance. Also in this case, associating with K a suitable function fK , we can prove stability with respect to domain perturbations measured by the symmetric difference distance (Theorem 3.1). We also consider the situation where sets are described in a fuzzy sense, by means of probability density functions, easily obtaining a stability result also in this case (Theorem 3.2). An experiment on a binary image concludes the paper, illustrating our results. To conclude this introduction, we emphasize three key-points in this paper. Firstly, there is a real need for developing techniques providing persistent homology with robustness against domain perturbations since the classical setting is not stable MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 3 in this respect, as the example in Section 4 clearly shows. Secondly, the technique we propose in order to achieve stability makes sense only in the multidimensional version of persistent homology, since it is based on passing from a k-dimensional measuring function to a (k + 1)-dimensional one. Thirdly, we underline that, substituting the study of the domain K with that of a much simpler domain D, we obtain the searched for stability but not at the price of forgetting the persistent homology of K. 1. Preliminaries 1.1. Multidimensional persistent homology groups. The following relations  and ≺ are defined in Rk : for ~u = (u1 , . . . , uk ) and ~v = (v1 , . . . , vk ), we say ~u  ~v (resp. ~u ≺ ~v ) if and only if ui ≤ vi (resp. ui < vi ) for every index i = 1, . . . , k. Moreover, Rk is endowed with the usual max-norm: k(u1 , u2 , . . . , uk )k∞ = max1≤i≤k |ui |. We shall use the following notations: ∆+ will be the open set {(~u, ~v ) ∈ Rk × Rk : ~u ≺ ~v }. Given a topological space X, for every k-tuple ~u = (u1 , . . . , uk ) ∈ Rk and for every continuous function ϕ ~ : X → Rk , we shall denote by Xh~ ϕ  ~u i the lower level set {x ∈ X : ϕi (x) ≤ ui , i = 1, . . . , k} and by k~ ϕk∞ the sup-norm of ϕ ~ , i.e. k~ ϕk∞ = maxx∈X k~ ϕ(x)k∞ . The function ϕ ~ will be called a k-dimensional measuring (or filtering) function. (~ u,~ v) Definition 1.1. Let πq : Ȟq (Xh~ ϕ  ~ui) → Ȟq (Xh~ ϕ  ~v i) be the homomorphism induced by the inclusion map π (~u,~v) : Xh~ ϕ  ~ui ֒→ Xh~ ϕ  ~v i with ~u  ~v , (~ u,~ v) where Ȟq denotes the qth Čech homology group. If ~u ≺ ~v , the image of πq is called the multidimensional qth persistent homology group of (X, ϕ ~ ) at (~u, ~v ), and (~ u,~ v) is denoted by Ȟq (X, ϕ ~ ). (~ u,~ v) In other words, the group Ȟq (X, ϕ ~ ) contains all and only the homology classes of q-cycles born before ~u and still alive at ~v . In what follows, we shall work with coefficients in a field K, so that homology groups are vector spaces, and hence torsion-free. Therefore, they can be completely described by their rank, leading to the following definition (cf. [4]). Definition 1.2 (qth rank invariant). Let X be a topological space, and ϕ ~ : X → Rk a continuous function. Let q ∈ Z. The qth rank invariant of the pair (X, ϕ ~ ) is the function ρ(X,ϕ),q : ∆+ → N ∪ {∞} defined as ~ ρ(X,ϕ),q u, ~v ) = rank im πq(~u,~v) . ~ (~ If X is a triangulable space embedded in some Rn , then ρ(X,ϕ),q u, ~v ) < +∞, for ~ (~ every (~u, ~v ) ∈ ∆+ and every q ∈ Z [3, 5]. 1.2. Matching distance. We now recall the construction of the distance Dmatch to compare rank invariants of multidimensional persistent homology groups. Definition 1.3 (Admissible pairs). For every vector ~l = (l1 , . . . , lk ) of Rk such Pk that li > 0 for i = 1, . . . , k and i=1 li2 = 1, and for every vector ~b = (b1 , . . . , bk ) of P k Rk such that i=1 bi = 0, we shall say that the pair (~l, ~b) is admissible. We shall denote the set of all admissible pairs in Rk × Rk by Admk . Given an admissible 4 PATRIZIO FROSINI AND CLAUDIA LANDI pair (~l, ~b), we define the half-plane π(~l,~b) of Rk × Rk by the following parametric equations: ( ~u = s~l + ~b ~v = t~l + ~b for s, t ∈ R, with s < t. The key property of this foliation of ∆+ by half-planes is that the restriction of ρ(X,ϕ),q to each leaf can be seen as the 1-dimensional rank invariant associated with ~ a suitable pair (X, F ϕ~~ ~ ), where F ϕ~~ ~ : X → R. Precisely, the following statement (l,b) (l,b) (proved in [2]) holds: Theorem 1.4 (Reduction Theorem). Let (~l, ~b) be an admissible pair and let F ϕ~~ ~ : (l,b) X → R be defined by setting   ϕi (x) − bi ϕ ~ F ~ ~ (x) = max . (l,b) i=1,...,k li Then, for every (~u, ~v ) = (s~l + ~b, t~l + ~b) ∈ π ~ ~ we have that (l,b) ρ(X,ϕ),q u, ~v ) = ρ(X,F ϕ~ ~ (~ (~ l,~ b) ),q (s, t) . Since 1-dimensional rank invariants can be compared by a distance dmatch that matches points of persistence diagrams [5, 6], we can construct the following distance between k-dimensional rank invariants, called multidimensional matching distance: Dmatch (ρ(X,ϕ),q ~ )= ~ , ρ(X,ψ),q sup (~l,~b)∈Admk  min li · dmatch ρ(X,F ϕ~ i (~ l,~ b) ~ ),q , ρ(X,Gψ (~ l,~ b) ),q  , n n o o ~ ψi (x)−bi ψ i and G = max . where F ϕ~~ ~ = maxi=1,...,k ϕi (x)−b i=1,...,k li li (~l,~b) (l,b) The key property of Dmatch is its stability with respect to perturbations of the measuring function ϕ ~ , as the following theorem states (cf. [5]). Theorem 1.5 (Multidimensional Stability Theorem). Let X be triangulable. For ~ : X → Rk , every q ∈ Z, and for every two continuous functions ϕ ~, ψ   ~ ∞. ≤ k~ ϕ − ψk Dmatch ρ(X,ϕ),q ~ ~ , ρ(X,ψ),q 1.3. Comparison of sets. The problems of description and comparison of sets can been dealt with in a myriad of different ways, each one more or less suitable than another for a given application task. In classical set theory, the membership of elements in a set is assessed in binary terms according to a bivalent condition – an element either belongs or does not belong to the set. By contrast, in fuzzy set theory [17, 11], a fuzzy set A in X is characterized by a membership function fA : X → [0, 1], with the value fA (x) representing the grade of membership of x in A. Usually, the nearer the value of fA (x) to 1, the higher the grade of membership of x in A. The fuzzy set theory can be used in a wide range of domains in which information is incomplete or imprecise. If classical set theory is adopted, then a number of different dissimilarity measures exist to compare two sets [15, 10]. A frequently used dissimilarity measure is the Hausdorff distance, which is defined for arbitrary non-empty compact subsets MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 5 K1 , K2 of Rn . If K1 , K2 are contained in a non-empty compact subset D of Rn , the Hausdorff distance can be defined by δH (K1 , K2 ) = max{ max dK1 (x), max dK2 (y)}, x∈K2 y∈K1 where dK denotes the distance to K, that is the function dK : D → R defined by dK (x) = miny∈K kx − yk, k · k being any norm on Rn (e.g., the Euclidean norm). This can be reformulated as follows (cf. [8, Ch.4, Sect. 2.2]): (1.1) δH (K1 , K2 ) = kdK1 − dK2 k∞ . The Hausdorff distance is robust against small deformations, but it is sensitive to outliers: a single far-away noise point drastically increases the Hausdorff distance. A dissimilarity measure that is based on the area of the symmetric difference, such as the symmetric difference pseudo-metric, overcomes the problem of outliers. Denoting by µ the Lebesgue measure on Rn , the symmetric difference pseudo-metric is defined between two measurable sets A, B with finite measure by d△ (A, B) = µ(A△B) where A△B = (A ∪ B) \ (A ∩ B) is the symmetric difference of A and B. It holds that d△ (A, B) = 0 if and only if A and B are equal almost everywhere. Identifying two sets A and B if µ(A△B) = 0, we obtain the symmetric difference metric. Other dissimilarity measures for more restricted patterns are, for example, the bottleneck distance between finite point sets and the Fréchet distance between curves. However, since many other distances could be considered, we will limit our research to consider stability with respect to the Hausdorff and symmetric difference distances. When fuzzy sets are used, their dissimilarity can be measured by any function distance. In this case we will confine ourselves to consider the sup-norm between fuzzy sets. 2. Stability with respect to Hausdorff distance Our main idea in proving stability of rank invariants with respect to noisy domains is to transform perturbations of sets into perturbations of functions. In this way it is possible to apply the Multidimensional Stability Theorem 1.5. When domain perturbations are measured by the Hausdorff distance, in order to pass from a set K to a function, we insert the distance function dK described in subsection 1.3 as the first component of the measuring function. In this way, assuming that all the sets under study are contained in a larger set D, the original problem, i.e. studying persistent homology groups of a set K endowed with the restriction to K of a measuring function ϕ ~ : D → Rk , is transformed into the new problem of studying the persistent homology groups of D endowed with the measuring function ~ = (dK , ϕ Φ ~ ). Given two domains K1 and K2 , and two functions ϕ ~ 1, ϕ ~ 2 : D → Rk , our first ~ 1 ), (D, Φ ~ 2 ) to the result relates the distance Dmatch between the new pairs (D, Φ change of the measuring functions ϕ ~ 1 and ϕ ~ 2 , and to the Hausdorff distance between the original sets K1 , K2 . More precisely, it proves stability with respect to both set and function perturbations. Indeed, the change in the multidimensional matching distance Dmatch is shown to be never greater than the maximum among the change in the Hausdorff distance between the domains K1 and K2 and the change in the 6 PATRIZIO FROSINI AND CLAUDIA LANDI sup-norm between the measuring functions ϕ ~ 1 and ϕ ~ 2 . In particular, if ϕ ~ 1 and ϕ ~ 2 coincide then the change in the multidimensional matching distance Dmatch is never greater than the Hausdorff distance between K1 and K2 . Theorem 2.1. Let K1 , K2 be non-empty closed subsets of a triangulable subspace D of Rn . Let dK1 , dK2 : D → R be their respective distance functions. Moreover, ~ 1, Φ ~2 : let ϕ ~ 1, ϕ ~ 2 : D → Rk be vector-valued continuous functions. Then, defining Φ k+1 ~ ~ ~ 2 ), the following inequality holds: ~ 1 ) and Φ2 = (dK2 , ϕ D→R by Φ1 = (dK1 , ϕ   ϕ1 − ϕ ~ 2 k∞ } . Dmatch ρ(D,Φ ~ 1 ),q , ρ(D,Φ ~ 2 ),q ≤ max {δH (K1 , K2 ), k~ Proof. The Multidimensional Stability Theorem1.5 for measuring function pertur~ ~ bations implies that Dmatch ρ(D,Φ ~ 1 ),q , ρ(D,Φ ~ 2 ),q ≤ kΦ1 − Φ2 k∞ . It follows that   ϕ1 − ϕ ~ 2 k∞ } . Dmatch ρ(D,Φ ~ 1 ),q , ρ(D,Φ ~ 2 ),q ≤ max {kdK1 − dK2 k∞ , k~ Hence, by equality (1.1), the claim is proved.  We now consider the problem of retrieving the rank invariants of (K, ϕ ~ |K ) from ~ with Φ ~ = (dK , ϕ the rank invariants of (D, Φ), ~ ). The next result shows that for any sufficiently small value of β ∈ R there exists a sufficiently small value α ∈ R with 0 ≤ α < β such that ρ(K,ϕ~ |K ),q (~u, ~v ) = ρ(D,Φ),q ((α, ~u), (β, ~v )). ~ Theorem 2.2. Let K be a non-empty triangulable subset of a triangulable subspace ~ :D → D of Rn . Moreover, let ϕ ~ : D → Rk be a continuous function. Setting Φ k+1 ~ k R , Φ = (dK , ϕ ~ ), for every ~u, ~v ∈ R with ~u ≺ ~v , there exists a real number β̂ > 0 such that, for any β ∈ R with 0 < β ≤ β̂, there exists a real number α̂ = α̂(β), with 0 < α̂ < β, for which ρ(K,ϕ~ |K ),q (~u, ~v ) = ρ(D,Φ),q ((α, ~u), (β, ~v )) , ~ for every α ∈ R with 0 ≤ α ≤ α̂. In particular, ρ(K,ϕ~ |K ),q (~u, ~v ) = lim+ ρ(D,Φ),q ((0, ~u), (β, ~v )) . ~ β→0 Proof. For every ~u ∈ Rk , we have Kh~ ϕ|K  ~ui = = = = {x ∈ K : ϕ ~ (x)  ~u} {x ∈ D : dK (x) ≤ 0} ∩ {x ∈ D : ϕ ~ (x)  ~u} ~ {x ∈ D : Φ(x)  (0, ~u)} ~  (0, ~u)i. DhΦ (α,~ u),(β,~ v) Hence, for every q ∈ Z, denoting by πq the homology homomorphism in~  (α, ~u)i → DhΦ ~  (β, ~v )i, with (α, ~u)  (β, ~v ), it duced by the inclusion DhΦ holds that   ρ(K,ϕ~ |K ),q (~u, ~v ) = rank im πq(0,~u),(0,~v) . We claim that there exists a positive real number β̂ such that im π (0,~u),(0,~v) ∼ = im π (0,~u),(β,~v) q q for every β with 0 < β ≤ β̂ (the claim is trivial for β = 0). In particular, this fact proves that ρ(K,ϕ~ |K ),q (~u, ~v ) = limβ→0+ ρ(D,Φ),q ((0, ~u), (β, ~v )) . ~ MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 7 In order to prove this claim, we consider the inverse system of homomorphisms (0,~ u),(β,~ v) ~  (0, ~ui) → Ȟq (DhΦ ~  (β, ~v i) over the directed set {β ∈ R : πq : Ȟq (DhΦ β > 0} decreasingly ordered. The following isomorphisms hold: im πq(0,~u),(β,~v) . π (0,~u),(β,~v) ∼ im πq(0,~u),(0,~v) ∼ = lim = im lim ←− ←− q (0,~ u),(β,~ v) ∼ by the continuity of Čech homology, and π = im lim ←− q (0,~ u),(β,~ v) ∼ (0,~ u),(β,~ v) because the inverse limit of vector spaces is im lim πq im πq = lim ←− ←− an exact functor and therefore it preserves epimorphisms, and hence images. It remains to prove that there exists a positive real number β̂ such that, for (0,~ u),(β,~ v) (0,~ u),(β,~ v) every 0 < β ≤ β̂, im πq is isomorphic to lim im πq . To this end, let ←− us consider the following commutative diagram, with 0 < β ′ ≤ β ′′ : (0,~ u),(0,~ v) Indeed, im πq id ~  (0, ~u)i) Ȟq (DhΦ (2.1) u),(β πq(0,~ / Ȟq (DhΦ ~  (0, ~u)i) ′ ,~ v) u),(β πq(0,~  ~  (β ′ , ~v )i) Ȟq (DhΦ πq(β ′ ,~ v ),(β ′′ ,~ v) ′′ ,~ v)  / Ȟq (DhΦ ~  (β ′′ , ~v )i). (β ′ ,~ v ),(β ′′ ,~ v) (β ′ ,β ′′ ) From the above diagram (2.1), we see that each πq induces a map τq : (0,~ u),(β ′ ,~ v) (0,~ u),(β ′′ ,~ v) im πq → im πq . From diagram (2.1) we see that these maps are (0,~ u),(0,~ v) surjective. On the other hand, by the finiteness of the rank of im πq and the monotonicity of the rank invariants, there exists β̂ > 0 such that the rank of (0,~ u),(β ′′ ,~ v) (0,~ u),(β ′ ,~ v) , whenever 0 < β ′ ≤ β ′′ ≤ β̂. is finite and equal to the rank of πq πq ′ ′′ (β ,β ) Hence the maps τq are surjections between vector spaces of the same finite (0,~ u),(β,~ v) dimension, i.e. isomorphisms for every 0 < β ′ ≤ β ′′ ≤ β̂. Thus, lim im πq ←− is the inverse limit of a system of finite dimensional vector spaces isomorphic to (0,~ u),(β̂,~ v) im πq , proving the claim. We now claim that for every strictly positive real number β, there exists a strictly positive real number α̂ < β such that im πq(0,~u),(β,~v) ∼ = im πq(α,~u),(β,~v) for every α with 0 ≤ α ≤ α̂. This claim can be proved in much the same way as the previous one. We consider (α,~ u),(β,~ v) ~  (α, ~ui) → Ȟq (DhΦ ~  the inverse system of homomorphisms πq : Ȟq (DhΦ (β, ~v i) over the directed set {α ∈ R : 0 ≤ α < β} decreasingly ordered. The following isomorphisms follow again from the continuity of Čech homology and the exacteness of the inverse limit functor for vector spaces: im πq(α,~u),(β,~v) . π (α,~u),(β,~v) ∼ im πq(0,~u),(β,~v) ∼ = lim = im lim ←− ←− q To prove that there exists a strictly positive real number α̂ such that, for every (α,~ u),(β,~ v) (α,~ u),(β,~ v) 0 ≤ α ≤ α̂, im πq is isomorphic to lim im πq , let us consider the ←− ′′ ′ following commutative diagram, with 0 ≤ α ≤ α : 8 (2.2) PATRIZIO FROSINI AND CLAUDIA LANDI ~  (α′ , ~u)i) Ȟq (DhΦ πq(α πq(α ′ ,~ u),(α′′ ,~ u) / Ȟq (DhΦ ~  (α′′ , ~u)i) ′ ,~ u),(β,~ v)  ~  (β, ~v )i) Ȟq (DhΦ πq(α id ′′ ,~ u),(β,~ v)  / Ȟq (DhΦ ~  (β, ~v )i). (α′ ,~ u),(α′′ ,~ u) (α′ ,α′′ ) From the above diagram (2.2), we see that each πq induces a map σq : (α′ ,~ u),(β,~ v) (α′′ ,~ u),(β,~ v) im πq → im πq . From diagram (2.2) we see that these maps are (α,~ u),(β,~ v) injective. On the other hand, by the finiteness of the rank of im πq , for any α with 0 < α < β, and the monotonicity of the rank invariants, there exists α̂, (α′ ,~ u),(β,~ v) is finite and equal to the rank with 0 < α̂ < β, such that the rank of πq ′′ (α ,~ u),(β,~ v) (α′ ,α′′ ) ′ ′′ of πq , whenever 0 ≤ α ≤ α ≤ α̂. Hence the maps σq are injections between vector spaces of the same finite dimension, i.e. isomorphisms for every (α,~ u),(β,~ v) 0 ≤ α′ ≤ α′′ ≤ α̂. Thus, lim im πq is the inverse limit of a system of finite ←− (α̂,~ u),(β̂,~ v) dimensional vector spaces isomorphic to im πq , proving the claim.  Many applications require that the presence of single outliers does not affect the evaluation of similarity. In these cases, always assuming K triangulable, it is sufficient to study the closure of the interior of K instead of K itself. Indeed, applying Theorems 2.1 and 2.2 with the closure of the interior of K instead of K, we obtain a result of stability of persistent homology groups with respect to the perturbations of the studied set and a reconstruction result for the original persistent homology groups modulo perturbations of zero measure. We underline once more that the results of this section are based on the idea of translating the problem of stability with respect to set perturbations into that of stability with respect to function perturbations. Therefore, the use of the distance function is only one among many ways to achieve this end and has the advantage of working well when sets are compared using the Hausdorff distance. One could conceive different ways, in connection with other methods to compare sets, as the following sections show. 3. Stability with respect to other distances between sets Our approach can be easily adapted to noise that is small with respect to distances other than the Hausdorff distance δH . We first show how persistent homology can be made stable with respect to perturbations of the sets measured using the symmetric difference distance (Theorem 3.1). Then we show the stability with respect to perturbations of fuzzy sets (Theorem 3.2). 3.1. Stability with respect to the symmetric difference distance. We work with a non-empty closed subset K of a triangulable set D in Rn . In this case, instead of the distance function dK , our construction depends on the use of functions λεK : Rn → R, with ε ∈ R, ε > 0, defined as Z λεK (x) = µ(Bε )−1 · χK (y) dy y∈Bε (x) MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 9 where Bε (x) denotes the n-disk centered at x with radius ε, Bε = Bε (~0), and χK denotes the characteristic function of K. The underlying idea of this choice is that the closer a point x of the real plane is to a large part of K, the closer the value of λεK (x) is to 1. More precisely, λεK (x) = 1 if and only if µ(Bε (x) ∩ K) = µ(Bε ), whereas λεK (x) = 0 if and only if µ(Bε (x) ∩ K) = 0. Clearly, λεK is a continuous function for every real number ε > 0. Analogously to Theorem 2.1, in this case we have the following result. Theorem 3.1. Let K1 , K2 be non-empty closed subsets of a triangulable subspace D of Rn . Moreover, let ϕ ~1, ϕ ~ 2 : D → Rk be vector-valued continuous functions. ε ~ε ε ~ ~ ε = (−λε , ϕ ~ε Then, defining Ψ1 , Ψ2 : D → Rk+1 by Ψ ~ 2 ), the 1 K1 ~ 1 ) and Ψ2 = (−λK2 , ϕ following inequality holds:     d△ (K1 , K2 ) (3.1) Dmatch ρ(D,Ψ , k~ ϕ − ϕ ~ k , ρ ≤ max ε ε 1 2 ∞ . ~ ),q (D,Ψ ~ ),q 1 2 µ(Bε ) Proof. For every x ∈ D, |λεK1 (x) − λεK2 (x)| = Z µ(Bε )−1 · −1 · y∈Bε (x) Z χK1 (y) − χK2 (y) dy |χK1 (y) − χK2 (y)| dy ≤ µ(Bε ) = µ(Bε )−1 · µ(K1 △K2 ). D Thus kλεK1 − λεK2 k∞ ≤ µ(Bε )−1 · µ(K1 △K2 ). The Multidimensional Stability Theorem 1.5 for measuring function perturbations implies that   ~ε ~ε Dmatch ρ(D,Ψ ~ ε ),q , ρ(D,Ψ ~ ε ),q ≤ kΨ1 − Ψ2 k∞ . 1 It follows that   ≤ , ρ Dmatch ρ(D,Ψ ε ε ~ ),q (D,Ψ ~ ),q 1 2 ≤ = 2  ϕ1 − ϕ ~ 2 k∞ max kλεK1 − λεK2 k∞ , k~  max µ(Bε )−1 · µ(K1 △K2 ), k~ ϕ1 − ϕ ~ 2 k∞  −1 max µ(Bε ) · d△ (K1 , K2 ), k~ ϕ1 − ϕ ~ 2 k∞ .  The previous theorem shows that, under our hypotheses, if two compact subsets K1 , K2 of the real plane are close to each other in the sense that their symmetric difference has a small measure, then also the rank invariants constructed using the ~ ε are close to each other. ~ ε, Ψ functions Ψ 1 2 We observe that the estimate in inequality (3.1) can be improved by substituting R d△ (K1 , K2 ) with maxx∈Rn y∈Bε (x) χK1 (y) − χK2 (y) dy . 3.2. Stability with respect to perturbations of fuzzy sets. Now we consider the case when sets are defined according to fuzzy theory, that is through functions representing the grade of membership of points to the considered set. One obtains a fuzzy set, for example, when a probability density p(x) is given, p(x) expressing the probability that a point of the considered set belongs to an infinitesimal neighborhood of x. We confine ourselves to considering only probability densities with compact support contained in a triangulable subspace D of Rn . From the 10 PATRIZIO FROSINI AND CLAUDIA LANDI Multidimensional Stability Theorem 1.5 for measuring function perturbations we immediately deduce the following result, whose simple proof is omitted, concerning the stability with respect to perturbations of fuzzy sets defined by probability densities. Theorem 3.2. Let p1 , p2 be two probability density functions having support con~ 1, Ψ ~2 : D → tained in a compact and triangulable subspace D of Rn . Defining Ψ k+1 ~ ~ R by Ψ1 = (−p1 , ϕ ~ 1 ) and Ψ2 = (−p2 , ϕ ~ 2 ), the following statement holds:   ϕ1 − ϕ ~ 2 k∞ } . Dmatch ρ(D,Ψ ~ 1 ),q , ρ(D,Ψ ~ 2 ),q ≤ max {kp1 − p2 k∞ , k~ 4. An example In this section the theoretical framework presented in Section 2 is applied in a discrete setting. Our goal is to check the stability of the proposed framework with respect to set perturbations measured by the Hausdorff distance. We confine ourselves to the case q = 0. With this aim in mind, we work with the binary digital image represented in Figure 1 (left), and we corrupt this image by adding salt & pepper noise to a neighborhood of the set of its black pixels, as shown in Figure 1 (right). Figure 1. Two binary images of an octopus. The image on the right is a noisy version of that on the left. Black pixels of left and right images represent the sets K1 , K2 under study, respectively, whereas in both cases the 269x256 rectangle of black and white pixels together constitute the set D. The so obtained noisy set K2 is close to the original set K1 with respect to the Hausdorff distance. A graph structure based on the local 4-neighbors adjacency relations of the digital points is used in order to topologize the images. Fixed the point c ∈ D corresponding to the center of mass of K1 , the chosen measuring function for both instances is ϕ : D → R, ϕ(p) = −kp − ck. Figure 2 (left) shows the persistence diagram of the 1-dimensional 0th rank invariant ρ(K1 ,ϕ|K1 ),0 . It displays eight relevant points in the persistence diagram, corresponding to the eight tentacles of the octopus. Only one of these points is at infinity (and therefore depicted by a vertical line rather than by a circle) since K1 has only one connected component. As for ρ(K2 ,ϕ|K2 ),0 , due to the presence of a great quantity of connected components in the noisy octopus, its persistence MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 11 diagram has a very large number of points at infinity, and a figure showing them all would be hardly readable. For this reason Figure 2 (right) shows only a small subset of its persistence diagram. However it is sufficient to perceive how dissimilar it is from ρ(K1 ,ϕ|K1 ),0 . −140 −60 −142 −80 −144 −146 −100 v v −148 −150 −120 −152 −154 −140 −156 −160 −158 −160 −160 −140 −120 u −100 −80 −60 −160 −155 −150 u −145 −140 Figure 2. Left: The persistence diagram of the rank invariant ρ(K1 ,ϕ|K1 ),0 corresponding to the original octopus image. Right: A detail of the persistence diagram of the rank invariant ρ(K2 ,ϕ|K2 ),0 corresponding to the noisy octopus image. As suggested by Theorem 2.1, if instead we compare K1 and K2 by means of 2 ~ ~ the rank invariants ρ(D,Φ ~ 1 ),0 and ρ(D,Φ ~ 2 ),0 , where Φ1 : D → R , Φ1 = (dK1 , ϕ), and ~ 2 : D → R2 , Φ ~ 2 = (dK2 , ϕ), we can see the similarity between K1 and K2 modulo Φ the salt & pepper noise. This is illustrated in Figure 3. In Figure 3 (a)-(b), we show the rank invariants ρ(D,Φ ~ 1 ),0 and ρ(D,Φ ~ 2 ),0 both restricted to the half-plane π(~l,~b) , ~ ~ with l = (0.1483, 0.9889) and b = (13.0434, −13.0434), that is the half-plane of the foliation containing the point ((0, −100), (3, −80)). In other words, Figure 3 (a)-(b) shows ρ and ρ , respectively. We can appreciate their similarity, ~ ~ Φ Φ 1 2 (D,F (~ l,~ b) ),0 (D,F (~ l,~ b) ),0 even if their matching distance dmatch is not necessarily smaller than the Hausdorff distance between K1 and K2 . The considered half-plane has been chosen so that it contains points where the rank invariant takes non-trivial values. Indeed it is easy to verify that Theorem 2.1 does not guarantee the stability of ρ(D,F Φ~ ),q but the stability of ρ(D,µ·F Φ~ ),q , where µ = mini li . We point out that (~ l,~ b) ρD,µ·F Φ~ (~ l,~ b) (~ l,~ b)  ,q (µ · s, µ · t) = ρD,F Φ~ (~ l,~ b) ~  ,q (s, t), and hence the passage from the mea~ suring function F(Φ~l,~b) to the measuring function µ · F(Φ~l,~b) corresponds to “rescaling up” the domain of the rank invariant. In other words, when we change K1 into a new compact set K2 that is close to K1 with respect to the Hausdorff distance, the matching distance between ρ(K1 ,ϕ|K1 ),q and ρ(K2 ,ϕ|K2 ),q may be not small, while the  must be small.  and ρ one between ρ (dK ,ϕ|K ) (dK ,ϕ|K ) 2 1 2 1 D,µ·F (~ l,~ b) ,q D,µ·F (~ l,~ b) ,q 12 PATRIZIO FROSINI AND CLAUDIA LANDI This is illustrated in Figure 3, where the rank invariants ρ ~ Φ 1 ),0 (~ l,~ b) (D,F displayed the top row, are not as similar as the rank invariants ρ ρ ~ Φ 2 ),0 (~ l,~ b) (D,µ·F ~ Φ 2 ),0 (~ l,~ b) (D,F ~ Φ 1 ),0 (~ l,~ b) (D,µ·F and , displayed in the bottom row. −50 −50 −55 −55 −60 −60 −65 −65 −70 −70 t t and ρ −75 −75 −80 −80 −85 −85 −90 −90 −95 −95 −95 −90 −85 −80 −75 −70 −65 s −60 −55 −50 −95 −90 −85 −80 25 25 20 20 15 15 10 10 µ·t 5 −65 −60 −55 −50 5 0 0 −5 −5 −10 −10 −15 −15 −20 −20 −70 s (b) µ·t (a) −75 −15 −10 −5 0 5 µ·s (c) 10 15 20 25 −20 −20 −15 −10 −5 0 5 µ·s 10 15 20 25 (d) Figure 3. (a) The rank invariant ρ(D,Φ ~ 1 ),0 restricted to the half~ plane π ~ ~ , with l = (0.1483, 0.9889) and ~b = (13.0434, −13.0434), (l,b) that is the half-plane of the foliation containing the point ((0, −100), (3, −80)). (b) The rank invariant ρ(D,Φ ~ 1 ),0 restricted to the same half-plane. (c)-(d) The same restrictions as in (a)-(b), respectively, but rescaled by µ = min{l1 , l2 }. Next we show how it is possible to point-wisely recover the rank invariant of ~ 1 ). According to Theorem 2.2, ρ(K ,ϕ ),0 (u, v) = (K1 , ϕ|K1 ) from that of (D, Φ 1 |K1 ρ(D,Φ ~ 1 ) (α, u, β, v) for α, β > 0 sufficiently small. , MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 13 As shown in [1], in this case the following equalities hold (with reference to Definition 1.3): β−α , (β−α)2 +(v−u)2 b1 (α, u, β, v) = α(β+v)−β(α+u) (β−α)+(v−u) , s(a, u, β, v) = lα+u , 1 +l2 v−u , (β−α)2 +(v−u)2 b2 (α, u, β, v) = u(β+v)−v(α+u) (β−α)+(v−u) , t(α, u, β, v) = lβ+v . 1 +l2 l1 (α, u, β, v) = √ (4.1) l2 (α, u, β, v) = √ As a consequence, Theorems 2.2 and 1.4 (applied in this order) imply that for every pair (u, v), with u < v, and for 0 < α < β, with α and β sufficiently small, ρ(K1 ,ϕ|K1 ),q (u, v) = ρ(D,Φ ~ 1 ),q ((α, u), (β, v)) = ρ = ρ ~ Φ 1 (~ l,~ b)  ,q ~ Φ 1 (~ l,~ b)  ,q D,F D,F (s(α, u, β, v), t(α, u, β, v))  α+u β+v , l1 + l2 l1 + l2  ~ where F Φ~ 1~ : D → R is defined by setting, for every x ∈ D, (l,b) ~ F Φ~ 1~ (x) = max (l,b)  dK (x) − b1 ϕ(x) − b2 , l1 l2  . Hence the finite value ρ(K1 ,ϕ|K−1 ),q (u, v) is equal to ρ ~ Φ D,F ~1~ (l,b)  ,q  α+u β+v l1 +l2 , l1 +l2  , if we choose α and β small enough. The corresponding admissible pair (~l, ~b) results to be close to the pair ((0, 1), (0, 0)). In other words, the information about the rank invariant of the original pair (K1 , ϕ|K1 ) can be recovered on the leaves associated with the admissible pairs (~l, ~b) in a small neighborhood of the pair ((0, 1), (0,0)), after re-parameterizing these  , β+v . Note that the pair leaves by taking each point (u, v) to the point lα+u 1 +l2 l1 +l2 ((0, 1), (0, 0)) is not admissible but is located on the boundary of the set Adm2 . This leads to instabilities if we take α, β too small. We underline that Theorem 2.2 ensures this approximation is good only point-bypoint. Thus, even for admissible pairs (~l, ~b) close enough to the pair ((0, 1), (0, 0)),  and ρ the matching distance between ρ ~ (K,ϕ|K1 ),q may be quite large. Φ 1 D,F (~ l,~ b) ,q To illustrate these issues we have computed the value taken by ρ(K1 ,ϕ|K1 ),0 at the point (u, v) = (−100, −80), that is 8. Using formulas (4.1), we have computed the admissible pairs ~l = ~l(α, β), ~b = ~b(α, β) such that the half-plane π(~l,~b) contains the point ((α, −100), (β, −80)) for the values of α, β shown in Table 4. For the same values of α, β, Table 4 also shows the parameters s = s(α, β) and t = t(α, β) for  (s, t).  which we have that ρ(D,Φ ~ ~ 1 ),0 ((α, −100), (β, −80)) = ρ Φ 1 D,F (~ l,~ b) Computations show that ρ(K1 ,ϕ|K1 ),0 (−100, −80) = ,0  (s, t) ρ ~ Φ D,F ~1~ ,0 = 8 for (l,b) small but positive values of α and β. It is noticeable that for α = 0, due to the mentioned instabilities near the boundary of the set Adm2 , computations are not reliable. 14 PATRIZIO FROSINI AND CLAUDIA LANDI α u β v 0.5 0.5 0.5 0.5 0.5 0.3 0.1 0 -100 -100 -100 -100 -100 -100 -100 -100 24 16 8 1 0.65 0.45 0.25 0.15 -80 -80 -80 -80 -80 -80 -80 -80 s t ρ -70.5866 -39.7272 -70.9216 -45.6179 -77.2843 -55.9262 -97.1120 -77.1040 -98.7692 -78..7672 -98.9677 -78.9657 -99.1663 -79.1643 -99.2655 -79.2635 ~ Φ 1 (~ l(ε),~ b(ε)) D,F  ,0 (s, t) 1 3 3 8 8 8 8 0 Table 1. The parameters used to approximate the value of  . ρ(K1 ,ϕ|K1 ),0 at (u, v) = (−100, −80), that is 8, using ρ ~ Φ 1 D,F (~ l,~ b) The corresponding rank invariants ρ ~ Φ 1 (~ l(ε),~ b(ε)) D,F  ,0 ,0 for the choices of α and β considered in Table 4 are displayed in Figure 4. 5. Discussion In this paper we have shown the stability of persistent homology groups with respect to perturbations of the studied set. Measuring set perturbations by different distances requires different constructions in order to achieve stability. If set perturbations are measured through the Hausdorff distance, we replace compact sets by distance functions. In this way, by the well-known property that if K ′ is a good Hausdorff approximation of K then the distance function dK ′ is close to dK , and by utilizing already available results of persistent homology stability with respect to function perturbations, we deduce stability with respect to set perturbations. We also show that while passing from sets to functions we are still able to recover information about the persistent homology groups of the original sets. If set perturbations are measured through the symmetric difference distance, an analogous procedure leads to the proof of stability also in this case. Finally, using the sup-distance enables us to guarantee stability also with respect to perturbations of fuzzy sets. The common underlying idea is to compare sets by comparing functions describing the sets themselves. While considering the Hausdorff distance and the symmetric difference distance is certainly not exhaustive of all the possible ways of measuring set perturbations, it accounts for two very widely used ones, and allows us to indicate a general procedure that could be applied also when dealing with other distances. Finally, we underline that the technique developed in this paper essentially relies on the multidimensional generalization of persistent homology, showing once more the importance of further pursuing this area of research. MULTIDIMENSIONAL PERSISTENT HOMOLOGY AND DOMAIN PERTURBATIONS 15 References [1] S. Biasotti, A. Cerri, P. Frosini, D. Giorgi, and C. Landi. Multidimensional size functions for shape comparison. J. Math. Imaging Vision, 32(2):161–179, 2008. [2] F. Cagliari, B. Di Fabio, and M. Ferri. One-dimensional reduction of multidimensional persistent homology. Posted on April 9, 2010 PII: S 0002-9939(10)10312-8 (to appear in print). [3] F. Cagliari and C. Landi. Finiteness of rank invariants of multidimensional persistent homology groups. arXiv:1001.0358v1 [math.AT]. [4] G. Carlsson and A. Zomorodian. The theory of multidimensional persistence. Discrete & Computational Geometry, 42(1):71–93, 2009. [5] A. Cerri, B. Di Fabio, M. Ferri, P. Frosini, and C. Landi. Multidimensional persistent homology is stable. Technical Report, Università di Bologna, Luglio 2009. http://amsacta.cib.unibo.it/2603/. [6] D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. Stability of persistence diagrams. Discrete Comput. Geom., 37(1):103–120, 2007. [7] D. Cohen-Steiner, H. Edelsbrunner, and D. Morozov. Vines and vineyards by updating persistence in linear time. In SCG ’06: Proceedings of the twenty-second annual symposium on Computational geometry, pages 119–126, New York, NY, USA, 2006. ACM. [8] M. C. Delfour and J.-P. Zolésio. Shapes and geometries: analysis, differential calculus, and optimization. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2001. [9] T. K. Dey and R. Wenger. Stability of critical points with interval persistence. Discrete Comput. Geom., 38(3):479–512, 2007. [10] M.-M. Deza and E. Deza. Dictionary of Distances. Elsevier, Burlington, 2006. [11] D. Dubois and H. Prade. Fuzzy sets and systems - Theory and applications. Academic Press, New York, 1980. [12] H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification. Discrete & Computational Geometry, 28(4):511–533, 2002. [13] P. Frosini. Measuring shapes by size functions. In D. P. Casasent, editor, Intelligent Robots and Computer Vision X: Algorithms and Techniques, volume 1607, pages 122–133, 1991. [14] P. Frosini and M. Mulazzani. Size homotopy groups for computation of natural size distances. Bulletin of the Belgian Mathematical Society, 6(3):455–464, 1999. [15] R. C. Veltkamp and M. Hagedoorn. State of the art in shape matching. Principles of visual information retrieval, pages 87–119, 2001. [16] A. Verri, C. Uras, P. Frosini, and M. Ferri. On the use of size functions for shape analysis. Biol. Cybern., 70:99–107, 1993. [17] L. Zadeh. Fuzzy sets. Information Control, 8:338–353, 1965. Dipartimento di Matematica, Università di Bologna, P.zza di Porta S. Donato 5, I-40126 Bologna, Italia E-mail address: frosini@dm.unibo.it Dipartimento di Scienze e Metodi dell’Ingegneria, Università di Modena e Reggio Emilia, Via Amendola 2, Pad. Morselli, I-42100 Reggio Emilia, Italia E-mail address: clandi@unimore.it 16 PATRIZIO FROSINI AND CLAUDIA LANDI α = 0.5, β = 24 α = 0.5, β = 16 −40 −40 −45 −45 −50 −50 −55 −55 t t −35 −60 −60 −65 −65 −70 −70 −75 −75 −75 −70 −65 −60 −55 −50 s −45 −40 −35 −75 −70 α = 0.5, β = 8 −65 −60 −55 s −50 −45 −40 α = 0.5, β = 1 −40 −45 −60 −50 −70 −55 −80 t t −60 −90 −65 −70 −100 −75 −110 −80 −120 −85 −85 −80 −75 −70 −65 s −60 −55 −50 −45 −40 −120 α = 0.5, β = 0.65 −110 −100 −90 s −80 −70 −60 α = 0.3, β = 0.45 −60 −60 −70 −80 −80 −90 t t −100 −120 −100 −110 −120 −140 −130 −140 −160 −150 −160 −140 −120 s −100 −80 −60 −150 −140 −130 −120 −110 −100 s α = 0.1, β = 0.25 −90 −80 −70 α = 0, β = 0.15 −55 −60 −60 −65 −70 −70 −80 t t −75 −80 −90 −85 −90 −100 −95 −110 −100 −105 −120 −120 −110 −100 −90 s −80 −70 −60 −100 Figure 4. The rank invariant ρ ~ Φ 1 (~ l(α),~ b(β)) D,F  ,0 −90 −80 s −70 as α and β tend to 0. Red circles and red lines denote the points (proper or at infinity) of the corresponding persistence diagram. The blue diamonds denote the point (s, t) corresponding to ((α, −100), (β, −80)). −60 −60