Academia.eduAcademia.edu

Hierarchy Measures In Complex Networks

2004, Physical review letters

AI-generated Abstract

This research investigates the concept of topological hierarchy within complex networks, defining hierarchy based on node importance, characterized by degree. The study proposes a dynamical process to construct networks with varying levels of hierarchy and compares these with random scale-free networks. Key findings reveal that the extent of topological hierarchy declines with the degree distribution's exponent γ, impacting network robustness and signaling in the process.

Hierarchy Measures in Complex Networks Ala Trusina,1, 2 Sergei Maslov,3 Petter Minnhagen,2, 1 and Kim Sneppen2 arXiv:cond-mat/0308339v2 [cond-mat.soft] 19 Feb 2004 1 Department of Physics, Umeå University, 90187 Umeå, Sweden 2 NORDITA, Blegdamsvej 17, 2100 Copenhagen Ø, Denmark∗ 3 Department of Physics, Brookhaven National Laboratory, Upton, New York 11973, USA† (Dated: February 2, 2008) Using each node’s degree as a proxy for its importance, the topological hierarchy of a complex network is introduced and quantified. We propose a simple dynamical process used to construct networks which are either maximally or minimally hierarchical. Comparison with these extremal cases as well as with random scale-free networks allows us to better understand hierarchical versus modular features in several real-life complex networks. For random scale-free topologies the extent of topological hierarchy is shown to smoothly decline with γ – the exponent of a degree distribution – reaching its highest possible value for γ ≤ 2 and quickly approaching zero for γ > 3. PACS numbers: 89.75.-k, 89.75.Fb Networks recently came to the focus of attention of the complex systems research. Indeed, most complex systems have an underlying network serving as a “backbone” for its dynamical processes. The large-scale topological organization of a particular complex network is related to both its functional role and historical background. Thus it is important to develop quantitative tools allowing one to detect and measure significant features in the topology of a given network. The hierarchical organization is a common feature of many complex systems. As an example of a hierarchy one might think of the organizational structure of a large company. The defining feature of a hierarchical organization is the existence of a hierarchical path connecting any two of its nodes. It can be thought of as a trajectory of a request initiated at one of the nodes and reaching its destination node through the “chain of command”. Such a request first goes up the steps of the hierarchy until it reaches the first common boss of its sender and recipient after which it descend the hierarchical levels down to the destination node. In most real-life complex systems the simple tree-like hierarchical network may be augmented by ”shortcuts” bypassing the chain of command. Such non-hierarchical shortcuts make the task of detecting the hierarchical structure much more non-trivial. The question we want to address in this work is how to detect and measure the extent of the hierarchy manifested in the topology of a given complex network. Real world networks often have very broad degree distribution [1] and thus the degree in itself gives a sensible characteristic of a node which has to be preserved in any randomized version of a network [2, 3]. In fact, in a number of systems nodes with higher degrees are on average more important than their lower degree counterparts. For example, for the Internet the number of hardwired connections a given Autonomous System serves as a proxy of its importance with the most connected nodes being global Internet Service Providers. For WWW the in-degree of a web page can serve as a measure of its popularity and hence importance; highly connected hubairports of airline networks typically located in large cities, etc. In what follows we use the degree of a node as a proxy for its rank in the hierarchy based on the relative importance of nodes. Thus we propose a way to couple a local topological quantity, the degree, to a global structure, hierarchy: Thereby we define and quantify topological hierarchy as a way to characterize networks beyond their degree distribution and its two point correlation function [3, 4, 12]. However, we would like to point out that our methods could be generalized to hierarchies defined in terms of any other characteristic of individual nodes, being it wealth, mass, or some other appropriate quantity. We quantify the hierarchical topology of a network using the concept of a hierarchical path [5, 6]: a path between two nodes in a network is called hierarchical if it consists of an “up path” - where one is allowed to step from node i to node j only if their degrees ki , kj satisfy ki ≤ kj followed by a ”down path” - where only steps to nodes of lower or equal degree are allowed. Either up or down path is allowed to have zero length. This definition of a hierarchical path follows the above mentioned trajectory of a request which is first forwarded up and then descends down the levels of a hierarchy quantified by ki . It is also similar to the definition proposed in [5] and [6]. The length of the shortest hierarchical path between a given pair of nodes can be either: 1) equal to the length of the shortest path between these nodes; 2) longer than it; 3) not exist at all if these two nodes cannot be connected by any hierarchical path. The fraction of pairs in each of these three categories is denoted as F , S, and U = 1 − F − S correspondingly. Equivalently, the hierarchical fraction F can be viewed as a fraction of shortest paths in the network that are hierarchical, while S as the probability of finding a non-hierarchical shortcut - a path shorter than the shortest hierarchical path between 2 a pair of nodes. We analyzed in detail the hierarchical structure of several real-life complex networks: 1) The Internet consisting of autonomous systems (AS) hardwired to each other [7] (6474 nodes, 1572 edges); 2) The largest connected component of the yeast protein interaction network [8] (2839 nodes, 4220 edges); 3) The largest connected component of the E-mail mutual correspondence network (25151 nodes, 19963 edges) [9]; 4) The network of CEOs (executive company directors); two directors are connected if they both belong to at least one common board of directors (6193 nodes, 43077 edges) [10]. To have a reference point, we compare these networks with their random connected counterparts in which the degree of every individual node is strictly preserved. In practice this is done by multiple edge-rewiring moves [3, 4] where two links between two randomly selected pairs of nodes are rewired with the constraint that one only accepts moves in which no double links are created, and where the network remains connected. We find that the randomized version of the Internet, characterized by Fr = 0.99, is almost as hierarchical as the real Internet, where F = 0.95. The same is true for the E-mail network (Fr = 0.98 vs F = 0.97) and the CEO network (Fr = 0.84 vs F = 0.78). On the other hand, the randomized protein interaction network, Fr = 0.88, is significantly more hierarchical than the real one, F = 0.33. This anti-hierarchical feature of the protein interaction network reflects a topology where highly connected nodes are placed on the periphery, and not in the center of the network [3]. We also found that a small reduction in F for the Internet compared to its randomized counterpart is mainly due to an increase in the number of non-hierarchical shortcuts, S = 0.02 (Sr =0). This feature is even more pronounced for the yeast protein network, with S = 0.17 (Sr = 0.02). One of the possible explanation for this phenomena can be the natural tendency toward shorter distances and thus toward faster and more specific signaling. As was shown above, a random network with a given degree distribution provides a useful benchmark for the extent of hierarchy in real-life complex networks. It is interesting also to consider the extreme cases: that is to construct networks that are the most or alternatively the least hierarchical for a given degree distribution. This is important for positioning real networks not only with the respect to its random counterpart, but also with respect to the extreme limits the network of a given degree distribution can achieve. Similar to a randomized version, the maximally hierarchal version of a network is generated by multiple rewirings of pairs of edges. One has to add however a particular preference for reconnection: At each step one selects two pairs of connected nodes and attempts to connect the node with the highest among these four nodes a) 111 000 111 000 00011111111 111 1111 0000 000000000000 00011111111 111 1111 00000000 000 111 11110 000 00000000 11111111 00011111111 111 111 000 111 000 00000000 000 111 00000000000 0000 1111 10 00011111111111 111 00000000 11111111 00000000000 11111111111 0000 1111 000 111 000 111 00000000000 11111111111 1100 000 111 0000 1111 000 111 000 00000000000111 111 000 00000000 11111111 0000 1111 111 11111111111 000 00000000 11111 000111 111 000 111 111 000 0000 00011111111 1111 00000 11111 111 00000000 111 000 00000 11111 000 111 000 111 111 000 0000000 1111111 0000 1111 000011111111111 1111 00000 11111 00000000 11111111 00000 11111 11111 000 111 000 111 00000000000 00000 0000000 1111111 0000 1111 0000 1111 00000 11111 00000000 11111111 000 111 0000000000000 1111111111111 00000 11111 000 111 00000000000 11111111111 00000 0000000 1111111 0000 00000 11111 00000000 11111111 000 11111 111 0000000000000 1111111111111 00000 000 000 111 11111111 111 11111111111 000 00000000000 000001111 11111 0000000 1111111 00000 11111 00000000 11111111 0000000000000 1111111111111 00000 11111 1111111 0000000 000000000000000000 111111111111111111 000 111 111 000 000 111 111 000 00000000000 11111111111 0000000 1111111 00000 11111 00000000 11111111 1111111111111111 0000000000000000 0000000000000 1111111111111 00000 11111 000000000000000000 111111111111111111 000 111 111111 000000 000 111 00000000000 11111111111 000 1111 111 0000000 1111111 00000000 11111111 0000 00000 11111 00000000 11111111 00000000000 11111111111 1111111111111111 0000000000000000 0000000000000 1111111111111 0000 1111 000000000000000000 111111111111111111 000 111 111111 000000 000 111 00000000000 11111111111 000 111 0000000 1111111 00000000 11111111 0000 0000 00000 11111 1111111111111111 0000000000000 1111111111111 00000000000 11111111111 0000000000000000 1111111111111111 0000 00000000000 11111111111 1100000000000000 0001111 111 000 1111 00000000 11111111 0000 111 1111 0000 1111 0000000000000 1111111111111 00000000000 11111111111 0000000000000000 1111111111111111 00000000000 11111111111 0000 1111 11001100 000 111 000 00000000 11111111 00000000000 0000000000000000 1111111111111111 00000000000 0000 11111111111 1111 1111 111 0000 000 11111111111 111 000 111 00000000 11111111 00000000000 11111111111 0000000000000000 1111111111111111 00000000000 11111111111 1100 1111 0000 000 111 000 111 00000000 000 111 0000000000000000 1111111111111111 0000 1111 00000000000 11111111111 11001100 000 11111111 111 00000000 11111111 000 111 0000 000 00000000000 11111111111 111 00000000001111 1111111111 000 111 00000000 11111111 0000 1111 111 000 00000000000 11111111111 0000000000 1111111111 00 11 000 111 0000 1111 0000 111 1111 000 00000000000 11111111111 0000000000 1111111111 0000 1111 000 1100000 0000111 1111 111 00000000000 11111111111 0000000000 1111111111 0000 1111 0000 1111 00000000000 11111111111 00001111111111 1111 111 000 0000000000 00000000000 11111111111 111 000 0000000000 1111111111 00000000000 11111111111 000000000011111111 1111111111 000 111 00000000000 11111111111 00000000111 0000000000 1111111111 00000000000 11111111111 1111111111000 0000000000 00000000000 11111111 000000000011111111 1111111111 000000 111111 111111111 0000000000000 111 111 000 00000000000 11111111111 1111111111 00000000 0000000000 1111111111 000000 111 000 111 00000000 11111111 0000000000 1111111111 0000000000 111111 1111 0000 1111 000 00000000 11111111 0000000000 1111111111 1111 0000 1111 0000 0000 1111 000000000000 111111111111 0000000000 1111111111 000000 111111 0000 1111 000000000000 111111111111 1111 0000 000000 111111 0000 1111 1111 000000 111111 000 1111 0000 0000 0000111 1111 111 1111 000 0000000000 111111 000 1111 111 1111 0000 0000 111 000 0000 1111 0000 1111 0000 1111 0000 1111 111 000 111 000 1111 0000 1111 0000 0000 0000 1111 1111 0000 0000 111 1111 000 0000 1111 1111 000 111 111 000 0000 1111 0000 1111 000 111 0000 0000 1111 000 1111 111 1111 0000 b) c) FIG. 1: Maximally hierarchical (a), random (b) and maximal anti-hierarchical (c) networks of size N = 50 nodes and node degree distribution f (k) ∝ 1/k2.5 . to the node of the next highest degree in this subset. The remaining two nodes are then linked together (multiple links are forbidden, and the network should always remain connected). The maximally anti-hierarchal version of a network can be constructed by the same algorithm but with the opposite preference of reconnection: the node with the highest degree is linked with that with the lowest degree. In Figs. 1a,c we show the maximally hierarchical respectfully anti-hierarchical networks with the same node degree distribution as a random network, shown in Fig. 1b. We have found that applying the above algorithm to all four empirical networks it is possible to achieve the limits where F =1 for maximal hierarchy and F ≈ 0 for maximal anti-hierarchy. As one can see from Fig. 1, the maximally hierarchical or anti-hierarchical networks show strong correlations between degrees of connected nodes [11] that can be quantified through either the correlation profile [3] or the assortativity measure, r introduced in Ref. [12]. We consider a modified assortativity measure, similar to the one used in hk k i Ref. [12], but here defined as rAD = ln( hhkiikjjiir ), where hki kj i is the average over all pairs i and j of nearestneighbor nodes in the network and hhki kj iir is the average of hki kj i in an ensemble of randomized networks generated as described above [3, 4]. We find the maximally hierarchical topology to be always assortative (rAD > 0), while the maximal anti-hierarchical topology - disassortative (rAD < 0). For example, for a network with the node degree distribution f (k) ∝ 1/k 2.5 , rAD = 0.14 and −1.24 for the maximal hierarchy and the anti-hierarchy respectively. For comparison,the protein-protein interactions in yeast, which are well described by f (k) ∝ 1/k 2.5 , has rAD = −0.82. We further stress that assortativity and hierarchical topology are in general not prerequisites for each other. Motivated by the abundance of real-life networks characterized by a broad, often scale-free, degree distribution (as it is the case for the empirical networks we are considering here), we quantified the hierarchical fraction F as a function of the exponent γ in random scale-free networks with a power law degree distribution f (k) ∝ 1/k γ . Such networks were constructed by first generating a set of 3 1 F 0.6 a) N=300 N=1000 N=3000 1 b) 0.06 S 0.8 N=3000 N=1000 N=300 0.04 P(kneighbour > k) 0.8 0.4 0.02 0.2 k=20000 k=20 k=2 0.6 2< γ<3 γ=3 γ>3 0.4 0.2 a) 0 0 2 2.2 2.4 2.6 3 2.8 2 2.2 γ 2.4 2.6 2.8 3 FIG. 2: a) The fraction of shortest paths that are also hierarchical, F and b) The fraction of non-hierarchical shortcuts, S, as a function of γ for three system sizes, N = 300, 1000, 3000. The shadowed area in a) corresponds to error estimates for a network of size N = 3000. power-law distributed degrees of individual nodes, then linking the edges to create a single-component network, and finally randomizing the resulting network using the algorithm of Ref. [4], which preserves individual degrees and connectedness of the network. Fig. 2a shows F vs γ measured in random scale-free networks for different system sizes. The decrease from F = 1 to F ∼ 0 happens as γ grows from 2 to 3 with a smooth transition around γ ∼ 2.6 that weakly depends on the system size. We remark that for γ ≤ 2, F = 1 and is nearly independent on the upper cutoff of the degree distribution, which is required in this case. Fig. 2b shows S vs γ in random scale-free networks. One can see that S → 0 as γ → 2 and 3. Indeed, as γ → 2 the largest hubs become dominant and the typical distance in a network approaches 2 (almost any pair of nodes are connected via at least one hub). This makes most shortest paths via a hub to be hierarchical. In the limit γ → 3, the topology of the network is very close to a tree, which in its turn implies that the number of alternative paths approaches zero, and thus again S → 0. We have seen that as γ approaches 2 almost all pairs of nodes tend to have at least one hierarchical paths connecting them ( F → 1 and hence U == 1 − F − S → 0). The existence of hierarchical paths connecting most pairs of nodes means that at the very least the majority of nodes have at least one neighbor with a degree higher then their own. Let us first calculate the probability that a given edge is attached to a node with degree larger than k Z K k ′ f (k ′ )dk ′ k ∝  k 1− K 2−γ k , 2−γ , for γ < 2 for γ > 2 (1) Here for γ < 2 one can only have a scale-free distributions below an upper cutoff K. Thus in the absence of degreedegree correlations the probability that a node of degree k has at least one neighbor of degree higher than itself is given by P (kneighbor ≥ k) ∝ (1 − Pedge (≥ k))k b) 2 3 5 γ γ Pedge (≥ k) ∝ 1 10 15 k FIG. 3: The probability P (kneighbor ≥ k) for a node of degree k to have a neighbor of degree kneighbor > k in an infinitely large scale free network, that is not necessarily connected, as function of a) - exponent γ and b) - degree of the node k for γ = 2.5, 3 and 4. ∝ (  k (2−γ)k 1− K , for γ < 2 (2) 1 − (1 − k 2−γ )k , for γ > 2 In Figure 3 we plot this probability of having a boss (neighbor with a higher degree) as a function of γ for three different values of the degree k, Fig. 3a, and as a function of k for different values of γ, Fig. 3b. For γ ≤ 2 both low and high degree nodes always have a higher connected neighbor and for γ > 3 the high k nodes nearly never have a boss. For 2 < γ < 3, low connected nodes often have no higher connected neighbors (see Fig. 3b), but as P (kneighbor > k) → 1 for increasing k there is a hierarchical core of highly connected nodes. In popular terms, at these intermediate values of γ many low degree nodes escape the hierarchy, while medium and highly connected nodes have bosses. Above γ = 3, P (kneighbor > k) decreases to zero with degree. Thus for these high values of γ a network becomes modular with each of the modules centered around a local hub. Figure 4a shows the possible values of F for hierarchies and anti-hierarchies for γ ∈ (2, 3). One can see that even the networks of narrow degree distribution can be organized hierarchically (see upper limit for F for γ = 3) as well as networks of broad degree could be rearranged to suppress ”self-hierarchical” features (see lower limit for F for γ = 2). In Figure 4a we summarize the results of our study of real and random scale-free networks by displaying the hierarchical fraction F observed in real world networks (black dots) relative to its value for the random scale-free networks with the corresponding value of γ (solid line). As discussed above the Internet, e-mail and CEO networks are about as hierarchical as their random scale-free counterparts, while that of protein-protein interactions in yeast is significantly anti-hierarchical. Dark shaded regions in this figure correspond to the range between maximally hierarchical and respectively anti-hierarchical networks for a given value of γ. Another interesting aspect is the impact the hierarchical structure has on the overall robustness of the network. In Figure 4b we show the average size of the largest 4 1 Internet SLC , N=1000 Node Attack Edge Attack Email 0.8 H R AH CEO F 0.6 990 997 820 1 0 50 150 843 650 20 3 30 Pbreak , % Edge Failure Node Failure 0.4 Yeast 0.2 a) b) 0 2 2.2 2.4 2.6 2.8 H R AH 58 33 26 2 2 2 45 24 0.3 1 1 0.4 3 γ FIG. 4: a) Possible hierarchical organizations of networks, dark-shadowed area shows the limits for the average values, and light-shadowed area are the corresponding limits including possible variations from sample to sample in a random scale-free network with N = 103 nodes. The black dots show values of the hierarchical fraction F and the degree exponent γ for the Internet, E-mail, Yeast Protein Interaction and CEO networks discussed in the text. The solid line follows F vs γ in random scale-free networks. b) The results of the robustness analysis for maximally hierarchical (H), random (R) and anti-hierarchical (AH) scale-free networks with γ = 2.5. The effect of the intentional attack by deletion of the single most connecting node or edge in the network is reflected in the reduced size SLC of the largest connected component after such an attack. The probability Pbreak for a network to break up following the removal of a randomly selected node or edge shows how sensitive it is to a random, non-intentional failure. connected component SLC of the scale-free network with γ = 2.5 and N = 1000 after the intentional attack, consisting of choosing and removing a single edge (left column) or node (right column) in such a way as to minimize the SLC in the resulting network. The smaller is the average SLC the more vulnerable is the network with respect to attacks. To characterize the robustness of a network with respect to random failures we specify the likelihood Pbreak that a removal of the single node/edge disconnects the network. We find that anti-hierarchical topologies are most vulnerable with respect to attacks on their edges while hierarchical topologies are sensitive to node attacks. Apart from that, hierarchies are most vulnerable to random failures. In summary, we have discussed hierarchical organization manifested in topology of complex networks, and demonstrated how it can be used to characterize possible network architectures beyond the degree distribution of their nodes. We quantified the hierarchal structure as the fraction of shortest paths that are also hierarchical. It was found that this quantity approaches its max- imum value for marginally divergent scale-free networks γ ≤ 2. It was also shown that anti-hierarchy is naturally related to modular features of networks. Finally we found that hierarchal as well as anti-hierarchical network topologies have implications for signaling and robustness against various types of attacks and malfunctions, with anti-hierarchies being quite reliable against the most types of perturbations. Acknowledgments: We thank Aspen Center for Physics for hospitality. Work at Brookhaven National Laboratory was carried out under Contract No. DEAC02-98CH10886, Division of Material Science, U.S. Department of Energy. ∗ † [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] Electronic address: trusina@tp.umu.se Electronic address: maslov@bnl.gov For a recent collection of articles describing different examples of scale-free networks see e.g. ”Handbook of Graphs and Networks”, edited by S. Bornholdt and H. G. Schuster, John Wiley and VCH publishers (2002) or J. F. F. Mendes, S. N. Dorogovtsev, A. F. Ioffe, ”Evolution of Networks: From Biological Nets to the Internet and WWW”, Oxford Press (2003). M.E.J. Newman, S.H Strogatz and D.J. Watts, Phys. Rev. E 64 026118. M.E.J. Newman, cond-mat/0202208 (2001). S. Maslov and & K. Sneppen, Science 296, 910-913 (2002). S. Maslov, K. Sneppen, and A. Zaliznyak, cond-mat/0205379 (2002); Physica A 333, 529 (2004). H. Tangmunarunkit, R. Govinadan, S. Jamin, S. Shenker, and W. Willinger, Tech. Rep. 01-746, Computer Science Department, University of Southern California (2001). L. Gao, Proc. IEEE INFOCOM, November (2000). The dataset collected by the Oregon Views project was downloaded from the website of the National Laboratory for Applied Network Research (http://moat.nlanr.net/AS/). Ito, T. et al. Proc. Natl. Acad. Sci. USA 98, 4569-4574 (2001). H. Ebel, L. I. Mielsch, and S. Bornholdt, Phys. Rev. E 66, 035103 (2002). G. F. Davis, M. Yoo, and W. E. Baker, preprint, University of Michigan Business School (2001). P. L. Krapivsky, S. Redner, Phys. Rev. E 63, 066123 (2001). M.E.J. Newman, Phys. Rev. Lett. 89, 208701 (2002). J. Park and M.E.J. Newman, Phys. Rev. E 68, 026112 (2003).