- 1 -
中国科技论文在线
Robustness of centrality measures against network
manipulation#
Qikai Niu
1
, ZengAn
2
, FanYing
1
, Zengru Di
1**
5
(1. School of Systems Science, Beijing Normal University, Beijing 100875, PR China;
2. Department of Physics, University of Fribourg, Chemin du Mus e´e 3, CH-1700 Fribourg,)
Foundations: National Natural Science Foundation of China under Grant(No. 61374175 ,61174150),The Chi-
Brief author introduction:Qikai Niu,male,PHD,major research direction:complex network
Correspondance author: Ying Fan,female, professor,major research direction:complex network, scientific
metrology. E-
Abstract: Node centrality is an important quantity for complex networks as it is related to many
applications ranging from the prediction of network structure to the control of dynamics on
networks. In the literature, much effort has been devoted to design new centrality measurements. 10
However, the reliability of these centrality measurements hasn’t been fully understood. Many real
networks are facing different kinds of manipulations such as addition, removal or rewiring of links.
In this paper, we focus on the robustness of classic centrality measures against network
manipulation. Our analysis is based on both artificial and real networks. We find that the centrality
measurements are generally more robust in heterogenous networks. Moreover, the top part of the 15
centrality ranking is more resistant to manipulation. Among the centrality measures we considered,
the eigenvector centrality is in general the most robust one.
Key words: Node centrality;robustness; network manipulation
0 Introduction 20
Networks provide us a powerful and versatile tool to recognize and analyze complex systems.
Many social, biological, and information systems can be naturally modeled by networks,where
nodes represent individuals, biological elements (proteins, genes, etc.), computers, web users, and
so on, and links denote the relations or interactions between nodes
[1, 2, 3, 4, 5]
.The study of complex
networks has therefore become a common focus of many branches of science
[6]
. Great effort has 25
been devoted to understand the evolution of networks, the relations between topologies and
functions, and the network characteristics. It is now well-known that many dynamical processes
such as cascading, propagation, and synchronization are highly affected by a small fraction of
influential nodes
[7]
. In network science, the importance of a node within a network was usually
quantified by its centrality
[8]
. Research on node centrality not only helps to reveal microscopic 30
structural features
[9, 10]
, but also provides assistant tools on analyzing network dynamics
[11, 12]
.
Meanwhile, it can find real applications with great social and economic values
[13, 14, 15]
.
1 Measures of the networks
There are already several classic node centrality measures. Degree centrality is a
straightforward and efficient metric. A node with larger degree is likely to have higher influence 35
than a node with smaller degree. However, this method may fail in some cases for it considers
only very limited information. Another group of methods considering the global information gives
better ranking results, such as betweenness centrality
[16]
and closeness centrality
[17]
(two
prominent geodesic-path-based ranking measures). Betweenness centrality of a node v is the sum
of the fraction of shortest paths that pass through v. Betweenness is, in some sense, a measure of 40
the influence of a node over the information spread through the network or the expected load of a
node in a transportation network. The closeness centrality is inverse of the mean shortest path
length of a node to all other nodes in the network. Closeness centrality can be considered as a
- 2 -
中国科技论文在线
measure of how long it will spread information from a given node to other reachable nodes in the
network. Eigenvector centrality
[18]
assigns relative scores to all nodes in the network based on an 45
iterative principle that connections to high-scoring nodes contribute more to the score of the node
than equal-number connections to low-scoring nodes. One of the most famous algorithm of this
type is Pagerank which is used by to rank web pages
[19]
. It can be easily calculated by
computing the eigenvector of the greatest eigenvalue of the adjacency matrix. The k-shell
calculates the node centrality by decomposing the network
[20]
. The nodes are removed according 50
to the residual degree and the nodes removed later in the procedure has higher centrality. The
k-shell method has been shown to perform effectively in identifying the influential nodes for
spreading.
2 Model of the networks
Besides the above centrality measures, researches are still making efforts to develop new 55
centrality indices for a more accurate estimating of nodes’ importance in networks. For example,
Dangalchev modified the definition of closeness to make it applicable in the network with low
connectivity
[21]
. Barthelemy proposed a local betweenness centrality measure to reduce the high
computational complex of the original definition
[22]
. L¨u et al. introduced a ground node to
modify the random-walk process in the well-known Pagerank
[23]
. The so-called Leaderrank 60
centrality outperforms Pagerank in identifying the most influential spreaders. Very recently,Yao et
al. consider the nonlinear interaction in the random-walk and improve the performance of
Pagerank algorithm in ranking scientific publications
[24]
. For the k-shell centrality, Zeng et
to take the exhausted degree into account when decomposing networks, such that the
degeneracy of the k-shell ranking is effectively broken
[25]
. 65
Even though there are many centrality measures, the robustness of these methods against
manipulation is not yet clear. In fact, in many real systems the network we constructed from data
inevitably contains unreliable information
[26]
. For example, in biological networks, whether a link
between two nodes exists must be demonstrated by field and/or laboratorial experiments. Any
inaccurate or biased measurement in the experiments may lead to noisy or missing data for 70
network construction. In social networks, the problem of the data may come from some malicious
behavior of people. In online social networks, for instance, some fake account may be created to
connect to some small degree users to boost up their popularity
[27]
.On the other hand, the rewiring
of edges, such as the alteration of the bus lines or the airlines are also appeared in society
[28]
.
Therefore, the performance of the node centrality indices should be judged not only by its 75
effectiveness, but also by its tolerance of manipulations. In ref. [29], the robustness of three
centrality measures under random removal and addition of links is studied. However, the
robustness of these centrality measures in real networks and against malicious manipulation hasn’
t been fully understand.
3 Mathematical treatment 80
In this paper, we study the robustness of five classic node centrality measures in both
artificial and real networks. Here, the manipulation means the addition, removal or rewiring of
links to the original networks. We consider both random and biased manipulation scenarios, and
investigate the correlation of the centrality of nodes before and after manipulation. The results
show that the centrality measurements are generally more robust in heterogenous networks. 85
Moreover, the top part of the centrality ranking is more resistant to manipulation. Among the
centrality measures we considered, the eigenvector centrality is in general the most robust one.
- 3 -
中国科技论文在线
We start our analysis by describing the networks we considered. For clarification, we report
the results of two artificial networks and two representative real networks in the figures. The
results of the rest real networks are listed in table 1. The artificial networks are the Watts-Strogatz 90
model (also called Small-World network)
[30]
and Barabasi-Albert model (also called Scale-Free
network)
[31]
. In this paper, we refer them as SW and BA networks of the two
real networks is a social network of coauthorship relation between scientists working on network
theory and experiment
[32]
. The second real network is the neural network of C. elegans
[33]
. The
original network is a directed one, here we consider it as an undirected network. 95
The rest of networks shown in table 1 are from both social and nonsocial systems:
Dolphins(friendship)
[34]
, Word (adjacency relation in English text)
[33]
, Jazz (musical
collaboration)
[35]
, E. coli (metabolic)
[36]
, Email (communication)
[37]
, TAP (yeast protein–protein
binding network generated by tandem affinity purification experiments)
[38]
, Y2H (yeast protein–
protein binding network generated using yeast two hybridization)
[39]
. 100
To assess the effect of manipulation on a node’s centrality measure C, we calculated the
Pearson correlation between the true measure CT and manipulated measure CM. CT is
computed in the real network and CM is calculated in the manipulated network where some
fraction of links are added, removed or rewired. Mathematically, the formula of reads
2 2 2 2( )( )
T N T N
T T N N
C C C C
C C C C
(1) 105
The higher is, the more robust the centrality measure is. In this paper, we consider five
well-known centrality indices, namely degree, betweenness, closeness, eigenvector and k-shell as
C, respectively.
4 Result
Result of model “random link change” 110
The influence of adding and removing links on is presented in Fig. 1. When the fraction
of manipulated links is positive, it means that some ratio of links are added to the real networks.
When the fraction of manipulated links is negative, some ratio of links are removed from the real
networks. By definition, when the fraction of manipulated links is 0, = 1. For each centrality
measure, will become smaller than 1 if some links are added or removed. The interesting 115
feature here is how fast _ decreases with the fraction of manipulated links. The first observation is
that the behavior of is different between SW and BA networks. In SW networks, can
decreases to as low as 0:2. However, in BA networks, can stay mostly higher than 0:
results indicate that the centrality in networks with heterogenous degree distribution is generally
more robust than that in networks with homogeneous degree distribution. In fact, this phenomenon 120
is consistent with a recent finding of the super stable nodes in complex networks
[40]
. The results of
the real networks are between the SW and BA networks. When different centrality measures are
compared, one can easily find that the betweenness and eigenvector centrality are most robust.
The curve of the k-shell method is not shown in BA network because all the nodes in the original
BA network are with the same k-shell value. In the coauthorship network, one can see that the 125
betweenness and closeness centrality are strongly affected by adding or removing links. This is
because the coauthorship network has obvious community structure, removing or adding links will
destroy the community structure and significantly influence the shortest paths in networks.
- 4 -
中国科技论文在线
130
The influence of adding and removing links on the Pearson correlation between the true measure CT and
manipulated measure CM.
Note:When the fraction of manipulated links is positive, it means that some ratio of links are added to the real
networks. When the fraction of manipulated links is negative, some ratio of links are removed from the real
SW networks, N = 500, ⟨k⟩ = 12. p = 0:2. For BA networks, N = 500, ⟨k⟩ = 12. The results are 135
obtained from 50 independent realizations.
In Fig. 1, one can also see an asymmetric phenomenon. It means that when some links are
added to the network, the effect on is different when the same amount of links are removed
from the network. For example, in SW networks, when links are removed, the most robust 140
centrality is betweenness. However, when links are added, the most robust centrality becomes
degree. In BA networks, the removal of links affects more on than adding links. This is
because the removal of links mainly reduces the degree of the hubs in BA networks. Without a
strong hub, the shortest paths in the networks will be substantially increased. Similar phenomenon
can be also observed in real networks. 145
Result of model “link rewiring”
Besides link addition and removal, link rewiring is a common way to manipulate complex
networks. In this procedure, two links, namely a-b and c-d, are randomly selected. Then they are
modified as a-d and c-b. After rewiring, the degree of the nodes will stay exactly unchanged while
the network structure is randomized. This procedure can mimic the case where the data contain 150
accurate information about node degree, but the information of the network structure has some
noise. The influence of link rewiring on the Pearson correlation is shown in Fig. expected,
of the degree centrality stays always 1. Among other centralities, the eigenvector and
betweenness outperform the other two. The rewiring process in general influences less on than
adding and removing links. However, in the coauthorship network where the community structure 155
is obvious, the rewiring process also causes serious change on as the link addition and
removal.
- 5 -
中国科技论文在线
The influence of rewiring links on the Pearson correlation between the true measure CT and manipulated
measure CM 160
Note:For SW networks, N = 500, ⟨k⟩ = 12. p = 0:2. For BA networks, N = 500, ⟨k⟩ = 12. The results are obtained
from 50 independent realizations.
In and , the manipulation of links is based on random selection. We further
consider a more complicated case where the manipulation is biased to node degree. This is 165
actually common in reality. As we discussed above, in real social systems, some fake accounts are
created to boost up the popularity of some specific nodes. In lab experiment, some measures may
also have some bias, such that some nodes receive or miss more links than , in
Fig. 3, we consider a biased link addition procedure to model this procedure is
simple. To add a link, the first node i is randomly selected from the existing nodes, but the second 170
node j is selected with probability proportional to jk
(note that j stands for the nodes that are not
yet connected to i). When < 0, the new links will be more likely to be added to small degree
nodes, and vice versa. The results in Fig. 3 show that actually significantly affects the results.
In SW networks, keeps increasing with . In BA, of degree increases with while of
closeness and eigenvector decrease with . In the coauthorship network, of degree and 175
k-shell increase with . However, the effect is minor. In the Neural network, the most strong
effect happens on of k-shell method. If links are added to small degree nodes, the k-shell value
of these small degree nodes will be largely increased since the increment of connectivity
immediately forms many triangle and square on these small degree nodes, and thus increases their
k-shell values. 180
- 6 -
中国科技论文在线
Fig. 3 The influence of biased adding links on the Pearson correlation between the true measure CT and
manipulated measure CM.
Note: is the parameter controlling the bias towards nodes’ degree when links are added. In this figure, 30% of
links are added. For SW networks, N = 500, ⟨k⟩ = 12. p = 0:2. For BA networks, N = 500, ⟨k⟩ =12. The results are 185
obtained from 50 independent realizations.
Result of model “top part change”
In fact, in most cases the top part of the ranking is more important [24]. Therefore, for each
centrality, we only consider the top L fraction of the nodes in the centrality ranking, and only
calculate the values of these nodes before and after manipulation. 190
The Pearson correlation between the top L fraction of nodes in the true measure CT and manipulated
measure CM.
Note: In this figure, 30% of links are randomly SW networks, N = 500, ⟨k⟩ = 12. p = 0:2. For BA
networks, N = 500, ⟨k⟩ = 12. The results are obtained from 50 independent realizations. 195
We show the dependence of on L in Fig. 4. One can see that _ decreases with L for BA
and real networks. It indicates that the top part of the centrality ranking is more resistant to
manipulation. In SW, increases with L for BA and real networks. This is because the degree
- 7 -
中国科技论文在线
distribution in SW is homogeneous. The difference of the high centrality nodes and low centrality 200
nodes are not big, such that a small change in the low centrality nodes can make them jump to the
top part of the centrality ranking.
Tab. 1Caption of Tablethe Pearson correlation between the true measure CT and manipulated measure CM
Networks properties degree betweenness closeness eigenvector k-shell
N E Add delete Add delete Add delete Add delete Add delete
Dolphins 62 159
Word 112 425
Jazz 198 2742
E. coli 230 695
Neural 297 2148
Coauthorship 379 914
Email 1133 5451
TAP 1373 6833
Y2H 1458 1948
Note: when links are manipulated in different real networks. For both link addition and removal, 30% links 205
are manipulated. The results in this table are obtained from 50 independent centrality with the best
performance is highlighted in bold.
Finally, we present the results on real network in table 1. Again, we can see that the
phenomenon we observed in the four networks above can be seen in this table. For example,the 210
asymmetric phenomenon is still present when we compare _ of adding links and removing links.
When different methods are compared, we can see that the eigenvector centrality is the most
robust one. The value of this method stays over in most cases. Moreover, we observe in this
table that the denser network (. networks with higher connectivity) tend to have more robust
centrality. 215
5 Conclusion
In this paper, we focus on the robustness of node centrality against network
consider adding, removing and rewiring links. Several well-known centrality
measures are used. We find that the centrality measurements are generally more robust in
heterogenous networks. Removing links usually results in a more significant change in centrality 220
(in the paper, we refer it as the asymmetric phenomenon). The effect of network manipulation will
be amplified if the procedure is biased to small degree nodes. Moreover, the top part of the
centrality ranking is more resistant to manipulation. Among the centrality measures we considered,
the eigenvector centrality is generally the most robust one.
Our work actually raises more questions that needed to be addressed in the near 225
results suggest that the network properties will be seriously altered if there are some noisy links or
missing links in the networks. Therefore, how to remove such effect and make the centrality of
nodes the same as the original networks is an important question. In addition, our results indicate
that if a centrality measure can well separate the centrality score of nodes, it in general more
robust. This in fact shed some light to design a more reliable centrality measure in the future. 230
Acknowledgements (Optional)
This work is supported by the National Natural Science Foundation of China under Grant
Nos. 61374175 and 61174150, and the major project of National Social Science
Fund(12&ZD217).
235
- 8 -
中国科技论文在线
References
[1] Berners-Lee T, Hall W, Hendler J, et al. Creating a Science of the ,313(5788):
769-771(2006).
[2] Wang J W, Rong L L. Cascade-based attack vulnerability on the US power grid. Safety 240
Science, 47(10): 1332-1336(2009).
[3] Barabasi A L, Oltvai Z N. Network biology: understanding the cell's functional organization.
Nature Reviews Genetics,5(2): 101-113(2004).
[4] Schweitzer F, Fagiolo G, Sornette D, et al. Economic networks: The new challenges. sci-
ence, 325(5939): 422(2009). 245
[5] Milgram S. The small world today,2(1): 60-67(1967).
[6] Boccaletti S, Latora V, Moreno Y, et al. Complex networks: Structure and dynamics[J].
Physics reports, 2006, 424(4): 175-308.
[7] Kitsak M, Gallos L K, Havlin S, et al. Identification of influential spreaders in complex
networks[J]. Nature Physics, 2010, 6(11): 888-893. 250
[8] Sabidussi G. The centrality index of a graph[J]. Psychometrika, 1966, 31(4): 581-603.
[9] Colizza V, Flammini A, Serrano M A, et al. Detecting rich-club ordering in complex
networks[J]. Nature physics, 2006, 2(2): 110-115.
[10] Newman M E J. Assortative mixing in networks[J]. Physical review letters, 2002, 89(20):
208701. 255
[11] Schneider C M, Moreira A A, Andrade J S, et al. Mitigation of malicious attacks on
networks[J]. Proceedings of the National Academy of Sciences, 2011, 108(10): 3838-3841.
[12] G o´mez-Garde˜nes J, G o´mez S, Arenas A, et al. Explosive synchronization transitions in
scale-free networks[J]. Physical review letters, 2011, 106(12): 128701.
[13] Caldarelli G, Cristelli M, Gabrielli A, et al. A network analysis of countries'export flows: 260
firm grounds for the building blocks of the economy[J]. PloS one, 2012, 7(10): e47278.
[14] Hidalgo C A, Blumm N, Barab a´si A L, et al. A dynamic network approach for the study
of human phenotypes[J]. PLoS computational biology, 2009, 5(4): e1000353.
[15] S. Battiston, M. Puliga, R. Kaushik, P. Tasca and G. Caldarelli, Sci. Rep. 2, 541 (2012).
[16] Freeman L C. A set of measures of centrality based on ,35- 265
41(1977).
[17] Krackhardt D. Assessing the political landscape: Structure, cognition, and power in organizations.
Administrative Science Quarterly,342-369(1990).
[18] Ruhnau B. Eigenvector-centrality - a networks, 22(4): 357-
365(2000). 270
[19] Page L, Brin S, Motwani R, et al. The PageRank citation ranking: Bringing order to the
web.(1999).
[20] Carmi S, Havlin S, Kirkpatrick S, et al. A model of Internet topology using k-shell decomposition.
Proceedings of the National Academy of Sciences,104(27): 11150-11154(2007).
[21] Dangalchev C. Residual closeness in networks[J]. Physica A: Statistical Mechanics and its 275
Applications, 2006, 365(2): 556-564.
[22] Barthelemy M. Betweenness centrality in large complex networks[J]. The European Physical
Journal B-Condensed Matter and Complex Systems, 2004, 38(2): 163-168.
[23] L u¨L, Zhang Y C, Yeung C H, et al. Leaders in social networks, the delicious case[J]. PloS
one, 2011, 6(6): e21202. 280
[24] Yao L, Wei T, Zeng A, et al. Ranking scientific publications: the effect of nonlinearity[J].
Scientific reports, 2014, 4.
[25] A. Zeng and C.-J. Zhang, Phys. Lett. A, 377, 1031 (2013).
[26] Guimera R, Sales-Pardo M, Amaral L A N. Modularity from fluctuations in random graphs
and complex networks[J]. Physical Review E, 2004, 70(2): 025101. 285
[27] Laureti P, Moret L, Zhang Y C, et al. Information filtering via iterative refinement[J]. EPL
(Europhysics Letters), 2006, 75(6): 1006.
[28] Schneider C M, Moreira A A, Andrade J S, et al. Mitigation of malicious attacks on
networks[J]. Proceedings of the National Academy of Sciences, 2011, 108(10): 3838-3841.
[29] Platig J, Ott E, Girvan M. Robustness of network measures to link errors[J]. Physical 290
Review E, 2013, 88(6): 062812.
[30] Watts D J, Strogatz S H. Collective dynamics of 'small-world'
,393(6684): 440-442(1998).
[31] Barab a´si A L, Albert R. Emergence of scaling in random ,286(5439): 509-
512(1999). 295
[32] Newman M E J. Finding community structure in networks using the eigenvectors of matrices.
Physical review E, 74(3): 036104(2006).
[33] Duch J, Arenas A. Community detection in complex networks using extremal optimization.
Physical review E,72(2): 027104(2005).
[34] Lusseau D, Schneider K, Boisseau O J, et al. The bottlenose dolphin community of Doubtful 300
Sound features a large proportion of long-lasting Ecology and
- 9 -
中国科技论文在线
Sociobiology,54(4): 396-405(2003).
[35] Gleiser P M, Danon L. Community structure in in complex systems,6(04):
565-573(2003).
[36] Jeong H, Tombor B, Albert R, et al. The large-scale organization of metabolic networks. 305
Nature,407(6804): 651-654(2000).
[37] Guimera R, Danon L, Diaz-Guilera A, et al. Self-similar community structure in a network
of human review E,68(6): 065103(2003).
[38] Gavin A C, B?sche M, Krause R, et al. Functional organization of the yeast proteome by
systematic analysis of protein ,415(6868): 141-147(2002). 310
[39] Jeong H, Mason S P, Barab a´si A L, et al. Lethality and centrality in protein networks.
Nature, 411(6833): 41-42(2001).
[40] Santos F C, Pacheco J M, Lenaerts T. Evolutionary dynamics of social dilemmas in structured heterogeneous
of the National Academy of Sciences of the United States of America ,103(9):
3490-3494(2006). 315
实证网络中节点中心性的鲁棒性研究
牛琪锴1,曾安2,樊瑛1,狄增如1
(1. 北京师范大学系统科学学院,北京 100875; 320
2. 瑞士弗里堡大学物理学系,瑞士)
摘要:节点中心性是复杂网络的一个重要方向,涉及到很多应用领域,比如网络结构动力学,
网络控制等。在很多研究工作中,大量精力被花费在设计新的中心性指标。然而,目前已有的
中心性指标的可靠性还没有完全理解。许多真实网络,正面临着各种各样的操作,如添加、
删除或交叉重连的操作。在本文中,我们关注的是网络操作对传统的中心性指标的影响。我325
们的分析是基于人工和实际网络。我们发现中心性指标通常在异构网络更鲁棒。此外,中心
性指标排在前面的节点往往有更强的鲁棒性。在所以的中心性指标中,特征向量中心性一般
是最鲁棒的。
关键词:节点中心性;鲁棒性;实际网络
中图分类号:N94 330