Volume 48 Issue 3
May  2019
Article Contents

WANG Kai, LIU Shu-xin, YU Hong-tao, LI Xing. Predicting Missing Links of Complex Network via Effective Common Neighbors[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(3): 432-439. doi: 10.3969/j.issn.1001-0548.2019.03.020
Citation: WANG Kai, LIU Shu-xin, YU Hong-tao, LI Xing. Predicting Missing Links of Complex Network via Effective Common Neighbors[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(3): 432-439. doi: 10.3969/j.issn.1001-0548.2019.03.020

Predicting Missing Links of Complex Network via Effective Common Neighbors

doi: 10.3969/j.issn.1001-0548.2019.03.020
More Information
  • Corresponding author: 刘树新, E-mail:liushuxin11@126.com
  • Received Date: 2018-03-21
  • Rev Recd Date: 2018-11-15
  • Publish Date: 2019-05-30
  • Link prediction can predict the missing links of complex networks, which promotes a better understanding of evolution mechanisms in real networks. Many similarity indices have been proposed based on a topology structure for link prediction. Local topological information, especially common neighbors, plays an important role in calculating the similarity between two endpoints. However, plenty of similarity indices ignore the effectiveness of common neighbors under different topology structures. Considering the local topological information around common neighbors, an effective common neighbor index is proposed. Firstly, we analyze the effectiveness of all neighbor links of common neighbors. Then, based on the local topology on both sides of two endpoints around common neighbors, the effectiveness of two sides of common neighbors is quantified separately. Finally, the similarity between two endpoints is described through the effect of common neighbors' effectiveness on bilateral resource allocation process. Empirical study on 15 real networks shows that the index proposed can achieve higher prediction accuracy, compared with 9 mainstream baselines.
  • 加载中
  • [1] BARABÁSI A L. Scale-free networks:A decade and beyond[J]. Science, 2009, 325(5939):412-413. doi:  10.1126/science.1173299
    [2] LI K, LI H J, WANG H. Situation analysis of relationship in social networks based on link entropy[J]. Modern Physics Letters B, 2015, 29(13):1550061. doi:  10.1142/S021798491550061X
    [3] BULLMORE E, SPORNS O. Complex brain networks:Graph theoretical analysis of structural and functional systems[J]. Nature Reviews Neuroscience, 2009, 10(3):186-198. doi:  10.1038/nrn2575
    [4] SCHWEITZER F, FAGIOLO G, SORNETTE D, et al. Economic networks:The new challenges[J]. Science, 2009, 325(5939):422-425. doi:  10.1126/science.1173644
    [5] LIN J, BAN Y. Complex network topology of transportation systems[J]. Transport Reviews, 2013, 33(6):658-685. doi:  10.1080/01441647.2013.848955
    [6] WANG P, XU B W, WU Y R, et al. Link prediction in social networks:The state-of-the-art[J]. Science China Information Sciences, 2015, 58(1):1-38. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1303.7264
    [7] VON M C, JENSEN L J, SNEL B, et al. STRING:Known and predicted protein-protein associations, integrated and transferred across organisms[J]. Nucleic Acids Research, 2005, 33(suppl.1):433-437. http://d.old.wanfangdata.com.cn/Periodical/jsjkxjsxb-e201706006
    [8] SCELLATO S, NOULAS A, MASCOLO C. Exploiting place features in link prediction on location-based social networks[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA, USA: ACM, 2011: 1046-1054.
    [9] HOLLAND P W, LASKEY K B, LEINHARDT S. Stochastic blockmodels:First steps[J]. Social Networks, 1983, 5(2):109-137. doi:  10.1016/0378-8733(83)90021-7
    [10] LIU S, XU X F, XIE R G, et al. Anomalous heat conduction and anomalous diffusion in low dimensional nanoscale systems[J]. The European Physical Journal B, 2012, 85(10):337. doi:  10.1140/epjb/e2012-30383-8
    [11] LÜ L, ZHOU T. Link prediction in complex networks:A survey[J]. Physica A:Statistical Mechanics and its Applications, 2011, 390(6):1150-1170. doi:  10.1016/j.physa.2010.11.027
    [12] LORRAIN F, WHITE H C. Structural equivalence of individuals in social networks[J]. Social Networks. 1977(1):67-98.
    [13] ZHOU T, LÜ L, ZHANG Y C. Predicting missing links via local information[J]. The European Physical Journal B, 2009, 71(4):623-630. doi:  10.1140/epjb/e2009-00335-8
    [14] ADAMIC L A, ADAR E. Friends and neighbors on the web[J]. Social Networks, 2003, 25(3):211-230. doi:  10.1016/S0378-8733(03)00009-1
    [15] LÜ L, JIN C H, ZHOU T. Similarity index based on local paths for link prediction of complex networks[J]. Physical Review E, 2009, 80(4):046122. doi:  10.1103/PhysRevE.80.046122
    [16] LIU S, JI X, LIU C, et al. Extended resource allocation index for link prediction of complex network[J]. Physica A:Statistical Mechanics and its Applications, 2017, 479:174-183. doi:  10.1016/j.physa.2017.02.078
    [17] LIU S, JI X, LIU C, et al. Similarity indices based on link weight assignment for link prediction of unweighted complex networks[J]. International Journal of Modern Physics B, 2017, 31(2):1650254. doi:  10.1142/S0217979216502544
    [18] YAO Y, ZHANG R, YANG F, et al. Link prediction based on local weighted paths for complex networks[J]. International Journal of Modern Physics C, 2017, 28(4):1750053. doi:  10.1142/S012918311750053X
    [19] MA Y, CHENG G, LIU Z, et al. Clustering-based link prediction in scientific coauthorship networks[J]. International Journal of Modern Physics C, 2017, 28(06):1750082. doi:  10.1142/S0129183117500826
    [20] FENG X, ZHAO J C, XU K. Link prediction in complex networks:A clustering perspective[J]. The European Physical Journal B, 2012, 85(1):1-3. doi:  10.1140/epjb/e2011-20818-1
    [21] LÜ L, PAN L, ZHOU T, et al. Toward link predictability of complex networks[J]. Proceedings of the National Academy of Sciences, 2015, 112(8):2325-2330. doi:  10.1073/pnas.1424644112
    [22] KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1):39-43. doi:  10.1007/BF02289026
    [23] LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical Review E, 2006, 73(2):026120. doi:  10.1103/PhysRevE.73.026120
    [24] KLEIN D J, RANDIĆ M. Resistance distance[J]. Journal of Mathematical Chemistry, 1993, 12(1):81-95. doi:  10.1007/BF01164627
    [25] FOUSS F, PIROTTE A, RENDERS J M, et al. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3):355-369. doi:  10.1109/TKDE.2007.46
    [26] CANNISTRACI C V, ALANIS-LOBATO G, RAVASI T. From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks[J]. Scientific Reports, 2013, 3:1613-1618. doi:  10.1038/srep01613
    [27] HERLOCKER J L, KONSTAN J A, TERVEEN L G, et al. Evaluating collaborative filtering recommender systems[J]. ACM Transactions on Information Systems (TOIS), 2004, 22(1):5-53. doi:  10.1145/963770
    [28] GLEISER P M, DANON L. Community structure in jazz[J]. Advances in Complex Systems, 2003, 6(4):565-573. doi:  10.1142/S0219525903001067
    [29] WATTS D J, STROGATZ S H. Collective dynamics of 'small-world' networks[J]. Nature, 1998, 393(6684):440-445. doi:  10.1038/30918
    [30] ZHANG Q M, LÜ L, WANG W Q, et al. Potential theory for directed networks[J]. PloS One, 2013, 8(2):e55437. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1202.2709
    [31] BATAGELJ V, MRVAR A. Pajek-program for large network analysis[J]. Connections, 1998, 21(2):47-57. http://cn.bing.com/academic/profile?id=ae97bbc52dc8eb50567885156695e92f&encoded=0&v=paper_preview&mkt=zh-cn
    [32] LÜ L, PAN L, ZHOU T, et al. Toward link predictability of complex networks[J]. Proceedings of the National Academy of Sciences, 2015, 112(8):2325-2330. doi:  10.1073/pnas.1424644112
    [33] DUCH J, ARENAS A. Community detection in complex networks using extremal optimization[J]. Physical Review E, 2005, 72(2):027104. doi:  10.1103/PhysRevE.72.027104
    [34] MICHALSKI R, PALUS S, KAZIENKO P. Matching organizational structure and social network extracted from email communication[C]//International Conference on Business Information Systems. Berlin, Heidelberg: Springer, 2011: 197-206.
    [35] BU D, ZHAO Y, CAI L, et al. Topological structure analysis of the protein-protein interaction network in budding yeast[J]. Nucleic Acids Research, 2003, 31(9):2443-2450. doi:  10.1093/nar/gkg340
    [36] ADAMIC L A, GLANCE N. The political blogosphere and the 2004 US election: Divided they blog[C]//Proceedings of the 3rd International Workshop on Link Discovery. Chicago, USA: ACM, 2005: 36-43.
    [37] OPSAHL T, AGNEESSENS F, SKVORETZ J. Node centrality in weighted networks:Generalizing degree and shortest paths[J]. Social Networks, 2010, 32(3):245-251. doi:  10.1016/j.socnet.2010.03.006
    [38] ULANOWICZ R E, DEANGELIS D L. Network analysis of trophic dynamics in south florida ecosystems[J]. US Geological Survey Program on the South Florida Ecosystem, 2005, 114:45-48.
    [39] SPRING N, MAHAJAN R, WETHERALL D. Measuring ISP topologies with Rocketfuel[J]. ACM SIGCOMM Computer Communication Review, 2002, 32(4):133-145. doi:  10.1145/964725
    [40] NEWMAN M E J. Assortative mixing in networks[J]. Physical Review Letters, 2002, 89(20):208701. doi:  10.1103/PhysRevLett.89.208701
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(2)  / Tables(2)

Article Metrics

Article views(4240) PDF downloads(129) Cited by()

Related
Proportional views

Predicting Missing Links of Complex Network via Effective Common Neighbors

doi: 10.3969/j.issn.1001-0548.2019.03.020

Abstract: Link prediction can predict the missing links of complex networks, which promotes a better understanding of evolution mechanisms in real networks. Many similarity indices have been proposed based on a topology structure for link prediction. Local topological information, especially common neighbors, plays an important role in calculating the similarity between two endpoints. However, plenty of similarity indices ignore the effectiveness of common neighbors under different topology structures. Considering the local topological information around common neighbors, an effective common neighbor index is proposed. Firstly, we analyze the effectiveness of all neighbor links of common neighbors. Then, based on the local topology on both sides of two endpoints around common neighbors, the effectiveness of two sides of common neighbors is quantified separately. Finally, the similarity between two endpoints is described through the effect of common neighbors' effectiveness on bilateral resource allocation process. Empirical study on 15 real networks shows that the index proposed can achieve higher prediction accuracy, compared with 9 mainstream baselines.

WANG Kai, LIU Shu-xin, YU Hong-tao, LI Xing. Predicting Missing Links of Complex Network via Effective Common Neighbors[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(3): 432-439. doi: 10.3969/j.issn.1001-0548.2019.03.020
Citation: WANG Kai, LIU Shu-xin, YU Hong-tao, LI Xing. Predicting Missing Links of Complex Network via Effective Common Neighbors[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(3): 432-439. doi: 10.3969/j.issn.1001-0548.2019.03.020
  • With the development of complex network research, many more real complex systems have been described as complex network[1-5]. Link prediction aims to predict the likelihood that a link exists between two endpoints in complex networks[6]. Predicting missing links plays an important role in discovering the unknown connections in protein-protein interaction networks[7], forecasting the future relationships in online social networks[8], and identifying spurious flights in air transportation networks[9].

    Based on evolution mechanisms of complex networks, a lot of link prediction methods have been proposed. Among these methods, topology-based similarity indices have got a great success due to their high performance and low complexity[10]. Similarity indices can be classified as local indices and global indices according to complexity and length of paths[11]. Considering the local information including common neighbors and short paths, local indices, such as common neighbor (CN) index[12]directly using the number of common neighbors, resource allocation (RA) index[13]and Adamic-Adar (AA) index[14] weighting the common neighbors by their node degree, local path (LP) index[15]and extended resource allocation (ERA)[16]index adding paths with length 3 to CN and RA respectively, perform very well in real networks with lower complexity. Ref.[17] proposed a local weighted method, which can improve the prediction accuracy of several similarity indices (including CN, AA, RA, Salton, Jaccard and LP). Similarly, Ref.[18]proposed a local weighted path (LWP) index to differentiate the contributions between paths. Considering the clustering coefficient of nodes, some clustering-based methods[19-20]have been proposed to improve the prediction accuracy. Ref.[21]believed that the regularity of a network was reflected in the consistency of structural features before and after a random removal of a small set of links and proposed a universal structural consistency index. To improve the prediction accuracy, many global indices have been proposed based on global paths, including Katz index[22], LHN-II[23], average commute time (ACT)[24] and cosine similarity time (Cos+)[25]. The global indices can achieve a good performance, but most of them are not suitable for large networks due to their higher complexity. Therefore, how to effectively use the local topological information may be a good way to predict the missing links of large-scale networks.

    In the real world, such as human relationship network, if there are more indirect connections between two people and their common friends, they are more likely to be friend. Considering two pairs of endpoints x, y and x', y' with a common neighbor. There are two longer paths between x', y', and two indirect connections between x, y and their common neighbor. According to CN, RA, AA and LP index, the similarity between two pairs of nodes are the same, though the links between endpoints x and y are more likely to exist. Because there are more indirect connections between x, y and their common neighbor, common neighbor of x and y is more effective in similarity calculating. In a word, the topology information related to common neighbors plays an important role in link prediction.

    Based on the above discussion, an effective common neighbor index (ECN) is proposed, which weights the common neighbors by the local topology information around them. With a parameter adjusting the strength of effective common neighbors, ECN can achieve the optimal prediction accuracy for different networks. Empirical study has shown that the ECN index proposed can significantly improve the prediction accuracy, compared with 9 similarity indices in 15 real networks.

  • Several classical similarity indices are introduced here, including 5 local indices: CN, RA, AA, CAR and LP index, and 4 global indices: Katz, ACT, Cos+ and SPM index. A brief description is shown as follows:

    1) CN index[12]measures the similarity of two endpoints by the number of their common neighbors:

    where ${\mathit{\Gamma}} (x)$is the set of neighbors of node x, and ${\mathit{\Gamma}} (x) \cap {\mathit{\Gamma}} (y)$denotes the common neighbors between nodes x and y.

    2) RA index[13]publishes the common neighbors with big degree:

    where ${k_z}$ is the node degree of node $z$.

    3) AA index[14]weights the common neighbors by nodes degree:

    4) CAR index[26]believes that two nodes are more likely to link together if their common-first-neighbors are members of a strongly inner-linked cohort:

    where $\left| {\gamma (z)} \right|$refers to the sub-set of neighbors of $z$ that are also common neighbors of nodes$x$and$y$.

    5) LP index[15]adds the path information with length 3 to CN:

    where $\alpha $ is the parameter and A is the adjacency matrix.

    6) Katz index[22]considers all the path information between two endpoints and gives more weights to the shorter paths:

    where ${\rm{path}}_{xy}^l$ is the set of paths with length l between x and y, and $\varepsilon $ is the adjust parameter.

    7) ACT[24]measures similarity by calculating the average steps of random walk between two endpoints:

    where $l_{xy}^ + $ is the corresponding entry in ${{\mathit{\boldsymbol{L}}}^{\mathit{\boldsymbol{ + }}}}$, and ${{\mathit{\boldsymbol{L}}}^{\mathit{\boldsymbol{ + }}}}$ denotes the pseudo-inverse of matrix ${\mathit{\boldsymbol{L}}} = {\mathit{\boldsymbol{D}}} - {\mathit{\boldsymbol{A}}}$.

    8) Cos+[25]is based on ${{\mathit{\boldsymbol{L}}}^{\mathit{\boldsymbol{ + }}}}$ by calculating similarity of two vectors:

    9) SPM[21]measures the structural consistency of a network by perturbing its adjacency matrix and observing the change of eigenvalues provided the fixed eigenvectors:

    where ${\lambda _k}$ and ${x_k}$ are the eigenvalue and the corresponding orthogonal and normalized eigenvector for${A^{\rm{R}}}$, respectively.

  • Considering an unweighted undirected network G(V, E), V and E are the sets of nodes and links respectively. A score ${s_{xy}}$ is assigned to each pair of nodes x and y by prediction algorithms and it can be a measure of similarity between two endpoints. For each nonexistent link of network, this score represents the likelihood that the link exists.

    Definition 1 Considering a pair of nodes, $x,y \in V$, node z is one of their common neighbors. Among all the links connected to node z, if there are more links on the local paths (here, length $l \leqslant 2$) between endpoints and node z, it will be more effective for common neighbor z in calculating the similarity of two endpoints. From the influence of paths connected to node x, the effectiveness of common neighbors between endpoints x and y can be defined as:

    where ${c_{xz}}$ denotes the number of different paths between node x and z with length no more than 2, and $\delta $ is the adjust parameter.

    Definition 2 On an unweighted undirected network G(V, E). The effective common neighbor index of endpoints x and y, is defined as:

    Here, the $\delta (\delta \geqslant 0)$ aims to adjust the strength of effectiveness. When $\delta {\rm{ = }}0$, the ECN index is the same as the CN index 12. TakeFig.1for example, the similarity of two endpoints is $s_{xy}^{{\rm{ECN}}} = ({3^\delta } + {2^\delta })/{9^\delta }$.

    Figure 1.  Effectiveness of common neighbors between two endpoints.

  • To quantify the prediction accuracy, the standard metric called precision[27]is introduced. Precision is defined as the ratio of relevant links to the top L predicted links (here we set L=100). If there are m relevant links appeared in the probe set, the precision value is:

    Clearly, the higher precision value means higher prediction accuracy.

    Our experiments are performed on fifteen different real networks: 1) Jazz[28]:the network of Jazz musicians. 2) Caenorhabditis elegans (CE)[29]: the neural network of the nematode worm. 3) Kohonen[30]: the citing network of articles with topic "self-organizing maps" or references to "Kohonen T". 4) USAir[31]: the USA airline network. 5) Hamster[32]: a friendship network of users on the website hamsterster.com. 6) Metabolic[33]: the metabolic network of Caenorhabditis elegans. 7) Email network (Email)[34]: the internal email communication network between employees of a mid-sized manufacturing company. 8) Yeast PPI (Yeast)[35]: a protein-protein interaction network of yeast. 9) Political blogs (PB)[36]: a network of US political blogs. 10) Openflights (Flight)[37]: the network of flights between air-ports of the world. 11) Scientometrics (SciMet)[30]: the citing network of the articles from scientometrics. 12) Food Web of Everglades ecosystem (FWEW)[38]: the network of carbon exchanges occurring during the wet season in the Everglades Graminoid Marshes of South Florida. 13) Route topology of internet (Router)[39]: the topology of internet in router-level. 14) Power Grid (Power)[29]: an electrical power grid network of western US; 15) Food Web of Florida Bay ecosystem (FWFB)[38]: the network of carbon exchanges occurring during the wet season in Florida Bay.

    Table 1 shows the basic topological features of these networks. $\left| V \right|$is the number of nodes, $\left| E \right|$ denotes the number of links, $\left\langle k \right\rangle $ indicates the average degree. $\left\langle d \right\rangle $denotes the average distance. r is the assortativity coefficient[40]. C represents the clustering coefficient[29]. H is the degree heterogeneity.

    Data $\left| V \right|$ $\left| E \right|$ $\langle k\rangle $ $\langle d\rangle $ r C H
    Jazz 198 2 742 27.7 2.24 0.02 0.633 1.4
    CE 297 2 148 14.46 2.46 -0.163 0.308 1.8
    Kohonen 3 704 12 683 3.4 5.64 -0.121 0.252 11.2
    USAir 332 2 128 12.81 2.74 -0.208 0.749 3.36
    Hamster 1 858 12 534 13.49 3.39 -0.085 0.090 3.36
    Metabolic 453 2 025 8.94 2.67 -0.226 0.647 4.49
    Email 167 5 784 69.26 1.87 -0.295 0.541 1.66
    Yeast 2 375 11 693 9.85 5.10 0.469 0.378 3.48
    PB 1 222 16 717 27.36 2.74 -0.221 0.361 2.97
    Flight 2 939 30 501 20.75 4.18 0.051 0.255 5.22
    SciMet 3 084 10 399 6.74 4.18 -0.033 0.151 2.78
    FWEW 69 880 25.51 1.64 -0.298 0.552 1.27
    Router 5 022 6 258 2.49 5.99 -0.138 0.033 5.5
    Power 4 941 6 594 2.67 18.9 0.003 0.107 1.45
    FWFB 128 2 075 32.42 1.78 -0.112 0.335 1.24

    Table 1.  Basic topological features of real networks

    Each original data is randomly divided into training set containing 90% of links, and the probe set containing the remaining 10%.

  • Firstly, we report the impact of adjust parameter$\delta $ on prediction accuracy in different networks, and each precision value is the average of 20 realizations. As shown in Fig. 2, all the precision curves can achieve the optimal values when $\delta > 1$, and each of them exceeds the precision value at $\delta = 1$ without enhancement. It demonstrates that higher effective neighbors have a much greater influence on the similarity between endpoints than expected and the contribution of different effective common neighbors for similarity is not linear (for different networks, its power exponent is the optimal $\delta $). The precision values show a rising trend with an increase of $\delta $ before reaching the highest point, which indicates that the effective common neighbors can improve the prediction accuracy compared with CN (when $\delta = 0$). The measurement of effective common neighbors and their enhancement are both beneficial to the description of similarity between two endpoints. For some of datasets, the precision curves decline gradually after peak point. However, the precision values are stable around the maximum value in some networks such as metabolic, Email, Yeast, FWEW, Router and FWFB. In these networks, the role that the lower effective common neighbors play is very limited in link prediction of missing links.

    Table 2 shows the comparison between ECN index (optimal values) and 9 similarity indices in 15 networks, and each precision data is the average of 20 realizations. In 10 out of 15 datasets, ECN index obtains the best performance, but worse than the SPM index in Jazz, Hamster, PB, FWEW and FWFB. For some lower clustering network such as Hamster and Router, the improvement of precision is significant for ECN index, which indicates that common neighbors with different topological structures perform different roles in calculating the similarity between endpoints in these lower clustering networks. With the consideration of longer path information, the precision values of path-based indices (LP and Katz) are higher than neighbor-based indices (CN, RA and AA) in most of dataset. However, in some high clustering network such as Jazz, USAir, Metabolic and Email, the neighbor-weighted indices (RA and AA) perform better than most of the global indices. Also, the local- community-based index CAR can perform better than some global indices in Jazz, Kohonen, PB, Flight and SciMet. Having considered the effectiveness of common neighbors, ECN index can get a good performance by fusing some topology information used in neighbor-weighted indices and path-based indices. With the strength parameter, ECN index enhance the performance of effectiveness by adjusting the influence of different effective common neighbors on similarity in different networks. Having considered the structural consistency of a network, SPM performs very well for all the datasets. Surprisingly, some global indices (ACT and Cos+) perform worse than expected, and their precision values are close to zero in SciMet, Router, and Power networks. To summarize, the common neighbors between endpoints seem to be more effective for similarity calculating under the standard metric precision, and the consideration of local information around common neighbors can significantly improve the precision value.

    Figure 2.  Precision curves of ECN index with the change of parameter for 15 real networks. Each precision point is the average of 20 realizations

    Data CN RA AA CAR LPa LPb Katza Katzb ACT Cos+ SPM ECN
    Jazz 0.810 0.809 0.830 0.851 0.798 0.775 0.798 0.740 0.258 0.356 0.895 0.861
    CE 0.129 0.131 0.137 0.129 0.138 0.138 0.138 0.138 0.064 0.069 0.207 0.210
    Kohonen 0.157 0.145 0.154 0.211 0.159 0.159 0.159 0.155 0.010 0.020 0.089 0.304
    USAir 0.611 0.636 0.632 0.612 0.615 0.612 0.615 0.604 0.492 0.104 0.589 0.676
    Hamster 0.015 0.004 0.011 0.037 0.017 0.054 0.017 0.075 0.089 0.020 0.898 0.186
    Metabolic 0.206 0.305 0.253 0.195 0.201 0.200 0.201 0.200 0.120 0.110 0.445 0.456
    Email 0.724 0.733 0.727 0.691 0.726 0.722 0.726 0.714 0.633 0.637 0.809 0.889
    Yeast 0.668 0.495 0.693 0.670 0.671 0.733 0.670 0.721 0.565 0.239 0.699 0.876
    PB 0.431 0.253 0.389 0.472 0.436 0.465 0.436 0.467 0.127 0.342 0.669 0.522
    Flight 0.511 0.354 0.443 0.631 0.521 0.559 0.522 0.554 0.344 0.030 0.127 0.646
    SciMet 0.149 0.130 0.147 0.165 0.149 0.150 0.149 0.147 0.000 0.000 0.079 0.203
    FWEW 0.147 0.163 0.154 0.142 0.158 0.189 0.158 0.197 0.271 0.000 0.543 0.387
    Router 0.110 0.092 0.113 0.109 0.132 0.132 0.131 0.130 0.000 0.019 0.015 0.306
    Power 0.130 0.071 0.097 0.131 0.148 0.148 0.146 0.146 0.000 0.086 0.017 0.178
    FWFB 0.090 0.087 0.086 0.089 0.096 0.133 0.096 0.146 0.178 0.030 0.718 0.378
    aParameter $\alpha {\rm{ = 0}}{\rm{.001}}$. bParameter$\alpha {\rm{ = 0}}{\rm{.01}}$.

    Table 2.  Comparison of the precision value between ECN (optimal value) and some similarity indices. Each value is the average of 20 realizations, each of which corresponds to an independent division of ${E^T}$and ${E^P}$

  • There are a lot of similarity indices considering local information between endpoints such as common neighbors and shorter paths. However, most of them ignore the effectiveness of common neighbors. Based on local information, an ECN index is proposed, which weights the common neighbors by the local topology information around them. Empirical results demonstrate that ECN index can significantly improve the prediction accuracy in 15 real networks, compared with 7 similarity indices. The measurement of effective common neighbors and their enhancement are both beneficial to the improvement of prediction accuracy or description of similarity between two endpoints. The ECN index is more suitable for the link prediction of networks concerning more about the prediction accuracy of the top L predicted links (such as protein-protein interaction networks). In addition, the complexity of ECN is between $O(N{\left\langle k \right\rangle ^2})$ (CN) and ${\rm{2}}O(N{\left\langle k \right\rangle ^2})$. Therefore, it can also be applied to predict missing links of large-scale networks.

Reference (40)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return