复杂网络视角下跨社交网络用户身份识别研究综述

邢玲; 邓凯凯; 吴红海; 谢萍

doi:10.12178/1001-0548.2019182

复杂网络视角下跨社交网络用户身份识别研究综述

doi: 10.12178/1001-0548.2019182

河南科技大学信息工程学院　河南洛阳　471023

基金项目: 国家自然科学基金 (61771185, 61772175, 61801171); 河南省高校科技创新团队支持计划(21IRTSTHN015)

详细信息

作者简介:
邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

中图分类号: TP391

Review of User Identification across Social Networks:The Complex Network Approach

School of Information Engineering, Henan University of Science and Technology　Luoyang Henan　471023

摘要: 社交网络是一种具有交互特性的复杂网络，利用复杂网络具有的网络特性可以链接不同社交网络中的节点，并分析节点之间存在的联系，结合相关的匹配算法可以有效地识别出用户在不同社交网络上的虚拟账号，有助于各大社交网络为用户提供更好的服务。该文对近十多年来数据挖掘领域中提出的跨社交网络用户身份识别技术进行了系统性地综述，详细阐述了3类用户身份识别技术相似度的计算方法和统一的识别框架，利用相关的评价指标对分类后的用户身份识别技术进行性能评估，最后展望了跨社交网络用户身份识别技术的未来研究方向。
- 跨社交网络 /
- 复杂网络 /
- 数据挖掘 /
- 实体用户 /
- 用户身份识别
Abstract: Social network is a complex network with interaction characteristics. It can link nodes in different social networks by using the network characteristics of complex network, analyze the connections between nodes, and combine with the related matching algorithm to identify user’s virtual accounts, which can help social networks to provide users with better services. This paper presents a systematic review on across social networks user identification techniques proposed in the field of data mining. Then the methods for calculating the similarity of the three types of user identification techniques and the unified identification framework are elaborated in detail. The relevant evaluation metrics are used to evaluate the classified user identification technique performances. Finally, the future research directions of across social networks user identification techniques are prospected based on the analysis of the research status.
- across social networks /
- complex network /
- data mining /
- entity user /
- user identification

图 1 跨社交网络用户身份识别问题解析

下载: 全尺寸图片幻灯片

图 2 有监督学习方法的示意图

下载: 全尺寸图片幻灯片

图 3 跨社交网络用户身份识别框架

下载: 全尺寸图片幻灯片

表 1 跨社交网络用户身份识别技术性能评估

用户身份识别技术	实体识别度	计算开销	数据缺失	数据获取难易程度
基于用户档案信息	高	中	高	高
基于网络拓扑结构	高	高	低	低
基于用户生成内容	高	低	低	中

下载: 导出CSV

表 2 跨社交网络用户身份识别技术对比分析

用户身份识别技术	主要优点	主要缺点	代表方法
基于用户档案信息	实体识别度高，实现简单	数据缺失严重且易出现伪造	FOAF匹配架构^[46]、 MADM^[48]、 UISN-UD模型^[56]
基于网络拓扑结构	数据较易获取且完整	存在网络异构性	去匿名化算法^[59]、 COSNET模型^[65]、 FRUI-P^[30]
基于用户生成内容	信息缺失性低且计算开销低	部分用户缺乏有效的行为信息	贝叶斯模型^[71]、 MNA算法^[61]、 U-UIM^[35]

下载: 导出CSV

[1]	LIU X Z, XIA T, YU Y Y, et al. Cross social media recommendation[C]//The International AAAI Conference on Web and Social Media. [S.l.]: AAAI, 2016: 1-10.
[2]	ZAFARANI R, TANG L, LIU H. User identification across social media[J]. ACM Transactions on Knowledge Discovery from Data, 2015, 10(2): 1-30.
[3]	LIU K, ZHANG L M, ZHOU L J. Survey of deep learning applied in information recommendation system[J]. Journal of Chinese Computer Systems, 2019, 40(4): 738-743.
[4]	陈玲姣, 蔡世民, 张千明, 等. 基于信任关系的资源分配推荐算法改进研究[J]. 电子科技大学学报, 2019, 48(3): 449-455. doi: 10.3969/j.issn.1001-0548.2019.03.022 CHEN Ling-jiao, CAI Shi-min, ZHANG Qian-ming, et al. Improved research on resource-allocation recommendation algorithm based on trust relationship[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(3): 449-455. doi: 10.3969/j.issn.1001-0548.2019.03.022
[5]	ZHONG E H, FAN W, YANG Q. User behavior learning and transfer in composite social networks[J]. ACM Transactions on Knowledge Discovery from Data, 2014, 8(1): 1-32.
[6]	SHEN W, WANG J Y, LUO P, et al. Linking named entities in Tweets with knowledge base via user interest modeling[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. [S.l.]: ACM, 2013: 68-76.
[7]	邵鹏, 胡平. 复杂网络特殊用户对群体观点演化的影响[J]. 电子科技大学学报, 2019, 48(4): 604-612. doi: 10.3969/j.issn.1001-0548.2019.04.019 SHAO Peng, HU Ping. The influence mechanism of special members on opinion evolution of group members in complex network[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(4): 604-612. doi: 10.3969/j.issn.1001-0548.2019.04.019
[8]	WICKER S B. The loss of location privacy in the cellular age[J]. Communications of the ACM, 2012, 55(8): 60-68. doi: 10.1145/2240236.2240255
[9]	NARAYANAN A, SHMATIKOV V. Robust de-anonymization of large sparse datasets[C]//Proceedings of the IEEE Symposium on Security and Privacy. [S.l.]: IEEE, 2008: 111-125.
[10]	王璐, 孟小峰. 位置大数据隐私保护研究综述[J]. 软件学报, 2014, 25(4): 693-712. WANG Lu, MENG Xiao-feng. Location privacy preservation in big data era: A survey[J]. Journal of Software, 2014, 25(4): 693-712.
[11]	陈晨. 面向Web文本挖掘的主题网络爬虫研究[D]. 成都: 电子科技大学, 2017. CHEN Chen. Research on web crawler for web text mining[D]. Chengdu: University of Electronic Science and Technology of China, 2017.
[12]	朱军芳, 陈端兵, 周涛, 等. 网络科学中相对重要节点挖掘方法综述[J]. 电子科技大学学报, 2019, 48(4): 595-603. doi: 10.3969/j.issn.1001-0548.2019.04.018 ZHU Jun-fang, CHEN Duan-bing, ZHOU Tao, et al. A survey on mining relatively important nodes in network science[J]. Journal of University of Electronic Science and Technology of China, 2019, 48(4): 595-603. doi: 10.3969/j.issn.1001-0548.2019.04.018
[13]	LI Y J, LIU B. A normalized Levenshtein distance metric[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 1091-1095. doi: 10.1109/TPAMI.2007.1078
[14]	KONDRAK G, MARCU D, KNIGHT K. Cognates can improve statistical translation models[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. [S.l.]: ACM, 2003: 46-48.
[15]	CALIANO D, FERSINI E, MANCHANDA P, et al. UniMiB: Entity linking in tweets using Jaro-Winkler distance, popularity and coherence[C]//Proceedings of the 6th International Workshop on Making Sense of Microposts. [S.l.]: Microposts, 2016: 70-72.
[16]	LIU D, WU Q, HAN W, et al. User identification across multiple websites based on username features[J]. Chinese Journal Computers, 2015, 38(10): 2028-2040.
[17]	VOSECKY J, HONG D, SHEN V Y. User identification across multiple social networks[C]//Proceedings of the 2009 First International Conference on Networked Digital Technologies. [S.l.]: IEEE, 2009: 360-365.
[18]	赵胜辉, 李吉月, 徐碧, 等. 基于TFIDF的社区问答系统问句相似度改进算法[J]. 北京理工大学学报, 2017, 37(9): 982-985. ZHAO Sheng-hui, LI Ji-yue, XU Bi, et al. Improved TFIDF-based question similarity algorithm for community interlocution system[J]. Transactions of Beijing Institute of Technology, 2017, 37(9): 982-985.
[19]	LI Y J, PENG Y, ZHANG Z, et al. A deep dive into user display names across social networks[J]. Information Sciences, 2018, 447: 186-204. doi: 10.1016/j.ins.2018.02.072
[20]	FUGLEDE B, TOPSOE F. Jensen-Shannon divergence and Hilbert space embedding[C]//International Symposium on Information Theory. [S.l.]: IEEE, 2005, DOI: 10.1109/ISIT.2004.1365067.
[21]	吴铮. 跨社交网络用户多重身份识别算法研究[D]. 郑州: 解放军信息工程大学, 2017. WU Zheng. Research on user identification algorithms across multiple online social networks[D]. Zhengzhou: The PLA Information Engineering University, 2017.
[22]	REXFORD J, DOVROLIS C. Future internet architecture: clean-slate versus evolutionary research[J]. Communications of the ACM, 2010, 53(9): 36-40.
[23]	ADAMIC L A, ADAR E. Friends and neighbors on the web[J]. Social Networks, 2003, 25(3): 211-230. doi: 10.1016/S0378-8733(03)00009-1
[24]	GREENHALGH A, HUICI F, HOERDT M, et al. Flow processing and the rise of commodity network hardware[J]. ACM SIGCOMM Computer Communication Review, 2009, 39(2): 20-26. doi: 10.1145/1517480.1517484
[25]	MACCHERANI E, FEMMINELLA M, LEE J W, et al. Extending the NetServ autonomic management capabilities using OpenFlow[C]//IEEE Network Operations and Management Symposium. [S.l.]: IEEE, 2012: 582-585.
[26]	KIM H, FEAMSTER N. Improving network management with software defined networking[J]. IEEE Communications Magazine, 2013, 51(2): 114-119. doi: 10.1109/MCOM.2013.6461195
[27]	QAZI Z A, TU C C, CHIANG L, et al. SIMPLY-fying middlebox policy enforcement using SDN[C]// Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM. [S.l.]: ACM, 2013: 27-38.
[28]	MILOJEVIC S. Modes of collaboration in modern science: Beyond power laws and preferential attachment[J]. Journal of the Association for Information Science and Technology, 2010, 61(7): 1410-1423.
[29]	ZHOU X P, LIANG X, ZHANG H Y, et al. Cross-platform identification of anonymous identical users in multiple social media networks[J]. IEEE Trans Knowl Data Eng, 2016, 28(2): 411-424. doi: 10.1109/TKDE.2015.2485222
[30]	ZHOU X P, LIANG X, DU X Y, et al. Structure based user identification across social networks[J]. IEEE Trans Knowl Data Eng, 2018, 30(6): 1178-1191. doi: 10.1109/TKDE.2017.2784430
[31]	NGUYEN H V, BAI L. Cosine similarity metric learning for face verification[C]//Asian Conference on Computer Vision. [S.l.]: Springer-Verlag, 2010: 709-720.
[32]	LIU D H, CHEN X H, PENG D. Some cosine similarity measures and distance measures between q-rung orthopair fuzzy sets[J]. International Journal of Intelligent Systems, 2019, 34(7): 1-16. doi: 10.1002/int.22108
[33]	黄丹阳, 王菲菲, 杨扬, 等. 基于网络结构与用户内容的动态兴趣识别方法[J]. 北京邮电大学学报, 2018, 41(2): 103-108. HUANG Dan-yang, WANG Fei-fei, YANG Yang, et al. Dynamic interest identification based on social network structure and user generated contents[J]. Journal of Beijing University of Posts and Telecommunications, 2018, 41(2): 103-108.
[34]	毕娟, 秦志光. 基于概率主题模型的社交网络层次化社区发现算法[J]. 电子科技大学学报, 2014, 43(6): 898-903. doi: 10.3969/j.issn.1001-0548.2014.06.018 BI Juan, QIN Zhi-guang. Hierarchical community discovery for social networks based on probabilistic topic model[J]. Journal of University of Electronic Science and Technology of China, 2014, 43(6): 898-903. doi: 10.3969/j.issn.1001-0548.2014.06.018
[35]	LI Y J, ZHANG Z, PENG Y, et al. Matching user accounts based on user generated content across social networks[J]. Future Generation Computer Systems, 2018, 83: 104-115. doi: 10.1016/j.future.2018.01.041
[36]	NIWATTANAKUL S, SINGTHONGCHAI J, NAENUDORM E, et al. Using of Jaccard coefficient for keywords similarity[C]//IAENG International Conference on Internet Computing. Hong Kong, China: IAENG, 2013: 380-384.
[37]	MA J T, QIAO Y Q, HU G W, et al. Social account linking via weighted bipartite graph matching[J]. International Journal of Communication Systems, 2018, 31(7): e3471. doi: 10.1002/dac.3471
[38]	MODI S, SHAGARI N M, WADATA B. Implementation of stable marriage algorithm in student project allocation[J]. Asian Journal of Research in Computer Science, 2018, 1(4): 1-9.
[39]	ZAFARANI R, LIU H. Connecting corresponding identities across communities[C]//International Conference on Weblogs and Social Media. [S.l.]: AAAI, 2009, 9: 354-357.
[40]	PERITO D, CASTELLUCCIA C, KAAFAR M A, et al. How unique and traceable are usernames?[J]. International Symposium on Privacy Enhancing Technologies Symposium, 2011, 6794: 1-17.
[41]	LIU J, ZHANG F, SONG X Y, et al. What’s in a name?: An unsupervised approach to link users across communities[C]//Proceedings of the 6th ACM International Conference on Web Search and Data Mining. [S.l.]: ACM, 2013: 495-504.
[42]	ZAFARANI R, LIU H. Connecting users across social media sites: A behavioral-modeling approach[C]// Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). [S.l.]: ACM, 2013: 41-49.
[43]	WANG Y B, LIU T W, TAN Q F, et al. Identifying users across different sites using usernames[J]. Procedia Computer Science, 2016, 80: 376-385. doi: 10.1016/j.procs.2016.05.336
[44]	LI Y J, PENG Y, JI W L, et al. User Identification based on display names across online social networks[J]. IEEE Access, 2017, 5: 17342-17353. doi: 10.1109/ACCESS.2017.2744646
[45]	MOTOYAMA M, VARGHESE G. I seek you: Searching and matching individuals in social networks[C]// Proceedings of the 11th International Workshop on Web Information and Data Management. HongKong, China: ACM, 2009: 67-75.
[46]	RAAD E, CHBEIR R, DIPANDA A. User profile matching in social networks[C]//Proceedings of the 13th International Conference on Network-Based Information Systems. [S.l.]: IEEE, 2010: 297-304.
[47]	IOFCIU T, FANKHAUSER P, ABEL F, et al. Identifying users across social tagging systems[C]//Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. [S.l.]: AAAI, 2011: 522-525.
[48]	YE N, ZHAO Y L, DONG L L, et al. User identification based on multiple attribute decision making in social networks[J]. China Communications, 2013, 10(12): 37-49. doi: 10.1109/CC.2013.6723877
[49]	吴铮, 于洪涛, 刘树新, 等. 基于信息熵的跨社交网络用户身份识别方法[J]. 计算机应用, 2017, 37(8): 2374-2380. doi: 10.11772/j.issn.1001-9081.2017.08.2374 WU Zheng, YU Hong-tao, LIU Shu-xin, et al. User identification across multiple social networks based on information entropy[J]. Journal of Computer Applications, 2017, 37(8): 2374-2380. doi: 10.11772/j.issn.1001-9081.2017.08.2374
[50]	DENG K K, XING L, ZHENG L S, et al. A user identification algorithm based on user behavior analysis in social networks[J]. IEEE Access, 2019, 9: 47114-47123.
[51]	GOGA O, PERITO D, LEI H, et al. Large-scale correlation of accounts across social networks[EB/OL]. [2019-05-06]. http://www.icsi.berkeley.edu/pubs/techreports/ICSI_TR-13-002.pdf.
[52]	LI H X, ZHU H J, DU S G, et al. Privacy leakage of location sharing in mobile social networks: Attacks and defense[J]. IEEE Transactions on Dependable and Secure Computing, 2018, 15(4): 646-660. doi: 10.1109/TDSC.2016.2604383
[53]	VENI R H, REDDY A H, KESAVULU C. Identifying malicious web links and their attack types in social networks[J]. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2018, 3(4): 1060-1066.
[54]	ZAMANI K, PALIOURAS G, VOGIATZIS D. Similarity-based user identification across social networks[C]//International Workshop on Similarity-based Pattern Recognition. [S.l.]: Springer, 2015: 171-185.
[55]	ESFANDYARI A, ZIGNANI M, GAITO S, et al. User identification across online social networks in practice: Pitfalls and solutions[J]. Journal of Information Science, 2016, 44(3): 377-391.
[56]	LI Y J, PENG Y, ZHANG Z, et al. Matching user accounts across social networks based on username and display name[J]. World Wide Web, 2018, 22(7): 1-23.
[57]	LI Y J, PENG Y, ZHANG Z, et al. Understanding the user display names across social networks[C]//Proceedings of the 26th International World Wide Web Conference Committee (IW3C2). [S.l.]: ACM, 2017: 1319-1326.
[58]	MISHRA R. Entity resolution in online multiple social networks[J]. Emerging Technologies in Data Mining and Information Security, 2019, 813: 221-237.
[59]	NARAYANAN A, SHMATIKOV V. De-anonymizing social networks[C]//The 30th IEEE Symposium on Security and Privacy. [S.l.]: IEEE, 2009: 173-187.
[60]	CUI Y, PEI J, TANG G T, et al. Finding email correspondents in online social networks[J]. World Wide Web, 2013, 16(2): 195-218. doi: 10.1007/s11280-012-0168-2
[61]	KONG X N, ZHANG J W, YU P S. Inferring anchor links across multiple heterogeneous social networks[C]//ACM International Conference on Information & Knowledge Management. [S.l.]: ACM, 2013: 179-188.
[62]	KORULA N, LATTANZI S. An efficient reconciliation algorithm for social networks[J]. Proceedings of the VLDB Endowment, 2014, 7(5): 377-388. doi: 10.14778/2732269.2732274
[63]	TAN S L, GUAN Z Y, CAI D, et al. Mapping users across networks by manifold alignment on hypergraph[C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence. [S.l.]: AAAI, 2014, 14: 159-165.
[64]	JIN T S, YU Z T, GAO Y, et al. Robust ℓ 2 − Hypergraph and its applications[J]. Information Sciences, 2019, 501: 708-723. doi: 10.1016/j.ins.2019.03.012
[65]	ZHANG Y T, TANG J, YANG Z L, et al. COSNET: Connecting heterogeneous social networks with local and global consistency[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. [S.l.]: ACM, 2015: 1485-1494.
[66]	LEE J Y, HUSSAIN R, RIVERA V, et al. Second-level degree-based entity resolution in online social networks[J]. Social Network Analysis and Mining, 2018, 8: 19. doi: 10.1007/s13278-018-0499-9
[67]	ZHANG W, SHU K, LIU H, et al. Graph neural networks for user identity linkage[EB/OL]. [2019-11-03]. https://arxiv.org/pdf/1903.02174.pdf.
[68]	WANG N, ZHOU Y D, SUN Q D, et al. A study on influential user identification in online social networks[J]. Chinese Journal of Electronics, 2016, 25(3): 467-473. doi: 10.1049/cje.2016.05.012
[69]	SHI C, LI Y T, ZHANG J W, et al. A survey of heterogeneous information network analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(1): 17-37. doi: 10.1109/TKDE.2016.2598561
[70]	DENG K K, XING L, ZHANG M C, et al. A multiuser identification algorithm based on internet of things[J]. Wireless Communications and Mobile Computing, 2019, DOI: 10.1155/2019/6974809.
[71]	ALMISHARI M, TSUDIK G. Exploring linkability of user reviews[J]. Computer Security-ESORICS, 2012, 7459: 307-324.
[72]	LIU S Y, WANG S H, ZHU F D, et al. HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling[C]//ACM SIGMOD International Conference on Management of Data. [S.l.]: ACM, 2014: 51-62.
[73]	NIE Y P, JIA Y, LI S D, et al. Identifying users across social networks based on dynamic core interests[J]. Neurocomputing, 2016, 210: 107-115. doi: 10.1016/j.neucom.2015.10.147
[74]	SHA Y, LIANG Q, ZHENG K J. Matching user accounts across social networks based on users message[J]. Procedia Computer Science, 2016, 80: 2423-2427. doi: 10.1016/j.procs.2016.05.541
[75]	ROEDLER R, KERGL D, RODOSEK G D. Profile matching across online social networks based on geo-tags[J]. Advances in Nature and Biologically Inspired Computing, 2016, 419: 417-428.
[76]	GOAG O, LEI H, PARTHASARATHI S H K, et al. Exploiting innocuous activity for correlating users across sites[C]//Proceedings of the 22nd International Conference on World Wide Web. [S.l.]: ACM, 2013: 447-458.
[77]	CAO W, WU Z W, WANG D, et al. Automatic user identification method across heterogeneous mobility data sources[C]//IEEE 32nd International Conference on Data Engineering. [S.l.]: IEEE, 2016: 978-989.
[78]	HAO T Y, ZHOU J B, CHENG Y S, et al. User identification in cyber-physical space: A case study on mobile query logs and trajectories[C]//Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. [S.l.]: ACM, 2016, 71: 1-4.
[79]	HAN X H, WANG L H, XU S J, et al. Linking social network accounts by modeling user spatiotemporal habits [C]//IEEE International Conference on Intelligence and Security Informatics. [S.l.]: IEEE, 2017: 19-24.
[80]	RIEDERER C, KIM Y, CHAINTREAU A, et al. Linking users across domains with location data: Theory and validation[C]//Proceedings of the 25th International Conference on World Wide Web. [S.l.]: ACM, 2016: 707-719.
[81]	HAN X H, WANG L H, XU L J, et al. Social media account linkage using user-generated geo-location data[C]//IEEE Conference on Intelligence and Security Informatics. [S.l.]: IEEE, 2016: 157-162.
[82]	QI M J, WANG Z Y, HE Z, et al. User identification across asynchronous mobility traijectories[J]. Sensors, 2019, 19(9): 2020.
[83]	JAIN P, KUMARAGURU P, JOSHI A. @ i seek ’fb. me’: Identifying users across multiple online social networks[C]//Proceedings of the 22nd international Conference on World Wide Web Companion. [S.l.]: ACM, 2013: 1259-1268.

[1]	张广胜, 康昭, 田玲. 面向网络安全治理的用户身份识别技术与挑战 . 电子科技大学学报, 2023, 52(3): 398-412. doi: 10.12178/1001-0548.2022106
[2]	刘楠, 张凤荔, 王瑞锦, 张志扬, 赖金山. 融合元路径学习和胶囊网络的社交媒体谣言检测方法 . 电子科技大学学报, 2022, 51(4): 608-614. doi: 10.12178/1001-0548.2021219
[3]	赵紫娟, 李小珂, 郭强, 杨凯, 刘建国. 基于LDA的复杂网络整体研究态势主题分析 . 电子科技大学学报, 2019, 48(6): 931-938. doi: 10.3969/j.issn.1001-0548.2019.06.019
[4]	朱军芳, 陈端兵, 周涛, 张千明, 罗咏劼. 网络科学中相对重要节点挖掘方法综述 . 电子科技大学学报, 2019, 48(4): 595-603. doi: 10.3969/j.issn.1001-0548.2019.04.018
[5]	邵鹏, 胡平. 复杂网络特殊用户对群体观点演化的影响 . 电子科技大学学报, 2019, 48(4): 604-612. doi: 10.3969/j.issn.1001-0548.2019.04.019
[6]	吴宗柠, 樊瑛. 复杂网络视角下国际贸易研究综述 . 电子科技大学学报, 2018, 47(3): 469-480. doi: 10.3969/j.issn.1001-0548.2018.03.023
[7]	朱为华, 刘凯, 闫小勇, 汪明, 吴金闪. 识别流网络关键节点的虚拟外界投入产出分析法 . 电子科技大学学报, 2018, 47(2): 292-297. doi: 10.3969/j.issn.1001-0548.2018.02.021
[8]	顾亦然, 朱梓嫣. 基于LeaderRank和节点相似度的复杂网络重要节点排序算法 . 电子科技大学学报, 2017, 46(2): 441-448. doi: 10.3969/j.issn.1001-0548.2017.02.020
[9]	苟智坚, 范明钰, 王光卫. 复杂网络中无信任边界限制的连续观点演化研究 . 电子科技大学学报, 2015, 44(5): 749-756. doi: 10.3969/j.issn.1001-0548.2015.05.019
[10]	尚可可, 许小可. 基于置乱算法的复杂网络零模型构造及其应用 . 电子科技大学学报, 2014, 43(1): 7-20. doi: 10.3969/j.issn.1001-0548.2014.01.002
[11]	汤蓉, 唐常杰, 徐开阔, 杨宁. 基于局部聚合的复杂网络自动聚簇算法 . 电子科技大学学报, 2014, 43(3): 329-335. doi: 10.3969/j.issn.1001-0548.2014.03.002
[12]	周涛, 张子柯, 陈关荣, 汪小帆, 史定华, 狄增如, 樊瑛, 方锦清, 韩筱璞, 刘建国, 刘润然, 刘宗华, 陆君安, 吕金虎, 吕琳媛, 荣智海, 汪秉宏, 许小可, 章忠志. 复杂网络研究的机遇与挑战 . 电子科技大学学报, 2014, 43(1): 1-5. doi: 10.3969/j.issn.1001-0548.2014.01.001
[13]	唐雪飞, 杨陈皓, 牛新征. 复杂网络链路危险度预测模型研究 . 电子科技大学学报, 2013, 42(3): 442-447. doi: 10.3969/j.issn.1001-0548.2013.03.024
[14]	王伟, 杨慧, 龚凯, 唐明, 都永海. 复杂网络上的局域免疫研究 . 电子科技大学学报, 2013, 42(6): 817-830.
[15]	许小可, 刘肖凡. 网络科学的发展新动力:大数据与众包 . 电子科技大学学报, 2013, 42(6): 802-805. doi: 10.3969/j.issn.1001-0548.2013.06.001
[16]	张昌利, 龚建国, 闫茂德. 基于复杂网络的社会化标签语义相似度分析 . 电子科技大学学报, 2012, 41(5): 642-648. doi: 10.3969/j.issn.1001-0548.2012.05.001
[17]	陈娟, 陆君安. 复杂网络中尺度研究揭开网络同步化过程 . 电子科技大学学报, 2012, 41(1): 8-16. doi: 10.3969/j.issn.1001-0548.2012.01.002
[18]	张昊, 陈超, 王长春. 基于空穴理论的复杂网络传染病传播控制 . 电子科技大学学报, 2011, 40(4): 491-496.
[19]	吕琳媛. 复杂网络链路预测 . 电子科技大学学报, 2010, 39(5): 651-661. doi: 10.3969/j.issn.1001-0548.2010.05.002
[20]	汪小帆, 刘亚冰. 复杂网络中的社团结构算法综述 . 电子科技大学学报, 2009, 38(5): 537-543. doi: 10.3969/j.issn.1001-0548.2009.05.007

点击查看大图

图(3) / 表(2)

计量

文章访问数: 7327
HTML全文浏览量: 2207
PDF下载量: 106
被引次数: 0

全文HTML

近年来，数据挖掘技术得到了迅猛发展，促使人们对自然和社会现象的认知逐渐从宏观层面深入到微观层面。节点作为微观存在单元，是复杂网络的重要组成元素。社交网络是复杂网络中部分节点构成的社交服务平台。人们可以利用社交网络来满足自身的需求。由于社交网络所提供的服务存在差异性，人们会有选择性地参与到各个社交网络中。社交网络从不同的视角来刻画用户的实际生活状态，是真实世界在虚拟网络上的映射。由于社交网络之间数据的不互通和用户隐私保护的问题，用户的完整数据获取较难，导致很难形成一个完善的用户社交网络图。识别出用户在不同社交网络中的多重身份，能够最大限度地整合与完善用户信息，从而为用户提供更加便捷的服务。

跨社交网络用户身份识别的本质就是找出多个虚拟账号背后的实体用户，该问题的解决对很多领域都存在着重要的意义，主要体现在以下4个方面：

1) 用户信息完善：单一社交网络中用户数据有限，如果能够得识别出用户的多个社交账号，就可以更加全面的掌握用户信息。

2) 个性化服务推荐：分析单一社交网络中的用户数据不能够很好地实现个性化服务推荐。如果将多个社交网络的用户数据进行融合，充分利用用户产生的信息，则推荐效果将会显著提高^[1-4]。

3) 数据挖掘：将具有链接性的多个社交账号进行数据挖掘，可以获取更多有研究价值的信息^[5-6]。

4) 提供科研支撑：用户之间的关系可构成复杂网络^[7]。复杂网络具有的特性在单个社交网络中被深入研究，当扩展到多个社交网络时，是否会产生新的特性需要进一步的研究。

这项技术给人们带来巨大收益的同时，也带来了泄露个人信息的危害。例如：恶意用户可以通过位置数据来推测正常用户的一些敏感信息^[8-10]。只有最大限度的减小用户隐私泄露，才能保证人们愿意将自己的数据提交给网络应用，进而更大限度的满足人们日常生活的需求。

目前关于跨社交网络用户身份识别的相关研究已经取得了一系列重要的成果，本文在复杂网络视角下，分别介绍了跨社交网络用户身份识别的概念、模型、分类、相似度计算方法、基本框架等方面的技术，详细分析了现有方法在用户身份识别方面的性能评价，通过对比分析现有方法的优缺点，探讨了跨社交网络用户身份识别未来的研究方向。

6. 未来研究方向

在大数据时代，获取信息的渠道越来越多，获取到的用户信息也越来越多样化。下面从用户数据权值分配、多维度数据融合和大规模用户身份识别3个方面来介绍未来的数据挖掘领域中用户身份识别的研究趋势。

6.1. 用户数据权值分配

用户的不同数据类型会对用户身份识别度产生不同的影响。因此，合理的权值分配是必不可少的。在确定用户数据中各个数据项的权重系数时，传统的专家主观赋权法和客观赋权法会存在鲁棒性差和普适性较差的问题。研究人员通过变种熵值来对每个待识别源账号进行识别，然而当应用场景发生改变时，数据的权重系数也需要重新分配，因此会产生较大的计算量。

在信息论中，熵值的大小反映了信息的无序化程度，其值越小，则含有的信息量就越多。因此，可用信息熵来评价所用数据的有序性及有效性。熵值是通过计算用户数据概率得到的，为了使用户数据概率的描述更加准确，为各个数据分配更有效的权重。在文献^[50]信息熵的基础上进一步计算用户数据的后验概率，对提高用户身份识别准确率有一定积极作用。通过将用户数据的后验概率和信息熵结合，可以有效地为相关数据进行权值分配。当识别的用户数量发生变化时，其对应的后验概率是不变的，因此，可以大大降低识别过程中的计算量。如果在后验概率的基础上进一步计算用户数据项的权重系数，进行二级权重分配会不会产生更好的识别效果，这也将是我们下一步的研究工作。

6.2. 基于多维度数据融合的用户身份识别

基于多维度数据融合的用户身份识别是指综合利用上述两种或三种用户数据类型进行识别用户。针对一些特殊机构，需要高准确度识别用户的身份，这时利用单一维度信息进行实体识别就具有一定的局限性。相关研究在利用网络拓扑结构进行实体识别时，弥补了非好友关系的作用，提出了亲密度函数来判别好友关系和非好友关系对识别用户的重要程度，并采用一些匹配算法将用户档案信息和包含好友关系和非好友关系在内的链接关系进行统一，用来解决实体识别问题。

此外，相似的研究还综合考虑了网络拓扑结构、用户档案信息、用户生成内容之间的信息交互^[83]，来实现社交网络的虚拟账号匹配。目前，融合多维度用户数据的跨社交网络用户身份识别的研究工作并很多，这方面的研究工作将是未来用户身份识别的重要组成部分。

此外，融合用户信息虽然可以提高用户身份识别的性能。然而，这样也给恶意攻击者提供了一条获取正常用户信息的途径，因此，从博弈论的角度来均衡好用户数据和隐私保护方面的问题也将是未来的研究热点。

6.3. 大规模用户身份识别

现有的跨社交网络用户身份识别在待识别用户数量过大时，识别性能会随着用户数量的增加而呈现降低的负相关趋势。复杂网络中的社区发现可以有效地将社交网络中大规模用户分为不同的社区，利用社区之间的关系识别用户身份也将是未来最有潜力的研究方向。

7. 结束语

本文在复杂网络的视角下，综述了近十多年来跨社交网络用户身份识别技术的研究现状。目前，用户身份识别的方法已经发展得比较成熟并在诸多领域中占有重要地位。这些方法可以帮助社交网络更好地为用户提供服务，并减少网络资源的消耗。本文首先对跨社交网络用户身份识别的概念和问题进行了阐述，然后从3个方面对现有的研究工作在模型、相似度计算方法、识别框架、研究现状以及性能评估等方面展开了比较和分析。最后，结合现有的研究工作对未来跨社交网络用户身份识别的研究方向进行了探讨。总之，跨社交网络用户身份识别属于大数据时代引领的新兴研究领域，仍然有许多关键性的问题需要进行深入细致的研究。

参考文献 (83)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

复杂网络视角下跨社交网络用户身份识别研究综述

doi: 10.12178/1001-0548.2019182

作者简介:
邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

Review of User Identification across Social Networks:The Complex Network Approach

计量

复杂网络视角下跨社交网络用户身份识别研究综述

doi: 10.12178/1001-0548.2019182

河南科技大学信息工程学院　河南洛阳　471023

作者简介:
邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

English Abstract

Review of User Identification across Social Networks:The Complex Network Approach

School of Information Engineering, Henan University of Science and Technology　Luoyang Henan　471023

全文HTML

1.1. 问题定义

1.2. 基于用户档案信息

1.3. 基于网络拓扑结构

1.4. 基于用户生成内容

2.1. 用户档案信息相似度计算

2.2. 网络拓扑结构相似度计算

2.3. 用户生成内容相似度计算

4.1. 基于用户档案信息的用户身份识别

4.1.1. 基于单属性的用户身份识别

4.1.2. 基于多属性的用户身份识别

4.2. 基于网络拓扑结构的用户身份识别

4.3. 基于用户生成内容的用户身份识别

5. 性能评估

6.1. 用户数据权值分配

6.2. 基于多维度数据融合的用户身份识别

6.3. 大规模用户身份识别

目录

期刊在线

编辑办公

友情链接

留言板

复杂网络视角下跨社交网络用户身份识别研究综述

doi: 10.12178/1001-0548.2019182

作者简介: 邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

Review of User Identification across Social Networks:The Complex Network Approach

计量

出版历程

复杂网络视角下跨社交网络用户身份识别研究综述

doi: 10.12178/1001-0548.2019182

河南科技大学信息工程学院 河南 洛阳 471023

作者简介: 邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

English Abstract

Review of User Identification across Social Networks:The Complex Network Approach

School of Information Engineering, Henan University of Science and Technology Luoyang Henan 471023

全文HTML

1.1. 问题定义

1.2. 基于用户档案信息

1.3. 基于网络拓扑结构

1.4. 基于用户生成内容

2.1. 用户档案信息相似度计算

2.2. 网络拓扑结构相似度计算

2.3. 用户生成内容相似度计算

4.1. 基于用户档案信息的用户身份识别

4.1.1. 基于单属性的用户身份识别

4.1.2. 基于多属性的用户身份识别

4.2. 基于网络拓扑结构的用户身份识别

4.3. 基于用户生成内容的用户身份识别

5. 性能评估

6.1. 用户数据权值分配

6.2. 基于多维度数据融合的用户身份识别

6.3. 大规模用户身份识别

目录

期刊在线

编辑办公

友情链接

作者简介:
邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

河南科技大学信息工程学院　河南洛阳　471023

作者简介:
邢玲(1978-)，女，博士，教授，主要从事多媒体语义挖掘、社交计算和隐私保护等方面的研究. E-mail: xingling_my@163.com

School of Information Engineering, Henan University of Science and Technology　Luoyang Henan　471023