基于流相似性的两阶段P2P僵尸网络检测方法

牛伟纳; 张小松; 孙恩博; 杨国武; 赵凌园

doi:10.3969/j.issn.1001-0548.2017.06.019

基于流相似性的两阶段P2P僵尸网络检测方法

doi: 10.3969/j.issn.1001-0548.2017.06.019

1.
电子科技大学网络空间安全研究中心成都 611731
2.
电子科技大学计算机科学与工程学院成都 611731

基金项目:

国家自然科学基金 61572115

国家自然科学基金 61502086

国家自然科学基金 61402080

四川省重大基础研究课题 2016JY0007

详细信息

作者简介:
牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

中图分类号: TP311

Two Stage P2P Botnet Detection Method Based on Flow Similarity

1.
Center for Cyber Security, University of Electronic Science and Technology of China Chengdu 611731
2.
School of Computer Science and Engineering, University of Electronic Science and Technology of China Chengdu 611731

摘要: 僵尸网络利用诸如蠕虫、木马以及rootkit等传统恶意程序，进行分布式拒绝服务攻击、发送钓鱼链接、提供恶意服务，已经成为网络安全的主要威胁之一。由于P2P僵尸网络的典型特征是去中心化和分布式，相对于IRC、HTTP等类型的僵尸网络具有更大的检测难度。为了解决这一问题，该文提出了一个具有两阶段的流量分类方法来检测P2P僵尸网络。首先，根据知名端口、DNS查询、流计数和端口判断来过滤网络流量中的非P2P流量；其次基于数据流特征和流相似性来提取会话特征；最后使用基于决策树模型的随机森林算法来检测P2P僵尸网络。使用UNB ISCX僵尸网络数据集对该方法进行验证，实验结果表明，该两阶段检测方法比传统P2P僵尸网络检测方法具有更高的准确率。
- 僵尸网络检测 /
- 会话特征 /
- 流相似性 /
- P2P流量识别
Abstract: The botnet has been one of the most common threats to the network security since it exploits multiple malicious codes like worm, Trojans, Rootkit, etc. toperform thedenial-of-service attack, send phishing links, and provide malicious services. Peer-to-peer (P2P) botnet is more difficult to be detected compared with IRC, HTTP and other types of botnets because it has typical features of the centralization and distribution. To solve these problems, we propose an effective two-stage traffic classification method to detect P2P botnet traffic based on both non-P2P traffic filtering mechanism and machine learning techniques on conversation features. At the first stage, the non-P2P packages are filtered to reduce the amount of network traffic, according to well-known ports, DNS query, and flow counting. At the second stage, the conversation features based on data flow features and flow similarity are extracted. Finally, the P2P botnet is detected by using Random Forest based on the decision tree model. Experimental evaluations on UNB ISCX botnet dataset shows that our two-stage detection method has a higher accuracy than traditional P2P botnet detection methods.
- botnet detection /
- conversation feature /
- flow similarity /
- P2P traffic identification

图 1 所提出方法的架构

下载: 全尺寸图片幻灯片

图 2 非P2P流量过滤

下载: 全尺寸图片幻灯片

图 3 不同阈值下的Web流量识别

下载: 全尺寸图片幻灯片

图 4 分类树的个数与深度对检测率的影响

下载: 全尺寸图片幻灯片

图 5 P2P僵尸网络检测结果

下载: 全尺寸图片幻灯片

表 1 常用应用程序及其对应端口

应用程序	端口号
SSH	22
Telnet	23
MAIL	25, 110, 143, 465, 220, 993, 995
NetBios	125, 137, 139, 445
Remote	3 389
FTP	20, 21
NTP	123

下载: 导出CSV

表 2 会话特征

特征值	特征值的说明
avg_dura	相同会话中不同网络流的持续总时间的均值
std_dura	相同会话中不同网络流的持续总时间的标准差
min_dura	相同会话中不同网络流的持续总时间的最小值
max_dura	相同会话中不同网络流的持续总时间的最大值
avg_f(b)int	相同会话中不同网络流的上行(下行)数据包传输的平均间隔时间
max_f(b)pl	相同会话中不同网络流的上行(下行)传输数据包长度的最大值的均值
avg_f(b)pl	相同会话中不同网络流的上行(下行)传输数据包长度的均值的均值
min_f(b)pl	相同会话中不同网络流上行(下行)传输数据包长度的最小值的均值
std_avg_f(b)pl	相同会话中不同网络流上行(下行)传输数据包长度的平均值的标准差
avg_f(b)pen	相同会话中不同网络流上行(下行)传输的有效数据包个数的平均值
std_avg_f(b)pen	相同会话中不同网络流上行(下行)传输的有效数据包个数的标准差
avg_f(b)pb	相同会话中不同网络流上行(下行)传输的总字节数的平均值
std_f(b)pb	相同会话中不同网络流上行(下行)传输的总字节数的标准差

下载: 导出CSV

[1]	ZHU Z, LU G, CHEN Y, et al. Botnet research survey[C]//IEEE International Computer Software and Applications Conference. Turku:IEEE, 2008:967-972. http://ieeexplore.ieee.org/document/4591703/
[2]	LIVADAS C, WALSH R, LAPSLEY D, et al. Using machine learning techniques to identify botnet traffic[C]//31st IEEE Conference on Local Computer Networks. Tampa:IEEE, 2006:967-974. http://www.mendeley.com/catalog/using-machine-learning-techniques-identify-botnet-traffic-12/
[3]	CAI T, ZOU F. Detecting HTTP botnet with clustering network traffic[C]//20128th International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM). Shanghai:IEEE, 2012:1-7. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6478491
[4]	ZEIDANLOO H R, MANAF A B A, AHMAD R B, et al. A proposed framework for P2P botnet detection[J]. International Journal of Engineering and Technology, 2010, 2(2):161-168. https://www.researchgate.net/publication/310793596_A_Proposed_Framework_for_P2P_Botnet_Detection
[5]	HADDADI F, CONG D L, PORTER L, et al. On the effectiveness of different botnet detection approaches[C]//International Conf on Information Security Practice and Experience. Beijing:ACM, 2015:121-135. https://www.researchgate.net/publication/278667543_On_the_Effectiveness_of_Different_Botnet_Detection_Approaches
[6]	WANG J S, LIU F, ZHANG J. Botnet detecting method based on group-signature filter[J]. Journal on Communications, 2010, 31(2):29-35. https://www.researchgate.net/publication/291313592_Botnet_detecting_method_based_on_group-signature_filter
[7]	ZHANG J, PERDISCI R, LEE W, et al. Detecting stealthy P2P botnets using statistical traffic fingerprints[J]. Journal of Child Psychology & Psychiatry, 2011, 14(14):271-282. doi: 10.1109/DSN.2011.5958212
[8]	ABDULLAH R S, ABDOLLAH M F, NOH Z A M, et al. Preliminary study of host and network-based analysis on P2P botnet detection[C]//TIME-E':International Conference on Technology, Informatics, Management, Engineering & Environment. Bandung:IEEE, 2013:105-109. http://ieeexplore.ieee.org/document/6611973/
[9]	ZHAO Y. The novel approach of P2P botnet node-based detection and applications[J]. Journal of Chemical and Pharmaceutical Research, 2014, 6(7):1055-1063. https://www.researchgate.net/publication/297510433_The_novel_approach_of_P2P_Botnet_node-based_detection_and_applications
[10]	ZHAO D, TRAORE I, SAYED B, et al. Botnet detection based on traffic behavior analysis and flow intervals[J]. Computers & Security, 2013, 39(4):2-16. http://www.sciencedirect.com/science/article/pii/S0167404813000837
[11]	ZHANG J, PERDISCI R, LEE W, et al. Building a scalable system for stealthy P2P-Botnet detection[J]. IEEE Transactions on Information Forensics & Security, 2014, 9(1):27-38. http://ieeexplore.ieee.org/document/6661360/
[12]	SHARIFNYA R, ABADI M. Dfbotkiller:Domain-flux botnet detection based on the history of group activities and failures in DNS traffic[J]. Digital Investigation, 2015, 12(12):15-26. http://www.sciencedirect.com/science/article/pii/S1742287614001182
[13]	BUCZAK A L, GUVEN E. A survey of data mining and machine learning methods for cyber security intrusion detection[J]. IEEE Communications Surveys & Tutorials, 2015, 18(2):1153-1176. http://ieeexplore.ieee.org/document/7307098/
[14]	YIN C, AWLLA A H, YIN Z, et al. Botnet detection based on genetic neural network[J]. International Journal of Security and Its Applications, 2015, 9(11):97-104. doi: 10.14257/ijsia
[15]	CONSTANTINOU F, MAVROMMATIS P. Identifying known and unknown peer-to-peer traffic[C]//IEEE International Symposium on Network Computing & Applications. Cambridge:IEEE, 2006:93-102. http://dl.acm.org/citation.cfm?id=1158211
[16]	MADHUKAR A, WILLIAMSON C. A longitudinal study of P2P traffic classification[C]//MASM' 06:14th IEEE International Symposium on Modeling, Analysis, and Simulation. Monterey:IEEE, 2006:179-188. http://dl.acm.org/citation.cfm?id=1158127
[17]	KARAGIANNIS T, BROIDO A, FALOUTSOS M, et al. Transport layer identification of P2P traffic[C]//ACM SIGCOMM Conference on Internet Measurement. Taormina:ACM, 2004:121-134. http://dl.acm.org/citation.cfm?id=1028804
[18]	WANG C, ZHOU X, YOU F, et al. Design of P2P traffic identification based on DPI and DFI[C]//CNMT' 09:Computer Network and Multimedia Technology. Wuhan:IEEE, 2009:1-4. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=5374577
[19]	BEIGI E B, JAZI H H, STAKHANOVA N, et al. Towards effective feature selection in machine learning-based botnet detection approaches[C]//CNS' 14:18th IEEE Conference on Communications and Network Security. San Francisco:IEEE, 2014:247-255. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=6997492

[1]	周仁爽, 陈尧森, 郭兵, 沈艳, 李杰, 王炜. 基于模块相似性的超分网络剪枝 . 电子科技大学学报, 2022, 51(1): 108-116. doi: 10.12178/1001-0548.2021126
[2]	程静静, 樊瑛. 基于网络相似性测度的国际贸易产品分类 . 电子科技大学学报, 2021, 50(2): 303-310. doi: 10.12178/1001-0548.2020252
[3]	陈兴蜀, 陈敬涵, 邵国林, 曾雪梅. 基于会话流聚合的隐蔽性通信行为检测方法 . 电子科技大学学报, 2019, 48(3): 388-396. doi: 10.3969/j.issn.1001-0548.2019.03.013
[4]	冯朝胜, 秦志光, 袁丁. 移动P2P网络中的病毒传播建模 . 电子科技大学学报, 2012, 41(1): 98-103. doi: 10.3969/j.issn.1001-0548.2012.01.019
[5]	邱元杰, 刘心松. P2P文件系统的高效写机制 . 电子科技大学学报, 2011, 40(4): 587-591. doi: 10.3969/j.issn.1001-0548.2011.04.023
[6]	任立勇, 雷明, 张磊. P2P应用层数据流量优化 . 电子科技大学学报, 2011, 40(1): 111-115. doi: 10.3969/j.issn.1001-0548.2011.01.021
[7]	刘丹, 李毅超, 余三超, 陈沁源. 面向P2P网络的DDoS攻击抑制方法 . 电子科技大学学报, 2011, 40(1): 85-89.
[8]	王勇, 黄科瑞, 秦志光, 吴波. 时空相关性的P2P网络信任模型 . 电子科技大学学报, 2011, 40(1): 80-84. doi: 10.3969/j.issn.1001-0548.2011.01.015
[9]	陈爱国, 徐国爱, 杨义先. 评价离散度敏感的P2P交易系统信任模型 . 电子科技大学学报, 2010, 39(3): 425-429. doi: 10.3969/j.issn.1001-0548.2010.03.022
[10]	杨寿保, 许通, 胡云. 用户需求适应的P2P超级节点选取机制 . 电子科技大学学报, 2009, 38(3): 385-388. doi: 10.3969/j.issn.1001-0548.2009.03.016
[11]	冯朝胜, 秦志光, 劳伦斯·库珀特, 罗瑞莎·托卡库克. P2P文件共享网络中被动蠕虫传播建模与分析 . 电子科技大学学报, 2009, 38(2): 262-265. doi: 10.3969/j.issn.1001-0548.2009.02.25
[12]	任超, 李战怀, 张英. 异构P2P网络的分布式查询协议 . 电子科技大学学报, 2009, 38(1): 108-112.
[13]	张纯容, 王忠, 周庆标, 施晓秋. 混合型P2P网络中的基于Gossip的动态自适应算法 . 电子科技大学学报, 2008, 37(5): 757-760.
[14]	牛新征, 佘堃, 路纲, 周明天. 基于RBAC技术的P2P安全机制的研究 . 电子科技大学学报, 2007, 36(3): 493-495,499.
[15]	谭浩, 杨敏, 李心怡, 刘化民. 移动计算环境中的P2P中间件的研究 . 电子科技大学学报, 2007, 36(6): 1342-1344.
[16]	廖军, 谭浩, 刘韵洁. 深度业务感知与电信网P2P业务 . 电子科技大学学报, 2007, 36(6): 1338-1341.
[17]	吴春江, 周世杰, 肖春静, 吴跃. BitTorrent网络中的P2P蠕虫传播仿真分析 . 电子科技大学学报, 2007, 36(6): 1206-1210.
[18]	侯孟书, 卢显良, 周旭, 詹川. 非结构化P2P系统的路由算法 . 电子科技大学学报, 2005, 34(1): 105-108.
[19]	侯孟书, 卢显良, 任立勇, 吴劲. 基于确定性理论的P2P系统信任模型 . 电子科技大学学报, 2005, 34(6): 806-808.
[20]	赵继东, 王晓斌, 张玮, 曾家智. 一种多级P2P文件交换系统架构的研究 . 电子科技大学学报, 2004, 33(4): 430-433.

点击查看大图

图(5) / 表(2)

计量

文章访问数: 4081
HTML全文浏览量: 1253
PDF下载量: 147
被引次数: 0

全文HTML

当今时代，网络环境错综复杂，安全问题日益突出。由于僵尸网络的C&C服务器具有更高的隐蔽性，僵尸程序经常被实施大规模网络攻击的黑客所采用，几乎所有的DDoS攻击和80%~90%的垃圾邮件攻击都是由僵尸网络发起的^[1]。因此，僵尸网络已成为网络安全中不容忽视的问题。

早期的僵尸网络主要采用IRC^[2]和HTTP^[3]作为通信协议，具有单点失效问题，很容易被检测和摧毁。如今，大多数僵尸网络使用P2P技术来创建C&C(命令和控制)机制以增强网络通信隐蔽性^[4]。相比采用IRC和HTTP协议的僵尸网络，不具有中心节点的P2P僵尸网络具有更大的威胁性和隐蔽性。所以，P2P僵尸网络越来越受到攻击者的青睐，P2P僵尸网络检测^[5]也成为安全领域的研究热点。

目前，P2P应用已经引起了互联网流量爆炸式的增长，这对数据存储以及实时分析来讲都是一个巨大的挑战。因此，在检测P2P僵尸网络的早期，对网络中的非P2P流量进行过滤就显得尤为重要。

本文针对P2P僵尸网络提出一种两阶段的检测方法：第一阶段基于端口判断、DNS查询以及会话中数据流计数来过滤非P2P流量；第二阶段基于会话特征来识别P2P僵尸网络，其中本文使用基于会话特征的检测方法有效降低了需要分析的数据条数。然后采用基于决策树模型的随机森林算法对流量进行分类识别，从而检测出僵尸网络。同时，在UNB数据集上将本文算法与另外两种已有算法做了实验对比和分析，实验结果表明随机森林算法对P2P僵尸网络的检测准确率更高。

1. 相关工作

根据检测策略的不同，P2P僵尸网络检测方法包括以下4种类型：基于特征码^[6-7]、基于主机行为^[8-9]、基于流行为特征^[10]和基于流相似性^[11]。

1.1. 基于特征码的检测

基于特征码的检测^[6-7]是通过分析僵尸网络应用程序或者通信流量提取其特征(如MD5、PE头格式等)来设计检测规则。但是最初的检测规则将会在僵尸网络应用程序改变它们的通信方式和数据包格式之后失效。与此同时，如果当前使用的特征码不能有效表示僵尸程序的特征，该检测策略就会有较高的误报率。

1.2. 基于主机行为的检测

基于主机行为的检测^[8-9]是通过在一个可控环境中监测主机中进程、文件、网络连接、注册表内容的更改来检测僵尸程序。该方法不能检测新型和变种的僵尸网络程序，如攻击者可以使用诸如rootkit、反调试等新的攻击和隐藏技术躲避此种检测策略。

1.3. 基于流行为特征的检测

基于流行为特征^[10]的检测主要是在僵尸网络C&C控制阶段使用^[12]，因为C&C控制阶段的流量与正常的网络流量在流特征与通信规律上存在差异，这些差异包括平均数据包大小、周期性连接等。因此，可以结合机器学习^[13]、神经网络^[14]对僵尸网络实时监控。

基于流行为特征的僵尸网络检测方法主要分析如下两个特征：连接失败率和流特征。其中，流特征又包括上下行数据包的数量，上下行传输字节的大小，上下行数据包的平均长度、最大长度、平均方差，数据流的持续时间以及在一个流中已加载的数据包的总长度。

这种方法具有较高的检测率，因为它不依赖于僵尸网络的类别来提取流的共同特征向量。所以，该检测策略广受流量分析领域专家学者的关注。在高速、复杂、多变的网络环境中，决定检测效率和准确率的主要因素是提取的特征和使用的分类策略。

1.4. 基于流相似性的检测

研究表明^[11]，加入同一个僵尸网络的僵尸主机之间的通信行为具有相似性。所以，P2P僵尸网络流量识别可以采用如下方案：首先对获取到的网络流量进行分析处理，并提取特征；然后结合聚类算法对上一阶段提取的流数据进行聚簇；最后分析判断P2P僵尸网络流量位于哪一个簇中。

该方案是通过设置阈值的方式来提高检测准确率，无需使用现有的僵尸网络数据流进行训练。但是，如果当前网络中只有一台僵尸主机，或者在已捕获的数据包中未发现不同僵尸主机的通信流量，此方法也不会有太大效果。

4. 结束语

本文提出了一种基于会话特征的P2P僵尸网络检测方法，首先分别从包、流和会话层面过滤非P2P流量，然后使用基于会话特征的有监督的机器学习算法检测P2P僵尸网络，该方法同时结合基于流特征的检测方法与基于流相似性的检测方法的优点。最后通过使用公开的数据集验证所提方法的有效性，实验结果表明，该方法能高效地检测P2P僵尸网络流量。

未来将致力于非P2P流量过滤算法的优化，进一步提升其性能。此外，希望将基于会话特征的检测方法推广到其他类型僵尸网络的检测与分类中。

参考文献 (19)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于流相似性的两阶段P2P僵尸网络检测方法

doi: 10.3969/j.issn.1001-0548.2017.06.019

作者简介:
牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

Two Stage P2P Botnet Detection Method Based on Flow Similarity

计量

基于流相似性的两阶段P2P僵尸网络检测方法

doi: 10.3969/j.issn.1001-0548.2017.06.019

1. 电子科技大学网络空间安全研究中心成都 611731

2. 电子科技大学计算机科学与工程学院成都 611731

作者简介:
牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

English Abstract

Two Stage P2P Botnet Detection Method Based on Flow Similarity

1. Center for Cyber Security, University of Electronic Science and Technology of China Chengdu 611731

2. School of Computer Science and Engineering, University of Electronic Science and Technology of China Chengdu 611731

全文HTML

1.1. 基于特征码的检测

1.2. 基于主机行为的检测

1.3. 基于流行为特征的检测

1.4. 基于流相似性的检测

2.1. 非P2P流量过滤

2.2. 会话特征提取

2.2.1. 会话中流持续时间

2.2.2. 会话中流的分布

2.3. 分类器选择

3.1. 实验设置

3.2. 过滤非P2P流量

3.3. 基于会话特征的识别

目录

期刊在线

编辑办公

友情链接

留言板

基于流相似性的两阶段P2P僵尸网络检测方法

doi: 10.3969/j.issn.1001-0548.2017.06.019

作者简介: 牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

Two Stage P2P Botnet Detection Method Based on Flow Similarity

计量

出版历程

基于流相似性的两阶段P2P僵尸网络检测方法

doi: 10.3969/j.issn.1001-0548.2017.06.019

1. 电子科技大学网络空间安全研究中心 成都 611731 2. 电子科技大学计算机科学与工程学院 成都 611731

作者简介: 牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

English Abstract

Two Stage P2P Botnet Detection Method Based on Flow Similarity

1. Center for Cyber Security, University of Electronic Science and Technology of China Chengdu 611731 2. School of Computer Science and Engineering, University of Electronic Science and Technology of China Chengdu 611731

全文HTML

1.1. 基于特征码的检测

1.2. 基于主机行为的检测

1.3. 基于流行为特征的检测

1.4. 基于流相似性的检测

2.1. 非P2P流量过滤

2.2. 会话特征提取

2.2.1. 会话中流持续时间

2.2.2. 会话中流的分布

2.3. 分类器选择

3.1. 实验设置

3.2. 过滤非P2P流量

3.3. 基于会话特征的识别

目录

期刊在线

编辑办公

友情链接

作者简介:
牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

1. 电子科技大学网络空间安全研究中心成都 611731

2. 电子科技大学计算机科学与工程学院成都 611731

作者简介:
牛伟纳(1990-), 女, 博士, 主要从事网络攻击检测与软件脆弱性方面的研究

1. Center for Cyber Security, University of Electronic Science and Technology of China Chengdu 611731

2. School of Computer Science and Engineering, University of Electronic Science and Technology of China Chengdu 611731