结合主动学习的威胁情报IOC识别方法

罗琴; 杨根; 刘智; 唐宾徽

doi:10.12178/1001-0548.2022090

结合主动学习的威胁情报IOC识别方法

ICAL: A Threat Intelligence IOC Identification Method Combined with Active Learning

摘要

摘要: 威胁指标(IOC)作为网络威胁的特征描述，是识别和防御网络攻击的重要凭证。当前IOC识别主要依赖于神经网络模型，其效果取决于标注数据的数量。然而，目前IOC识别领域缺乏公认的数据集，且IOC的标注只能由安全专家手动完成，标注成本高，难以获取大量已标注数据。针对该问题，提出了一种结合主动学习的威胁情报IOC识别方法ICAL。该方法首先基于样本的代表性选择初始样本进行人工标注，然后基于聚类假设对聚类样本进行伪标注，最后基于样本的不确定性继续迭代标注，直到满足终止条件。使用CNNPLUS作为分类模型，在自构建的威胁情报数据集上进行实验。结果表明，相比于传统IOC自动识别策略，ICAL的识别准确率达到94.2%、召回率达到94.1%，同时减少了58%的人工标注工作量，具有较高的实用价值。

Abstract: Indicators of compromise (IOC), as behavioral descriptions of cyber threats, are important credentials for identifying and defending against cyberattacks. The current IOC recognition mainly adopts the deep neural network training model, and its effect depends on a large amount of training data. However, there is currently a lack of recognized datasets in the field of IOC recognition. IOC can only be manually labeled by security experts, the labeling cost is high, and it is difficult to obtain a large amount of labeling data. To solve this problem, we propose a threat intelligence IOC identification method with active learning, called ICAL (IOC identification combined with active learning). The method first selects the initial samples for manual labeling according to the representativeness of the samples; then it pseudo-labels the clustered samples according to the clustering hypothesis; finally, it continues to iteratively label the samples according to the uncertainty of the samples until the termination conditions are satisfied. Using CNNPLUS as the classification model, experiments are performed on the self-built threat intelligence dataset. The results show that ICAL reduces the labeling workload by nearly 58% compared with the traditional IOC automatic identification strategies, and the recognition accuracy rate reaches 94.2%. ICAL reduces the amount of data labeling in IOC identification with strong practicability.

HTML全文

参考文献(20)

施引文献

资源附件(0)

结合主动学习的威胁情报IOC识别方法

ICAL: A Threat Intelligence IOC Identification Method Combined with Active Learning

期刊在线

编辑办公

友情链接