入侵数据特征并行选择算法
Parallel Algorithm of Feature Reduction in Intrusion Data
-
摘要: 用知识的条件粗糙熵定义了特征的相对重要性,提出了一种基于条件粗糙熵的入侵数据特征并行选择算法。算法首先将入侵数据决策表划分成多个子表,然后利用特征的相对重要性对各子表并行求解,最后以子表选出的局部特征为基础求得原决策表的约简。实验表明,该算法适用于大规模的入侵数据集,选出的特征属性不仅可以大大减少数据在存储、分析以及各组件共享中的代价,还能够保持并提高入侵分类的准确性。Abstract: This paper defines the importance of attack features using conditional rough entropy of knowledge and presentes a parallel algorithm of optimal feature selection in intrusion data based on conditional rough entropy. The algorithm divides the decision table of intrusion data into several sub-tables, and then the conditional rough entropy is used for the parallel computing of the sub-tables. Finally, the original decision table reduction is obtained based on the part reduction results from the sub-tables. The proposed algorithm has good performance and is good at dealing with the huge volume of data. The experimental results show that it is effective to reduce the storage requirements of the dataset and the computational cost, and it can increase the detection speed and without sacrificing the detection correctness by using the reduced feature subset.