多属性泛化的K-匿名算法

K-Anonymity Algorithm Based on Multi Attribute Generalization

  • 摘要: 针对现有的K-匿名模型中存在泛化属性选取不唯一和数据过度泛化的问题,提出多属性泛化的K-匿名算法。在K-匿名模型实现的过程中,引入属性近似度概念,定量刻画准标识符属性的离散程度,进而确定泛化的准标识符属性;同时采用广度优先泛化的方法,避免数据被过度泛化,最终实现数据表的K-匿名要求。实验结果表明,多属性泛化的K-匿名模型可以提高泛化后数据精度,其处理效率和Datafly算法相当。该算法有效地解决了取值最多准标识符属性存在多个时的泛化属性选取问题,并且防止属性被过度泛化,提高数据的可用性。

     

    Abstract: Aiming at the major issues for data over-generalization and no unique attributes of K-anonymity model, a modified K-anonymity algorithm based on multiple attributes generalization is proposed in this paper. The conception of attribute approximation degree is introduced which describes the discrete degree of quasi-identifiers, and determines the candidate quasi-identifier attribute to be generalized. In the meantime, breadth-first generalization is exploited to avoid over-generalization and meets the K-anonymity requirements ultimately. The experimental results show that the new K-anonymity algorithm based on multiple attribute generalization can improve data precision and its efficiency is equal to Datafly algorithm. The proposed algorithm can effectively solve the issue of generalization attribute selecting when quasi-identifiers are not unique, the over-generalization of quasi-identifiers attributes can be avoided, and the usability of data can be improved.

     

/

返回文章
返回