Matrix Factorization Recommendation Algorithm for Differential Privacy Protection

WANG Yong; RAN Xun; YIN En-ming; WANG Li

doi:10.12178/1001-0548.2020359

Collaborative filtering techniques require tremendous amount of personal data to provide personalized recommendation services, which has caused the rising concerns about the risk of privacy leakage. Most existed methods for implementing privacy protection in recommender systems are prone to introduce excessive noises, which significantly degrades the recommendation quality. To address this problem, a matrix factorization algorithm satisfying differential privacy is proposed. The method first converts the matrix factorization problem into two alternate optimization problems, in which user latent factors and item latent factors are optimized respectively. Then a genetic algorithm is introduced to solve these two optimization problems, in which the enhanced exponential mechanism is applied into the individual selection and a novel mutation operation is designed based on the idea of finding important latent factors. Theoretical analysis and experimental results show that the algorithm can not only provide strong differential privacy protection for user data, but also ensure the accuracy of recommendations. Therefore, it has good application value in recommender systems.

HTML

推荐系统是当前互联网商家为用户提供个性化信息服务的主要技术手段之一。协同过滤作为一类主流的推荐算法，它利用用户对项目的历史评价信息来预测用户对未知项目的好恶并据此进行推荐。协同过滤技术需要使用大量用户数据，存在用户个人隐私泄漏的风险^[1]。在基于邻居的协同过滤技术中，攻击者可以通过追踪邻居用户的推荐列表变化，推测目标用户对项目的评分^[2]；在基于矩阵分解的协同过滤技术中，由于分解所得的隐因子矩阵携带数据信息，可能被攻击者利用，通过重构攻击等方式推断出用户的评分数据^[3-4]。遭泄露的评分可能被进一步用于推测出用户的性别、年龄等信息，侵犯用户隐私^[5]。如果用户出于安全考虑拒绝提供部分信息，则可能会导致推荐系统性能下降，甚至无法提供个性化服务。因此，非常有必要在推荐系统中考虑对用户信息进行隐私保护。

文献[6]提出了差分隐私的定义，为在推荐系统中实施有效隐私保护提供了良好的理论基础。文献[7]将差分隐私保护引入协同过滤技术中，通过扰动项目协方差矩阵实现差分隐私保护。文献[8]将差分隐私应用到基于邻居的协同过滤推荐算法中，通过在邻居选择和相似性度量过程中加入噪音，实现隐私保护。文献[9]提出了两种分别对原始评分和用户相似性度量过程添加Laplace噪音的隐私保护方案。

针对基于矩阵分解的推荐算法，文献[10]在考虑推荐系统不可信的情况下，扰动矩阵分解算法的目标函数，将实施了隐私保护的项目隐因子矩阵用于推荐任务。文献[11]假设用户有不同程度的隐私保护需求，基于概率矩阵分解提出一种个性化的差分隐私推荐算法。文献[12]通过对目标函数进行扰动，提出了基于联合优化的隐私矩阵分解方案。文献[13-14]将差分隐私保护应用到矩阵分解推荐算法中，设计了3种添加噪音的方式，即分别在输入信息中、训练过程中和输出信息中添加噪音。依据这种思想，文献[15]在SVD++模型上设计了3种差分隐私保护模型。目前的工作大多通过对矩阵分解过程的各种结果(如梯度、隐因子矩阵、目标函数)加入噪声项以实现差分隐私保护，这类方案存在如下问题：1) 噪声较大。较高的隐私保护需求或敏感度会使噪声分布的方差增大，导致加入过大的噪声；2)不具通用性。加噪方法可能导致最终解在有约束问题上不可行；3)没有考虑隐因子的重要程度，影响了算法求解效率。

针对上述问题，本文将遗传算法引入矩阵分解任务，使得差分隐私保护可以通过扰动候选解的选择过程实现，而不依赖于上述加入噪声的方法^[16]。此外，遗传算法中解的搜索将在可行域内进行，易于延伸到带约束的矩阵分解问题。然而，直接应用遗传算法存在如下困难：首先，矩阵分解属非凸问题且参数量大，求解难度高；其次，如何减小隐私保护机制引入的扰动也是重要挑战。为解决上述问题，本文改进了遗传算法的关键步骤，提出一种满足差分隐私保护的矩阵分解方案。本文的主要贡献为：1)将矩阵分解转化为两个交替进行的用户隐因子和项目隐因子优化问题，有效克服了求解过程中存在的解空间高维性和优化中的非凸性问题。2)考虑用户或项目对隐因子的不同偏重，重新设计了遗传算法的变异过程，提升解的搜索效率；在此基础上利用增强指数机制减轻了算法受扰动程度，更好地实现了隐私保护水平和算法效用之间的平衡。

5. 结束语

本文针对推荐系统中的隐私问题提出了一种满足差分隐私保护的矩阵分解算法。该算法将矩阵分解问题转化为两个交替进行的优化问题。在遗传算法的选择操作中采用了增强指数机制使得整个矩阵因子分解的过程满足差分隐私保护。基于搜索重要隐因子的思想，设计了遗传算法的变异操作，从正反两个方向变异隐因子，不仅提高了算法的效率而且有效增强了解的性能。在两个标准数据集上的实验结果表明本文算法能更好地平衡隐私性和推荐的准确性，尤其在隐私保护需求较高的条件下，仍然可以取得良好的推荐效果，具有很好的应用潜力。

Reference (19)

[1]	KENTHAPADI K, MIRONOV I, THAKURTA A G. Privacy-preserving data mining in industry[C]//Proceedings of the 12th ACM International Conference on Web Search and Data Mining. [S.l.]: ACM, 2019: 840-841.
[2]	CALANDRINO J A, KILZER A, NARAYANAN A, et al. " You might also like: " Privacy risks of collaborative filtering[C]//2011 IEEE Symposium on Security and Privacy. [S.l.]: IEEE, 2011: 231-246.
[3]	FREDRIKSON M, JHA S, RISTENPART T. Model inversion attacks that exploit confidence information and basic countermeasures[C]//Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. [S.l.]: ACM, 2015: 1322-1333.
[4]	WEINSBERG U, BHAGAT S, IOANNIDIS S, et al. BlurMe: Inferring and obfuscating user gender based on ratings[C]//Proceedings of the 6th ACM Conference on Recommender Systems. [S.l.]: ACM, 2012: 195-202.
[5]	NIKOLAENKO V, IOANNIDIS S, WEINSBERG U, et al. Privacy-preserving matrix factorization[C]//Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security. [S.l.]: ACM, 2013: 801-812.
[6]	DWORK C. Differential privacy[C]//Proceedings of the 33rd Int Colloquium on Automata, Languages and Programming. Binlin: Springer, 2006: 1-12.
[7]	MCSHERRY F, MIRONOV I. Differentially private recommender systems: Building privacy into the net[C]// Proceedings of the 2009 ACM SIGKDD Internationl Conference on Knowledge Discovery and Data Mining. New York: ACM, 2009: 627-636.
[8]	ZHU Tian-qing, REN Yong-li, ZHOU Wan-lei, et al. An effective privacy preserving algorithm for neighborhood-based collaborative filtering[J]. Future Generation Computer Systems, 2014, 36: 142-155.
[9]	YANG Jing, LI Xiao-ye, SUN Zhen-long, et al. A differential privacy framework for collaborative filtering[J]. Mathematical Problems in Engineering, 2019, DOI: 10.1155/2019/1460234.
[10]	HUA Jing-yu, XIA Chang, ZHONG Sheng. Differential private matrix factorization[C]//Proceedings of the 24th International Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015: 1763-1770.
[11]	ZHANG Shun, LIU Lai-xiang, CHEN Zhi-li, et al. Probabilistic matrix factorization with personalized differential privacy[J]. Knowledge-Based Systems, 2019, 183: 104864.
[12]	ZHANG F, LEE V E, CHOO K K R. Jo-DPMF: Differentially private matrix factorization learming through joint optimization[J]. Information Sciences, 2018, 467: 271-281.
[13]	FRIEDMAN A, BERKOVSKY S, KAAFAR M A. A differential privacy framework for matrix factorization recommender systems[J]. User Modeling and User-Adapted Interaction, 2016, 26(5): 1-34.
[14]	BERLIOZ A, FRIEDMAN A, KAAFAR M A, et al. Applying differential privacy to matrix factorization[C]// Proceedings of the 9th ACM Conference on Recommender Systems. [S.l.]: ACM, 2015: 107-114.
[15]	鲜征征, 李启良, 黄晓宇, 等. 基于差分隐私和SVD++的协同过滤算法[J]. 控制与决策, 2019, 34(1): 43-54.	XIAN Zheng-zheng, LI Qi-liang, HUANG Xiao-yu, et al. Collaborative filtering via SVD++ with differential privacy[J]. Control and Decision, 2019, 34(1): 43-54.
[16]	ZHANG Jun, XIAO Xiao-kui, YANG yin, et al. PrivGene: Differentially private model fitting using genetic algorithms[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. [S.l.]: ACM, 2013: 665-676.
[17]	KOREN Y, BELL R, VOLINSKY C. Matrix factorization techniques for recommender systems[J]. Computer, 2009, 42(8): 30-37.
[18]	MCSHERRY F, TALWAR K. Mechanism design via differential privacy[C]//The 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07). [S.l.]: IEEE, 2007: 94-103.
[19]	鲜征征, 李启良, 李改, 等. 差分隐私在协同过滤算法中的应用研究[J]. 计算机科学, 2017(5): 81-88.	XIAN Zheng-zheng, LI Qi-liang, LI Gai, et al. Research on application of differential privacy in collaborative filtering algorithms[J]. Computer Science, 2017(5): 81-88.

属性名	Movielens100K	YahooMusic
用户数	943	8089
电影数	1682	1000
密度/%	6.3	1.8
评分均值	3.5299	2.6321
评分方差	1.2671	2.3821
用户平均评分数	106	33
项目平均受评数	59.4	270.1

算法名称	描述
PGMF	本文算法
ALSBase^[17]	不考虑差分隐私保护，运用交替最小二乘法(alternating least squares, ALS)求解矩阵分解的算法
DPSGD^[14]	应用随机梯度下降法(stochastic gradient descent, SGD) 求解矩阵分解，对梯度进行扰动，实施隐私保护
DPSGDInput^[13]	对原始评分进行扰动之后运用SGD求解矩阵分解的算法
DPALS^[14]	对ALS求解的结果进行扰动，实施隐私保护的算法
DPALSObj^[19]	对ALS的目标函数进行扰动，实施隐私保护的算法

Matrix Factorization Recommendation Algorithm for Differential Privacy Protection

doi: 10.12178/1001-0548.2020359

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views