Sequence Recommendation Based on Contrast Learning and Fourier Transform

ZHANG Shaodong; YANG Xingyao; YU Jiong; LI Ziyang; LIU Yansong

doi:10.12178/1001-0548.2022164

This paper proposes a sequence recommendation algorithm based on self-attention mechanism and Fourier transform, named CSFTRec. By filtering the noise in the original data, this algorithm maximizes the feature capturing ability of the self-attention mechanism on the sequence data. According to the characteristics of contrast learning, a new contrast loss is introduced on the basis of Bayesian personalized ranking for joint training, which can shorten the distance between different similar sequences. Experiments on eight public data sets show that CSFTRec converges faster and improves the recommendation accuracy by 3% to 5%, which indicates that CSFTRec is more suitable for processing sequence data.

HTML

推荐算法在近些年发展非常迅速^[1]，受到各行各业的关注，尤其是电商行业，十分依赖于推荐算法的应用，以此来提升目标用户的粘性和体验。随着信息量的增长，推荐算法也成为应对信息过载的主要手段，可以有效帮助用户从海量数据中筛选出目标项目，并且预测接下来有可能交互的对象。

早期的推荐算法是根据用户或物品间的相似性达到推荐的目的^[2]，但算法效果并不理想。实际上，用户的交互行为在大多数情况下是动态的，且会随着时间的推移而不断改变。因此，如何对用户的行为进行单独建模并挖掘其潜在兴趣成为推荐算法的研究热点。序列推荐是一种能够根据用户的历史行为来动态捕捉用户行为特征的重要推荐算法，通过输入一个用户交互序列来预测用户下一步的行为。传统序列推荐算法在训练模型时会遵循监督学习范式，通过引入一个或多个负样本，使用贝叶斯个性化排名(Bayesian personalized ranking, BPR)损失函数^[3]来优化算法。但监督学习会将注意力重点放在样本标注与训练数据之间的关系对结果产生的影响上，比较依赖数据标注的准确性^[4]。而且在对用户交互序列建模时也很容易受到噪声数据的影响，使得模型非常脆弱，无法做到精准推荐。

针对上述问题，本文提出一种基于对比学习与傅里叶变换的自注意力序列推荐算法CSFTRec，通过训练方法的转变与噪声数据的消除提高算法性能。首先，通过傅里叶变换在序列数据进入编码器(Encoder)之前过滤噪声。同时，引入两种数据增强方式处理序列数据，将原始序列与增强序列同时作为自注意力层的输入。最后，通过改进的对比损失函数计算损失值。在8个公开数据集上的实验证明，本文算法相比于目前的主流推荐算法，能有效提高推荐结果的准确性。

1. 序列推荐研究现状

现有的序列推荐算法可以通过其对用户偏好建模的类型分为3种。

1)基于短期偏好的序列推荐算法，以马尔科夫链推荐为代表，低阶的马尔科夫链^[5]是根据当前交互项，预测用户的下一交互项。高阶马尔科夫链^[6]在低阶的基础上增加能够对目标项产生影响的交互项个数，但当用户交互序列较长时依然无法有效捕捉交互项之间的依赖关系，所以无法体现用户的长期偏好。卷积神经网络也被应用于处理用户的交互序列^[7]，效果相较于马尔科夫链略有提升，但核心思想还是对用户的短期偏好进行建模，算法依然存在局限性。

2)循环神经网络(recurrent neural network, RNN)是一类专门处理序列数据的神经网络，通过RNN构建的序列推荐算法^[8]能够从全局的角度考虑整个交互序列，以对用户的长期偏好进行建模。在RNN的基础上，又拓展出基于长短期记忆(long short-term memory, LSTM)^[9]和基于门控循环单元(gated recurrent unit, GRU)的网络。这两种网络有更多的参数和更好的性能，适合构建大型推荐网络，也可以有效预防模型的过拟合。此外，记忆网络(memory networks, MemNN)^[10]也可以用来对用户长期偏好建模，通过引入一个外部存储器来保存序列与下一个交互项之间的依赖关系。MemNN相较于RNN系列网络，能有效降低算法的存储和计算压力。

3)此外，还有诸多模型综合考虑了长短期偏好对算法的影响。图神经网络(graph neural networks, GNN)^[11]在序列推荐中使用有向图表示用户的交互序列，图中的一个节点就表示一个交互对象，将每个序列映射为图中的一条路径。GNN有着较强的可扩展性，如融入上下文增强的GCE-GNN^[12]和融入多重加权图的FGNN^[13]，分别从不同角度对GNN在推荐算法中的应用做出完善。但GNN存在信息丢失问题，后续也提出了LESSR^[14]等一系列模型去处理信息丢失。注意力机制也是一种比较流行的基于长短期偏好建模的序列处理算法，早期应用于机器翻译任务中^[15]。NARM^[16]则是使用普通注意力机制完成推荐任务，其强调输入的各部分对输出的影响程度不同，NARM的成功也使得注意力机制在推荐算法中广泛应用。本文选取自注意力机制作为序列推荐算法的核心，本文所提方法主要基于对比学习和自注意力机制。

5. 结束语

序列推荐算法的研究有着广泛的应用背景和实用性。针对现有序列推荐算法在对序列数据进行特征建模时容易受到噪声数据的影响，且传统的对比损失在推荐任务中表现不佳的问题，本文基于自注意力机制的编码器，使用傅里叶变换对序列数据进行噪声过滤，最大化编码器对序列数据的特征捕获能力，通过进一步引入Context-Context损失与由BPR推广的对比损失联合训练，提高推荐算法的性能。最后在5个亚马逊工业数据集上进行实验并与当下流行的算法模型进行对比，验证了本文所提出的序列推荐算法拥有较好的性能，对于对比学习在推荐算法领域中的应用有着重要意义。

Reference (37)

[1]	ZHANG S, YAO L, SUN A, et al. Deep learning based recommender system: A survey and new perspectives[J]. ACM Computing Surveys (CSUR), 2019, 52(1): 1-38.
[2]	FAN H, WU K, PARVIN H, et al. A hybrid recommender system using knn and clustering[J]. International Journal of Information Technology & Decision Making, 2021, 20(2): 553-596.
[3]	RENDLE S, FREUDENTHALER C, GANTNER Z, et al. BPR: Bayesian personalized ranking from implicit feedback[EB/OL]. [2022-5-29]. https://arxiv.org/abs/1205.2618.
[4]	SINGH A, THAKUR N, SHARMA A. A review of supervised machine learning algorithms[C]//2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). [S.l.]: IEEE, 2016: 1310-1315.
[5]	SHANI G, HECKERMAN D, BRAFMAN R I, et al. An MDP-based recommender system[J]. Journal of Machine Learning Research, 2005, 6(9): 1265-1295.
[6]	POLYZOU A, NIKOLAKOPOULOS A N, KARYPIS G. Scholars walk: A Markov chain framework for course recommendation[J]. International Educational Data Mining Society, 2019: 396-401.
[7]	TANG J, WANG K. Personalized top-n sequential recom-mendation via convolutional sequence embedding[C]//Proceedings of the 11th ACM International Conferenceon Web Search and Data Mining. [S.l.]: ACM, 2018: 565-573.
[8]	DONKERS T, LOEPP B, ZIEGLER J. Sequential user-based recurrent neural network recommendations [C]//Proceedings of the 11th ACM Conference on Recommender Systems. [S.l.]: ACM, 2017: 152-160.
[9]	BHARADHWAJ H, JOSHI S. Explanations for temporal recommendations[J]. KI-Künstliche Intelligenz, 2018, 32(4): 267-272.
[10]	CHEN X, XU H, ZHANG Y, et al. Sequential recommendation with user memory networks [C]//Proceedings of the 11th ACM International Conference On Web Search and Data Mining. [S.l.]: ACM, 2018: 108-116.
[11]	WU S, TANG Y, ZHU Y, et al. Session-Based recommendation with graph neural networks [C]//Proceedings of the AAAI Conference on Artificial Intelligence. [S.l.]: AAAI, 2019: 346-353.
[12]	WANG Z, WEI W, CONG G, et al. Global context enhanced graph neural networks for session-based recommendation[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. [S.l.]: ACM, 2020: 169-178.
[13]	QIU R, LI J, HUANG Z, et al. Rethinking the item order in session-based recommendation with graph neural networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. [S.l.]: ACM, 2019: 579-588.
[14]	CHEN T, WONG R C W. Handling information loss of graph neural networks for session-based recommendation [C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. [S.l.]: ACM, 2020: 1172-1180.
[15]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30.
[16]	LI J, REN P, CHEN Z, et al. Neural attentive session-based recommendation[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. [S.l.]: ACM, 2017: 1419-1428.
[17]	VAN DEN O A, LI Y, VINYALS O. Representation learning with contrastive predictive coding[EB/OL]. [2022-7-10]. https://arxiv.org/abs/1807.03748.
[18]	CHEN X, FAN H, GIRSHICK R, et al. Improved baselines with momentum contrastive learning[EB/OL]. [2022-7-28]. https://arxiv.org/abs/2003.04297.
[19]	HE K, FAN H, WU Y, et al. Momentum contrast for unsupervised visual representation learning [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S.l.]: IEEE, 2020: 9729-9738.
[20]	MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. [2022-8-10]. https://arxiv.org/abs/1301.3781.
[21]	KANG W C, MCAULEY J. Self-Attentive sequential recommendation[C]//2018 IEEE International Conference on Data Mining (ICDM). [S.l.]: IEEE, 2018: 197-206.
[22]	WANG C, MA W, CHEN C. Sequential recommendation with multiple contrast signals[J]. ACM Transactions on Information Systems (TOIS), 2022, 41(1): 1-27.
[23]	CHEN T, KORNBLITH S, NOROUZI M, et al. A simple fra-mework for contrastive learning of visual representations[C]//International Conference on Machine Learning. [S.l.]: PMLR, 2020: 1597-1607.
[24]	JAIN P, JAIN A, ZHANG T, et al. Contrastive code representation learning[EB/OL]. [2022-8-28]. https://arxiv.org/abs/2007.04973.
[25]	WU Z, WANG S, GU J, et al. Clear: Contrastive learning for sentence representation[EB/OL]. [2022-6-19]. https://arxiv.org/abs/2012.15466.
[26]	SANG Y F, WANG D, WU J C, et al. The relation between periods’ identification and noises in hydrologic series data[J]. Journal of Hydrology, 2009, 368(1-4): 165-177.
[27]	STAMMLER K. SeismicHandler-programmable multichannel data handler for interactive and automatic processing of seismological analyses[J]. Computers & Geosciences, 1993, 19(2): 135-140.
[28]	UNUMA M, ANJYO K, TAKEUCHI R. Fourier principles for emotion-based human figure animation[C]//Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques. [S.l.]: ACM, 1995: 91-96.
[29]	ZHOU K, YU H, ZHAO W X, et al. Filter-Enhanced MLP is all you need for sequential recommendation [C]//Proceedings of the ACM Web Conference 2022. [S.l.]: ACM, 2022: 2388-2399.
[30]	BA J L, KIROS J R, HINTON G E. Layer normalization[EB/OL]. [2022-6-29]. https://arxiv.org/abs/1607.06450.
[31]	SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[32]	ZHAO W X, CHEN J, WANG P, et al. Revisiting alternative experimental settings for evaluating top-N item recommendation algorithms[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. [S.l.]: ACM, 2020: 2329-2332.
[33]	CEN Y, ZHANG J, ZOU X, et al. Controllable multi-interest framework for recommendation[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. [S.l.]: ACM, 2020: 2942-2951.
[34]	RENDLE S, FREUDENTHALER C, SCHMIDT-THIEME L. Factorizing personalized markov chains for next-basket recom-mendation[C]//Proceedings of the 19th International Conference on World Wide Web. [S.l.]: ACM, 2010: 811-820.
[35]	HIDASI B, KARATZOGLOU A, BALTRUNAS L, et al. Session-Based recommendations with recurrent neural networks[EB/OL]. [2022-9-11]. https://arxiv.org/abs/1511.06939.
[36]	SUN F, LIU J, WU J, et al. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. [S.l.]: ACM, 2019: 1441-1450.
[37]	LI J, WANG Y, MCAULEY J. Time interval aware self-attention for sequential recommendation[C]//Proceedings of the 13th International Conference on Web Search and Data mining. [S.l.]: ACM, 2020: 322-330.

DataSet	Sequences	Items	Sparsity/%
Baby	19445	7011	99.91
Beauty	22363	12068	99.94
Movies and TV	2088620	186349	99.99
Toys and Games	1342911	283394	99.99
Grocery and Gourmet Food	14684	8687	99.90
Yelp-2018	104072	54034	99.92
ML-1M	6040	3704	95.58
Gowalla	76894	304440	99.77

DataSets	Methods	Metric
DataSets	Methods	HR@5	HR@10	NDCG@5	NDCG@10
Yelp	SASRec	0.368 5	0.498 2	0.259 9	0.301 8
	FMLP-Rec	0.360 7	0.493 9	0.253 4	0.296 4
	CSFTRec+	0.378 0	0.515 0	0.263 2	0.307 5
	CSFTRec	0.382 4	0.521 1	0.268 0	0.312 8
MovieLens	SASRec	0.702 0	0.805 0	0.550 2	0.583 8
	FMLP-Rec	0.713 9	0.809 8	0.562 0	0.593 1
	CSFTRec+	0.703 8	0.798 3	0.553 5	0.584 4
	CSFTRec	0.715 4	0.812 9	0.566 8	0.598 6
Food	SASRec	0.385 5	0.474 2	0.287 6	0.316 4
	FMLP-Rec	0.396 3	0.483 7	0.300 4	0.328 7
	CSFTRec+	0.404 9	0.489 9	0.310 7	0.338 3
	CSFTRec	0.401 7	0.479 9	0.309 1	0.335 1
Gowalla	SASRec	0.617 4	0.738 5	0.477 1	0.516 3
	FMLP-Rec	0.618 2	0.737 2	0.478 4	0.517 0
	CSFTRec+	0.642 3	0.749 8	0.500 3	0.535 2
	CSFTRec	0.669 8	0.776 3	0.524 1	0.558 8
Movies	SASRec	0.694 9	0.781 6	0.572 9	0.601 2
	FMLP-Rec	0.691 0	0.775 9	0.571 3	0.598 9
	CSFTRec+	0.707 6	0.786 9	0.595 4	0.621 2
	CSFTRec	0.704 8	0.793 1	0.595 5	0.620 3

DataSets	Methods	Metric
DataSets	Methods	HR@5	HR@10	NDCG@5	NDCG@10
Baby	POP	0.204 6	0.314 4	0.131 9	0.167 1
	BPRMF	0.229 1	0.339 8	0.151 2	0.186 8
	FPMC	0.260 6	0.371 4	0.181 2	0.216 8
	ComiRec	0.267 9	0.388 5	0.180 0	0.218 9
	GRU4Rec	0.291 9	0.415 8	0.197 9	0.237 9
	CSFTRec	0.300 8	0.422 2	0.208 7	0.244 3
Beauty	POP	0.173 5	0.290 0	0.114 1	0.151 6
	BPRMF	0.363 6	0.464 9	0.264 3	0.2971
	FPMC	0.344 8	0.428 6	0.264 8	0.291 6
	ComiRec	0.367 5	0.479 0	0.264 6	0.300 8
	GRU4Rec	0.326 8	0.438 3	0.234 2	0.270 1
	CSFTRec	0.376 2	0.469 7	0.290 1	0.320 4
Food	POP	0.206 5	0.334 7	0.130 1	0.171 3
	BPRMF	0.354 7	0.460 9	0.248 4	0.282 9
	FPMC	0.359 1	0.438 1	0.279 6	0.305 1
	ComiRec	0.366 3	0.474 4	0.259 6	0.294 7
	GRU4Rec	0.362 6	0.472 4	0.257 8	0.293 3
	CSFTRec	0.401 7	0.479 9	0.309 1	0.335 1
Toys	POP	0.338 5	0.408 3	0.276 6	0.299 1
	BPRMF	0.376 0	0.460 1	0.281 9	0.309 2
	FPMC	0.405 1	0.491 0	0.315 1	0.342 9
	ComiRec	0.437 6	0.527 7	0.338 0	0.367 2
	GRU4Rec	0.395 8	0.513 3	0.284 9	0.323 0
	CSFTRec	0.445 9	0.575 1	0.325 2	0.367 0
Movies	POP	0.635 3	0.757 1	0.500 7	0.540 2
	BPRMF	0.456 0	0.563 1	0.373 9	0.452 3
	FPMC	0.630 0	0.746 1	0.498 3	0.536 0
	ComiRec	0.661 3	0.770 4	0.524 1	0.559 5
	GRU4Rec	0.601 3	0.734 5	0.451 5	0.494 7
	CSFTRec	0.704 8	0.793 1	0.595 5	0.620 3

DataSets	Methods	Metric
DataSets	Methods	HR@5	HR@10	NDCG@5	NDCG@10
Baby	SASRec	0.290 8	0.409 9	0.202 9	0.241 3
	FMLP-Rec	0.282 8	0.393 1	0.196 6	0.232 1
	CSFTRec+	0.297 1	0.417 9	0.205 3	0.247 7
	CSFTRec	0.300 8	0.422 2	0.208 7	0.244 3
Beauty	SASRec	0.367 3	0.460 1	0.277 6	0.307 5
	FMLP-Rec	0.371 1	0.464 4	0.281 0	0.311 0
	CSFTRec+	0.368 3	0.459 2	0.280 1	0.309 8
	CSFTRec	0.376 2	0.469 7	0.290 1	0.320 4
Food	SASRec	0.385 5	0.474 2	0.287 6	0.316 4
	FMLP-Rec	0.390 2	0.482 0	0.293 6	0.322 5
	CSFTRec+	0.389 2	0.477 4	0.293 8	0.322 3
	CSFTRec	0.401 7	0.479 9	0.309 1	0.335 1
Toys	SASRec	0.368 3	0.450 5	0.290 4	0.316 9
	FMLP-Rec	0.406 2	0.488 8	0.320 6	0.347 3
	CSFTRec+	0.419 3	0.510 2	0.323 3	0.352 8
	CSFTRec	0.445 9	0.575 1	0.325 2	0.367 0
Movies	SASRec	0.694 9	0.781 6	0.572 9	0.601 2
	FMLP-Rec	0.708 4	0.785 6	0.580 6	0.609 4
	CSFTRec+	0.701 1	0.790 8	0.578 1	0.607 2
	CSFTRec	0.704 8	0.793 1	0.595 5	0.620 3

Sequence Recommendation Based on Contrast Learning and Fourier Transform

doi: 10.12178/1001-0548.2022164

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views