基于多维行为分析的用户聚类方法研究

User Clustering Method Based on Multi-dimensional Behavior Analysis

  • 摘要: 聚类分析是数据挖掘中一项重要的技术,通过对多维用户行为的聚类分析,可以从用户层面来帮助管理人员得到更为精确和有效的用户评价信息。该文首先从用户行为数据中提取多维用户行为特征,之后采用基于互信息的无监督特征选择(UFS-MI)模型对提取的特征进行排序、筛选并确定权重,得到每个用户行为的加权特征向量。根据用户行为之间的相似性构造网络,然后通过Blondel社团划分算法对用户行为网络进行聚类分析。在某公交线路的实证数据集上的实验结果表明,该方法的准确率为92%,比传统聚类算法K-means的准确率有明显提升,研究结果可以为公交公司的管理层在进行统一管理和培训时提供参考。本文的工作拓展了网络科学在多维用户行为数据聚类分析的应用范围,丰富了多维驾驶行为数据聚类分析的思路,为决策者提供参考依据。

     

    Abstract: Clustering analysis is an important technology in data mining. By clustering analysis of multi-dimensional user behavior, it can help managers get more accurate and effective user evaluation information from the user level. In this paper, multi-dimensional user behavior features are extracted from user behavior data, and then unsupervised feature selection based on mutual information (UFS-MI) is used to sort, filter and confirm the features of the extracted features, and the weighted feature vectors of each user's behavior are obtained. The network is constructed according to the similarity between user behaviors, and then the user behavior network is clustered and analyzed by Blondel community partition algorithm. The experimental results on an empirical data set of a bus line show that the accuracy of the method is 92%, which is significantly higher than the accuracy rate of the traditional clustering algorithm K-means. The results can provide a reference for the management and training of the public transport management. This paper expands the application scope of network science in multi-dimensional user behavior data clustering analysis, enriches the idea of multi-dimensional driving behavior data clustering analysis, and provides reference for managers.

     

/

返回文章
返回