微博用户兴趣主题抽取方法

A Method of Micro-Blog Users' Interests Topic Extraction

摘要: 根据社交媒体短文本特征改进了词袋模型，利用特征之间的语义关系提出了语义表示模型，采用句子中特征先后顺序构建了次序图模型，在此基础上引入时间因素，提出了基于Single-Pass算法的用户兴趣主题模型用于抽取微博用户关注的话题。实验结果表明，该方法的FM、AA和F指标相比FSC-LDA方法分别提高了200.40%、46.50%、80.05%。

Abstract: The bag of word model is first improved according to the social media short text feature. The semantic representation model is then proposed by using semantic relations between features. The sequence diagram model can be constructed by using the sequence of features in the sentence. On the base of these, together with time factor, we propose a user interest topic mode based on Single-Pass to extract the topic of user's attention. The experimental results show that the FM, AA and F of our method are increased by 200.40%, 46.50% and 80.05%, respectively, compared with the latest method FSC-LDA.