Abstract:
The topic analysis on the Sina microblog data is studied by using the Twitter-LDA topic model. The analysis based on correlation of users' topic interests shows that topic interests between users follow the three degrees of correlation. Within the same topic interest when the number of microblogs that users publish increases, the number of microblogs that their fans within three degrees publish also increases in fluctuation, and the similarity of topic interests between users and their multi-degree fans decreases with the increase of degree. Through the analysis and comparison of the diffusion difference of diverse topic categories, we find that users prefer the information with lifestyle topic, reposting probability is significantly different among microblogs within different topic categories, and the average reposting count can be 10 times in difference. In microblog information diffusion trees, diffusion depth, diffusion time interval and users' diffusion ability all show different characteristics for microblogs with different topic categories.