基于LDA的复杂网络整体研究态势主题分析

Evolution Properties of Complex Networks in Terms of the LDA

  • 摘要: 复杂网络的研究发展非常迅速,已经对自动控制、统计物理、计算机及管理等学科产生了深刻的影响。然而,国内的主题发展态势一直缺乏系统、直观的分析。本文以2017年第十三届全国复杂网络大会的会议摘要文本为研究对象,从会议摘要主题分析的角度研究了国内复杂网络科研领域的整体发展态势。研究过程中首先对摘要文本进行预处理,通过建立自定义词典和停用词库对文本进行jieba分词,得到一个文档-词矩阵。然后用LDA主题模型对摘要主题进行挖掘,通过SVD分解确定主题数目,并基于摘要间的JS距离进行凝聚层次聚类,基于机构间的JS距离用Blondel算法对机构进行社团划分,最终得到10类会议主题和4类科研社团。实证结果不仅能分析出复杂网络宏观上的研究趋势与不同研究方向的热门程度;也能基于聚出的4类科研社团,为新进入复杂网络的研究者寻找对应研究方向的文献提供参考机构。

     

    Abstract: The research of complex networks has been developing rapidly, which has had a profound impact on such disciplines as automatic control, statistical physics, computers, and management. However, there has been a lack of systematic and intuitive analysis of the development of topics in China. Taking the abstracts of the 13th National Complex Network Conference in 2017 as research object, we investigate the topic trend of the domestic complex network researches. Firstly, the text information of the abstracts are preprocessed and segmented by adding a custom dictionary and a stop word dictionary to obtain a document-word matrix. Then the LDA model is used to mine topics of the abstracts and SVD decomposition is applied to obtain the number of topics. As a result, ten topics of the conference are found through agglomerative hierarchical clustering according to the JS distance among the abstracts and four research communities involved in the conference are identified through community detection according to the JS distance among institutions. This work not only makes insight on the research trends and the popularity of different research directions in complex networks, but also provides reference institutions for new researchers to find corresponding research directions based on the results.

     

/

返回文章
返回