GU Yi-ran, XU Meng-xin. Keyword Extraction from News Articles Based on PageRank Algorithm[J]. Journal of University of Electronic Science and Technology of China, 2017, 46(5): 777-783. DOI: 10.3969/j.issn.1001-0548.2017.05.021
Citation: GU Yi-ran, XU Meng-xin. Keyword Extraction from News Articles Based on PageRank Algorithm[J]. Journal of University of Electronic Science and Technology of China, 2017, 46(5): 777-783. DOI: 10.3969/j.issn.1001-0548.2017.05.021

Keyword Extraction from News Articles Based on PageRank Algorithm

  • Most of the existing methods of extracting keyword based on complex networks ignore the natural language characters when building the weighted text network. In the meantime, they involve less the classical algorithms in complex network field. Based on PageRank algorithm, we propose a keyword extraction method, named LTWPR (located and TF-weighted PageRank), which takes into consideration term-frequency character and human language characters. The algorithm creates a term-frequency-shared weight in order to share the node's term-frequency value to its links, and defines a position weight coefficient to express different importance of words in different positions of news articles. LTWPR brings text networks' local and global features into consideration, making the results more accurate. Comprehensive experiments are conducted based on news articles grabbed from Sina News. Experimental results show that LTWPR algorithm is more effective and can better cover the keywords tagged by authors.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return