CHENG Shaohuan, TANG Yujia, LIU Qiao, CHEN Wenyu. Cross-Lingual Summarization Method Based on Joint Training and Self-Training in Low-Resource Scenarios[J]. Journal of University of Electronic Science and Technology of China, 2024, 53(5): 762-770. DOI: 10.12178/1001-0548.2024173
Citation: CHENG Shaohuan, TANG Yujia, LIU Qiao, CHEN Wenyu. Cross-Lingual Summarization Method Based on Joint Training and Self-Training in Low-Resource Scenarios[J]. Journal of University of Electronic Science and Technology of China, 2024, 53(5): 762-770. DOI: 10.12178/1001-0548.2024173

Cross-Lingual Summarization Method Based on Joint Training and Self-Training in Low-Resource Scenarios

  • As globalization continues to develop, cross-lingual summarization has become an important topic in natural language processing. In low-resource scenarios, existing methods face challenges such as limited representation transfer and insufficient data utilization. To address these issues, this paper proposes a novel method based on joint training and self-training. Specifically, two models are used to handle the translation and cross-lingual summarization tasks, respectively, which unify the language vector space of the output and avoid the issue of limited representation transfer. Additionally, joint training is performed by aligning the output features and probabilities of parallel training pairs, thereby enhancing semantic sharing between the models. Furthermore, based on joint training, a self-training technique is introduced to generate synthetic data from additional monolingual summary data, effectively mitigating the data scarcity issue of low-resource scenarios. Experimental results demonstrate that this method outperforms existing approaches in multiple low-resource scenarios, achieving significant improvements in ROUGE scores.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return