基于梯度相似性的自动作文评分多主题联合预训练方法

A Gradient-Similarity Based Multi-Topic Jointly Pre-Training Method for Automated Essay Scoring

  • 摘要: 提出了一种基于梯度相似性的自动加权方法,用于作文评分的多主题联合预训练。在预训练阶段同时使用多个主题的数据,通过计算外部主题的训练样本的梯度向量与目标主题的梯度向量之间的相似度作为该样本的损失权重。将深度学习与特征工程相结合,手工设计了3类特征。在公开数据集上进行对比实验表明,与现有的基线模型相比,提出的多主题联合预训练方法和手工特征均能有效提升作文评分模型的评分准确性。

     

    Abstract: This paper proposes a gradient-similarity based multi-topic jointly pre-training method for automated essay scoring (AES). Specifically, in the pre-training stage, the training data of multiple topics are used at the same time, and the similarity between the gradient vector of a sample from other topics and the gradient vector of target topic is calculated as the loss weight for this sample. Besides, this paper also designs three types of handcrafted features, combining deep learning with feature engineering. Comparative experiments are conducted on publicly available datasets, and the results show that compared with the existing baselines, both proposed multi-topic jointly pre-training method and handcrafted features can effectively improve the scoring accuracy of the AES model.

     

/

返回文章
返回