基于BERT和集成学习的抗菌肽预测

Antimicrobial Peptides Prediction Based on BERT and Ensemble Learning

  • 摘要: 利用计算方法准确识别抗菌肽是近年来生物信息学领域研究的重点问题。传统的机器学习方法需要自主从序列信息中提取和选择特征,导致抗菌肽识别准确率低。为此提出基于BERT的深度学习预测方法,从预训练策略、词向量嵌入、预测性能等方面比较了4种现有基于BERT的抗菌肽预测模型,并基于集成学习思想提出了一个新的抗菌肽预测工具。实验结果表明,该模型在多个性能评价指标上都有所提升。

     

    Abstract: As the best substitute for antibiotics, antimicrobial peptides (AMPs) have important research significance. How to accurately identify AMPs using computational methods has been a key issue in the field of bioinformatics in recent years. However, traditional machine learning methods require autonomous extraction and selection of features from sequence information, resulting in low AMPs identification accuracy. Faced with the above challenges, a deep learning prediction methods based on Bidirectional Encoder Representation from Transformers (BERT) is proposed. In order to conduct a comprehensive evaluation of existing BERT-based AMP tools and further improve the performance of AMP calculation methods, four existing BERT-based AMP prediction tools in terms of pre-training strategies, word vector embeddings, and prediction performance are compared, and thus a novel AMP prediction tool based on the idea of ensemble learning is proposed. The experimental results show that the proposed model has been improved on several performance evaluation indexes.

     

/

返回文章
返回