Abstract:
Chinese electronic medical record texts are highly professional, with complex grammar,it is difficult to use named entity recognition (NER) for natural language processing (NLP). In order to accurately identify medical entities from electronic medical record data, a named entity recognition algorithm combining semantic and boundary information is proposed. In this algorithm, the graphic information of Chinese characters is extracted by using the convolutional neural network (CNN) structure and the semantic information of the Chinese characters is enriched with Wubi features. And then the text information is matched with medical dictionary as a potential phrase of characters by using the Lattice in the FLAT model. Finally, the Lattice model incorporating semantic information is used for named entity recognition in Chinese electronic medical records. The experimental results show that this method has better recognition performance than other existing methods on the Yidu-S4K data set, and the F1 value on the Resume dataset is 96.06%.