Abstract:
Entities such as drug names are difficult to identify accurately in Chinese medical questioning texts due to the frequent occurrence of colloquial irregular expressions and jargon. To make full use of the important role of inter-word relations in Chinese sentences, a medical named entity recognition model for enhancing global information is proposed. The model enhances the word embedding representation using an attention mechanism and enriches the global information representation of sentences in two ways simultaneously, based on the use of the sequence processing capability of bidirectional long and short-term memory networks to obtain contextual information. Firstly, a graphical convolutional network layer is constructed to enrich inter-word dependencies based on syntactic relationships to obtain additional dependencies between words; secondly, an auxiliary task is constructed to predict the class of syntactic dependencies between words. Experimental results on the Chinese medical consultation dataset show that the model is very competitive, with an F1 value of 94.54%. Significant improvements are achieved in the recognition of entity classes such as drugs and symptoms compared to other models. Experiments on the Weibo public dataset also show that the model has general-domain applications.