Abstract:
In order to measure the relationship between features more accurately, an objective function representation method based on mutual information adaptive estimation is proposed for speaker verification systems. This objective function introduces an adaptive metric learning method, and the optimization objective is maximizing the intra-class similarity and minimizing the inter-class similarity. Meanwhile, the objective function can dynamically adjust the similarity according to the real distribution of deep features. Based on dynamically adjusting, the deep neural networks can be optimized towards the direction of stronger discrimination. In addition, the adaptive metric method is used for feature sampling and update the parameters according to the characteristics of the features. Thus, the feature can be more typical and beneficial to improve the supervised ability of the optimization direction of the deep neural networks. Experimental results show that, compared with other deep neural networks, the relative equal error rate of the proposed method is reduced by up to 28%, and the performance of the speaker verification system is significantly improved.