Abstract:
With a large number of video surveillance and camera networks, face recognition of continuous video frames in unrestricted scenes is becoming more and more attractive. Most of the traditional face recognition methods for continuous video frames have the problem of fluctuating recognition results and intensive computing resources. In this paper, an efficient 3D decomposition convolution is designed, which can effectively reduce the computational consumption of video face recognition and improve the recognition accuracy. Finally, we also propose a temporal pyramid network to further effectively mine complementary information between frames to improve the recognition accuracy. The performance has been tested on YTF and PaSC datasets.