Abstract:
In order to improve the accuracy of anthropometric feature point localization in complex background and arbitrary dress cases, the stacked hourglass network (SHN) is introduced into the localization of anthropometric feature points in body images. However, the resolution of the SHN model’s output feature map is too low to obtain high accurate feature points. So, a Deconv-SHN model is proposed to address this problem. On the one hand, the output layer of the initial model is replaced by several deconvolution layers to improve the resolution of the output feature map. On the other hand, the objective function is optimized based on Smooth L1 and local response. According to the experimental results on the self-built dataset consisting of 6 700 human body images, the localization precision of the Deconv-SHN model in complex background and arbitrary dress cases is significantly higher than that of the traditional algorithm, which is also obviously superior to the SHN model, and basically meets the requirements of anthropometric applications.